Docker input plugin #14319

clement-deltel · 2023-11-20T14:09:20Z

Relevant telegraf.conf

# # Read metrics about docker containers
[[inputs.docker]]
#   ## Docker Endpoint
#   ##   To use TCP, set endpoint = "tcp://[ip]:[port]"
#   ##   To use environment variables (ie, docker-machine), set endpoint = "ENV"
  endpoint = "unix:///var/run/docker.sock"
#
#   ## Set to true to collect Swarm metrics(desired_replicas, running_replicas)
#   ## Note: configure this in one of the manager nodes in a Swarm cluster.
#   ## configuring in multiple Swarm managers results in duplication of metrics.
  gather_services = false
#
#   ## Only collect metrics for these containers. Values will be appended to
#   ## container_name_include.
#   ## Deprecated (1.4.0), use container_name_include
#   container_names = []
#
#   ## Set the source tag for the metrics to the container ID hostname, eg first 12 chars
  source_tag = false
#
#   ## Containers to include and exclude. Collect all if empty. Globs accepted.
  container_name_include = []
  container_name_exclude = []
#
#   ## Container states to include and exclude. Globs accepted.
#   ## When empty only containers in the "running" state will be captured.
#   ## example: container_state_include = ["created", "restarting", "running", "removing", "paused", "exited", "dead"]
#   ## example: container_state_exclude = ["created", "restarting", "running", "removing", "paused", "exited", "dead"]
#   # container_state_include = []
#   # container_state_exclude = []
#
#   ## Objects to include for disk usage query
#   ## Allowed values are "container", "image", "volume"
#   ## When empty disk usage is excluded
  storage_objects = ["container", "image", "volume"]

#   ## Timeout for docker list, info, and stats commands
  timeout = "5s"
#
#   ## Whether to report for each container per-device blkio (8:0, 8:1...),
#   ## network (eth0, eth1, ...) and cpu (cpu0, cpu1, ...) stats or not.
#   ## Usage of this setting is discouraged since it will be deprecated in favor of 'perdevice_include'.
#   ## Default value is 'true' for backwards compatibility, please set it to 'false' so that 'perdevice_include' setting
#   ## is honored.
  perdevice = false
#
#   ## Specifies for which classes a per-device metric should be issued
#   ## Possible values are 'cpu' (cpu0, cpu1, ...), 'blkio' (8:0, 8:1, ...) and 'network' (eth0, eth1, ...)
#   ## Please note that this setting has no effect if 'perdevice' is set to 'true'
  perdevice_include = ["blkio", "cpu", "network"]
#
#   ## Whether to report for each container total blkio and network stats or not.
#   ## Usage of this setting is discouraged since it will be deprecated in favor of 'total_include'.
#   ## Default value is 'false' for backwards compatibility, please set it to 'true' so that 'total_include' setting
#   ## is honored.
  total = false
#
#   ## Specifies for which classes a total metric should be issued. Total is an aggregated of the 'perdevice' values.
#   ## Possible values are 'cpu', 'blkio' and 'network'
#   ## Total 'cpu' is reported directly by Docker daemon, and 'network' and 'blkio' totals are aggregated by this plugin.
#   ## Please note that this setting has no effect if 'total' is set to 'false'
  total_include = ["blkio", "cpu", "network"]
#
#   ## docker labels to include and exclude as tags.  Globs accepted.
#   ## Note that an empty array for both will include all labels as tags
#   docker_label_include = []
#   docker_label_exclude = []
#
#   ## Which environment variables should we use as a tag
#   tag_env = ["JAVA_HOME", "HEAP_SIZE"]
#
#   ## Optional TLS Config
#   # tls_ca = "/etc/telegraf/ca.pem"
#   # tls_cert = "/etc/telegraf/cert.pem"
#   # tls_key = "/etc/telegraf/key.pem"
#   ## Use TLS but skip chain & host verification
#   # insecure_skip_verify = false

Logs from Telegraf

Job for telegraf.service failed because the control process exited with error code.
See "systemctl status telegraf.service" and "journalctl -xeu telegraf.service" for details.


telegraf.service - Telegraf
     Loaded: loaded (/lib/systemd/system/telegraf.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2023-11-20 07:56:44 CST; 4s ago
       Docs: https://github.com/influxdata/telegraf
    Process: 2002175 ExecStart=/usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d $TELEGRAF_OPTS (code=exited, status=1/FAILURE)
   Main PID: 2002175 (code=exited, status=1/FAILURE)
        CPU: 145ms

Nov 20 07:56:44 home-server systemd[1]: telegraf.service: Scheduled restart job, restart counter is at 5.
Nov 20 07:56:44 home-server systemd[1]: Stopped Telegraf.
Nov 20 07:56:44 home-server systemd[1]: telegraf.service: Start request repeated too quickly.
Nov 20 07:56:44 home-server systemd[1]: telegraf.service: Failed with result 'exit-code'.
Nov 20 07:56:44 home-server systemd[1]: Failed to start Telegraf.

System info

Telegraf 1.28.5, Ubuntu 22.04, Docker version 24.0.7, build afdd53b

Docker

No response

Steps to reproduce

Install Docker and Telegraf.
Enable the Telegraf input plugin for Docker, and specifically enable the option storage_objects)
Restart Telegraf with the command "systemctl restart Telegraf".

If the storage_objects option is commented out, it works just fine. If I uncomment it, Telegraf crashes and refuses to start. I tried the following, without success

storage_objects = ["container", "image", "volume"]
storage_objects = ["container"]
storage_objects = []

Expected behavior

Since Telegraf supports Docker metrics, enabling the option for Docker disk_usage should work just fine and rebooting Telegraf should not fail.

Actual behavior

As soon as I enable the "storage_objects" option and reboot Telegraf to apply the changes, it fails and Telegraf refuses to boot. I don't have much more from the service apart from the fact that it is not booting.

Additional info

No response

The text was updated successfully, but these errors were encountered:

R290 · 2023-11-20T16:21:09Z

Good to see that there is demand for the storage objects part of the docker input plugin. The pull request that actually adds the functionality (#13894) is scheduled to be added in v1.29.0 (https://github.com/influxdata/telegraf/milestone/94?closed=1). Would you be interested in testing the Telegraf nightly version? You can download it at: https://www.influxdata.com/downloads/

clement-deltel · 2023-11-21T11:04:56Z

Hello,

Thank you for letting me know, I didn't know it was not implemented yet since the documentation is available. Yes, I would be interested. I installed it and so far it works just fine, for all types of docker objects: container, image, and volume.

srebhan · 2023-11-24T11:44:52Z

@clement-deltel so your issue was using a not-yet-released feature? Everything works and we can close the issue?

clement-deltel · 2023-11-24T12:58:04Z

Yes, this is exactly what happened. You can close this issue.

clement-deltel added the bug unexpected problem or unintended behavior label Nov 20, 2023

srebhan added the waiting for response waiting for response from contributor label Nov 24, 2023

telegraf-tiger bot removed the waiting for response waiting for response from contributor label Nov 24, 2023

srebhan closed this as completed Nov 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker input plugin #14319

Docker input plugin #14319

clement-deltel commented Nov 20, 2023 •

edited

Loading

R290 commented Nov 20, 2023

clement-deltel commented Nov 21, 2023

srebhan commented Nov 24, 2023

clement-deltel commented Nov 24, 2023

Docker input plugin #14319

Docker input plugin #14319

Comments

clement-deltel commented Nov 20, 2023 • edited Loading

Relevant telegraf.conf

Logs from Telegraf

System info

Docker

Steps to reproduce

Expected behavior

Actual behavior

Additional info

R290 commented Nov 20, 2023

clement-deltel commented Nov 21, 2023

srebhan commented Nov 24, 2023

clement-deltel commented Nov 24, 2023

clement-deltel commented Nov 20, 2023 •

edited

Loading