You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's say I have two output plugins, influxdb and opentsdb, when I take down opentsdb, telegraf will be unable to send metric to neither as it reports connection failure. I suggest that if one of the outputs is not available, let telegraf send metric to the available ones, for the unavailable output, let telegraf either drop the data or keep it in the buffer/disk with a limit of size.
The text was updated successfully, but these errors were encountered:
This is a mostly known issue, there is a bit more discussion on #2919. However, so long as the output does not block for too long the other outputs should continue working, though it can still cause other performance issues. One of the challenges in fixing this is that under normal use it is used to throttle service inputs that read from queuing systems.
The best thing that can be done is to ensure there are timeouts configured on all of your outputs, and that are not too large. This way the output cannot block the main process for too long. However, as I look over the OpenTSDB output I don't see any timeout configuration, so it could blocked it would totally halt Telegraf.
I think we should change this ticket to be "OpenTSDB output can block Telegraf", does that sound alright?
danielnelson
changed the title
[Bug?] Telegraf doesn't work when one of multiple outputs is not available
OpenTSDB output can block Telegraf due to no timeout
Jul 12, 2017
This is expected behavior currently. When a metric is ready to be sent to outputs, it's added to the outputs one by one. If one output takes a long time sending, buffers can fill up and metrics can be dropped.
Let's say I have two output plugins,
influxdb
andopentsdb
, when I take down opentsdb, telegraf will be unable to send metric to neither as it reports connection failure. I suggest that if one of the outputs is not available, let telegraf send metric to the available ones, for the unavailable output, let telegraf either drop the data or keep it in the buffer/disk with a limit of size.The text was updated successfully, but these errors were encountered: