-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
telegraf taking too long to collect net metrics #3318
Comments
Is there more to the SIGQUIT dumps? |
No, seems that journald cuts the end.
I can switch to file log and repeat the SIGQUIT if it is needed.
El mar., 10 oct. 2017 20:17, Daniel Nelson <[email protected]>
escribió:
… Is there more to the SIGQUIT dumps?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3318 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADFnmKpg-cM133RMWTLGERa96PwdNO2Qks5sq7TDgaJpZM4PzcwE>
.
|
Yeah could you do that, also when this occurs try to make a request to |
Now it is not failing, although kubernetes input is giving timeout all the time (and docker socket from time to time) |
It might be helpful to enable the |
Here are a few things to check into:
|
Yes, the issue with the net input is still happening. |
I haven't been able to figure out the problem, lets try to get the full sigquit stacktrace? It might be easier to run with |
@adrianlzt I've learned that it can be quite time consuming for the net input to discover interfaces, because of the cost of checking if the interface is a loopback and if the it is up. It may help if you add a list or glob(only in 1.5.0+) of interfaces:
Let me know if this helps. |
Thanks for the tip, but I cannot test anymore. New job :) |
Congrats! I'm going to close this issue then, if someone else reading this has the issue and the tip above doesn't help then please open a new issue |
I am having this same issue, and have seen multiple related reports, but haven't found anything that has helped resolve it. Just enabled internal input and net inputs as described, will see what data that produces. |
Bug report
We are seeing this message each 10" in some of our servers:
Net metrics are not being sent, but the rest are working correctly.
If I run telegraf in test mode it works correctly:
I have restarted telegraf in one of the failing nodes and now is working correctly.
Killed with SIGQUIT another node: https://gist.github.com/7e999b78093bb41fd89e1314fe7e4b1b
Another SIGQUIT: https://gist.github.com/adrianlzt/09f3c4dcd5ff54d1ddc5fcb156003d7d
12h later one of the servers is having the same problem. Before failing there are some others plugins giving errors. Errors and SIGQUIT: https://gist.github.com/6c8173e791e26d78545ee7f5a00ba08e
Relevant telegraf.conf:
Loaded outputs: influxdb
Loaded inputs: inputs.disk inputs.diskio inputs.kernel inputs.mem inputs.processes inputs.swap inputs.system inputs.cpu inputs.procstat inputs.docker inputs.procstat inputs.prometheus inputs.kubernetes inputs.net inputs.netstat inputs.prometheus inputs.procstat inputs.procstat
System info:
After killing the agents they were running telegraf-1.3.5-1.x86_64.
Now they are running 1.4.1-1.x86_64
OS: Red Hat Enterprise Linux Server release 7.3 (Maipo)
Additional info:
Maybe related with #2870 and #3107
The text was updated successfully, but these errors were encountered: