-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible truncation outputting to udp in influx format #2881
Comments
cc @oplehto |
Valid case, but should be rather hard to trigger. The IP protocol will split the packet if it's too large. Maximum payload size is a little under 64KiB, which is an insanely huge point. Edit: Actually I think what's likely to happen is that the write operation will throw an error. So nothing will be written. Not truncation. |
There are no errors thrown, at least in our case. When the bug is triggered by a single large metric in a metrics batch (what is gathered during an interval), the rest of the batch is quietly dropped. Thus a subset of the metrics will be sent cleanly but there are random gaps in data. |
That's not what I get. When I test I get:
|
It looks like we send one point per packet, so there shouldn't be gaps in data due to this. Maybe the network is being overloaded due to the batching causing packets to not being received? Looks like I was concerned about nothing here. We could run the output through the Split function but I think it's not needed at this time. |
@danielnelson Solving this is really tricky. The reason to return an error is that telegraf will re-call the |
All of points should be resent, so what would happen in this case is the socket_writer would be completely stuck, right? A possible exception to this would be an icmp error for a previous send. |
When I looked at the code, it seemed like telegraf would eventually give up on the write, and skip to the next batch. It looked to be driven by new data coming in, but the code was a somewhat hard to follow, so didn't dig into it too deeply. |
Closing due to inactivity, this might still be an issue but at this point after so many years a new issue with a reproducible use case would be better to help resolve this. |
Bug report
Triggered by reports of malformed metrics in #2862 and based on a code inspection, it appears to me that when using udp with the socket_writer, points will be truncated when serializing in influx format.
This could occur anywhere we have a fixed output buffer.
Relevant telegraf.conf:
N/A
System info:
1.3.1
Steps to reproduce:
I have not test this!
Expected behavior:
Use field splitting where possible to fit into buffer, warn if cannot possibly fit.
Actual behavior:
I think it will be truncated.
Additional info:
#2880 (comment)
The text was updated successfully, but these errors were encountered: