-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
in_emitter: Fix single record chunks and respect mem_buf_limit pause #8473
in_emitter: Fix single record chunks and respect mem_buf_limit pause #8473
Conversation
@edsiper @nokute78 @leonardo-albertovich @pwhelan |
1b8aec2
to
e61f412
Compare
e61f412
to
350fe62
Compare
…uf_limit The current code creates a situation, where only one record per chunk is created. In case of a non-existing ring-buffer, the old mechanism is used. Also the in_emitter plugin continued to accept records even after the set emitter_mem_buf_limit was reached. This commit implements a check if the plugin was paused and returns accordingly. Signed-off-by: Richard Treu <[email protected]>
350fe62
to
feb4243
Compare
While rolling fluentbit out in my environment, we ran into this bug. We have built and run fluentbit off of this PR with no sign of failure. It has been moving 10+TB/day of logs for 12 days across 500-1500 collectors. We would very much like to see a fix, if not this one, introduced in order to avoid log loss. |
@drbugfinder-work thanks for triaging and fixing this. In order to merge it, would you please split the commits per component ? ,e.g:
thanks |
This commit will pause the inputs (sending to multiline) to not loose any in-flight records. Signed-off-by: Richard Treu <[email protected]>
This commit will pause the inputs (sending to rewrite_tag) to not loose any in-flight records. Signed-off-by: Richard Treu <[email protected]>
This commit will pause all known inputs (sending to multiline) to not loose any in-flight records. in_emitter will keep track of all sending input plugins and actively pause/resume them in case in_emitter is paused/resumed. Signed-off-by: Richard Treu <[email protected]>
This commit will add a resume message, when a paused input plugin is resumed. Signed-off-by: Richard Treu <[email protected]>
…line This commit will add a test for pause functionality of in_emitter. The test uses a small emitter buffer size, so the in_emitter will definitely be paused. Signed-off-by: Richard Treu <[email protected]>
4ac237c
to
3162d0c
Compare
@edsiper Done! I've split the commits according to components. |
This PR covers two issues:
This PR corrects the behavior of a reached mem_buf_limit. The plugin used to accept further records even after being paused which leads to Potential log loss during high load at Multiline & Rewrite Tag Filter (in_emitter) #8198
The current
in_emitter
implementation results in only one record per chunk being created, which is suboptimal. This PR fixes the collector handling:Please refer to line:
fluent-bit/plugins/in_emitter/emitter.c
Line 165 in 9652b0d
To validate this observation, please use:
Check the output:
Full logs here:
I'm uncertain if this modification is the correct approach to resolve this issue, as it appears that the ring buffer is unused. This PR is partially reverting some parts to older code version.
However, implementing this does address the behavior of only one record being generated per chunk.
Please also see the valgrind output:
https://gist.github.com/drbugfinder-work/83352c59799659db9741f33a22083eaa
Other comments, not directly related to this PR:
Additionally, the entire file appears to require restructuring, as some comments are inaccurate, and the
DEFAULT_EMITTER_RING_BUFFER_FLUSH_FREQUENCY
, which I think should represent a time duration, is instead utilized as a 'buffer size'.fluent-bit/plugins/in_emitter/emitter.c
Line 245 in 9652b0d
The scheduler seems to be empty here, so the ring buffer will never be started (proved by another debug output):
fluent-bit/plugins/in_emitter/emitter.c
Lines 241 to 259 in 9652b0d
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.