-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[POC, Do not merge] input_chunk: split incoming buffer when it's too big #9385
Open
braydonk
wants to merge
8
commits into
fluent:master
Choose a base branch
from
braydonk:chunk_size_poc
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+169
−23
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
braydonk
requested review from
edsiper,
leonardo-albertovich,
fujimotos and
koleini
as code owners
September 13, 2024 19:44
Add a configuration value for the storage chunk max size. Signed-off-by: braydonk <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a proof of concept for mitigating the issue where a chunk can be too large when received from an input plugin.
Bug Explanation
When a large set of data is read at one time, all these records are appended into whichever chunk is the most recently active, and all the records are written at once. The check for the chunk size only happens after writing data to the chunk. So despite the chunk size being "limited" to 2M, this doesn't guarantee that it won't exceed that number. In this case, we could easily have a chunk that is right up close to the
2M
limit, and then have loads of data written to it leading to an excessively large chunk that once encoded can exceed write limits of output plugin APIs.Proposed Solution
This solution is an attempt to mitigate the problem without immense restructuring. The strategy is to examine the size of the incoming buffer, and if it exceeds the
FLB_CHUNK_FS_MAX_SIZE
(2M
), the buffer is split into separate buffers that are under the max size, and are all appended to chunks separately. This is paired with a check when retrieving a new input chunk, which checks if appending the current buffer will exceed the chunk size limit, and if so a new chunk is created.This solution is not perfect, but it was the best way I could find within my power (i.e. I don't consider major restructures to this code or
chunkio
to be "within my power").Issues: #9374, #1938
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.