-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split large batches of documents if received 413 from Elasticsearch #29778
Comments
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
ping @nimarezainia for prioritization and awareness |
@rdner this sounds like the right thing to do and makes our product more robust. What would be the level of effort involved in getting this done? Ideally the buffers would match dynamically so we wouldn't hit these issues but I know that is near impossible. Could you please ensure that each of these actions are logged, in particular: when the batch is dropped, please state in the log for info:
When the batch is being cut to size:
|
@nimarezainia Regarding the estimation of effort, I'm quite new to the project, so it's hard for me to give a precise estimation on the effort, I would ask @faec for help here since we already touched on this topic once. @cmacknz it might be worth considering to introduce this kind of behaviour into the shippers design too. |
@cmacknz Should I "close" this one and focus on the shippers as you have already included it in the V2 implementation? |
Let's keep the issue as it is a good description of the work to do. We could remove the release target and labels though. This will happen as part of the shipper work at some to be determined point in the future. |
Is this not going to be fixed in the current beat implementation? |
@Foxboron Even if here we talk about fixing it in the shippers it doesn't mean that it will not be fixed in standalone beats. |
The current plan is to address this in Beats so the fix is available sooner, and then port it into shipper afterwards so we aren't tied to date when the shipper is ready to be released. |
@nimarezainia why is it near impossible? is it because Agent / beat can send to multiple ES clusters with different size limits? I agree with the conclusion, just want to make sure i understand all the reasons for it. |
I believe I was told that the ES buffer size is not known to us. this may have changed. if there was an API for us to read that, perhaps our output can be set to match, minimizing drops. perhaps things have changed since that comment. |
Describe the enhancement:
Currently, after seeing a 413 response from Elasticsearch the whole batch is dropped and the error is logged (#29368). Some of our customers would like to preserve at least some data from the batch instead of discarding the whole batch.
The proposal is:
http.max_content_length
threshold in ElasticsearchSomething similar was done in this PR logstash-plugins/logstash-output-elasticsearch#497
Please ensure that each of these actions are logged, in particular:
when the batch is dropped, please state in the log for info:
When the batch is being cut to size:
max_bulk_size
Describe a specific use case for the enhancement or feature:
Some of our clients are more sensitive to data loss than others and this enhancement would allow to preserve more data in case of misconfiguration of
http.max_content_length
in Elasticsearch orbulk_max_size
in beats. This would improve the situation in most of the cases but it would not completely solve the data loss problem.The text was updated successfully, but these errors were encountered: