Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ergonomically set the search.max_buckets setting #91776

Open
jpountz opened this issue Nov 21, 2022 · 4 comments
Open

Ergonomically set the search.max_buckets setting #91776

jpountz opened this issue Nov 21, 2022 · 4 comments
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@jpountz
Copy link
Contributor

jpountz commented Nov 21, 2022

Description

search.max_buckets is similar to the maximum number of clauses in queries in that it is a proxy for resource usage, which is much easier to count than actual resource usage. Having a limit on the number of buckets produced helps ensure that aggregations do not exceed resource limits, such as heap.

But not all nodes have the same resources, and the fixed limit on search.max_buckets doesn't reflect this. In #46433 (comment) we decided to move from a fixed limit on the number of clauses in queries to a value that would be set ergonomically based on resources, let's do the same with the similar search.max_buckets setting?

Questions:

  • What should be the inputs for the maximum number of buckets?
  • Should we also no longer make it configurable, like we did for the maximum number of clauses of queries?
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 21, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@jpountz
Copy link
Contributor Author

jpountz commented Nov 21, 2022

What should be the inputs for the maximum number of buckets?

My initial reaction to this question is that we'll want the maximum heap size to be an input to this: the more heap you have, the more buckets you can have. And search coordination queue size: the more concurrent requests, the less buckets per request you can have?

@martijnvg
Copy link
Member

We discussed this during the our weekly sync. For now like to keep search.max_buckets as is until we have circuit breaker at reduce time. After that we will revisit what to do with search.max_buckets limit.

@martijnvg
Copy link
Member

We're still relying on search.max_buckets in order to avoid memory pressure. Previously we had plans to redesign the internal aggregation responses (binary based), which then would allow us to account memory. It is unlikely that this will happen. So I think, we should look into ergonomically setting search.max_buckets cluster setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

3 participants