-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IndexSearcher max clause count #46433
Comments
Pinging @elastic/es-search |
@jimczi @rjernst while the max-clause threshold value is known and set in configuration, users may also like to know what is the range of clauses they are hitting so they can tune/increase the threshold of maxclause setting appropriately. So can this information (clause-count) be added as part of maxclause exceeded error message? |
@colings86 @qhoxie and I had a discussion about this problem and we'd like to propose the following plan:
Finding out the right formula to determine the maximum number of clauses ergonomically is going to require some testing. I did some quick math for the worst-case scenario, which would give something like that in terms of overhead per boolean clause:
If you have 1GB of heap and 8 processors, this gives a limit of ~5k clauses. If you have 30GB of heap and 48 processors, this gives a limit of ~27k clauses. (I'm assuming the default search threadpool size of There are other things that can consume heap in the JVM, so we couldn't use the entire heap only for query clauses. One assumption is that other memory consumers would hopefully check the real-memory circuit breaker so that they would figure out that they shouldn't allocate more memory because it's already used by query clauses. And maybe we could also find a way to check the real-memory circuit breaker while Lucene is building the tree of scorers to further improve protection against OOMEs. |
elastic#81525) This commit deprecates the indices.query.bool.max_clause_count node setting, and instead configures the maximum clause count for lucene based on the available heap and the size of the thread pool. Closes elastic#46433
In Lucene 9.0 index searchers will start to check the overall number of clauses in a Query and will throw an error if it is above a certain threshold. The default value is the same than the max boolean clause (1024) but is configurable per index searcher. This is breaking change so only Elasticsearch 8.0 could be affected by this but I am opening this issue to raise awareness and discuss the best options to introduce this change in the next major version. Today we have a setting in Elasticsearch that controls the maximum number of clauses that a single boolean query can handle (
indices.query.bool.max_clause_count
) and we use it in different layer as an hard limit (filter aggregations, fuzzy query expansion, ...). Since the checking of the number of clauses is now global to a query one first actionable item would be to change the name of the setting to reflect the new behavior. So instead ofindices.query.bool.max_clause_count
it would beindices.query.max_clause_count
. Another point for the discussion is the handling of stored queries (percolator, ...) in Elasticsearch that breaks the new limit, should we accept them if they were created in a version where the limit was per boolean query ?The text was updated successfully, but these errors were encountered: