Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure IndexSearcher.maxClauseCount() based on Node characteristics #81525

Merged
merged 15 commits into from
Dec 17, 2021

Conversation

romseygeek
Copy link
Contributor

This commit deprecates the indices.query.bool.max_clause_count node setting,
and instead configures the maximum clause count for lucene based on the available
heap and the size of the thread pool.

Closes #46433

@romseygeek romseygeek added :Search/Search Search-related issues that do not fall into other categories >deprecation v8.0.0 v8.1.0 labels Dec 8, 2021
@romseygeek romseygeek self-assigned this Dec 8, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Dec 8, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@romseygeek
Copy link
Contributor Author

I'm not totally happy with the testing on this yet, and it will need some documentation as well, but it's good enough for a preliminary review I think.

public static void configureMaxClauses(ThreadPool threadPool) {
int searchThreadPoolSize = threadPool.info(ThreadPool.Names.SEARCH).getMax();
long heapSize = JvmStats.jvmStats().getMem().getHeapMax().getGb();
configureMaxClauses(searchThreadPoolSize, heapSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you start Elasticsearch with 512MB of heap, does it mean you get zero max clauses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If heapSize is zero then we stick with the default value from lucene, currently 1024. But maybe we should stick with the current ES default of 4096 instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have updated this so that we will always return a minimum of 4096, or larger if the heap/cpu size permits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this work if we used the heap size in GB as a fractional number instead?

return; // If we don't know how to size things, keep the lucene default
}

int maxClauseCount = Math.toIntExact(heapInGb * 65_536 / threadPoolSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some explanation as to where this 65,536 number comes from?

+ "This limit can be set by changing the ["
+ SearchModule.INDICES_MAX_CLAUSE_COUNT_SETTING.getKey()
+ "] setting."
"Number of filters is too large, must be less than or equal to: [" + maxFilters + "] but was [" + maxFiltersPlusOne + "]."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add guidance as to how to address this error? E.g. can we check if the user overrode the size of the search thread and suggest undoing the change? Or otherwise recommend scaling up or avdoiding running queries with such large numbers of clauses?

@@ -268,7 +267,8 @@
4096,
1,
Integer.MAX_VALUE,
Setting.Property.NodeScope
Setting.Property.NodeScope,
Setting.Property.DeprecatedWarning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would we raise a deprecation warning for node settings, only on startup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have confirmed with @pgomulka that this will only be emitted on startup, yes.

@romseygeek
Copy link
Contributor Author

I've reworked things a bit: the utils class now only calculates the max clauses value, it is set directly in the node startup code, and we use the current default value (4096) as a minimum so that we won't inadvertently limit users query sizes when then upgrade.

@romseygeek
Copy link
Contributor Author

Interesting, this causes QueryStringIT#testLimitOnExpandedFieldsButIgnoreUnmappedFields to fail on CI as we run into a too many fields exception, but this doesn't reproduce locally because my machine has more memory than the CI worker. I'll rework the test to be a unit test for the query string query builder, rather than an integration test.

@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-2

@romseygeek romseygeek requested a review from jpountz December 14, 2021 12:15
of clauses of a single `bool` query. It now applies to the total number of
clauses of the rewritten query. To reduce chances of breaks, its
default value has been bumped from 1024 to 4096.
Elasticsearch will now dynamically set the maximum number of allowed clauses
Copy link
Contributor

@jpountz jpountz Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need a word about the fact that Elasticsearch now checks the number of clauses across the entire query rather than on a per-booleanquery basis? So that anyone who has used the hack that consists of splitting boolean queries knows that they can undo it.

Elasticsearch will now dynamically set the maximum number of allowed clauses
in a query, using a heuristic based on the size of the search thread pool and
the size of the heap allocated to the JVM. This limit has a minimum value of
4096 (the previous default) and will in most cases be larger (for example,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1024 was the previous value, not 4096?

you might need to increase it further.
Queries with many clauses should be avoided whenever possible.
If you previously bumped this setting to accommodate heavy queries,
you might need to increase the amount of memory available to elasticsearch,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
you might need to increase the amount of memory available to elasticsearch,
you might need to increase the amount of memory available to Elasticsearch,

public static void configureMaxClauses(ThreadPool threadPool) {
int searchThreadPoolSize = threadPool.info(ThreadPool.Names.SEARCH).getMax();
long heapSize = JvmStats.jvmStats().getMem().getHeapMax().getGb();
configureMaxClauses(searchThreadPoolSize, heapSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this work if we used the heap size in GB as a fractional number instead?

default value has been bumped from 1024 to 4096.
Elasticsearch will now dynamically set the maximum number of allowed clauses
in a query, using a heuristic based on the size of the search thread pool and
the size of the heap allocated to the JVM. This limit has a minimum value of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to clarify that this number increases with heap size and decreases with the size of the threadpool, wdyt?

assertEquals(4096, SearchUtils.calculateMaxClauseValue(4, 0));

// Number of processors not available
assertEquals(4096, SearchUtils.calculateMaxClauseValue(-1, 1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it actually something that can happen? While the number of cores can be unknown, we are actually using the threadpool size, which should always be defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly sure that this would always be set, but put this in as a safety net. But maybe it should be replaced with an assertion that the thread pool size > 1

@romseygeek
Copy link
Contributor Author

Would this work if we used the heap size in GB as a fractional number instead?

I'm not sure what you mean here?

@jpountz
Copy link
Contributor

jpountz commented Dec 14, 2021

Would this work if we used the heap size in GB as a fractional number instead?

I'm not sure what you mean here?

We currently get the Java heap in GB as a long. A side-effect of this is that the formula always returns 0 as the maximum number of clauses if the heap is less than 1GB since the amount of memory allocated to the JVM is rounded down. Si I wonder if we should use a fractional representation of the heap size instead, e.g. 0.5 if the JVM is given 512MB of heap size? This way, we might also not need to resort to taking the max of the return value and 4096, it should better scale to small heaps?

or to reduce the size of your search thread pool so that more memory is
available to each concurrent search.

In previous versions of lucene you could get around this limit by nesting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In previous versions of lucene you could get around this limit by nesting
In previous versions of Lucene you could get around this limit by nesting

@@ -280,7 +279,7 @@ public void testTooLargeMatrix() throws Exception {

// Create more filters than is permitted by Lucene Bool clause settings.
MapBuilder filtersMap = new MapBuilder();
int maxFilters = SearchModule.INDICES_MAX_CLAUSE_COUNT_SETTING.get(Settings.EMPTY);
int maxFilters = IndexSearcher.getMaxClauseCount();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number might be super high now, how slow is the test? Maybe we need to set an artificially low limit for this test to keep it reasonable?

+ SearchModule.INDICES_MAX_CLAUSE_COUNT_SETTING.getKey()
+ "] setting."
+ "]. "
+ "You can increase this limit by scaling up your java heap or number of CPUs"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scaling up the number of CPUs would increase the size of the search threadpool and reduce the number of allowed clauses, so maybe just mention memory?

@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.0

romseygeek added a commit to romseygeek/elasticsearch that referenced this pull request Dec 17, 2021
elastic#81525)

This commit deprecates the indices.query.bool.max_clause_count node setting,
and instead configures the maximum clause count for lucene based on the available
heap and the size of the thread pool.

Closes elastic#46433
elasticsearchmachine pushed a commit that referenced this pull request Dec 17, 2021
#81525) (#81850)

This commit deprecates the indices.query.bool.max_clause_count node setting,
and instead configures the maximum clause count for lucene based on the available
heap and the size of the thread pool.

Closes #46433
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>deprecation :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.0.0-rc2 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IndexSearcher max clause count
5 participants