Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure IndexSearcher.maxClauseCount() based on Node characteristics #81525

Merged
merged 15 commits into from
Dec 17, 2021
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ The default for the `action.destructive_requires_name` setting changes from `fal
to `true` in {es} 8.0.0.

Previously, defaulting to `false` allowed users to use wildcard
patterns to delete, close, or change index blocks on indices.
patterns to delete, close, or change index blocks on indices.
To prevent the accidental deletion of indices that happen to match a
wildcard pattern, we now default to requiring that destructive
operations explicitly name the indices to be modified.
Expand Down Expand Up @@ -44,19 +44,28 @@ non-frozen node will result in an error on startup.
====

[[max_clause_count_change]]
.The `indices.query.bool.max_clause_count` setting now limits all query clauses.
.The `indices.query.bool.max_clause_count` setting has been deprecated, and no longer has any effect.
[%collapsible]
====
*Details* +
Previously, the `indices.query.bool.max_clause_count` would apply to the number
of clauses of a single `bool` query. It now applies to the total number of
clauses of the rewritten query. To reduce chances of breaks, its
default value has been bumped from 1024 to 4096.
Elasticsearch will now dynamically set the maximum number of allowed clauses
Copy link
Contributor

@jpountz jpountz Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need a word about the fact that Elasticsearch now checks the number of clauses across the entire query rather than on a per-booleanquery basis? So that anyone who has used the hack that consists of splitting boolean queries knows that they can undo it.

in a query, using a heuristic based on the size of the search thread pool and
the size of the heap allocated to the JVM. This limit has a minimum value of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to clarify that this number increases with heap size and decreases with the size of the threadpool, wdyt?

1024 and will in most cases be larger (for example, a node with 30Gb RAM and
48 CPUs will have a maximum clause count of around 27,000). Larger heaps lead
to higher values, and larger thread pools result in lower values.

*Impact* +
Queries with many clauses should be avoided whenever possible.
If you previously bumped this setting to accommodate heavy queries,
you might need to increase it further.
Queries with many clauses should be avoided whenever possible.
If you previously bumped this setting to accommodate heavy queries,
you might need to increase the amount of memory available to Elasticsearch,
or to reduce the size of your search thread pool so that more memory is
available to each concurrent search.

In previous versions of lucene you could get around this limit by nesting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In previous versions of lucene you could get around this limit by nesting
In previous versions of Lucene you could get around this limit by nesting

boolean queries within each other, but the limit is now based on the total
number of leaf queries within the query as a whole and this workaround will
no longer help.
====

[[ilm-poll-interval-limit]]
Expand Down Expand Up @@ -224,7 +233,7 @@ Remove the `http.content_type.required` setting from `elasticsearch.yml`. Specif
The `http.tcp_no_delay` setting was deprecated in 7.x and has been removed in 8.0. Use`http.tcp.no_delay` instead.

*Impact* +
Replace the `http.tcp_no_delay` setting with `http.tcp.no_delay`.
Replace the `http.tcp_no_delay` setting with `http.tcp.no_delay`.
Specifying `http.tcp_no_delay` in `elasticsearch.yml` will
result in an error on startup.
====
Expand All @@ -237,7 +246,7 @@ The `network.tcp.connect_timeout` setting was deprecated in 7.x and has been rem
was a fallback setting for `transport.connect_timeout`.

*Impact* +
Remove the`network.tcp.connect_timeout` setting.
Remove the`network.tcp.connect_timeout` setting.
Use the `transport.connect_timeout` setting to change the default connection
timeout for client connections. Specifying
`network.tcp.connect_timeout` in `elasticsearch.yml` will result in an
Expand Down Expand Up @@ -282,7 +291,7 @@ since the 5.2 release of {es}.

*Impact* +
Remove the `xpack.security.authz.store.roles.index.cache.max_size`
and `xpack.security.authz.store.roles.index.cache.ttl` settings from `elasticsearch.yml` .
and `xpack.security.authz.store.roles.index.cache.ttl` settings from `elasticsearch.yml` .
Specifying these settings will result in an error on startup.
====

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,13 @@

package org.elasticsearch.search.aggregations.bucket;

import org.apache.lucene.search.IndexSearcher;
import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.action.search.SearchPhaseExecutionException;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.search.SearchModule;
import org.elasticsearch.search.aggregations.InternalAggregation;
import org.elasticsearch.search.aggregations.bucket.adjacency.AdjacencyMatrix;
import org.elasticsearch.search.aggregations.bucket.adjacency.AdjacencyMatrix.Bucket;
Expand Down Expand Up @@ -280,7 +279,7 @@ public void testTooLargeMatrix() throws Exception {

// Create more filters than is permitted by Lucene Bool clause settings.
MapBuilder filtersMap = new MapBuilder();
int maxFilters = SearchModule.INDICES_MAX_CLAUSE_COUNT_SETTING.get(Settings.EMPTY);
int maxFilters = IndexSearcher.getMaxClauseCount();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number might be super high now, how slow is the test? Maybe we need to set an artificially low limit for this test to keep it reasonable?

for (int i = 0; i <= maxFilters; i++) {
filtersMap.add("tag" + i, termQuery("tag", "tag" + i));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

package org.elasticsearch.search.query;

import org.apache.lucene.search.IndexSearcher;
import org.elasticsearch.ExceptionsHelper;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.action.search.SearchResponse;
Expand All @@ -17,12 +18,10 @@
import org.elasticsearch.index.query.QueryStringQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.SearchModule;
import org.elasticsearch.test.ESIntegTestCase;
import org.elasticsearch.xcontent.XContentBuilder;
import org.elasticsearch.xcontent.XContentType;
import org.junit.Before;
import org.junit.BeforeClass;

import java.io.IOException;
import java.util.ArrayList;
Expand All @@ -42,28 +41,13 @@

public class QueryStringIT extends ESIntegTestCase {

private static int CLUSTER_MAX_CLAUSE_COUNT;

@BeforeClass
public static void createRandomClusterSetting() {
CLUSTER_MAX_CLAUSE_COUNT = randomIntBetween(50, 100);
}

@Before
public void setup() throws Exception {
String indexBody = copyToStringFromClasspath("/org/elasticsearch/search/query/all-query-index.json");
prepareCreate("test").setSource(indexBody, XContentType.JSON).get();
ensureGreen("test");
}

@Override
protected Settings nodeSettings(int nodeOrdinal, Settings otherSettings) {
return Settings.builder()
.put(super.nodeSettings(nodeOrdinal, otherSettings))
.put(SearchModule.INDICES_MAX_CLAUSE_COUNT_SETTING.getKey(), CLUSTER_MAX_CLAUSE_COUNT)
.build();
}

public void testBasicAllQuery() throws Exception {
List<IndexRequestBuilder> reqs = new ArrayList<>();
reqs.add(client().prepareIndex("test").setId("1").setSource("f1", "foo bar baz"));
Expand Down Expand Up @@ -250,40 +234,18 @@ public void testAllFieldsWithSpecifiedLeniency() throws IOException {
assertThat(e.getCause().getMessage(), containsString("unit [D] not supported for date math [-2D]"));
}

// The only expectation for this test is to not throw exception
public void testLimitOnExpandedFieldsButIgnoreUnmappedFields() throws Exception {
XContentBuilder builder = jsonBuilder();
builder.startObject();
builder.startObject("_doc");
builder.startObject("properties");
for (int i = 0; i < CLUSTER_MAX_CLAUSE_COUNT; i++) {
builder.startObject("field" + i).field("type", "text").endObject();
}
builder.endObject(); // properties
builder.endObject(); // type1
builder.endObject();

assertAcked(prepareCreate("ignoreunmappedfields").setMapping(builder));

client().prepareIndex("ignoreunmappedfields").setId("1").setSource("field1", "foo bar baz").get();
refresh();
public void testLimitOnExpandedFields() throws Exception {

QueryStringQueryBuilder qb = queryStringQuery("bar");
if (randomBoolean()) {
qb.field("*").field("unmappedField1").field("unmappedField2").field("unmappedField3").field("unmappedField4");
}
client().prepareSearch("ignoreunmappedfields").setQuery(qb).get();
}
final int maxClauseCount = randomIntBetween(50, 100);

public void testLimitOnExpandedFields() throws Exception {
XContentBuilder builder = jsonBuilder();
builder.startObject();
{
builder.startObject("_doc");
{
builder.startObject("properties");
{
for (int i = 0; i < CLUSTER_MAX_CLAUSE_COUNT; i++) {
for (int i = 0; i < maxClauseCount; i++) {
builder.startObject("field_A" + i).field("type", "text").endObject();
builder.startObject("field_B" + i).field("type", "text").endObject();
}
Expand All @@ -296,25 +258,34 @@ public void testLimitOnExpandedFields() throws Exception {

assertAcked(
prepareCreate("testindex").setSettings(
Settings.builder().put(MapperService.INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING.getKey(), CLUSTER_MAX_CLAUSE_COUNT + 100)
Settings.builder().put(MapperService.INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING.getKey(), maxClauseCount + 100)
).setMapping(builder)
);

client().prepareIndex("testindex").setId("1").setSource("field_A0", "foo bar baz").get();
refresh();

// single field shouldn't trigger the limit
doAssertOneHitForQueryString("field_A0:foo");
// expanding to the limit should work
doAssertOneHitForQueryString("field_A\\*:foo");
int originalMaxClauses = IndexSearcher.getMaxClauseCount();
try {

IndexSearcher.setMaxClauseCount(maxClauseCount);

// single field shouldn't trigger the limit
doAssertOneHitForQueryString("field_A0:foo");
// expanding to the limit should work
doAssertOneHitForQueryString("field_A\\*:foo");

// adding a non-existing field on top shouldn't overshoot the limit
doAssertOneHitForQueryString("field_A\\*:foo unmapped:something");
// adding a non-existing field on top shouldn't overshoot the limit
doAssertOneHitForQueryString("field_A\\*:foo unmapped:something");

// the following should exceed the limit
doAssertLimitExceededException("foo", CLUSTER_MAX_CLAUSE_COUNT * 2, "*");
doAssertLimitExceededException("*:foo", CLUSTER_MAX_CLAUSE_COUNT * 2, "*");
doAssertLimitExceededException("field_\\*:foo", CLUSTER_MAX_CLAUSE_COUNT * 2, "field_*");
// the following should exceed the limit
doAssertLimitExceededException("foo", IndexSearcher.getMaxClauseCount() * 2, "*");
doAssertLimitExceededException("*:foo", IndexSearcher.getMaxClauseCount() * 2, "*");
doAssertLimitExceededException("field_\\*:foo", IndexSearcher.getMaxClauseCount() * 2, "field_*");

} finally {
IndexSearcher.setMaxClauseCount(originalMaxClauses);
}
}

private void doAssertOneHitForQueryString(String queryString) {
Expand All @@ -340,7 +311,7 @@ private void doAssertLimitExceededException(String queryString, int exceedingFie
"field expansion for ["
+ inputFieldPattern
+ "] matches too many fields, limit: "
+ CLUSTER_MAX_CLAUSE_COUNT
+ IndexSearcher.getMaxClauseCount()
+ ", got: "
+ exceedingFieldCount
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import org.apache.lucene.analysis.TokenFilter;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.search.IndexSearcher;
import org.elasticsearch.ExceptionsHelper;
import org.elasticsearch.action.admin.indices.create.CreateIndexRequestBuilder;
import org.elasticsearch.action.index.IndexRequestBuilder;
Expand All @@ -29,13 +30,11 @@
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.SearchModule;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.test.ESIntegTestCase;
import org.elasticsearch.xcontent.XContentBuilder;
import org.elasticsearch.xcontent.XContentFactory;
import org.elasticsearch.xcontent.XContentType;
import org.junit.BeforeClass;

import java.io.IOException;
import java.util.ArrayList;
Expand Down Expand Up @@ -69,21 +68,6 @@
*/
public class SimpleQueryStringIT extends ESIntegTestCase {

private static int CLUSTER_MAX_CLAUSE_COUNT;

@BeforeClass
public static void createRandomClusterSetting() {
CLUSTER_MAX_CLAUSE_COUNT = randomIntBetween(60, 100);
}

@Override
protected Settings nodeSettings(int nodeOrdinal, Settings otherSettings) {
return Settings.builder()
.put(super.nodeSettings(nodeOrdinal, otherSettings))
.put(SearchModule.INDICES_MAX_CLAUSE_COUNT_SETTING.getKey(), CLUSTER_MAX_CLAUSE_COUNT)
.build();
}

@Override
protected Collection<Class<? extends Plugin>> nodePlugins() {
return Collections.singletonList(MockAnalysisPlugin.class);
Expand Down Expand Up @@ -578,11 +562,14 @@ public void testAllFieldsWithSpecifiedLeniency() throws IOException {
}

public void testLimitOnExpandedFields() throws Exception {

final int maxClauseCount = randomIntBetween(50, 100);

XContentBuilder builder = jsonBuilder();
builder.startObject();
builder.startObject("_doc");
builder.startObject("properties");
for (int i = 0; i < CLUSTER_MAX_CLAUSE_COUNT + 1; i++) {
for (int i = 0; i < maxClauseCount + 1; i++) {
builder.startObject("field" + i).field("type", "text").endObject();
}
builder.endObject(); // properties
Expand All @@ -591,15 +578,21 @@ public void testLimitOnExpandedFields() throws Exception {

assertAcked(
prepareCreate("toomanyfields").setSettings(
Settings.builder().put(MapperService.INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING.getKey(), CLUSTER_MAX_CLAUSE_COUNT + 100)
Settings.builder().put(MapperService.INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING.getKey(), maxClauseCount + 100)
).setMapping(builder)
);

client().prepareIndex("toomanyfields").setId("1").setSource("field1", "foo bar baz").get();
refresh();

doAssertLimitExceededException("*", CLUSTER_MAX_CLAUSE_COUNT + 1);
doAssertLimitExceededException("field*", CLUSTER_MAX_CLAUSE_COUNT + 1);
int originalMaxClauses = IndexSearcher.getMaxClauseCount();
try {
IndexSearcher.setMaxClauseCount(maxClauseCount);
doAssertLimitExceededException("*", maxClauseCount + 1);
doAssertLimitExceededException("field*", maxClauseCount + 1);
} finally {
IndexSearcher.setMaxClauseCount(originalMaxClauses);
}
}

private void doAssertLimitExceededException(String field, int exceedingFieldCount) {
Expand All @@ -610,7 +603,9 @@ private void doAssertLimitExceededException(String field, int exceedingFieldCoun
});
assertThat(
ExceptionsHelper.unwrap(e, IllegalArgumentException.class).getMessage(),
containsString("field expansion matches too many fields, limit: " + CLUSTER_MAX_CLAUSE_COUNT + ", got: " + exceedingFieldCount)
containsString(
"field expansion matches too many fields, limit: " + IndexSearcher.getMaxClauseCount() + ", got: " + exceedingFieldCount
)
);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@

package org.elasticsearch.index.search;

import org.apache.lucene.search.IndexSearcher;
import org.elasticsearch.common.regex.Regex;
import org.elasticsearch.core.Nullable;
import org.elasticsearch.index.mapper.MappedFieldType;
import org.elasticsearch.index.mapper.TextSearchInfo;
import org.elasticsearch.index.query.SearchExecutionContext;
import org.elasticsearch.search.SearchModule;

import java.util.Collection;
import java.util.HashMap;
Expand Down Expand Up @@ -94,7 +94,7 @@ static Map<String, Float> resolveMappingFields(
resolvedFields.put(field.getKey(), boost);
}
}
checkForTooManyFields(resolvedFields.size(), context, null);
checkForTooManyFields(resolvedFields.size(), null);
return resolvedFields;
}

Expand Down Expand Up @@ -151,8 +151,8 @@ static Map<String, Float> resolveMappingField(
return fields;
}

static void checkForTooManyFields(int numberOfFields, SearchExecutionContext context, @Nullable String inputPattern) {
Integer limit = SearchModule.INDICES_MAX_CLAUSE_COUNT_SETTING.get(context.getIndexSettings().getSettings());
static void checkForTooManyFields(int numberOfFields, @Nullable String inputPattern) {
int limit = IndexSearcher.getMaxClauseCount();
if (numberOfFields > limit) {
StringBuilder errorMsg = new StringBuilder("field expansion ");
if (inputPattern != null) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ private Map<String, Float> extractMultiFields(String field, boolean quoted) {
} else {
extractedFields = fieldsAndWeights;
}
checkForTooManyFields(extractedFields.size(), this.context, field);
checkForTooManyFields(extractedFields.size(), field);
return extractedFields;
}

Expand Down
Loading