Add safeguards to prevent simple user errors #11511

clintongormley · 2015-06-05T10:30:11Z

There are a number of places where a naive user can break Elasticsearch very easily. We should add more (dynamically overridable) safeguards that prevent users from hurting themselves.

Note:

We are adding high limits to start so that we don't suddenly disable things that users already do today, but so that sysadmins have tools that they can use to protect their clusters. We can revisit the limits later on.
All these settings should be prefixed by policy. to make them easier to document together and to understand their purpose.

Accepted limits:

For discussion:

Add a switch to disallow slow queries #29050 Disable certain query types, eg wildcard, span etc?
Improve circuit breaking on aggregations #14046 Limit on the number of buckets returned by aggs
Include hits in request circuit breaker #9310 Limit the size of the response (eg for very large doc bodies)
~~Kill slow scripts when search timeout has lapsed aka while(true) should not require a rolling restart to recover from~~ Don't run a script a second time when the first execution takes longer than 1 second
~~Disable searching across all indices #6470 Disable searching on all indices by default~~ Handled by max number of shards
Limit the number nested Lucene documents per document. Limit the number of nested documents #26962

Any other ideas?

The text was updated successfully, but these errors were encountered:

jpountz · 2015-06-05T10:43:12Z

Limit the max number of shards

I'm wondering if we should do it per index or per cluster. If we do it per index, then we might also want to have a max number of indices per cluster.

Limit the size of a bulk request

I guess it would also apply to multi-get and multi-search.

alexbrasetvik · 2015-06-05T11:06:35Z

Some of this could go into a "sanity checker"-kind of plugin akin to the migration plugin that runs a bunch of tests as well.

That one could warn when e.g. minimum master nodes looks wrong, and when the number of shards/indexes/fields looks silly / approaches the above limits.

clintongormley · 2015-06-05T13:58:56Z

@alexbrasetvik the requires the user to actually run the check. Often poor sysadmins are at the mercy of their users. What I'd like to do is to prevent users from blowing things up by mistake.

alexbrasetvik · 2015-06-07T16:37:09Z

@clintongormley Agreed! I still think there's room for both, though such a tool should be another issue.

For example, a high number of indexes with few documents and identical mappings can be a sign that the user is doing per-user index partitioning when he shouldn't. That will turn into a problem, even if the current values are far from hitting above mentioned limits.

pickypg · 2015-07-15T20:16:07Z

Any other ideas?

Limit the max number of indices
- It's effectively covered by limiting by shards, but touching too many indices may indicate more of a logical issue than the shard count (e.g., with daily indices, it's much easier to realize that sending a request to 5 indices represents five days rather than 25 shards with default counts).
Limit the concurrent request size
- Request circuit breaker across all concurrent requests

dakrone · 2015-07-16T18:49:54Z

Limit the concurrent request size

This is already available with the thread pools and queue_sizes to limit the number of requests per-node and apply backpressure.

EDIT: I guess I am taking "size" as "count", is that what you mean?

pickypg · 2015-07-17T20:57:40Z

@dakrone Size of an actual request. For instance, if one request comes in with an aggregation that uses size: 0 at the same time as another, then maybe we should block the second one (or at least delay).

jpountz · 2015-10-30T22:46:34Z

Another protection to add: check mapping depth #14370

ppf2 · 2015-11-02T18:41:05Z

Limit the max value that can be set for queue_size for our search, bulk, index, etc.. thread pools so users can't set them to unlimited, millions, etc..?

s1monw · 2017-03-28T08:04:38Z

@clintongormley I think we missed one rather important aspect when it comes to soft-limits. Today the user can override those limits via dynamic properties which is ok most of the time but in the case of a cloud hosting infrastructure where the org that runs the infrastructure needs to have full control over these limits they should be able to disable the dynamic property or should disable setting these settings entirely?

This change adds a dynamic cluster setting named `search.max_keep_alive`. It is used as an upper limit for scroll expiry time in scroll queries and defaults to 1 hour. This change also ensures that the existing setting `search.default_keep_alive` is always smaller than `search.max_keep_alive`. Relates elastic#11511

Relates to elastic#11511

Relates to #11511

This change adds a dynamic cluster setting named `search.max_keep_alive`. It is used as an upper limit for scroll expiry time in scroll queries and defaults to 1 hour. This change also ensures that the existing setting `search.default_keep_alive` is always smaller than `search.max_keep_alive`. Relates #11511 * check style * add skip for bwc * iter * Add a maxium throttle wait time of 1h for reindex * review * remove empty line

Relates to #11511

jpountz · 2018-03-14T09:07:56Z

Most of the work has been done, and items that have not been done have an assigned issue so I'll close this issue. Thanks everyone!

clintongormley added discuss Meta >enhancement v2.0.0-beta1 labels Jun 5, 2015

clintongormley mentioned this issue Jun 26, 2015

API to list running requests #4329

Closed

clintongormley added v2.0.0 and removed v2.0.0-beta1 labels Aug 12, 2015

clintongormley mentioned this issue Aug 26, 2015

search Request performance drops significantly when setting size to Integer.MAX_VALUE #13125

Closed

clintongormley added v2.1.0 and removed v2.1.0 v2.0.0 labels Oct 6, 2015

jasontedor mentioned this issue Nov 2, 2015

Limit the maximum queue_size for fixed thread pools? #14448

Closed

clintongormley mentioned this issue Nov 8, 2015

StackOverflowError when parsing deeply nested mappings #14370

Closed

clintongormley added v2.2.0 and removed v2.1.0 labels Nov 20, 2015

clintongormley mentioned this issue Nov 27, 2015

Set soft limit on the number of nested fields per index #14983

Closed

spinscale added v2.3.0 and removed v2.2.0 labels Dec 23, 2015

clintongormley mentioned this issue Jan 12, 2016

Smarter request routing which takes recent node latency into account #15914

Closed

clintongormley mentioned this issue Jan 20, 2016

Limit the accepted length of the _id #16036

Merged

clintongormley added the v5.0.0-alpha1 label Mar 16, 2016

clintongormley added v6.0.0 and removed v6.0.0-alpha1 labels May 3, 2017

javanna removed the discuss label May 5, 2017

s1monw mentioned this issue Jun 15, 2017

Add a softlimit to the number of open scroll contexts per node #25244

Closed

jimczi mentioned this issue Aug 30, 2017

Add upper limit for scroll expiry #26448

Merged

martijnvg mentioned this issue Sep 4, 2017

Add a limit to from + size in top_hits and inner hits. #26492

Merged

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Sep 5, 2017

Added a limit to from + size in top_hits and inner hits.

78e9c96

Relates to elastic#11511

martijnvg added a commit that referenced this issue Sep 5, 2017

Added a limit to from + size in top_hits and inner hits.

df55dba

Relates to #11511

cbuescher mentioned this issue Sep 11, 2017

Add a soft limit for the number of requested doc-value fields #26574

Merged

martijnvg added a commit that referenced this issue Sep 21, 2017

Added a limit to from + size in top_hits and inner hits.

1328925

Relates to #11511

martijnvg mentioned this issue Oct 11, 2017

Limit the number of nested documents #26962

Closed

lcawl added v6.0.1 and removed v6.0.0 labels Nov 13, 2017

lcawl added v6.0.2 and removed v6.0.1 labels Dec 6, 2017

jaymode added v6.0.3 and removed v6.0.2 labels Dec 13, 2017

This was referenced Dec 13, 2017

Add the limit on the number of expanded terms #27796

Closed

Limit the number of fields that can be used in queries #27797

Closed

jpountz mentioned this issue Mar 14, 2018

Add a switch to disallow slow queries #29050

Closed

jpountz closed this as completed Mar 14, 2018

javanna added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add safeguards to prevent simple user errors #11511

Add safeguards to prevent simple user errors #11511

clintongormley commented Jun 5, 2015 •

edited by ywelsch

Loading

jpountz commented Jun 5, 2015

alexbrasetvik commented Jun 5, 2015

clintongormley commented Jun 5, 2015

alexbrasetvik commented Jun 7, 2015

pickypg commented Jul 15, 2015

dakrone commented Jul 16, 2015

pickypg commented Jul 17, 2015

jpountz commented Oct 30, 2015

ppf2 commented Nov 2, 2015

s1monw commented Mar 28, 2017

jpountz commented Mar 14, 2018

Add safeguards to prevent simple user errors #11511

Add safeguards to prevent simple user errors #11511

Comments

clintongormley commented Jun 5, 2015 • edited by ywelsch Loading

jpountz commented Jun 5, 2015

alexbrasetvik commented Jun 5, 2015

clintongormley commented Jun 5, 2015

alexbrasetvik commented Jun 7, 2015

pickypg commented Jul 15, 2015

dakrone commented Jul 16, 2015

pickypg commented Jul 17, 2015

jpountz commented Oct 30, 2015

ppf2 commented Nov 2, 2015

s1monw commented Mar 28, 2017

jpountz commented Mar 14, 2018

clintongormley commented Jun 5, 2015 •

edited by ywelsch

Loading