Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request Circuit Breaker not limiting memory usage #20250

Closed
sarathrs opened this issue Aug 31, 2016 · 12 comments
Closed

Request Circuit Breaker not limiting memory usage #20250

sarathrs opened this issue Aug 31, 2016 · 12 comments
Labels
:Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload >enhancement

Comments

@sarathrs
Copy link

sarathrs commented Aug 31, 2016

Elasticsearch version: 2.3.4

Plugins installed: []

JVM version: 1.7.0

OS version: Ubuntu 12.04.5 LTS, Also reproduced in Centos

Description of the problem including expected versus actual behavior:
We are using elasticsearch 2.3.4 with 1GB of heap allocated. When we execute elasticsearch dismaxquery which has 200K multimatch queries memory usage of elasticsearch inreases to 40%. So we updated circuit breaker configuration in yml to limit memory usage for the request but that didn't fix the memory usage.
Configuration change:
indices.breaker.fielddata.limit: 20%
indices.breaker.request.limit: 5%
indices.breaker.total.limit: 25%

Note: We have only one record in elasticsearch. Elasticsearch query will not match to any document.
We also noticed that there is no memory usage by fielddata. Same issue is reproduced in our production cluster with 19GB of heap allocated for elasticsearch

Expected Output: "indices.breaker.request.limit" should limit the memory usage of elasticsearch

Steps to reproduce:

  1. Execute Dismax query with 200K multimatch queries

Elasticsearch

PUT /user PUT /user/profile/1 { "name" : "sarath", "location" : "chennai" }

Java Code

Builder builder = Settings.settingsBuilder(); builder.put("cluster.name", "sarath-es-tv"); TransportClient client = TransportClient.builder().settings(builder.build()).build(); client.addTransportAddress(new InetSocketTransportAddress(new InetSocketAddress("localhost", 9302))); DisMaxQueryBuilder dqb = QueryBuilders.disMaxQuery(); for(int i = 0; i<100000; i++){ MultiMatchQueryBuilder nameQueryBuilder = QueryBuilders.multiMatchQuery( ""+i, "name"); dqb.add(nameQueryBuilder); MultiMatchQueryBuilder locationQueryBuilder = QueryBuilders.multiMatchQuery( ""+i, "location"); dqb.add(locationQueryBuilder); dqb.add(nameQueryBuilder); dqb.add(locationQueryBuilder); } SearchRequestBuilder searchQuery = client.prepareSearch("user") .setTypes("profile") .setQuery(dqb) .setSearchType(SearchType.QUERY_THEN_FETCH) .setSize(10); SearchHits sh = searchQuery.execute().actionGet(new TimeValue(1, TimeUnit.MINUTES)).getHits();

Provide logs (if relevant):
Attached the screenshots of node stats with and without circuit breaker. In both cases memory usage increases.
Without Circuit Breaker
withoutcircuitbreaker

With Circuit Breaker
withcircuitbreaker

@clintongormley clintongormley added discuss :Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload labels Aug 31, 2016
@clintongormley
Copy link
Contributor

@danielmitterdorfer could you take a look at this please?

@danielmitterdorfer
Copy link
Member

@clintongormley Yes, I can have a look.

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Sep 1, 2016

First of all, this is a great reproduction scenario. Thanks a lot.

In my tests, the majority of memory (around 570MB) was allocated in org.elasticsearch.search.query.QueryPhase.execute(SearchContext, IndexSearcher) which calls Lucene's IndexSearcher. As - at this point - Lucene allocates memory on-heap I fear (A) that we cannot do much about that and (B) that this is not the intent of the request circuit breaker. We could estimate the expected size but I doubt that this is easy to predict.

I think some of the confusion comes from the fact that we call this a "request" circuit breaker so you may think that this limits the total memory allocated in the context of a request. So maybe it makes sense to change the name of this breaker to make the intent clearer?

What do you think about all this @dakrone?

@danielmitterdorfer
Copy link
Member

Btw, also the new in-flight request circuit breaker (added in 2.4) will not help because the actual request is just roughly 8MB large (so this is not the problem).

@jpountz
Copy link
Contributor

jpountz commented Sep 1, 2016

I think the problem is more about expectations than naming since users seem to expect it to be a 100% correct thing while it is only a best-effort in practice (we cannot intercept every object allocation). But its goal is to abort requests that may use too much memory?

@danielmitterdorfer
Copy link
Member

@jpountz my (implicit) reasoning was that it's the name that triggers this expectation. I agree with everything else that you've said.

@jimczi
Copy link
Contributor

jimczi commented Sep 1, 2016

Maybe we can add a limit for the number of disjuncts in the DisjunctionMaxQuery ? This would be similar to the BooleanQuery which throws an exception if the number of clauses is greater than 1024 (by default).
I also tried with the BM25 similarity and it's worst, the query needs more than 400MB just for the norms (we allocate an array of 256 floats for each term query => 400,000 * 256 * 4).

@jpountz
Copy link
Contributor

jpountz commented Sep 1, 2016

+1 @jimferenczi

@dakrone
Copy link
Member

dakrone commented Sep 1, 2016

I agree with Adrien that the expectation is certainly that something named "request" breaker will encompass all memory used by a request. I don't think it should be renamed, though, we may want to try and fold different allocations (like the dismax allocations) into it if possible so it eventually grows to be a true request breaker.

If we know the size of an array for each term, is it possible for us to hook into the disjunct generation so we can calculate the memory usage based on the number of disjuncts?

@Tim-Brooks
Copy link
Contributor

@jimczi - Is this still something we want to do? This issue has not been active in a while.

@danielmitterdorfer
Copy link
Member

Maybe we can add a limit for the number of disjuncts in the DisjunctionMaxQuery ?

This has been implemented in Lucene in https://issues.apache.org/jira/browse/LUCENE-8811 and will be included in Lucene 9. As this will address the issue that has been originally brought up here, I'm closing this ticket.

@danielmitterdorfer
Copy link
Member

The follow-up work to make this limit configurable in Elasticsearch is now tracked in #46433.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload >enhancement
Projects
None yet
Development

No branches or pull requests

7 participants