-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
search Request performance drops significantly when setting size to Integer.MAX_VALUE #13125
Comments
I'm pretty sure this is a duplicate. I'll hunt down what this is a duplicate of. OTOH you've left a workaround so thanks for that. |
Have you considered using a scroll? I would advise against setting the result
Note that such a request opens a "scroll context" that will enable you to continue to fetch the results of the initial search request. |
Yeah - this is really much better than your workaround. I'd love for this to be less of a thing - for elasticsearch to reject requests that are unreasonably large with a helpful message pointing you to scroll. And for some requests to be streamable. But neither of those are implemented yet. I thought there were issues opened for them but I've misplaced them. |
There we go. Thanks. |
I think I got the solution. Scroll to get a snapshot of the request that I can work with, then scroll through the list and collect all the items. I´ll post my working solution when I find time testing and finishin it. Thx for the good hints and answers. 🙇 |
That's exactly right. |
here is my solution, hopefully it helps people that come across the same problem: SearchRequestBuilder sizeQuery = client.prepareSearch(SearchProvider.INDEX)
.setTypes(TYPE)
.setSearchType(SearchType.SCAN)
.setSize(10)
.setScroll(TimeValue.timeValueSeconds(60))
.setQuery(queryBuilder);
SearchResponse scrollResponse = sizeQuery.execute().actionGet();
long totalHits = scrollResponse.getHits().getTotalHits();
//might be an optimization, delete the scroll when size is 0, it´s deleted anyways when getting the first result...
//if (totalHits == 0) {
// ClearScrollRequest clear = new ClearScrollRequest();
// clear.addScrollId(scrollResponse.getScrollId());
// client.clearScroll(clear);
// return result;
//}
String scrollID = scrollResponse.getScrollId();
SearchResponse response;
do {
response = client.prepareSearchScroll(scrollID)
.setScroll(TimeValue.timeValueSeconds(60))
.get();
scrollID = response.getScrollId();
for (SearchHit searchHit : response.getHits()) {
EsEntry esEntry = new EsEntry();
esEntry.setId(searchHit.getId());
esEntry.setRawData(searchHit.getSource());
result.getEntryList().add(esEntry);
}
} while (response.getHits().hits().length > 0);
return result; |
Well Search type scan and count are now deprecated #1745 So how this solution will be implemented with this? |
We are updating some data in our els store. The data has a creation timestamp so we fetch buckets of 1 hour length from the database. Therefore we need to fetch all items from this date bucket. Since we don´t know how many items could be in the bucket, we se the size of search to Integer.MAX_VALUE.
We´re seeing a dramatic drop in performance when setting the hit size that high. When we set the size closer to the actual size, the performance of the request is again expectably fast.
Scroll to my comment for a solution using the scroll API
we tried to reproduce the problem as well using the REST API directly:
With Integer.MAX_VALUE (POST {{elastic_search_host}}/resolve_log_v1/entry/_search?pretty)
Result:
same request with size set to 21
when setting the size to 0 we still get a quick response.
The total count of entry objects (:
To optimize, we now do two queries, 1 count, one to geht the hits:
this is not ideal, as the bucket theoretically could change.
The text was updated successfully, but these errors were encountered: