Result streaming on the discover tab #15662

jgough · 2017-12-18T11:58:28Z

In version 4 of Kibana, the Discover tab was filled by doing multiple _msearch HTTP requests, one to each index matching the index pattern.
In version 6 now a single _msearch request is now done to display the Discover tab.

This meant that when viewing the Discover tab in Kibana 4, there was one request per index which, while inefficient, performed much better on slower Elasticsearch hardware. Queries over billions of documents in a hundred indexes would load progressively - slowly - 100 small HTTP requests but they would not timeout.

With Kibana 6 and a single _msearch request there is no progressive loading, and the single query for everything needs to complete and return before showing any data - 1 big HTTP request. On slower Elasticsearch instances this means that it can often timeout on large collections of data.

It would seem very useful to have a toggle to bring back this progressive loading with multiple queries, one per index. This currently is a blocker for us upgrading to Kibana 6.

I discussed this issue previously here:
https://discuss.elastic.co/t/discover-tab-timing-out-with-single-msearch-request/110325

The text was updated successfully, but these errors were encountered:

Bargs · 2017-12-20T20:35:16Z

Splitting up the requests was always a bit hacky and significantly complicated the codebase. We could explore some options if it affects a large number of users, but this is the first I've heard of this problem.

Out of curiosity, what takes longer for you, the query itself or the aggregation for building the date histogram? You can grab the request Discover is sending from your browser's dev tool and play around with it in Kibana's dev tool app to get some timings.

jgough · 2017-12-21T10:28:43Z

I've done some profiling and if I'm interpreting this correctly the DateHistogramAggregator is the thing that is slow here, specifically the collect phase:
https://pastebin.com/Y4ve1yjS

Bargs · 2017-12-21T18:30:20Z

Thanks, I'll have to think about this some more and do some additional testing. Might take a little while with the holidays coming up. Just a guess, but if the date histogram is the bottleneck it might speed things up if you select a larger date interval instead of relying on whatever auto picks.

Bargs · 2018-03-07T20:17:28Z

Turns out this might be solvable in ES.

timroes · 2018-05-06T11:59:30Z

If the only issue here was the date histogram, we could close that issue in favor of #18853.

@Bargs please feel free to close this ticket, if there is no other request in this ticket except the time zone problematic, now described in #18853

jgough · 2018-05-06T15:19:01Z

I've done some basic tests and not found any issues to do with speed due to the timezone settings in Kibana. I can do some more thorough tests on this on Tuesday.

My issue is that a query to view the discover histogram on a billion documents is taking about 5 minutes (2 node cluster) and timing out. If this query is split up into chunks (indexes ala Kibana 4) then it doesn't time-out. Not sure if the timezone issue is causing the query to take a long time - or if a histogram query over that many documents is likely to take that long. As I say I will test more thoroughly on Tuesday thanks.

jgough · 2018-05-08T11:27:18Z

I've done some testing and the timezone is definitely one factor in this, but the query still times out and takes too long to be useful:

Query timing:
With time_zone:"Europe/London": ~170s
With time_zone:"UTC": ~90s

This ticket is a request for the return of the Histogram loading behaviour from Kibana 4 - loading the data one index at a time - which meant that we did not get timeouts. With this behaviour we would have started to see results after just a few seconds and then the data would fill up as it completes.

As it is this new behaviour makes the Discover tab mostly unusable to us. The ability to disable it in #17065 is helpful - but not really the solution we were looking for.

Bargs · 2018-05-09T19:03:45Z

Thanks for the additional info @jgough. Out of curiosity, how slow is the query if you remove the date histogram agg completely?

I agree some form of progressive loading might be nice for slow queries. Interval based patterns and field stats are going away so we couldn't implement it the same way as in Kibana 4, but we could still use simple date ranges to break up the query.

Also keep in mind in the short term you can increase the timeout settings in kibana.yml if you don't mind waiting for the slow queries to load.

jgough · 2018-05-11T15:07:12Z

Just managed to get a few more quick benchmarks. Note this is a slightly different time range so the results are not quite comparable to above

With histogram agg (Europe/London): >~120s (timed out)
With histogram agg (UTC): ~57s
Without histogram agg: ~18s

drock · 2020-07-22T14:27:52Z

We have a similar concern with the loading of the Discover tab. We have a 2B+ document and growing logging cluster. A search of 7 days of logs often matches over 1B results. While we don't get timeouts, the discover tab loads pretty slow, about 30s when caches are warm. I tested the query without aggs and it does cut the time in about half, but Kibana doesn't give an option to remove the histogram from the Discover tab. Changing the timezone doesn't make much difference since we are on a later version of ES that solves the timezone problem.

All this loading is blocking and you get nothing until the entire search is done. If I search an individual day that matches ~250M documents I get a 6s load time (with caches warm). I would think you could get a much better user experience using progressive and/or parallel loading. If I took that same 7 day search and broke it up into 7 1 day searches, the total query time would be higher, but kibana could start showing results on the screen sooner. There would also be the opportunity to parallelize the requests which would bring the total time to full results down much lower.

We are running Elasticsearch and Kibana 7.4 on AWS.

kertal · 2022-07-29T06:03:22Z

Closing because
a) We splitted the histogram + document query
b) the histogram can be hidden
So in case the histogram is slow, results are displayed much faster

Bargs added :Discovery discuss labels Dec 20, 2017

Bargs added feedback_needed and removed discuss labels Dec 20, 2017

jgough mentioned this issue Dec 21, 2017

Date histogram slow on Discover page #15732

Closed

Bargs mentioned this issue Mar 7, 2018

time_zone option makes date histograms much slower elastic/elasticsearch#28727

Closed

Bargs added enhancement New value added to drive a business result and removed feedback_needed labels May 11, 2018

timroes added Feature:Discover Discover Application Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed :Discovery labels Sep 16, 2018

timroes mentioned this issue Sep 16, 2018

Show progress bar for big result sets #3937

Closed

timroes added Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. and removed Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Aug 31, 2021

kertal closed this as completed Jul 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Result streaming on the discover tab #15662

Result streaming on the discover tab #15662

jgough commented Dec 18, 2017

Bargs commented Dec 20, 2017

jgough commented Dec 21, 2017

Bargs commented Dec 21, 2017

Bargs commented Mar 7, 2018

timroes commented May 6, 2018

jgough commented May 6, 2018

jgough commented May 8, 2018

Bargs commented May 9, 2018

jgough commented May 11, 2018

drock commented Jul 22, 2020

kertal commented Jul 29, 2022

Result streaming on the discover tab #15662

Result streaming on the discover tab #15662

Comments

jgough commented Dec 18, 2017

Bargs commented Dec 20, 2017

jgough commented Dec 21, 2017

Bargs commented Dec 21, 2017

Bargs commented Mar 7, 2018

timroes commented May 6, 2018

jgough commented May 6, 2018

jgough commented May 8, 2018

Bargs commented May 9, 2018

jgough commented May 11, 2018

drock commented Jul 22, 2020

kertal commented Jul 29, 2022