Composite aggs seems to sort too slowly with filter queries #70035

benwtrent · 2021-03-05T20:12:28Z

Piggy-backing off of previous work: #28745

During the work in #69970 some troubling performance data has reared its ugly head.

Given the following query:

{"bool":{"filter":[{"term":{"event.dataset":"nginx.access"}}]}}

The following composite agg moves at an almost glacial pace:

"aggs": {
    "buckets": {
      "composite": {
        "size": 1000,
        "sources": [
          {
            "date": {
              "date_histogram": {
                "field": "@timestamp",
                "fixed_interval": "15m"
              }
            }
          },
          {
            "source.address": {
              "terms": {
                "field": "source.address"
              }
            }
          }
        ]
      },
      "aggregations": {
        "@timestamp": {
          "max": {
            "field": "@timestamp"
          }
        }
      }
    }
  }

Here are some doc stats:

total_hits: 14479391
cardinality(source.address): 851502
max_timestamp: "2017-03-11T23:59:56.537Z"
min_timestamp: "2017-02-01T00:00:00.189Z"

In datafeeds we "chunk" through when scrolling through data. Consequently, we hit every document and make multiple queries. This is because sorting by timestamp can be costly when hitting many docs.

So, our scrolling datafeed had the following performance:

search_count | 16,649
bucket_count | 935
average_search_time_per_bucket_ms | 81.901
~4.5 ms per search (bucket_count * average_search_time_per_bucket_ms)/search_count

Job finished in ~6 minutes

Doing composite agg without chunking:
🐌 🐌 🐌

search_count | 3,795
bucket_count | 935
average_search_time_per_bucket_ms | 2,705.224
~666.5 ms per search

🐌 🐌 🐌
job finished in 40+ mintes

It seems to me that the composite agg is doing WAY too much work. I think it may be sorting WAY too many documents given the sources.

As an experiment, I added some time based query chunking in 25264688ms intervals (calculated based on term cardinality, count, and total time range)
🔥 🔥 🔥

search_count | 4,124
bucket_count | 935
average_search_time_per_bucket_ms | 112.775
~25 ms per search

🔥 🔥 🔥
Job finished in ~4 minutes

Datafeeds (and transforms) will ALWAYS be a filter based query (ignoring scores). These queries are user provided, so they could definitely be anything. But it seems to me that there is still room for improvement in the composite agg.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-03-05T20:12:30Z

Pinging @elastic/es-analytics-geo (Team:Analytics)

benwtrent · 2021-03-05T20:22:55Z

OK, for curiousity's sake, I ran the datafeed without that simple filter query:
⚡ ⚡ ⚡ ⚡

search_count | 3,795
bucket_count | 935
average_search_time_per_bucket_ms | 59.694

⚡ ⚡ ⚡ ⚡
So much faster than my garbage chunking.
And the job finished in 3 minutes.

Oh man, if we could get these speeds with filter queries!!!

benwtrent · 2021-03-05T20:48:04Z

Note: for machine learning datafeeds, the first composite agg source will always be a date_histogram. We do this to make sure we get the buckets in a known time order.

benwtrent · 2021-03-16T19:08:48Z

Digging into the code some and discussed the various execution paths with @nik9000 .

elasticsearch/server/src/main/java/org/elasticsearch/search/aggregations/bucket/composite/CompositeAggregator.java

Lines 402 to 411 in 94b9c4b

    
           } else { 
        
               final LeafBucketCollector inner = queue.getLeafCollector(ctx, getFirstPassCollector(docIdSetBuilder, sortPrefixLen)); 
        
               return new LeafBucketCollector() { 
        
                   @Override 
        
                   public void collect(int doc, long zeroBucket) throws IOException { 
        
                       assert zeroBucket == 0L; 
        
                       inner.collect(doc); 
        
                   } 
        
               }; 
        
           }

Is the path we are hitting as:

The source index is NOT sorted
We have more than a range query in our user provided query

But, it does seem weird to hit every document.

Assume that the top source is a date_histogram and you have the after_key and size in hand. That seems to provide ample opportunity for reducing the number of docs hit as one can bring down the range of docs considered. Obviously, this is easier said than done.

Making that "slow path" faster will greatly improve throughput for transforms (which constantly uses composite aggs with range queries and term filter queries) and datafeeds (which allow arbitrary user provided filter queries).

The nice thing about datafeeds is that the top source will ALWAYS be a date_histogram.

For transforms AND datafeeds, the composite agg will be the only top level aggregation.

I leave this in more capable hands than mine :).

dimitris-athanasiou · 2022-12-07T14:40:26Z

I've raised #92197 which might be able to help with this issue.

wchaparro · 2024-02-15T20:32:50Z

closing as not planned.

benwtrent added >enhancement :Analytics/Aggregations Aggregations labels Mar 5, 2021

elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 5, 2021

benwtrent mentioned this issue Mar 5, 2021

[ML] adding support for composite aggs in anomaly detection #69970

Merged

not-napoleon added :Analytics/CompositeAggs and removed :Analytics/Aggregations Aggregations labels Jun 23, 2021

ywelsch mentioned this issue Jun 22, 2022

Add runner for paging through composite aggregations elastic/rally#1526

Merged

wchaparro closed this as completed Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composite aggs seems to sort too slowly with filter queries #70035

Composite aggs seems to sort too slowly with filter queries #70035

benwtrent commented Mar 5, 2021 •

edited

Loading

elasticmachine commented Mar 5, 2021

benwtrent commented Mar 5, 2021

benwtrent commented Mar 5, 2021

benwtrent commented Mar 16, 2021

dimitris-athanasiou commented Dec 7, 2022

wchaparro commented Feb 15, 2024

Composite aggs seems to sort too slowly with filter queries #70035

Composite aggs seems to sort too slowly with filter queries #70035

Comments

benwtrent commented Mar 5, 2021 • edited Loading

elasticmachine commented Mar 5, 2021

benwtrent commented Mar 5, 2021

benwtrent commented Mar 5, 2021

benwtrent commented Mar 16, 2021

dimitris-athanasiou commented Dec 7, 2022

wchaparro commented Feb 15, 2024

benwtrent commented Mar 5, 2021 •

edited

Loading