[Concurrent Segment Search] Optimize Significant Terms Agg to not perform count query for each segment slice #8789

jed326 · 2023-07-20T00:31:49Z

This issue is a follow-up to #8703 and #8735

For Significant Terms queries, the bg_count is gathered by performing a query against the shard. For the top level bg_count we only perform a query if a backgroundFilter is present:

OpenSearch/server/src/main/java/org/opensearch/search/aggregations/bucket/terms/SignificanceLookup.java

Line 100 in 8eea7b9

    
           supersetNumDocs = backgroundFilter == null ? searcher.getIndexReader().maxDoc() : searcher.count(this.backgroundFilter);

For the inner bg_count the query is performed regardless:

OpenSearch/server/src/main/java/org/opensearch/search/aggregations/bucket/terms/SignificanceLookup.java

Lines 193 to 216 in 8eea7b9

    
               /** 
        
                * Get the background frequency of a {@code long} term. 
        
                */ 
        
               private long getBackgroundFrequency(long term) throws IOException { 
        
                   return getBackgroundFrequency(fieldType.termQuery(format.format(term).toString(), context)); 
        
               } 
        
               private long getBackgroundFrequency(Query query) throws IOException { 
        
                   if (query instanceof TermQuery) { 
        
                       // for types that use the inverted index, we prefer using a terms 
        
                       // enum that will do a better job at reusing index inputs 
        
                       Term term = ((TermQuery) query).getTerm(); 
        
                       TermsEnum termsEnum = getTermsEnum(term.field()); 
        
                       if (termsEnum.seekExact(term.bytes())) { 
        
                           return termsEnum.docFreq(); 
        
                       } 
        
                       return 0; 
        
                   } 
        
                   // otherwise do it the naive way 
        
                   if (backgroundFilter != null) { 
        
                       query = new BooleanQuery.Builder().add(query, Occur.FILTER).add(backgroundFilter, Occur.FILTER).build(); 
        
                   } 
        
                   return context.searcher().count(query); 
        
               }

The top level bg_count will be the same for all buckets in the same shard, while the inner bg_count will be the same for every bucket with the same key in the same shard. Since both of these are shard level counts, we do not need to perform the query for each bucket and can do some optimization here to save on queries.

The text was updated successfully, but these errors were encountered:

jed326 · 2024-04-18T20:58:49Z

Wanted to follow up on this with some data. The count query will get performed per-bucket to get the bg_count of each bucket. In the concurrent segment search case this means there will be roughly slice_count times buckets for which this query is performed.

Moreover, this is done in the buildAggregation step which is done sequentially in the search threadpool in 2.13 until #11673 was completed for 2.14.

This means that in cases of significant terms aggs with a large bucket count (like nested aggs for example) we start to see latency regressions as the count query being done in buildAggregation would be forking all of these requests to the index_searcher threadpool. Moreover, since we are using the index_searcher task executor in those cases we would also spam the TaskResourceTrackingService.

In 2.14 this TaskResourceTrackingService overhead is resolved as the buildAggregation step was moved to the index_searcher thread and the deadlock protection provided by Lucene's TaskExecutor makes the count queries run in the same calling index_searcher thread via Runnable::run without invoking the executor.

For an example, here is some perf numbers for the range-numeric-significant-terms operation in the noaa workload with the top line being 0-slice numbers and the bottom line being 4-slice numbers.

Additionally, see sample CPU profiler for the 0 slice case:

jed326 · 2024-04-18T21:01:49Z

The regression with respect to the task resource tracking over head is fixed in 2.14 due to #11673 :

However, the underlying issue here where we are doing approximately slice_count times duplicated count queries in the concurrent search case still exists and we can see that is making it so there is roughly no perf improvement when comparing 0 slice case and concurrent search disabled case.

jed326 added enhancement Enhancement or improvement to existing feature or request untriaged labels Jul 20, 2023

jed326 added this to Concurrent Search Jul 20, 2023

github-project-automation bot moved this to Todo in Concurrent Search Jul 20, 2023

jed326 added distributed framework Search:Aggregations and removed untriaged labels Jul 20, 2023

sohami removed the status in Concurrent Search Sep 12, 2023

jed326 mentioned this issue Apr 18, 2024

[RFC] "Significant Terms" Aggregation Performance Ideas #13124

Open

peterzhuamazon added this to Search Project Board Dec 19, 2024

github-project-automation bot moved this to 🆕 New in Search Project Board Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Concurrent Segment Search] Optimize Significant Terms Agg to not perform count query for each segment slice #8789

[Concurrent Segment Search] Optimize Significant Terms Agg to not perform count query for each segment slice #8789

jed326 commented Jul 20, 2023

jed326 commented Apr 18, 2024

jed326 commented Apr 18, 2024

[Concurrent Segment Search] Optimize Significant Terms Agg to not perform count query for each segment slice #8789

[Concurrent Segment Search] Optimize Significant Terms Agg to not perform count query for each segment slice #8789

Comments

jed326 commented Jul 20, 2023

jed326 commented Apr 18, 2024

jed326 commented Apr 18, 2024