Remove shortcutTotalHitCount optimization #89047

javanna · 2022-08-02T17:13:08Z

Our TopDocsCollectorContext has an optimization to try and avoid counting total hit count for queries like match all docs, term query and field exists query, relying on the statistics from each segment instead. This optimization has been recently streamlined in lucene through the introduction of Weight#count and then leveraged directly by TotalHitCountCollector in lucene with https://issues.apache.org/jira/browse/LUCENE-10620 , later complemented by #88396 within Elasticsearch.

With this, we can remove the internal optimization and instead leverage the default lucene behaviour which covers more queries and will be possibly expanded in the future as well.

Closes #81034

Our TopDocsCollectorContext has an optimization to try and avoid counting total hit count for queries like match all docs, term query and field exists query, relying on the statistics from each segment instead. This optimization has been recently streamlined in lucene through the introduction of Weight#count and now leveraged directly by TotalHitCountCollector in lucene with https://issues.apache.org/jira/browse/LUCENE-10620 , later complemented by elastic#88396 within Elasticsearch. With this, we can remove this internal optimization and instead leverage the default lucene behaviour which covers more queries and will be possibly expanded in the future as well. Closes elastic#81034

elasticsearchmachine · 2022-08-02T19:45:43Z

Pinging @elastic/es-search (Team:Search)

dnhatn

LGTM. Thanks @javanna!

jpountz · 2022-08-03T08:42:24Z

server/src/main/java/org/elasticsearch/search/query/TopDocsCollectorContext.java

-                    totalHitsSupplier = () -> topDocsSupplier.get().totalHits;
-                } else {
-                    // don't compute hit counts via the collector
-                    topDocsCollector = createCollector(sortAndFormats, numHits, searchAfter, 1);


I like this change, this is the only place where I think we may be seeing regressions. Before your change, we would tell the top docs collector that it doesn't have to count hits at all (the 1 here) since we computed it up-front. With your change, we would always count trackTotalHitsUpTo documents, which delays a bit skipping.

Maybe we could run benchmarks with the geonames track to quantify the impact. I'll be especially interested on the impact on the default and term queries.

javanna · 2022-12-20T10:37:10Z

I revived this PR and ran the geonames benchmarks. Nothing from the benchmarks results caught my eye, can you double check too @jpountz ? Would you like me to run other benchmarks?

Baseline (current main without my change):

|---------------------------------------------------------------:|-------------------------------:|----------------:|--------:|
|                     Cumulative indexing time of primary shards |                                |    14.7545      |     min |
|             Min cumulative indexing time across primary shards |                                |     2.9129      |     min |
|          Median cumulative indexing time across primary shards |                                |     2.96518     |     min |
|             Max cumulative indexing time across primary shards |                                |     2.99017     |     min |
|            Cumulative indexing throttle time of primary shards |                                |     0           |     min |
|    Min cumulative indexing throttle time across primary shards |                                |     0           |     min |
| Median cumulative indexing throttle time across primary shards |                                |     0           |     min |
|    Max cumulative indexing throttle time across primary shards |                                |     0           |     min |
|                        Cumulative merge time of primary shards |                                |     0.167567    |     min |
|                       Cumulative merge count of primary shards |                                |     6           |         |
|                Min cumulative merge time across primary shards |                                |     0.0027      |     min |
|             Median cumulative merge time across primary shards |                                |     0.00398333  |     min |
|                Max cumulative merge time across primary shards |                                |     0.143083    |     min |
|               Cumulative merge throttle time of primary shards |                                |     0.03035     |     min |
|       Min cumulative merge throttle time across primary shards |                                |     0           |     min |
|    Median cumulative merge throttle time across primary shards |                                |     0           |     min |
|       Max cumulative merge throttle time across primary shards |                                |     0.03035     |     min |
|                      Cumulative refresh time of primary shards |                                |     1.65617     |     min |
|                     Cumulative refresh count of primary shards |                                |    48           |         |
|              Min cumulative refresh time across primary shards |                                |     0.271283    |     min |
|           Median cumulative refresh time across primary shards |                                |     0.306783    |     min |
|              Max cumulative refresh time across primary shards |                                |     0.445217    |     min |
|                        Cumulative flush time of primary shards |                                |     0.959783    |     min |
|                       Cumulative flush count of primary shards |                                |    10           |         |
|                Min cumulative flush time across primary shards |                                |     0.140083    |     min |
|             Median cumulative flush time across primary shards |                                |     0.199433    |     min |
|                Max cumulative flush time across primary shards |                                |     0.225283    |     min |
|                                        Total Young Gen GC time |                                |     1.505       |       s |
|                                       Total Young Gen GC count |                                |    46           |         |
|                                          Total Old Gen GC time |                                |     0           |       s |
|                                         Total Old Gen GC count |                                |     0           |         |
|                                                     Store size |                                |     2.96172     |      GB |
|                                                  Translog size |                                |     2.56114e-07 |      GB |
|                                         Heap used for segments |                                |     0           |      MB |
|                                       Heap used for doc values |                                |     0           |      MB |
|                                            Heap used for terms |                                |     0           |      MB |
|                                            Heap used for norms |                                |     0           |      MB |
|                                           Heap used for points |                                |     0           |      MB |
|                                    Heap used for stored fields |                                |     0           |      MB |
|                                                  Segment count |                                |    87           |         |
|                                    Total Ingest Pipeline count |                                |     0           |         |
|                                     Total Ingest Pipeline time |                                |     0           |       s |
|                                   Total Ingest Pipeline failed |                                |     0           |         |
|                                                     error rate |                   index-append |     0           |       % |
|                                       100th percentile latency |            refresh-after-index | 10493.2         |      ms |
|                                  100th percentile service time |            refresh-after-index | 10493.2         |      ms |
|                                                     error rate |            refresh-after-index |   100           |       % |
|                                                 Min Throughput |                    index-stats |    89.95        |   ops/s |
|                                                Mean Throughput |                    index-stats |    89.98        |   ops/s |
|                                              Median Throughput |                    index-stats |    89.98        |   ops/s |
|                                                 Max Throughput |                    index-stats |    89.99        |   ops/s |
|                                        50th percentile latency |                    index-stats |     2.87281     |      ms |
|                                        90th percentile latency |                    index-stats |     3.78207     |      ms |
|                                        99th percentile latency |                    index-stats |     4.25682     |      ms |
|                                      99.9th percentile latency |                    index-stats |     4.55479     |      ms |
|                                       100th percentile latency |                    index-stats |     4.58303     |      ms |
|                                   50th percentile service time |                    index-stats |     1.65261     |      ms |
|                                   90th percentile service time |                    index-stats |     1.89706     |      ms |
|                                   99th percentile service time |                    index-stats |     2.16371     |      ms |
|                                 99.9th percentile service time |                    index-stats |     2.88629     |      ms |
|                                  100th percentile service time |                    index-stats |     3.40735     |      ms |
|                                                     error rate |                    index-stats |     0           |       % |
|                                                 Min Throughput |                     node-stats |    89.78        |   ops/s |
|                                                Mean Throughput |                     node-stats |    89.92        |   ops/s |
|                                              Median Throughput |                     node-stats |    89.93        |   ops/s |
|                                                 Max Throughput |                     node-stats |    89.97        |   ops/s |
|                                        50th percentile latency |                     node-stats |     2.99969     |      ms |
|                                        90th percentile latency |                     node-stats |     4.06034     |      ms |
|                                        99th percentile latency |                     node-stats |     5.01334     |      ms |
|                                      99.9th percentile latency |                     node-stats |     6.46364     |      ms |
|                                       100th percentile latency |                     node-stats |     6.596       |      ms |
|                                   50th percentile service time |                     node-stats |     2.0371      |      ms |
|                                   90th percentile service time |                     node-stats |     2.42635     |      ms |
|                                   99th percentile service time |                     node-stats |     4.08543     |      ms |
|                                 99.9th percentile service time |                     node-stats |     4.85342     |      ms |
|                                  100th percentile service time |                     node-stats |     4.86696     |      ms |
|                                                     error rate |                     node-stats |     0           |       % |
|                                                 Min Throughput |                        default |    49.99        |   ops/s |
|                                                Mean Throughput |                        default |    49.99        |   ops/s |
|                                              Median Throughput |                        default |    49.99        |   ops/s |
|                                                 Max Throughput |                        default |    50           |   ops/s |
|                                        50th percentile latency |                        default |     3.06582     |      ms |
|                                        90th percentile latency |                        default |     4.31493     |      ms |
|                                        99th percentile latency |                        default |     4.75226     |      ms |
|                                      99.9th percentile latency |                        default |     7.30601     |      ms |
|                                       100th percentile latency |                        default |     9.2979      |      ms |
|                                   50th percentile service time |                        default |     2.03371     |      ms |
|                                   90th percentile service time |                        default |     2.34846     |      ms |
|                                   99th percentile service time |                        default |     2.72376     |      ms |
|                                 99.9th percentile service time |                        default |     6.79293     |      ms |
|                                  100th percentile service time |                        default |     8.94201     |      ms |
|                                                     error rate |                        default |     0           |       % |
|                                                 Min Throughput |                           term |    99.88        |   ops/s |
|                                                Mean Throughput |                           term |    99.92        |   ops/s |
|                                              Median Throughput |                           term |    99.93        |   ops/s |
|                                                 Max Throughput |                           term |    99.95        |   ops/s |
|                                        50th percentile latency |                           term |     2.59975     |      ms |
|                                        90th percentile latency |                           term |     3.03838     |      ms |
|                                        99th percentile latency |                           term |     3.4622      |      ms |
|                                      99.9th percentile latency |                           term |     7.59964     |      ms |
|                                       100th percentile latency |                           term |    11.1005      |      ms |
|                                   50th percentile service time |                           term |     1.80133     |      ms |
|                                   90th percentile service time |                           term |     2.06376     |      ms |
|                                   99th percentile service time |                           term |     2.28196     |      ms |
|                                 99.9th percentile service time |                           term |     6.80313     |      ms |
|                                  100th percentile service time |                           term |    10.8138      |      ms |
|                                                     error rate |                           term |     0           |       % |
|                                                 Min Throughput |                         phrase |   109.65        |   ops/s |
|                                                Mean Throughput |                         phrase |   109.79        |   ops/s |
|                                              Median Throughput |                         phrase |   109.81        |   ops/s |
|                                                 Max Throughput |                         phrase |   109.86        |   ops/s |
|                                        50th percentile latency |                         phrase |     2.58116     |      ms |
|                                        90th percentile latency |                         phrase |     3.01626     |      ms |
|                                        99th percentile latency |                         phrase |     3.32541     |      ms |
|                                      99.9th percentile latency |                         phrase |    15.9282      |      ms |
|                                       100th percentile latency |                         phrase |    18.9218      |      ms |
|                                   50th percentile service time |                         phrase |     1.80603     |      ms |
|                                   90th percentile service time |                         phrase |     2.01516     |      ms |
|                                   99th percentile service time |                         phrase |     2.26491     |      ms |
|                                 99.9th percentile service time |                         phrase |    10.6534      |      ms |
|                                  100th percentile service time |                         phrase |    18.4407      |      ms |
|                                                     error rate |                         phrase |     0           |       % |
|                                                 Min Throughput |           country_agg_uncached |     3           |   ops/s |
|                                                Mean Throughput |           country_agg_uncached |     3           |   ops/s |
|                                              Median Throughput |           country_agg_uncached |     3           |   ops/s |
|                                                 Max Throughput |           country_agg_uncached |     3           |   ops/s |
|                                        50th percentile latency |           country_agg_uncached |   134.2         |      ms |
|                                        90th percentile latency |           country_agg_uncached |   145.774       |      ms |
|                                        99th percentile latency |           country_agg_uncached |   162.576       |      ms |
|                                       100th percentile latency |           country_agg_uncached |   176.927       |      ms |
|                                   50th percentile service time |           country_agg_uncached |   133.062       |      ms |
|                                   90th percentile service time |           country_agg_uncached |   144.698       |      ms |
|                                   99th percentile service time |           country_agg_uncached |   161.493       |      ms |
|                                  100th percentile service time |           country_agg_uncached |   175.857       |      ms |
|                                                     error rate |           country_agg_uncached |     0           |       % |
|                                                 Min Throughput |             country_agg_cached |    98.62        |   ops/s |
|                                                Mean Throughput |             country_agg_cached |    99.03        |   ops/s |
|                                              Median Throughput |             country_agg_cached |    99.07        |   ops/s |
|                                                 Max Throughput |             country_agg_cached |    99.3         |   ops/s |
|                                        50th percentile latency |             country_agg_cached |     2.28683     |      ms |
|                                        90th percentile latency |             country_agg_cached |     3.45561     |      ms |
|                                        99th percentile latency |             country_agg_cached |     3.83689     |      ms |
|                                      99.9th percentile latency |             country_agg_cached |     4.39798     |      ms |
|                                       100th percentile latency |             country_agg_cached |     4.6001      |      ms |
|                                   50th percentile service time |             country_agg_cached |     1.43025     |      ms |
|                                   90th percentile service time |             country_agg_cached |     1.70903     |      ms |
|                                   99th percentile service time |             country_agg_cached |     1.95273     |      ms |
|                                 99.9th percentile service time |             country_agg_cached |     2.63577     |      ms |
|                                  100th percentile service time |             country_agg_cached |     2.77928     |      ms |
|                                                     error rate |             country_agg_cached |     0           |       % |
|                                                 Min Throughput |                         scroll |    20.05        | pages/s |
|                                                Mean Throughput |                         scroll |    20.06        | pages/s |
|                                              Median Throughput |                         scroll |    20.06        | pages/s |
|                                                 Max Throughput |                         scroll |    20.07        | pages/s |
|                                        50th percentile latency |                         scroll |   123.728       |      ms |
|                                        90th percentile latency |                         scroll |   126.994       |      ms |
|                                        99th percentile latency |                         scroll |   146.606       |      ms |
|                                       100th percentile latency |                         scroll |   161.786       |      ms |
|                                   50th percentile service time |                         scroll |   121.415       |      ms |
|                                   90th percentile service time |                         scroll |   124.792       |      ms |
|                                   99th percentile service time |                         scroll |   144.102       |      ms |
|                                  100th percentile service time |                         scroll |   159.452       |      ms |
|                                                     error rate |                         scroll |     0           |       % |
|                                                 Min Throughput |                     expression |     1.5         |   ops/s |
|                                                Mean Throughput |                     expression |     1.5         |   ops/s |
|                                              Median Throughput |                     expression |     1.5         |   ops/s |
|                                                 Max Throughput |                     expression |     1.5         |   ops/s |
|                                        50th percentile latency |                     expression |   332.406       |      ms |
|                                        90th percentile latency |                     expression |   336.178       |      ms |
|                                        99th percentile latency |                     expression |   349.702       |      ms |
|                                       100th percentile latency |                     expression |   350.252       |      ms |
|                                   50th percentile service time |                     expression |   331.126       |      ms |
|                                   90th percentile service time |                     expression |   334.905       |      ms |
|                                   99th percentile service time |                     expression |   348.412       |      ms |
|                                  100th percentile service time |                     expression |   348.679       |      ms |
|                                                     error rate |                     expression |     0           |       % |
|                                                 Min Throughput |                painless_static |     1.4         |   ops/s |
|                                                Mean Throughput |                painless_static |     1.4         |   ops/s |
|                                              Median Throughput |                painless_static |     1.4         |   ops/s |
|                                                 Max Throughput |                painless_static |     1.4         |   ops/s |
|                                        50th percentile latency |                painless_static |   424.343       |      ms |
|                                        90th percentile latency |                painless_static |   431.837       |      ms |
|                                        99th percentile latency |                painless_static |   441.434       |      ms |
|                                       100th percentile latency |                painless_static |   442.237       |      ms |
|                                   50th percentile service time |                painless_static |   422.939       |      ms |
|                                   90th percentile service time |                painless_static |   430.634       |      ms |
|                                   99th percentile service time |                painless_static |   440.14        |      ms |
|                                  100th percentile service time |                painless_static |   440.874       |      ms |
|                                                     error rate |                painless_static |     0           |       % |
|                                                 Min Throughput |               painless_dynamic |     1.4         |   ops/s |
|                                                Mean Throughput |               painless_dynamic |     1.4         |   ops/s |
|                                              Median Throughput |               painless_dynamic |     1.4         |   ops/s |
|                                                 Max Throughput |               painless_dynamic |     1.4         |   ops/s |
|                                        50th percentile latency |               painless_dynamic |   431.19        |      ms |
|                                        90th percentile latency |               painless_dynamic |   435.314       |      ms |
|                                        99th percentile latency |               painless_dynamic |   447.68        |      ms |
|                                       100th percentile latency |               painless_dynamic |   451.084       |      ms |
|                                   50th percentile service time |               painless_dynamic |   429.399       |      ms |
|                                   90th percentile service time |               painless_dynamic |   433.318       |      ms |
|                                   99th percentile service time |               painless_dynamic |   446.788       |      ms |
|                                  100th percentile service time |               painless_dynamic |   450.219       |      ms |
|                                                     error rate |               painless_dynamic |     0           |       % |
|                                                 Min Throughput | decay_geo_gauss_function_score |     1           |   ops/s |
|                                                Mean Throughput | decay_geo_gauss_function_score |     1           |   ops/s |
|                                              Median Throughput | decay_geo_gauss_function_score |     1           |   ops/s |
|                                                 Max Throughput | decay_geo_gauss_function_score |     1           |   ops/s |
|                                        50th percentile latency | decay_geo_gauss_function_score |   374.378       |      ms |
|                                        90th percentile latency | decay_geo_gauss_function_score |   389.191       |      ms |
|                                        99th percentile latency | decay_geo_gauss_function_score |   391.85        |      ms |
|                                       100th percentile latency | decay_geo_gauss_function_score |   392.543       |      ms |
|                                   50th percentile service time | decay_geo_gauss_function_score |   372.867       |      ms |
|                                   90th percentile service time | decay_geo_gauss_function_score |   387.9         |      ms |
|                                   99th percentile service time | decay_geo_gauss_function_score |   390.2         |      ms |
|                                  100th percentile service time | decay_geo_gauss_function_score |   390.619       |      ms |
|                                                     error rate | decay_geo_gauss_function_score |     0           |       % |
|                                                 Min Throughput |   decay_geo_gauss_script_score |     1           |   ops/s |
|                                                Mean Throughput |   decay_geo_gauss_script_score |     1           |   ops/s |
|                                              Median Throughput |   decay_geo_gauss_script_score |     1           |   ops/s |
|                                                 Max Throughput |   decay_geo_gauss_script_score |     1           |   ops/s |
|                                        50th percentile latency |   decay_geo_gauss_script_score |   401.698       |      ms |
|                                        90th percentile latency |   decay_geo_gauss_script_score |   410.091       |      ms |
|                                        99th percentile latency |   decay_geo_gauss_script_score |   428.357       |      ms |
|                                       100th percentile latency |   decay_geo_gauss_script_score |   431.643       |      ms |
|                                   50th percentile service time |   decay_geo_gauss_script_score |   399.86        |      ms |
|                                   90th percentile service time |   decay_geo_gauss_script_score |   408.528       |      ms |
|                                   99th percentile service time |   decay_geo_gauss_script_score |   426.957       |      ms |
|                                  100th percentile service time |   decay_geo_gauss_script_score |   430.013       |      ms |
|                                                     error rate |   decay_geo_gauss_script_score |     0           |       % |
|                                                 Min Throughput |     field_value_function_score |     1.5         |   ops/s |
|                                                Mean Throughput |     field_value_function_score |     1.5         |   ops/s |
|                                              Median Throughput |     field_value_function_score |     1.5         |   ops/s |
|                                                 Max Throughput |     field_value_function_score |     1.51        |   ops/s |
|                                        50th percentile latency |     field_value_function_score |   136.794       |      ms |
|                                        90th percentile latency |     field_value_function_score |   138.645       |      ms |
|                                        99th percentile latency |     field_value_function_score |   140.362       |      ms |
|                                       100th percentile latency |     field_value_function_score |   141.246       |      ms |
|                                   50th percentile service time |     field_value_function_score |   135.269       |      ms |
|                                   90th percentile service time |     field_value_function_score |   136.876       |      ms |
|                                   99th percentile service time |     field_value_function_score |   138.512       |      ms |
|                                  100th percentile service time |     field_value_function_score |   139.136       |      ms |
|                                                     error rate |     field_value_function_score |     0           |       % |
|                                                 Min Throughput |       field_value_script_score |     1.5         |   ops/s |
|                                                Mean Throughput |       field_value_script_score |     1.5         |   ops/s |
|                                              Median Throughput |       field_value_script_score |     1.5         |   ops/s |
|                                                 Max Throughput |       field_value_script_score |     1.5         |   ops/s |
|                                        50th percentile latency |       field_value_script_score |   197.463       |      ms |
|                                        90th percentile latency |       field_value_script_score |   201.006       |      ms |
|                                        99th percentile latency |       field_value_script_score |   241.363       |      ms |
|                                       100th percentile latency |       field_value_script_score |   269.868       |      ms |
|                                   50th percentile service time |       field_value_script_score |   195.967       |      ms |
|                                   90th percentile service time |       field_value_script_score |   199.248       |      ms |
|                                   99th percentile service time |       field_value_script_score |   239.841       |      ms |
|                                  100th percentile service time |       field_value_script_score |   268.763       |      ms |
|                                                     error rate |       field_value_script_score |     0           |       % |
|                                                 Min Throughput |                    large_terms |     1.1         |   ops/s |
|                                                Mean Throughput |                    large_terms |     1.1         |   ops/s |
|                                              Median Throughput |                    large_terms |     1.1         |   ops/s |
|                                                 Max Throughput |                    large_terms |     1.1         |   ops/s |
|                                        50th percentile latency |                    large_terms |   543.912       |      ms |
|                                        90th percentile latency |                    large_terms |   547.238       |      ms |
|                                        99th percentile latency |                    large_terms |   571.009       |      ms |
|                                       100th percentile latency |                    large_terms |   578.712       |      ms |
|                                   50th percentile service time |                    large_terms |   534.931       |      ms |
|                                   90th percentile service time |                    large_terms |   538.092       |      ms |
|                                   99th percentile service time |                    large_terms |   561.818       |      ms |
|                                  100th percentile service time |                    large_terms |   569.064       |      ms |
|                                                     error rate |                    large_terms |     0           |       % |
|                                                 Min Throughput |           large_filtered_terms |     1.1         |   ops/s |
|                                                Mean Throughput |           large_filtered_terms |     1.1         |   ops/s |
|                                              Median Throughput |           large_filtered_terms |     1.1         |   ops/s |
|                                                 Max Throughput |           large_filtered_terms |     1.1         |   ops/s |
|                                        50th percentile latency |           large_filtered_terms |   548.604       |      ms |
|                                        90th percentile latency |           large_filtered_terms |   555.543       |      ms |
|                                        99th percentile latency |           large_filtered_terms |   578.383       |      ms |
|                                       100th percentile latency |           large_filtered_terms |   586.958       |      ms |
|                                   50th percentile service time |           large_filtered_terms |   539.855       |      ms |
|                                   90th percentile service time |           large_filtered_terms |   546.654       |      ms |
|                                   99th percentile service time |           large_filtered_terms |   569.742       |      ms |
|                                  100th percentile service time |           large_filtered_terms |   578.006       |      ms |
|                                                     error rate |           large_filtered_terms |     0           |       % |
|                                                 Min Throughput |         large_prohibited_terms |     1.1         |   ops/s |
|                                                Mean Throughput |         large_prohibited_terms |     1.1         |   ops/s |
|                                              Median Throughput |         large_prohibited_terms |     1.1         |   ops/s |
|                                                 Max Throughput |         large_prohibited_terms |     1.1         |   ops/s |
|                                        50th percentile latency |         large_prohibited_terms |   527.754       |      ms |
|                                        90th percentile latency |         large_prohibited_terms |   531.253       |      ms |
|                                        99th percentile latency |         large_prohibited_terms |   535.281       |      ms |
|                                       100th percentile latency |         large_prohibited_terms |   535.465       |      ms |
|                                   50th percentile service time |         large_prohibited_terms |   518.843       |      ms |
|                                   90th percentile service time |         large_prohibited_terms |   522.73        |      ms |
|                                   99th percentile service time |         large_prohibited_terms |   526.707       |      ms |
|                                  100th percentile service time |         large_prohibited_terms |   527.02        |      ms |
|                                                     error rate |         large_prohibited_terms |     0           |       % |
|                                                 Min Throughput |           desc_sort_population |     1.5         |   ops/s |
|                                                Mean Throughput |           desc_sort_population |     1.51        |   ops/s |
|                                              Median Throughput |           desc_sort_population |     1.51        |   ops/s |
|                                                 Max Throughput |           desc_sort_population |     1.51        |   ops/s |
|                                        50th percentile latency |           desc_sort_population |     5.5268      |      ms |
|                                        90th percentile latency |           desc_sort_population |     5.91803     |      ms |
|                                        99th percentile latency |           desc_sort_population |     6.12278     |      ms |
|                                       100th percentile latency |           desc_sort_population |     6.14367     |      ms |
|                                   50th percentile service time |           desc_sort_population |     3.78196     |      ms |
|                                   90th percentile service time |           desc_sort_population |     3.97551     |      ms |
|                                   99th percentile service time |           desc_sort_population |     4.14683     |      ms |
|                                  100th percentile service time |           desc_sort_population |     4.19459     |      ms |
|                                                     error rate |           desc_sort_population |     0           |       % |
|                                                 Min Throughput |            asc_sort_population |     1.5         |   ops/s |
|                                                Mean Throughput |            asc_sort_population |     1.51        |   ops/s |
|                                              Median Throughput |            asc_sort_population |     1.51        |   ops/s |
|                                                 Max Throughput |            asc_sort_population |     1.51        |   ops/s |
|                                        50th percentile latency |            asc_sort_population |     4.73636     |      ms |
|                                        90th percentile latency |            asc_sort_population |     5.1548      |      ms |
|                                        99th percentile latency |            asc_sort_population |    41.998       |      ms |
|                                       100th percentile latency |            asc_sort_population |    78.611       |      ms |
|                                   50th percentile service time |            asc_sort_population |     3.0662      |      ms |
|                                   90th percentile service time |            asc_sort_population |     3.18948     |      ms |
|                                   99th percentile service time |            asc_sort_population |    39.9719      |      ms |
|                                  100th percentile service time |            asc_sort_population |    76.6236      |      ms |
|                                                     error rate |            asc_sort_population |     0           |       % |
|                                                 Min Throughput | asc_sort_with_after_population |     1.5         |   ops/s |
|                                                Mean Throughput | asc_sort_with_after_population |     1.51        |   ops/s |
|                                              Median Throughput | asc_sort_with_after_population |     1.51        |   ops/s |
|                                                 Max Throughput | asc_sort_with_after_population |     1.51        |   ops/s |
|                                        50th percentile latency | asc_sort_with_after_population |     6.19328     |      ms |
|                                        90th percentile latency | asc_sort_with_after_population |     6.72275     |      ms |
|                                        99th percentile latency | asc_sort_with_after_population |     6.88792     |      ms |
|                                       100th percentile latency | asc_sort_with_after_population |     6.89045     |      ms |
|                                   50th percentile service time | asc_sort_with_after_population |     4.54978     |      ms |
|                                   90th percentile service time | asc_sort_with_after_population |     4.88187     |      ms |
|                                   99th percentile service time | asc_sort_with_after_population |     5.03946     |      ms |
|                                  100th percentile service time | asc_sort_with_after_population |     5.05401     |      ms |
|                                                     error rate | asc_sort_with_after_population |     0           |       % |
|                                                 Min Throughput |            desc_sort_geonameid |     6.02        |   ops/s |
|                                                Mean Throughput |            desc_sort_geonameid |     6.02        |   ops/s |
|                                              Median Throughput |            desc_sort_geonameid |     6.02        |   ops/s |
|                                                 Max Throughput |            desc_sort_geonameid |     6.03        |   ops/s |
|                                        50th percentile latency |            desc_sort_geonameid |     5.61848     |      ms |
|                                        90th percentile latency |            desc_sort_geonameid |     5.94872     |      ms |
|                                        99th percentile latency |            desc_sort_geonameid |     6.14102     |      ms |
|                                       100th percentile latency |            desc_sort_geonameid |     6.14694     |      ms |
|                                   50th percentile service time |            desc_sort_geonameid |     4.4259      |      ms |
|                                   90th percentile service time |            desc_sort_geonameid |     4.68748     |      ms |
|                                   99th percentile service time |            desc_sort_geonameid |     4.97231     |      ms |
|                                  100th percentile service time |            desc_sort_geonameid |     5.01756     |      ms |
|                                                     error rate |            desc_sort_geonameid |     0           |       % |
|                                                 Min Throughput | desc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                                Mean Throughput | desc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                              Median Throughput | desc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                                 Max Throughput | desc_sort_with_after_geonameid |     6.03        |   ops/s |
|                                        50th percentile latency | desc_sort_with_after_geonameid |    14.0931      |      ms |
|                                        90th percentile latency | desc_sort_with_after_geonameid |    16.115       |      ms |
|                                        99th percentile latency | desc_sort_with_after_geonameid |    18.0218      |      ms |
|                                       100th percentile latency | desc_sort_with_after_geonameid |    18.0224      |      ms |
|                                   50th percentile service time | desc_sort_with_after_geonameid |    12.9568      |      ms |
|                                   90th percentile service time | desc_sort_with_after_geonameid |    15.1084      |      ms |
|                                   99th percentile service time | desc_sort_with_after_geonameid |    16.7593      |      ms |
|                                  100th percentile service time | desc_sort_with_after_geonameid |    16.9827      |      ms |
|                                                     error rate | desc_sort_with_after_geonameid |     0           |       % |
|                                                 Min Throughput |             asc_sort_geonameid |     6.02        |   ops/s |
|                                                Mean Throughput |             asc_sort_geonameid |     6.02        |   ops/s |
|                                              Median Throughput |             asc_sort_geonameid |     6.02        |   ops/s |
|                                                 Max Throughput |             asc_sort_geonameid |     6.03        |   ops/s |
|                                        50th percentile latency |             asc_sort_geonameid |     4.62365     |      ms |
|                                        90th percentile latency |             asc_sort_geonameid |     5.05052     |      ms |
|                                        99th percentile latency |             asc_sort_geonameid |     5.22239     |      ms |
|                                       100th percentile latency |             asc_sort_geonameid |     5.25253     |      ms |
|                                   50th percentile service time |             asc_sort_geonameid |     3.44913     |      ms |
|                                   90th percentile service time |             asc_sort_geonameid |     3.60107     |      ms |
|                                   99th percentile service time |             asc_sort_geonameid |     3.68395     |      ms |
|                                  100th percentile service time |             asc_sort_geonameid |     3.71943     |      ms |
|                                                     error rate |             asc_sort_geonameid |     0           |       % |
|                                                 Min Throughput |  asc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                                Mean Throughput |  asc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                              Median Throughput |  asc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                                 Max Throughput |  asc_sort_with_after_geonameid |     6.03        |   ops/s |
|                                        50th percentile latency |  asc_sort_with_after_geonameid |     5.03019     |      ms |
|                                        90th percentile latency |  asc_sort_with_after_geonameid |     5.4246      |      ms |
|                                        99th percentile latency |  asc_sort_with_after_geonameid |     5.76356     |      ms |
|                                       100th percentile latency |  asc_sort_with_after_geonameid |     5.85535     |      ms |
|                                   50th percentile service time |  asc_sort_with_after_geonameid |     3.75358     |      ms |
|                                   90th percentile service time |  asc_sort_with_after_geonameid |     4.06571     |      ms |
|                                   99th percentile service time |  asc_sort_with_after_geonameid |     4.25657     |      ms |
|                                  100th percentile service time |  asc_sort_with_after_geonameid |     4.2727      |      ms |
|                                                     error rate |  asc_sort_with_after_geonameid |     0           |       % |

Results from my branch which includes the fix:

|---------------------------------------------------------------:|-------------------------------:|----------------:|--------:|
|                     Cumulative indexing time of primary shards |                                |    14.9185      |     min |
|             Min cumulative indexing time across primary shards |                                |     2.85472     |     min |
|          Median cumulative indexing time across primary shards |                                |     2.9371      |     min |
|             Max cumulative indexing time across primary shards |                                |     3.15958     |     min |
|            Cumulative indexing throttle time of primary shards |                                |     0           |     min |
|    Min cumulative indexing throttle time across primary shards |                                |     0           |     min |
| Median cumulative indexing throttle time across primary shards |                                |     0           |     min |
|    Max cumulative indexing throttle time across primary shards |                                |     0           |     min |
|                        Cumulative merge time of primary shards |                                |     0.159183    |     min |
|                       Cumulative merge count of primary shards |                                |     5           |         |
|                Min cumulative merge time across primary shards |                                |     0.00345     |     min |
|             Median cumulative merge time across primary shards |                                |     0.0157167   |     min |
|                Max cumulative merge time across primary shards |                                |     0.0852167   |     min |
|               Cumulative merge throttle time of primary shards |                                |     0.0234333   |     min |
|       Min cumulative merge throttle time across primary shards |                                |     0           |     min |
|    Median cumulative merge throttle time across primary shards |                                |     0           |     min |
|       Max cumulative merge throttle time across primary shards |                                |     0.0234333   |     min |
|                      Cumulative refresh time of primary shards |                                |     1.67967     |     min |
|                     Cumulative refresh count of primary shards |                                |    46           |         |
|              Min cumulative refresh time across primary shards |                                |     0.27745     |     min |
|           Median cumulative refresh time across primary shards |                                |     0.305833    |     min |
|              Max cumulative refresh time across primary shards |                                |     0.469167    |     min |
|                        Cumulative flush time of primary shards |                                |     1.25548     |     min |
|                       Cumulative flush count of primary shards |                                |    10           |         |
|                Min cumulative flush time across primary shards |                                |     0.21925     |     min |
|             Median cumulative flush time across primary shards |                                |     0.2472      |     min |
|                Max cumulative flush time across primary shards |                                |     0.27925     |     min |
|                                        Total Young Gen GC time |                                |     1.121       |       s |
|                                       Total Young Gen GC count |                                |    45           |         |
|                                          Total Old Gen GC time |                                |     0           |       s |
|                                         Total Old Gen GC count |                                |     0           |         |
|                                                     Store size |                                |     2.94586     |      GB |
|                                                  Translog size |                                |     2.56114e-07 |      GB |
|                                         Heap used for segments |                                |     0           |      MB |
|                                       Heap used for doc values |                                |     0           |      MB |
|                                            Heap used for terms |                                |     0           |      MB |
|                                            Heap used for norms |                                |     0           |      MB |
|                                           Heap used for points |                                |     0           |      MB |
|                                    Heap used for stored fields |                                |     0           |      MB |
|                                                  Segment count |                                |    89           |         |
|                                    Total Ingest Pipeline count |                                |     0           |         |
|                                     Total Ingest Pipeline time |                                |     0           |       s |
|                                   Total Ingest Pipeline failed |                                |     0           |         |
|                                                     error rate |                   index-append |     0           |       % |
|                                       100th percentile latency |            refresh-after-index | 10732.7         |      ms |
|                                  100th percentile service time |            refresh-after-index | 10732.7         |      ms |
|                                                     error rate |            refresh-after-index |   100           |       % |
|                                                 Min Throughput |                    index-stats |    89.98        |   ops/s |
|                                                Mean Throughput |                    index-stats |    89.98        |   ops/s |
|                                              Median Throughput |                    index-stats |    89.98        |   ops/s |
|                                                 Max Throughput |                    index-stats |    89.99        |   ops/s |
|                                        50th percentile latency |                    index-stats |     2.67475     |      ms |
|                                        90th percentile latency |                    index-stats |     3.49978     |      ms |
|                                        99th percentile latency |                    index-stats |     3.88879     |      ms |
|                                      99.9th percentile latency |                    index-stats |     5.07837     |      ms |
|                                       100th percentile latency |                    index-stats |     5.85732     |      ms |
|                                   50th percentile service time |                    index-stats |     1.36093     |      ms |
|                                   90th percentile service time |                    index-stats |     1.57976     |      ms |
|                                   99th percentile service time |                    index-stats |     1.97002     |      ms |
|                                 99.9th percentile service time |                    index-stats |     2.02621     |      ms |
|                                  100th percentile service time |                    index-stats |     2.03324     |      ms |
|                                                     error rate |                    index-stats |     0           |       % |
|                                                 Min Throughput |                     node-stats |    89.75        |   ops/s |
|                                                Mean Throughput |                     node-stats |    89.9         |   ops/s |
|                                              Median Throughput |                     node-stats |    89.92        |   ops/s |
|                                                 Max Throughput |                     node-stats |    89.95        |   ops/s |
|                                        50th percentile latency |                     node-stats |     3.23653     |      ms |
|                                        90th percentile latency |                     node-stats |     4.12945     |      ms |
|                                        99th percentile latency |                     node-stats |     5.11187     |      ms |
|                                      99.9th percentile latency |                     node-stats |     6.08176     |      ms |
|                                       100th percentile latency |                     node-stats |     6.18163     |      ms |
|                                   50th percentile service time |                     node-stats |     2.29206     |      ms |
|                                   90th percentile service time |                     node-stats |     2.76165     |      ms |
|                                   99th percentile service time |                     node-stats |     4.14635     |      ms |
|                                 99.9th percentile service time |                     node-stats |     5.36388     |      ms |
|                                  100th percentile service time |                     node-stats |     5.47781     |      ms |
|                                                     error rate |                     node-stats |     0           |       % |
|                                                 Min Throughput |                        default |    49.94        |   ops/s |
|                                                Mean Throughput |                        default |    49.96        |   ops/s |
|                                              Median Throughput |                        default |    49.97        |   ops/s |
|                                                 Max Throughput |                        default |    49.98        |   ops/s |
|                                        50th percentile latency |                        default |     3.10619     |      ms |
|                                        90th percentile latency |                        default |     4.08708     |      ms |
|                                        99th percentile latency |                        default |     4.64534     |      ms |
|                                      99.9th percentile latency |                        default |     8.88669     |      ms |
|                                       100th percentile latency |                        default |    10.9282      |      ms |
|                                   50th percentile service time |                        default |     1.95389     |      ms |
|                                   90th percentile service time |                        default |     2.36799     |      ms |
|                                   99th percentile service time |                        default |     2.92647     |      ms |
|                                 99.9th percentile service time |                        default |     7.24396     |      ms |
|                                  100th percentile service time |                        default |     9.73177     |      ms |
|                                                     error rate |                        default |     0           |       % |
|                                                 Min Throughput |                           term |    99.73        |   ops/s |
|                                                Mean Throughput |                           term |    99.83        |   ops/s |
|                                              Median Throughput |                           term |    99.85        |   ops/s |
|                                                 Max Throughput |                           term |    99.89        |   ops/s |
|                                        50th percentile latency |                           term |     2.92205     |      ms |
|                                        90th percentile latency |                           term |     3.42724     |      ms |
|                                        99th percentile latency |                           term |     3.82527     |      ms |
|                                      99.9th percentile latency |                           term |     7.63367     |      ms |
|                                       100th percentile latency |                           term |    11.1277      |      ms |
|                                   50th percentile service time |                           term |     2.12522     |      ms |
|                                   90th percentile service time |                           term |     2.43555     |      ms |
|                                   99th percentile service time |                           term |     2.75201     |      ms |
|                                 99.9th percentile service time |                           term |     6.49252     |      ms |
|                                  100th percentile service time |                           term |     9.82611     |      ms |
|                                                     error rate |                           term |     0           |       % |
|                                                 Min Throughput |                         phrase |   109.73        |   ops/s |
|                                                Mean Throughput |                         phrase |   109.83        |   ops/s |
|                                              Median Throughput |                         phrase |   109.85        |   ops/s |
|                                                 Max Throughput |                         phrase |   109.9         |   ops/s |
|                                        50th percentile latency |                         phrase |     2.53479     |      ms |
|                                        90th percentile latency |                         phrase |     3.01195     |      ms |
|                                        99th percentile latency |                         phrase |     3.54029     |      ms |
|                                      99.9th percentile latency |                         phrase |    15.006       |      ms |
|                                       100th percentile latency |                         phrase |    18.005       |      ms |
|                                   50th percentile service time |                         phrase |     1.74877     |      ms |
|                                   90th percentile service time |                         phrase |     2.10537     |      ms |
|                                   99th percentile service time |                         phrase |     2.60654     |      ms |
|                                 99.9th percentile service time |                         phrase |    10.0943      |      ms |
|                                  100th percentile service time |                         phrase |    17.3029      |      ms |
|                                                     error rate |                         phrase |     0           |       % |
|                                                 Min Throughput |           country_agg_uncached |     3           |   ops/s |
|                                                Mean Throughput |           country_agg_uncached |     3           |   ops/s |
|                                              Median Throughput |           country_agg_uncached |     3           |   ops/s |
|                                                 Max Throughput |           country_agg_uncached |     3           |   ops/s |
|                                        50th percentile latency |           country_agg_uncached |   138.293       |      ms |
|                                        90th percentile latency |           country_agg_uncached |   149.663       |      ms |
|                                        99th percentile latency |           country_agg_uncached |   156.159       |      ms |
|                                       100th percentile latency |           country_agg_uncached |   158.189       |      ms |
|                                   50th percentile service time |           country_agg_uncached |   136.876       |      ms |
|                                   90th percentile service time |           country_agg_uncached |   148.403       |      ms |
|                                   99th percentile service time |           country_agg_uncached |   154.805       |      ms |
|                                  100th percentile service time |           country_agg_uncached |   156.452       |      ms |
|                                                     error rate |           country_agg_uncached |     0           |       % |
|                                                 Min Throughput |             country_agg_cached |    98.58        |   ops/s |
|                                                Mean Throughput |             country_agg_cached |    99           |   ops/s |
|                                              Median Throughput |             country_agg_cached |    99.05        |   ops/s |
|                                                 Max Throughput |             country_agg_cached |    99.29        |   ops/s |
|                                        50th percentile latency |             country_agg_cached |     2.24814     |      ms |
|                                        90th percentile latency |             country_agg_cached |     3.45649     |      ms |
|                                        99th percentile latency |             country_agg_cached |     3.76037     |      ms |
|                                      99.9th percentile latency |             country_agg_cached |     3.91796     |      ms |
|                                       100th percentile latency |             country_agg_cached |     3.9418      |      ms |
|                                   50th percentile service time |             country_agg_cached |     1.35517     |      ms |
|                                   90th percentile service time |             country_agg_cached |     1.68261     |      ms |
|                                   99th percentile service time |             country_agg_cached |     2.06296     |      ms |
|                                 99.9th percentile service time |             country_agg_cached |     2.29668     |      ms |
|                                  100th percentile service time |             country_agg_cached |     2.31993     |      ms |
|                                                     error rate |             country_agg_cached |     0           |       % |
|                                                 Min Throughput |                         scroll |    20.05        | pages/s |
|                                                Mean Throughput |                         scroll |    20.05        | pages/s |
|                                              Median Throughput |                         scroll |    20.05        | pages/s |
|                                                 Max Throughput |                         scroll |    20.07        | pages/s |
|                                        50th percentile latency |                         scroll |   136.009       |      ms |
|                                        90th percentile latency |                         scroll |   142.206       |      ms |
|                                        99th percentile latency |                         scroll |   147.463       |      ms |
|                                       100th percentile latency |                         scroll |   149.218       |      ms |
|                                   50th percentile service time |                         scroll |   133.091       |      ms |
|                                   90th percentile service time |                         scroll |   139.803       |      ms |
|                                   99th percentile service time |                         scroll |   145.222       |      ms |
|                                  100th percentile service time |                         scroll |   146.898       |      ms |
|                                                     error rate |                         scroll |     0           |       % |
|                                                 Min Throughput |                     expression |     1.5         |   ops/s |
|                                                Mean Throughput |                     expression |     1.5         |   ops/s |
|                                              Median Throughput |                     expression |     1.5         |   ops/s |
|                                                 Max Throughput |                     expression |     1.5         |   ops/s |
|                                        50th percentile latency |                     expression |   321.258       |      ms |
|                                        90th percentile latency |                     expression |   334.713       |      ms |
|                                        99th percentile latency |                     expression |   342.656       |      ms |
|                                       100th percentile latency |                     expression |   343.252       |      ms |
|                                   50th percentile service time |                     expression |   319.339       |      ms |
|                                   90th percentile service time |                     expression |   332.754       |      ms |
|                                   99th percentile service time |                     expression |   340.651       |      ms |
|                                  100th percentile service time |                     expression |   340.659       |      ms |
|                                                     error rate |                     expression |     0           |       % |
|                                                 Min Throughput |                painless_static |     1.4         |   ops/s |
|                                                Mean Throughput |                painless_static |     1.4         |   ops/s |
|                                              Median Throughput |                painless_static |     1.4         |   ops/s |
|                                                 Max Throughput |                painless_static |     1.4         |   ops/s |
|                                        50th percentile latency |                painless_static |   399.607       |      ms |
|                                        90th percentile latency |                painless_static |   411.235       |      ms |
|                                        99th percentile latency |                painless_static |   445.419       |      ms |
|                                       100th percentile latency |                painless_static |   466.456       |      ms |
|                                   50th percentile service time |                painless_static |   398.064       |      ms |
|                                   90th percentile service time |                painless_static |   409.713       |      ms |
|                                   99th percentile service time |                painless_static |   443.995       |      ms |
|                                  100th percentile service time |                painless_static |   465.206       |      ms |
|                                                     error rate |                painless_static |     0           |       % |
|                                                 Min Throughput |               painless_dynamic |     1.4         |   ops/s |
|                                                Mean Throughput |               painless_dynamic |     1.4         |   ops/s |
|                                              Median Throughput |               painless_dynamic |     1.4         |   ops/s |
|                                                 Max Throughput |               painless_dynamic |     1.4         |   ops/s |
|                                        50th percentile latency |               painless_dynamic |   403.304       |      ms |
|                                        90th percentile latency |               painless_dynamic |   415.35        |      ms |
|                                        99th percentile latency |               painless_dynamic |   420.506       |      ms |
|                                       100th percentile latency |               painless_dynamic |   421.215       |      ms |
|                                   50th percentile service time |               painless_dynamic |   402.141       |      ms |
|                                   90th percentile service time |               painless_dynamic |   413.886       |      ms |
|                                   99th percentile service time |               painless_dynamic |   419.586       |      ms |
|                                  100th percentile service time |               painless_dynamic |   420.293       |      ms |
|                                                     error rate |               painless_dynamic |     0           |       % |
|                                                 Min Throughput | decay_geo_gauss_function_score |     1           |   ops/s |
|                                                Mean Throughput | decay_geo_gauss_function_score |     1           |   ops/s |
|                                              Median Throughput | decay_geo_gauss_function_score |     1           |   ops/s |
|                                                 Max Throughput | decay_geo_gauss_function_score |     1           |   ops/s |
|                                        50th percentile latency | decay_geo_gauss_function_score |   352.713       |      ms |
|                                        90th percentile latency | decay_geo_gauss_function_score |   354.868       |      ms |
|                                        99th percentile latency | decay_geo_gauss_function_score |   362.09        |      ms |
|                                       100th percentile latency | decay_geo_gauss_function_score |   366.608       |      ms |
|                                   50th percentile service time | decay_geo_gauss_function_score |   351.167       |      ms |
|                                   90th percentile service time | decay_geo_gauss_function_score |   353.26        |      ms |
|                                   99th percentile service time | decay_geo_gauss_function_score |   360.291       |      ms |
|                                  100th percentile service time | decay_geo_gauss_function_score |   365.044       |      ms |
|                                                     error rate | decay_geo_gauss_function_score |     0           |       % |
|                                                 Min Throughput |   decay_geo_gauss_script_score |     1           |   ops/s |
|                                                Mean Throughput |   decay_geo_gauss_script_score |     1           |   ops/s |
|                                              Median Throughput |   decay_geo_gauss_script_score |     1           |   ops/s |
|                                                 Max Throughput |   decay_geo_gauss_script_score |     1           |   ops/s |
|                                        50th percentile latency |   decay_geo_gauss_script_score |   384.917       |      ms |
|                                        90th percentile latency |   decay_geo_gauss_script_score |   393.224       |      ms |
|                                        99th percentile latency |   decay_geo_gauss_script_score |   403.572       |      ms |
|                                       100th percentile latency |   decay_geo_gauss_script_score |   404.466       |      ms |
|                                   50th percentile service time |   decay_geo_gauss_script_score |   383.385       |      ms |
|                                   90th percentile service time |   decay_geo_gauss_script_score |   391.183       |      ms |
|                                   99th percentile service time |   decay_geo_gauss_script_score |   402.103       |      ms |
|                                  100th percentile service time |   decay_geo_gauss_script_score |   402.781       |      ms |
|                                                     error rate |   decay_geo_gauss_script_score |     0           |       % |
|                                                 Min Throughput |     field_value_function_score |     1.5         |   ops/s |
|                                                Mean Throughput |     field_value_function_score |     1.5         |   ops/s |
|                                              Median Throughput |     field_value_function_score |     1.5         |   ops/s |
|                                                 Max Throughput |     field_value_function_score |     1.5         |   ops/s |
|                                        50th percentile latency |     field_value_function_score |   129.766       |      ms |
|                                        90th percentile latency |     field_value_function_score |   141.085       |      ms |
|                                        99th percentile latency |     field_value_function_score |   145.17        |      ms |
|                                       100th percentile latency |     field_value_function_score |   145.696       |      ms |
|                                   50th percentile service time |     field_value_function_score |   128.224       |      ms |
|                                   90th percentile service time |     field_value_function_score |   139.273       |      ms |
|                                   99th percentile service time |     field_value_function_score |   143.382       |      ms |
|                                  100th percentile service time |     field_value_function_score |   144.433       |      ms |
|                                                     error rate |     field_value_function_score |     0           |       % |
|                                                 Min Throughput |       field_value_script_score |     1.5         |   ops/s |
|                                                Mean Throughput |       field_value_script_score |     1.5         |   ops/s |
|                                              Median Throughput |       field_value_script_score |     1.5         |   ops/s |
|                                                 Max Throughput |       field_value_script_score |     1.5         |   ops/s |
|                                        50th percentile latency |       field_value_script_score |   201.995       |      ms |
|                                        90th percentile latency |       field_value_script_score |   209.175       |      ms |
|                                        99th percentile latency |       field_value_script_score |   245.659       |      ms |
|                                       100th percentile latency |       field_value_script_score |   277.35        |      ms |
|                                   50th percentile service time |       field_value_script_score |   200.701       |      ms |
|                                   90th percentile service time |       field_value_script_score |   207.558       |      ms |
|                                   99th percentile service time |       field_value_script_score |   243.796       |      ms |
|                                  100th percentile service time |       field_value_script_score |   276.325       |      ms |
|                                                     error rate |       field_value_script_score |     0           |       % |
|                                                 Min Throughput |                    large_terms |     1.1         |   ops/s |
|                                                Mean Throughput |                    large_terms |     1.1         |   ops/s |
|                                              Median Throughput |                    large_terms |     1.1         |   ops/s |
|                                                 Max Throughput |                    large_terms |     1.1         |   ops/s |
|                                        50th percentile latency |                    large_terms |   558.621       |      ms |
|                                        90th percentile latency |                    large_terms |   575.088       |      ms |
|                                        99th percentile latency |                    large_terms |   584.8         |      ms |
|                                       100th percentile latency |                    large_terms |   585.265       |      ms |
|                                   50th percentile service time |                    large_terms |   549.546       |      ms |
|                                   90th percentile service time |                    large_terms |   565.672       |      ms |
|                                   99th percentile service time |                    large_terms |   575.41        |      ms |
|                                  100th percentile service time |                    large_terms |   575.685       |      ms |
|                                                     error rate |                    large_terms |     0           |       % |
|                                                 Min Throughput |           large_filtered_terms |     1.1         |   ops/s |
|                                                Mean Throughput |           large_filtered_terms |     1.1         |   ops/s |
|                                              Median Throughput |           large_filtered_terms |     1.1         |   ops/s |
|                                                 Max Throughput |           large_filtered_terms |     1.1         |   ops/s |
|                                        50th percentile latency |           large_filtered_terms |   562.047       |      ms |
|                                        90th percentile latency |           large_filtered_terms |   579.132       |      ms |
|                                        99th percentile latency |           large_filtered_terms |   587.518       |      ms |
|                                       100th percentile latency |           large_filtered_terms |   589.435       |      ms |
|                                   50th percentile service time |           large_filtered_terms |   553.251       |      ms |
|                                   90th percentile service time |           large_filtered_terms |   570.79        |      ms |
|                                   99th percentile service time |           large_filtered_terms |   579.485       |      ms |
|                                  100th percentile service time |           large_filtered_terms |   581.571       |      ms |
|                                                     error rate |           large_filtered_terms |     0           |       % |
|                                                 Min Throughput |         large_prohibited_terms |     1.1         |   ops/s |
|                                                Mean Throughput |         large_prohibited_terms |     1.1         |   ops/s |
|                                              Median Throughput |         large_prohibited_terms |     1.1         |   ops/s |
|                                                 Max Throughput |         large_prohibited_terms |     1.1         |   ops/s |
|                                        50th percentile latency |         large_prohibited_terms |   544.585       |      ms |
|                                        90th percentile latency |         large_prohibited_terms |   562.376       |      ms |
|                                        99th percentile latency |         large_prohibited_terms |   576.493       |      ms |
|                                       100th percentile latency |         large_prohibited_terms |   576.652       |      ms |
|                                   50th percentile service time |         large_prohibited_terms |   536.069       |      ms |
|                                   90th percentile service time |         large_prohibited_terms |   553.927       |      ms |
|                                   99th percentile service time |         large_prohibited_terms |   567.9         |      ms |
|                                  100th percentile service time |         large_prohibited_terms |   568.545       |      ms |
|                                                     error rate |         large_prohibited_terms |     0           |       % |
|                                                 Min Throughput |           desc_sort_population |     1.5         |   ops/s |
|                                                Mean Throughput |           desc_sort_population |     1.51        |   ops/s |
|                                              Median Throughput |           desc_sort_population |     1.51        |   ops/s |
|                                                 Max Throughput |           desc_sort_population |     1.51        |   ops/s |
|                                        50th percentile latency |           desc_sort_population |     5.83199     |      ms |
|                                        90th percentile latency |           desc_sort_population |     6.40678     |      ms |
|                                        99th percentile latency |           desc_sort_population |     6.56707     |      ms |
|                                       100th percentile latency |           desc_sort_population |     6.62493     |      ms |
|                                   50th percentile service time |           desc_sort_population |     4.17976     |      ms |
|                                   90th percentile service time |           desc_sort_population |     4.36236     |      ms |
|                                   99th percentile service time |           desc_sort_population |     4.56165     |      ms |
|                                  100th percentile service time |           desc_sort_population |     4.56992     |      ms |
|                                                     error rate |           desc_sort_population |     0           |       % |
|                                                 Min Throughput |            asc_sort_population |     1.5         |   ops/s |
|                                                Mean Throughput |            asc_sort_population |     1.51        |   ops/s |
|                                              Median Throughput |            asc_sort_population |     1.51        |   ops/s |
|                                                 Max Throughput |            asc_sort_population |     1.51        |   ops/s |
|                                        50th percentile latency |            asc_sort_population |     5.32994     |      ms |
|                                        90th percentile latency |            asc_sort_population |     5.82029     |      ms |
|                                        99th percentile latency |            asc_sort_population |     6.22858     |      ms |
|                                       100th percentile latency |            asc_sort_population |     6.38707     |      ms |
|                                   50th percentile service time |            asc_sort_population |     3.6178      |      ms |
|                                   90th percentile service time |            asc_sort_population |     3.80185     |      ms |
|                                   99th percentile service time |            asc_sort_population |     3.98445     |      ms |
|                                  100th percentile service time |            asc_sort_population |     4.01286     |      ms |
|                                                     error rate |            asc_sort_population |     0           |       % |
|                                                 Min Throughput | asc_sort_with_after_population |     1.5         |   ops/s |
|                                                Mean Throughput | asc_sort_with_after_population |     1.51        |   ops/s |
|                                              Median Throughput | asc_sort_with_after_population |     1.51        |   ops/s |
|                                                 Max Throughput | asc_sort_with_after_population |     1.51        |   ops/s |
|                                        50th percentile latency | asc_sort_with_after_population |     6.4384      |      ms |
|                                        90th percentile latency | asc_sort_with_after_population |     7.0876      |      ms |
|                                        99th percentile latency | asc_sort_with_after_population |     7.36099     |      ms |
|                                       100th percentile latency | asc_sort_with_after_population |     7.43844     |      ms |
|                                   50th percentile service time | asc_sort_with_after_population |     4.91549     |      ms |
|                                   90th percentile service time | asc_sort_with_after_population |     5.10619     |      ms |
|                                   99th percentile service time | asc_sort_with_after_population |     5.27522     |      ms |
|                                  100th percentile service time | asc_sort_with_after_population |     5.29588     |      ms |
|                                                     error rate | asc_sort_with_after_population |     0           |       % |
|                                                 Min Throughput |            desc_sort_geonameid |     6.02        |   ops/s |
|                                                Mean Throughput |            desc_sort_geonameid |     6.02        |   ops/s |
|                                              Median Throughput |            desc_sort_geonameid |     6.02        |   ops/s |
|                                                 Max Throughput |            desc_sort_geonameid |     6.03        |   ops/s |
|                                        50th percentile latency |            desc_sort_geonameid |     5.48076     |      ms |
|                                        90th percentile latency |            desc_sort_geonameid |     6.00734     |      ms |
|                                        99th percentile latency |            desc_sort_geonameid |     6.26044     |      ms |
|                                       100th percentile latency |            desc_sort_geonameid |     6.29781     |      ms |
|                                   50th percentile service time |            desc_sort_geonameid |     4.34716     |      ms |
|                                   90th percentile service time |            desc_sort_geonameid |     4.5983      |      ms |
|                                   99th percentile service time |            desc_sort_geonameid |     4.80134     |      ms |
|                                  100th percentile service time |            desc_sort_geonameid |     4.82802     |      ms |
|                                                     error rate |            desc_sort_geonameid |     0           |       % |
|                                                 Min Throughput | desc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                                Mean Throughput | desc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                              Median Throughput | desc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                                 Max Throughput | desc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                        50th percentile latency | desc_sort_with_after_geonameid |    18.3231      |      ms |
|                                        90th percentile latency | desc_sort_with_after_geonameid |    20.3584      |      ms |
|                                        99th percentile latency | desc_sort_with_after_geonameid |    27.2554      |      ms |
|                                       100th percentile latency | desc_sort_with_after_geonameid |    27.3688      |      ms |
|                                   50th percentile service time | desc_sort_with_after_geonameid |    17.168       |      ms |
|                                   90th percentile service time | desc_sort_with_after_geonameid |    19.3803      |      ms |
|                                   99th percentile service time | desc_sort_with_after_geonameid |    25.7227      |      ms |
|                                  100th percentile service time | desc_sort_with_after_geonameid |    25.8356      |      ms |
|                                                     error rate | desc_sort_with_after_geonameid |     0           |       % |
|                                                 Min Throughput |             asc_sort_geonameid |     6.02        |   ops/s |
|                                                Mean Throughput |             asc_sort_geonameid |     6.02        |   ops/s |
|                                              Median Throughput |             asc_sort_geonameid |     6.02        |   ops/s |
|                                                 Max Throughput |             asc_sort_geonameid |     6.03        |   ops/s |
|                                        50th percentile latency |             asc_sort_geonameid |     4.92353     |      ms |
|                                        90th percentile latency |             asc_sort_geonameid |     5.24371     |      ms |
|                                        99th percentile latency |             asc_sort_geonameid |     5.54651     |      ms |
|                                       100th percentile latency |             asc_sort_geonameid |     5.55844     |      ms |
|                                   50th percentile service time |             asc_sort_geonameid |     3.74649     |      ms |
|                                   90th percentile service time |             asc_sort_geonameid |     3.89364     |      ms |
|                                   99th percentile service time |             asc_sort_geonameid |     4.0023      |      ms |
|                                  100th percentile service time |             asc_sort_geonameid |     4.00615     |      ms |
|                                                     error rate |             asc_sort_geonameid |     0           |       % |
|                                                 Min Throughput |  asc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                                Mean Throughput |  asc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                              Median Throughput |  asc_sort_with_after_geonameid |     6.02        |   ops/s |
|                                                 Max Throughput |  asc_sort_with_after_geonameid |     6.03        |   ops/s |
|                                        50th percentile latency |  asc_sort_with_after_geonameid |     5.07973     |      ms |
|                                        90th percentile latency |  asc_sort_with_after_geonameid |     5.53334     |      ms |
|                                        99th percentile latency |  asc_sort_with_after_geonameid |     5.83067     |      ms |
|                                       100th percentile latency |  asc_sort_with_after_geonameid |     5.83473     |      ms |
|                                   50th percentile service time |  asc_sort_with_after_geonameid |     4.00845     |      ms |
|                                   90th percentile service time |  asc_sort_with_after_geonameid |     4.19059     |      ms |
|                                   99th percentile service time |  asc_sort_with_after_geonameid |     4.40769     |      ms |
|                                  100th percentile service time |  asc_sort_with_after_geonameid |     4.41827     |      ms |
|                                                     error rate |  asc_sort_with_after_geonameid |     0           |       % |

jpountz · 2023-02-02T09:37:08Z

Sorry for the lag @javanna and thanks for running benchmarks. If the slowdown to match_all and term queries is not significant, then I feel better about the performance impact of this change. It's a great cleanup/simplification.

This reverts commit 283f8ac.

This reverts commit 283f8ac in the 8.7 branch (#89047). We have found a performance regression around executing search requests with size greater than zero that hold queries that can shortcut their total hit count, like term and match_all. The previous shortcut total hit count optimization done in ES was able to shortcut those while the top score docs collector in Lucene does not support that. This can be improved further on main but for 8.7 we are going the safe path of reverting and leaving things how they were.

We have removed shortcut total hit count with elastic#89047 and later noticed a couple of benchmark regressions. While we have moved to skip counting when possible when not collecting hits (e.g. size=0), which is the case where Elasticsearch uses TotalHitCountCollector and the shortcutting is supported natively in Lucene. For the case where hits are collected, the total hit count is counted as part of the collection in TopScoreDocCollector and TopFieldCollector, where Lucene does not support skipping the counting as it is hard to determine whether more competitive hits need to be collected or not. The previous change caused a regression specifically when collecting hits because we ended up removing our manual shortcut in favour of counting which causes overhead. With this change we reintroduce the shortcut total hit count method, and only use it when strictly necessary. When size is 0, we rely entirely on Lucene to shortcut the total hit counting, while when hits are collected we do it our way, for now. While at it, a few more tests are added to cover for situations that were not covered before.

This reverts commit 283f8ac.

Reverts #89047 We have removed shortcut total hit count with#89047 and later noticed a couple of benchmark regressions. This PR reverts such change and reinstates the original logic for shortcut total hit count.

We have removed shortcut total hit count with #89047 and later noticed a couple of benchmark regressions, which made us restore our shortcut total hit count mechanism. When not collecting hits (e.g. size=0) we can leverage Lucene skipping mechanism instead of our handmade shortcut total hit count, as Elasticsearch uses TotalHitCountCollector which calls Weight#count. The advantage of this is that it supports shortcutting for many more queries than the only 3 which our manual mechanism supports (match_all, term and field exists). While at it, a few more tests are added to cover for situations that were not covered before.

elasticsearchmachine added the v8.5.0 label Aug 2, 2022

javanna added :Search/Search Search-related issues that do not fall into other categories >refactoring labels Aug 2, 2022

javanna requested review from jpountz and dnhatn August 2, 2022 19:44

javanna marked this pull request as ready for review August 2, 2022 19:45

elasticsearchmachine added the Team:Search Meta label for search team label Aug 2, 2022

javanna mentioned this pull request Aug 2, 2022

Use Weight#count API instead shortcutTotalHitCount #84778

Closed

dnhatn approved these changes Aug 2, 2022

View reviewed changes

jpountz reviewed Aug 3, 2022

View reviewed changes

csoulios added v8.6.0 and removed v8.5.0 labels Sep 21, 2022

Merge branch 'main' into refactoring/remove_shortcut_total_hit_count

de01d55

kingherc added v8.7.0 and removed v8.6.0 labels Nov 16, 2022

Merge branch 'main' into refactoring/remove_shortcut_total_hit_count

e90a295

Merge branch 'main' into refactoring/remove_shortcut_total_hit_count

4474d47

javanna merged commit 283f8ac into elastic:main Feb 6, 2023

javanna deleted the refactoring/remove_shortcut_total_hit_count branch February 6, 2023 10:24

javanna added a commit to javanna/elasticsearch that referenced this pull request Feb 27, 2023

Revert "Remove shortcutTotalHitCount optimization (elastic#89047)"

e1037ba

This reverts commit 283f8ac.

javanna mentioned this pull request Feb 27, 2023

Revert "Remove shortcutTotalHitCount optimization (#89047)" #94159

Merged

javanna mentioned this pull request Feb 27, 2023

Reintroduce shortcut total hit count when collecting hits #94170

Closed

javanna removed the v8.7.0 label Feb 28, 2023

javanna added the v8.8.0 label Mar 10, 2023

javanna mentioned this pull request Mar 29, 2023

Leverage Weight#count when size is set to 0 #94858

Merged

javanna added a commit that referenced this pull request Mar 29, 2023

Revert "Remove shortcutTotalHitCount optimization (#89047)"

6d40c5d

This reverts commit 283f8ac.

javanna mentioned this pull request Mar 29, 2023

Revert "Remove shortcutTotalHitCount optimization" #94876

Merged

javanna removed the v8.8.0 label Mar 29, 2023

javanna mentioned this pull request Mar 29, 2023

Clean up shortcutTotalHitCount using the new Weight#count API #81034

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove shortcutTotalHitCount optimization #89047

Remove shortcutTotalHitCount optimization #89047

javanna commented Aug 2, 2022 •

edited

Loading

elasticsearchmachine commented Aug 2, 2022

dnhatn left a comment

jpountz Aug 3, 2022

javanna commented Dec 20, 2022

jpountz commented Feb 2, 2023

Remove shortcutTotalHitCount optimization #89047

Remove shortcutTotalHitCount optimization #89047

Conversation

javanna commented Aug 2, 2022 • edited Loading

elasticsearchmachine commented Aug 2, 2022

dnhatn left a comment

Choose a reason for hiding this comment

jpountz Aug 3, 2022

Choose a reason for hiding this comment

javanna commented Dec 20, 2022

jpountz commented Feb 2, 2023

javanna commented Aug 2, 2022 •

edited

Loading