Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize sort on numeric long and date fields #39770

Conversation

mayya-sharipova
Copy link
Contributor

@mayya-sharipova mayya-sharipova commented Mar 6, 2019

Rewrite sort on numeric long or date field,
as LongPoint.newDistanceFeatureQuery.
This allow to significantly speed up the sorting.

The rewritten query will be be a bool query,
with FILTER clause on the original query,
and SHOULD clause on LongDistanceFeatureQuery.

closes #37043

@mayya-sharipova mayya-sharipova added the :Search/Search Search-related issues that do not fall into other categories label Mar 6, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search


// 2. Iterate over DocValues and fill fieldDocs
Long missingValue = (Long) sortFields[0].getMissingValue(); // for rewritten numeric sort, when we had a single sort on Long field
SortedNumericDocValues sdvs = MultiDocValues.getSortedNumericValues(reader, sortFields[0].getField());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpountz Is it fine to use MultiDocValues.getSortedNumericValues or is there is some other way to get values for docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could be simplified if we add the numeric field as the secondary sort. This way you can access the values directly in the returned field docs ?

@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Mar 29, 2019

@jpountz Adrien, I have finished the implementation of hits optimization on sort, but would like to get your feedback before writing tests

The main question is what to do with multiple values per document? Two challenges here:

  1. LongDistanceFeatureQuery has its own way to select a value, which can be different from what a user specifies in sort mode (min, max, avg, median)
  2. Difficulty in filling a proper value when covering from topDocs to fieldDocs

To resolve an issue with multiple values, we can either:

  1. Detect that documents have multiple values and don't run optimizations in this case ( I don't know how to do this detection efficiently except just iterating through docValues and checking docValuesCount for every doc)
  2. Add another parameter to the sort request (e.g. "optimized" : true, which is by default is false). With this parameter set, the user gives consent to run the optimization. In this case the user is either having only single valued docs, or agrees that a value is picked from multiple values based on the algorithm from LongDistanceFeatureQuery

What do you think?

@mayya-sharipova mayya-sharipova marked this pull request as ready for review March 29, 2019 19:02
// check if this is a field of type Long, that is indexed and has doc values
String fieldName = sortField.getField();
final MappedFieldType fieldType = searchContext.mapperService().fullName(fieldName);
if (fieldType == null) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the field isn't mapped, we could maybe return eg. a MatchAllDocsQuery?

String fieldName = sortField.getField();
final MappedFieldType fieldType = searchContext.mapperService().fullName(fieldName);
if (fieldType == null) return null;
if (fieldType.typeName() != "long") return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use .equals instead? I don't we would get an interned string all the time, so it shouldn't really matter, but I'd rather like to avoid relying on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's because it's a work in progress but the optimization can work on any numeric and date field ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimczi The optimization only works for the numeric long field and date field. Date field was added as well

if (fieldType.hasDocValues() == false) return null;
byte[] minValueBytes = PointValues.getMinPackedValue(reader, fieldName);
byte[] maxValueBytes = PointValues.getMaxPackedValue(reader, fieldName);
if ((maxValueBytes == null) || (minValueBytes == null)) return null; // no values on this shard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe return a MatchAllDocsQuery in that case

final MappedFieldType fieldType = searchContext.mapperService().fullName(fieldName);
if (fieldType == null) return null;
if (fieldType.typeName() != "long") return null;
if (fieldType.indexOptions() == IndexOptions.NONE) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is always going to be NONE on long fields, which are indexed with points, you should check the point dimension count instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpountz Actually for long fields, for fieldType.indexOptions() I am getting IndexOptions.DOCS_AND_FREQS_AND_POSITIONS and for fieldType.pointDataDimensionCount() I am getting 0. Should we modify first NumberFieldType to have IndexOptions.NONE and dataDimensionCount = 1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to set the dimension count on numeric and date fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sorry I got confused between what the actual fieldtype would look like at the Lucene level and what the Elasticsearch fieldtype looks like. Changing the MappedFieldType would be nice but from what I remember it might not be straightforward due to the fact that we have a fair number of functions that assume that indexOptions != NONE means "indexed". So let's not try to do it as part of this PR.

byte[] minValueBytes = PointValues.getMinPackedValue(reader, fieldName);
byte[] maxValueBytes = PointValues.getMaxPackedValue(reader, fieldName);
if ((maxValueBytes == null) || (minValueBytes == null)) return null; // no values on this shard

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to check the missing value as well to make sure this optimization is applicable.

long minValue = LongPoint.decodeDimension(minValueBytes, 0);
long maxValue = LongPoint.decodeDimension(maxValueBytes, 0);
final long origin = (sortField.getReverse()) ? maxValue : minValue;
final long pivotDistance = maxValue == minValue ? maxValue/2 : (maxValue - minValue)/2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe minValue == maxValue is a case when we should return a DocValueFieldExistsQuery since there is no point in computing distances in such a case.

@jpountz
Copy link
Contributor

jpountz commented Apr 1, 2019

Detect that documents have multiple values and don't run optimizations in this case

+1 to this approach. You can do that easily by comparing PointValues#size and docCount. I like the 2nd approach less since it is less transparent.

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments but I like the approach @mayya-sharipova .
Regarding the handling of multi-valued fields I think that the default mode to select the sort values are compatible with this optimization. When sorting by descending order we take the max value in the field and the minimum value is picked for ascending order. I think this is compatible with the distance feature query since it will pick the value closest to the origin. We don't need to handle multi-valued fields in the first version but this is something that could be added in a follow up.

Sort sort = searchContext.sort().sort;
if (Sort.RELEVANCE.equals(sort)) return null;
if (Sort.INDEXORDER.equals(sort)) return null;
if (sort.getSort().length > 1) return null; // we need only a single sort
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should prevent the optimization to kick in. The only condition is that the primary sort is performed on a numeric field indexed with points and doc_values no matter what fields is used for tiebreaking.

String fieldName = sortField.getField();
final MappedFieldType fieldType = searchContext.mapperService().fullName(fieldName);
if (fieldType == null) return null;
if (fieldType.typeName() != "long") return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's because it's a work in progress but the optimization can work on any numeric and date field ?

final MappedFieldType fieldType = searchContext.mapperService().fullName(fieldName);
if (fieldType == null) return null;
if (fieldType.typeName() != "long") return null;
if (fieldType.indexOptions() == IndexOptions.NONE) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to set the dimension count on numeric and date fields.


// 2. Iterate over DocValues and fill fieldDocs
Long missingValue = (Long) sortFields[0].getMissingValue(); // for rewritten numeric sort, when we had a single sort on Long field
SortedNumericDocValues sdvs = MultiDocValues.getSortedNumericValues(reader, sortFields[0].getField());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could be simplified if we add the numeric field as the secondary sort. This way you can access the values directly in the returned field docs ?

@mayya-sharipova mayya-sharipova changed the title WIP Optimize sort on numeric long and date fields Apr 5, 2019
@mayya-sharipova mayya-sharipova force-pushed the hits-optimizations-on-sort branch from 34a3c84 to 74457c5 Compare April 7, 2019 23:41
}
// add the numeric field as the last sort, so we can access its values later
newSortFields[oldSortFields.length] = oldSortFields[0];
newFormats[oldSortFields.length] = oldFormats[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we should make it second, not last. In some cases, different values will be mapped to the same score because floats don't have infinite accuracy. So we should tie-break on the field that we are sorting on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpountz thanks Adrien, corrected!

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mayya-sharipova for the iterations, I think it's getting close.

for (int i = 0; i < oldSortFields.length; i++) {
newSortFields[i + 1] = oldSortFields[i];
newFormats[i + 1] = oldFormats[i];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's replace this for-loop with a call to System#arraycopy?

if (searchContext.mapperService() == null) return null; // mapperService can be null in tests
final MappedFieldType fieldType = searchContext.mapperService().fullName(fieldName);
if (fieldType == null) return null; // for unmapped fields, default behaviour depending on "unmapped_type" flag
if ((fieldType.typeName().equals("long") == false) && (fieldType instanceof DateFieldType == false)) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra indentation?

@@ -303,6 +349,67 @@ static boolean execute(SearchContext searchContext,
}
}

private static Query tryRewriteNumericLongOrDateSort(SearchContext searchContext, IndexReader reader) throws IOException {
// child docs can inherit scores from parent, so avoid has_parent query
if (searchContext.request().source().query().getName().equals("has_parent")) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain this one, I don't understand why this is required.

if (searchContext.collapse() != null) return null;
Sort sort = searchContext.sort().sort;
if (Sort.RELEVANCE.equals(sort)) return null;
if (Sort.INDEXORDER.equals(sort)) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those two cases would be covered below by the fieldName == null check?

Object[] newFieldValues = new Object[oldFieldValues.length - 1];
for (int i = 0; i < oldFieldValues.length - 1; i++) {
newFieldValues[i] = oldFieldValues[i + 1];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Arrays#copyOfRange?


byte[] minValueBytes = PointValues.getMinPackedValue(reader, fieldName);
byte[] maxValueBytes = PointValues.getMaxPackedValue(reader, fieldName);
if ((maxValueBytes == null) || (minValueBytes == null)) return new MatchAllDocsQuery(); // no values on this shard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the back-and-forth, I just realized that returning a query that produces constant-scores would not help because when there are multiple sorting criteria, we can only ask hits to have a score that is greater than or equal to to best k-th score, rather than strictly greater to like when sorting only by score.

@mayya-sharipova mayya-sharipova force-pushed the hits-optimizations-on-sort branch 3 times, most recently from ac43ae1 to 3f5535d Compare April 26, 2019 14:45
@mayya-sharipova
Copy link
Contributor Author

@jpountz Adrien, thanks for the review. I have tried to address your comments, and this PR is ready for another round of review.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did another review round, sorry it took so long.

}
} finally {
// in case of errors do nothing - keep the same query
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm the comment suggests that we want to ignore errors, but this is not what an empty finally block would do, the error would still be propagated

if (searchContext.request() != null && searchContext.request().source().query() != null) {
if (searchContext.request().source().query().getName().equals("has_parent"))
return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only works if the has_parent query is the top-level query? Can you point me to a test that fails without this check so that I can better understand what the issue is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpountz Thanks I will remove this condition.
The test that previously was failing for me was the last test in ChildQuerySearchIT::testScoreForParentChildQueriesWithFunctionScore.

But after I added another check that there is NO _score sort field among any sort fields, this test doesn't fail anymore.

has_parent query with the parameter "score" = true produces child docs with scores from their parents even if there is a sort field for the child docs. But this apparently only happens when there is a secondary sort field on _score as in the test .addSort(SortBuilders.fieldSort("c_field3")).addSort(SortBuilders.scoreSort())
Thus the check that there is not _score sort field is enough to ensure we don't run sort optimizations for this kind of queries.

private static Query tryRewriteNumericLongOrDateSort(SearchContext searchContext, IndexReader reader) throws IOException {
if (searchContext.searchAfter() != null) return null;
if (searchContext.scrollContext() != null) return null;
if (searchContext.collapse() != null) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we need to check that track_scores is set to false too. Maybe we don't, but in this case too I think it'd be worth to leave a comment about it.

// check that there is NO _score sort field among sort fields, as it will be overwritten with score from DistanceFeatureQuery
for (int i = 1; i < sort.getSort().length; i++) {
if (SortField.FIELD_SCORE.equals(sort.getSort()[i])) return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to disable the optimization when sorting by a script as well, since scripts may use the score. Or turn it into a whitelist rather than a blacklist and only enable this optimization when all sort fields are either actual fields or _doc.

pivotDistance = 1;
} else if (pivotDistance < 0) { // negative if overflow happened
pivotDistance = - pivotDistance;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's compute pivotDistance = (maxValue - minValue) >>> 1 to avoid the overflow entirely? (>>>1 is a division by 2 on the unsigned representation)

@jpountz
Copy link
Contributor

jpountz commented May 21, 2019

Something I meant to ask as well: have you tried it out on some dataset, does it make sorting faster?


// check if this is a field of type Long or Date, that is indexed and has doc values
String fieldName = sortField.getField();
if (fieldName == null) return null; // happens when _score or _doc is the 1st sort field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably also return null when sortField.getType is not long. Not doing it would not cause bugs to my knowledge, but Lucene supports providing custom comparators, while we are assuming the natural number ordering here.

@mayya-sharipova
Copy link
Contributor Author

@jpountz Thanks Adrien for the feedback. I have addressed your comments in the last commit. One thing still left is to do an evaluation on a dataset. I will do the evaluation, and will get back with results.

@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented May 23, 2019

Here are results on running on the geonames rally's track.
Thanks @jimczi for corrections and help.

Running with an operation:

{
      "name": "long_sort_population",
      "operation-type": "search",
      "body": {
        "query": {
          "match_all": {}
        },
        "sort" : [
          {"population" : "desc"}
        ]
      }
    }

Results without optimization:

|       50th percentile latency | long_sort_population |   58.0295 |     ms |
|       90th percentile latency | long_sort_population |    62.525 |     ms |
|       99th percentile latency | long_sort_population |   68.5235 |     ms |
|      100th percentile latency | long_sort_population |    72.394 |     ms |
|  50th percentile service time | long_sort_population |   54.1314 |     ms |
|  90th percentile service time | long_sort_population |   57.6252 |     ms |
|  99th percentile service time | long_sort_population |   63.1359 |     ms |
| 100th percentile service time | long_sort_population |   67.0069 |     ms |

Results with optimization:

|        50th percentile latency | long_sort_population |   7.19378 |     ms |
|        90th percentile latency | long_sort_population |   9.63573 |     ms |
|        99th percentile latency | long_sort_population |   15.9834 |     ms |
|       100th percentile latency | long_sort_population |   17.1643 |     ms |
|   50th percentile service time | long_sort_population |   4.39315 |     ms |
|   90th percentile service time | long_sort_population |   4.75856 |     ms |
|   99th percentile service time | long_sort_population |   10.8195 |     ms |
|  100th percentile service time | long_sort_population |   13.7886 |     ms |

Optimization clearly demonstrates speedups:
Screenshot 2019-05-23 17 50 51

queries with optimizations ran 5-7 times faster.

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some additional comments but it looks good @mayya-sharipova . The benchmark results are great but we should also test on a field with more variations. Most of the document in population field for the geonames track have a value of 0, this is why sorting in ascending order is much slower with the new optimization. I compared ascending vs descending with the optimization so it would be good to also compare the times without optimization to ensure that there is no regression there (for the ascending sort).

@@ -306,6 +352,73 @@ static boolean execute(SearchContext searchContext,
}
}

private static Query tryRewriteNumericLongOrDateSort(SearchContext searchContext, IndexReader reader) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not activate the optim if trackTotalHits is true or set to Integer.MAX_VALUE since we cannot skip hits in this case. I think that moving these checks to TopDocsCollectorContext#SimpleTopDocsCollectorContext would be easier since you can find the final value of trackTotalHits easily. We should also not activate this optim if aggregations are part of the query because we don't apply max-score in this case either.

if (missingValuesAccordingToSort == false) return null;

// check for multiple values
if (PointValues.size(reader, fieldName) != PointValues.getDocCount(reader, fieldName)) return null; //TODO: handle multiple values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work if the multi-valued sort mode is set to max when you sort in reverse order (that's the default) and min if you sort in natural order (that's the default too). You can check the SortedNumericSortField to find these values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is already large, maybe we should keep making this optimization work for multi-valued fields as a follow-up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@mayya-sharipova
Copy link
Contributor Author

Here are the results of running of the geonames rally's track on sort ascending:

{
      "name": "long_sort_population",
      "operation-type": "search",
      "body": {
        "query": {
          "match_all": {}
        },
        "sort" : [
          {"population" : "asc"}
        ]
      }
    }

Without optimization:

|       50th percentile latency | long_sort_population |   56.2071 |     ms |
|       90th percentile latency | long_sort_population |   60.0372 |     ms |
|       99th percentile latency | long_sort_population |   66.0915 |     ms |
|      100th percentile latency | long_sort_population |   67.7832 |     ms |
|  50th percentile service time | long_sort_population |   52.0731 |     ms |
|  90th percentile service time | long_sort_population |   55.3643 |     ms |
|  99th percentile service time | long_sort_population |   60.7257 |     ms |
| 100th percentile service time | long_sort_population |   65.8864 |     ms |

with optimization:

|       50th percentile latency | long_sort_population |   175.018 |     ms |
|       90th percentile latency | long_sort_population |   184.525 |     ms |
|       99th percentile latency | long_sort_population |   206.141 |     ms |
|      100th percentile latency | long_sort_population |   210.183 |     ms |
|  50th percentile service time | long_sort_population |   171.161 |     ms |
|  90th percentile service time | long_sort_population |   179.845 |     ms |
|  99th percentile service time | long_sort_population |   205.672 |     ms |
| 100th percentile service time | long_sort_population |   209.223 |     ms |

Screenshot 2019-05-24 18 47 51

Looks like there is 3-4 times degradation in the performance

@mayya-sharipova mayya-sharipova force-pushed the hits-optimizations-on-sort branch from cd36d69 to 7f7a54e Compare May 27, 2019 13:13
@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented May 27, 2019

Main differences identified through the profile API:

Asc Desc
set_min_competitive_score 149,741 1,167,420
next_doc 424,122,609 678,521
next_doc_count 2,279,537 2,844
score 1,155,162,705 127,9540
score_count 2,279,537 2,844
SimpleFieldCollector 556,742,784 2,412,882

Look like in the asc case next_doc and score functions were applied to every document in the shard.

Profile from one shard asc

"searches": [
    {
        "query": [
            {
                "type": "BooleanQuery",
                "description": "#*:* LongDistanceFeatureQuery(field=,origin=0,pivotDistance=787892250)",
                "time_in_nanos": 1595958190,
                "breakdown": {
                    "set_min_competitive_score_count": 100,
                    "match_count": 0,
                    "shallow_advance_count": 0,
                    "set_min_competitive_score": 231962,
                    "next_doc": 425294028,
                    "match": 0,
                    "next_doc_count": 2279537,
                    "score_count": 2279537,
                    "compute_max_score_count": 0,
                    "compute_max_score": 0,
                    "advance": 30213,
                    "advance_count": 21,
                    "score": 1165159282,
                    "build_scorer_count": 42,
                    "create_weight": 33343,
                    "shallow_advance": 0,
                    "create_weight_count": 1,
                    "build_scorer": 650124
                },
                "children": [
                    {
                        "type": "MatchAllDocsQuery",
                        "description": "*:*",
                        "time_in_nanos": 345442099,
                        "breakdown": {
                            "set_min_competitive_score_count": 0,
                            "match_count": 0,
                            "shallow_advance_count": 0,
                            "set_min_competitive_score": 0,
                            "next_doc": 0,
                            "match": 0,
                            "next_doc_count": 0,
                            "score_count": 0,
                            "compute_max_score_count": 0,
                            "compute_max_score": 0,
                            "advance": 343032288,
                            "advance_count": 2279558,
                            "score": 0,
                            "build_scorer_count": 63,
                            "create_weight": 8618,
                            "shallow_advance": 0,
                            "create_weight_count": 1,
                            "build_scorer": 121571
                        }
                    },
                    {
                        "type": "LongDistanceFeatureQuery",
                        "description": "LongDistanceFeatureQuery(field=,origin=0,pivotDistance=787892250)",
                        "time_in_nanos": 777135963,
                        "breakdown": {
                            "set_min_competitive_score_count": 100,
                            "match_count": 0,
                            "shallow_advance_count": 63,
                            "set_min_competitive_score": 143873,
                            "next_doc": 0,
                            "match": 0,
                            "next_doc_count": 0,
                            "score_count": 2279537,
                            "compute_max_score_count": 42,
                            "compute_max_score": 6321,
                            "advance": 345943515,
                            "advance_count": 2279537,
                            "score": 426409120,
                            "build_scorer_count": 42,
                            "create_weight": 1816,
                            "shallow_advance": 9563,
                            "create_weight_count": 1,
                            "build_scorer": 62433
                        }
                    }
                ]
            }
        ],
        "rewrite_time": 53904,
        "collector": [
            {
                "name": "CancellableCollector",
                "reason": "search_cancelled",
                "time_in_nanos": 1217521280,
                "children": [
                    {
                        "name": "SimpleFieldCollector",
                        "reason": "search_top_hits",
                        "time_in_nanos": 556742784
                    }
                ]
            }
        ]
    }
]

Profile from one shard desc

"searches": [
    {
        "query": [
            {
                "type": "BooleanQuery",
                "description": "#*:* LongDistanceFeatureQuery(field=,origin=1575784500,pivotDistance=787892250)",
                "time_in_nanos": 3533470,
                "breakdown": {
                    "set_min_competitive_score_count": 122,
                    "match_count": 0,
                    "shallow_advance_count": 0,
                    "set_min_competitive_score": 1167420,
                    "next_doc": 678521,
                    "match": 0,
                    "next_doc_count": 2844,
                    "score_count": 2844,
                    "compute_max_score_count": 0,
                    "compute_max_score": 0,
                    "advance": 22470,
                    "advance_count": 21,
                    "score": 1279540,
                    "build_scorer_count": 42,
                    "create_weight": 27929,
                    "shallow_advance": 0,
                    "create_weight_count": 1,
                    "build_scorer": 351716
                },
                "children": [
                    {
                        "type": "MatchAllDocsQuery",
                        "description": "*:*",
                        "time_in_nanos": 503232,
                        "breakdown": {
                            "set_min_competitive_score_count": 0,
                            "match_count": 0,
                            "shallow_advance_count": 0,
                            "set_min_competitive_score": 0,
                            "next_doc": 0,
                            "match": 0,
                            "next_doc_count": 0,
                            "score_count": 0,
                            "compute_max_score_count": 0,
                            "compute_max_score": 0,
                            "advance": 417693,
                            "advance_count": 3243,
                            "score": 0,
                            "build_scorer_count": 63,
                            "create_weight": 7558,
                            "shallow_advance": 0,
                            "create_weight_count": 1,
                            "build_scorer": 74674
                        }
                    },
                    {
                        "type": "LongDistanceFeatureQuery",
                        "description": "LongDistanceFeatureQuery(field=,origin=1575784500,pivotDistance=787892250)",
                        "time_in_nanos": 2072289,
                        "breakdown": {
                            "set_min_competitive_score_count": 122,
                            "match_count": 0,
                            "shallow_advance_count": 63,
                            "set_min_competitive_score": 1131340,
                            "next_doc": 0,
                            "match": 0,
                            "next_doc_count": 0,
                            "score_count": 2844,
                            "compute_max_score_count": 42,
                            "compute_max_score": 5662,
                            "advance": 396083,
                            "advance_count": 2863,
                            "score": 475990,
                            "build_scorer_count": 42,
                            "create_weight": 1228,
                            "shallow_advance": 6652,
                            "create_weight_count": 1,
                            "build_scorer": 49357
                        }
                    }
                ]
            }
        ],
        "rewrite_time": 43716,
        "collector": [
            {
                "name": "CancellableCollector",
                "reason": "search_cancelled",
                "time_in_nanos": 3154714,
                "children": [
                    {
                        "name": "SimpleFieldCollector",
                        "reason": "search_top_hits",
                        "time_in_nanos": 2412882
                    }
                ]
            }
        ]
    }
]

@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented May 29, 2019

Results on the index http_logs with optimization:

desc:

|   50th percentile latency | desc_sort_timestamp |   334.230 |      ms |
|   90th percentile latency | desc_sort_timestamp |   436.808 |      ms |
|   99th percentile latency | desc_sort_timestamp |   568.251 |      ms |
|  100th percentile latency | desc_sort_timestamp |  2381.240 |      ms |

asc:

|   50th percentile latency |  asc_sort_timestamp |    98.879 |      ms |
|   90th percentile latency |  asc_sort_timestamp |   108.682 |      ms |
|   99th percentile latency |  asc_sort_timestamp |   120.685 |      ms |
|  100th percentile latency |  asc_sort_timestamp |   130.796 |      ms |

@mayya-sharipova
Copy link
Contributor Author

I experimented why LongDistanceFeatureQuery doesn't not bring optimizations in the asc case on population on geonames. And it looks like setMinCompetitiveScore in this case is called, but because how we use score calculation to compute minValue and maxValue for the range, we end up with broader ranges then necessary. The cause of this is rounding errors in score calculation.

Several things to help to improve situation:

  • In the QueryPhase for pivotDistance instead of average value for the field, use median value. This help to avoid a situation of super small scores, where we loose some precision because of float calculation.
  • If I set a proper median value for the pivotDistance, in the LongDistanceFeatureQuery::minCompetiveScore I will end up with a range of minValue=0 and maxValue = 0. I wonder if we can exclude this type of ranges where minValue is equal to maxValue.

@mayya-sharipova mayya-sharipova force-pushed the hits-optimizations-on-sort branch from 13c487c to 075a7bf Compare May 31, 2019 19:26
@mayya-sharipova
Copy link
Contributor Author

@jimczi @jpountz As agreed in our last conversation I have added an optional sort parameter "optimized": true that is false by default. Can you please review this PR, so we can merge it and continue further iterations on top of it.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to merge this to a public branch on the elasticsearch repo

searchContext.sort(new SortAndFormats(new Sort(newSortFields), newFormats));
}
} catch (IOException e) {
// in case of errors do nothing - keep the same query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't look safe as we might have changed the query already when this exception is caught?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpountz Can you please explain how this may happen? The way I see it is that the only way we can change the query if there is not exception in tryRewriteLongSort.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine the case that the exception is thrown after query = rewrittenQuery;. Maybe there is nothing that throws an exception today, but even then this would be fragile and this invariant could get broken by a seemingly safe refactoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpountz Thanks, Adrien; makes sense. This addressed in the last commit, I have removed the try block. I think it is reasonable if there are some IO exceptions during sort optimization, we can fail the whole search request.

if (hasFilterCollector) return null;
// if we can't pre-calculate hitsCount based on the query type, optimization does't make sense
if (shortcutTotalHitCount(reader, query) == -1) return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above lines seem to be indented by one more space than the lines below

@mayya-sharipova mayya-sharipova changed the base branch from master to long_sort_optimization June 7, 2019 13:20
@mayya-sharipova
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/2

@mayya-sharipova mayya-sharipova merged commit 567a739 into elastic:long_sort_optimization Jun 10, 2019
@mayya-sharipova
Copy link
Contributor Author

from now on, the work will continue of the feature branch https://github.com/elastic/elasticsearch/tree/long_sort_optimization

mayya-sharipova added a commit that referenced this pull request Nov 26, 2019
* Optimize sort on numeric long and date fields (#39770)

Optimize sort on numeric long and date fields, when 
the system property `es.search.long_sort_optimized` is true.

* Skip optimization if the index has duplicate data (#43121)

Skip sort optimization if the index has 50% or more data
with the same value.
When index has a lot of docs with the same value, sort
optimization doesn't make sense, as DistanceFeatureQuery
will produce same scores for these docs, and Lucene
will use the second sort to tie-break. This could be slower
than usual sorting.

* Sort leaves on search according to the primary numeric sort field (#44021)

This change pre-sort the index reader leaves (segment) prior to search
when the primary sort is a numeric field eligible to the distance feature
optimization. It also adds a tie breaker on `_doc` to the rewritten sort
in order to bypass the fact that leaves will be collected in a random order.
I ran this patch on the http_logs benchmark and the results are very promising:

```
|                                       50th percentile latency | desc_sort_timestamp |    220.706 |      136544 |   136324 |     ms |
|                                       90th percentile latency | desc_sort_timestamp |    244.847 |      162084 |   161839 |     ms |
|                                       99th percentile latency | desc_sort_timestamp |    316.627 |      172005 |   171688 |     ms |
|                                      100th percentile latency | desc_sort_timestamp |    335.306 |      173325 |   172989 |     ms |
|                                  50th percentile service time | desc_sort_timestamp |    218.369 |     1968.11 |  1749.74 |     ms |
|                                  90th percentile service time | desc_sort_timestamp |    244.182 |      2447.2 |  2203.02 |     ms |
|                                  99th percentile service time | desc_sort_timestamp |    313.176 |     2950.85 |  2637.67 |     ms |
|                                 100th percentile service time | desc_sort_timestamp |    332.924 |     2959.38 |  2626.45 |     ms |
|                                                    error rate | desc_sort_timestamp |          0 |           0 |        0 |      % |
|                                                Min Throughput |  asc_sort_timestamp |   0.801824 |    0.800855 | -0.00097 |  ops/s |
|                                             Median Throughput |  asc_sort_timestamp |   0.802595 |    0.801104 | -0.00149 |  ops/s |
|                                                Max Throughput |  asc_sort_timestamp |   0.803282 |    0.801351 | -0.00193 |  ops/s |
|                                       50th percentile latency |  asc_sort_timestamp |    220.761 |     824.098 |  603.336 |     ms |
|                                       90th percentile latency |  asc_sort_timestamp |    251.741 |     853.984 |  602.243 |     ms |
|                                       99th percentile latency |  asc_sort_timestamp |    368.761 |     893.943 |  525.182 |     ms |
|                                      100th percentile latency |  asc_sort_timestamp |    431.042 |      908.85 |  477.808 |     ms |
|                                  50th percentile service time |  asc_sort_timestamp |    218.547 |     820.757 |  602.211 |     ms |
|                                  90th percentile service time |  asc_sort_timestamp |    249.578 |     849.886 |  600.308 |     ms |
|                                  99th percentile service time |  asc_sort_timestamp |    366.317 |     888.894 |  522.577 |     ms |
|                                 100th percentile service time |  asc_sort_timestamp |    430.952 |     908.401 |   477.45 |     ms |
|                                                    error rate |  asc_sort_timestamp |          0 |           0 |        0 |      % |
```

So roughly 10x faster for the descending sort and 2-3x faster in the ascending case. Note
that I indexed the http_logs with a single client in order to simulate real time-based indices
where document are indexed in their timestamp order.

Relates #37043

* Remove nested collector in docs response

As we don't use cancellableCollector anymore, it should be removed from
the expected docs response.

* Use collector manager for search when necessary (#45829)

When we optimize sort, we sort segments by their min/max value.
As a collector expects to have segments in order,
we can not use a single collector for sorted segments.
Thus for such a case, we use collectorManager,
where for every segment a dedicated collector will be created.

* Use shared TopFieldCollector manager

Use shared TopFieldCollector manager for sort optimization.
This collector manager is able to exchange minimum competitive
score between collectors

* Correct calculation of avg value to avoid overflow

* Optimize calculating if index has duplicate data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants