-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing SortField comparison to use equals instead of reference equality #6901
Conversation
Gradle Check (Jenkins) Run Completed with:
|
@@ -636,7 +636,7 @@ private static Sort createSort(TopFieldDocs[] topFieldDocs) { | |||
*/ | |||
private static boolean isSortWideningRequired(TopFieldDocs[] topFieldDocs, int sortFieldindex) { | |||
for (int i = 0; i < topFieldDocs.length - 1; i++) { | |||
if (topFieldDocs[i].fields[sortFieldindex] != topFieldDocs[i + 1].fields[sortFieldindex]) { | |||
if (!topFieldDocs[i].fields[sortFieldindex].equals(topFieldDocs[i + 1].fields[sortFieldindex])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gashutos mind taking a look? I found this one out while adding tests for multi index search queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I remember I did test this with int & long and it worked fine.
The unit tests too are working fine, really is this the issue ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly this one, but assume that index1 and index2 have same counter
field with type long
- in this case we will do unnecessary unwinding although the sort fields are the same. The test cases I added just to capture the ability to do sort across different types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh got it, so correctness wise it was still giving right results ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, no visible effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, that I got.
but ./gradlew check is having integ test failure for comparator type mismtach. That should work though..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh got it, you mean we don't need this test 260_sort_mixed.yml
? I haven't found any rest spec tests for mixed indices search ... I thought it is good to have one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, its good to have. I was just curious why it is failing :)
you mean we don't need this test 260_sort_mixed.yml
No no, we need it, thank you for adding it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reta I think this is mixed cluster scenario which is failing in ./gradlew check
.
Our old versions dont support mixing results between LONG/DOUBLE. We are specifying sortType Long for Long NumericType here, and SortType Double for Double numeric type. So it will return results in Long for those shards of index with Long mapping, while Double for another case.
The new changes we did as part of widening sort type handles it correctly but that will be in only latest version.
May be we can change mapping types from Long to Double
to Int to Long
or Float to Double
to having them work in mixed cluster scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new changes we did as part of widening sort type handles it correctly but that will be in only latest version.
Correct, so I was adding the version check here:
- skip:
version: " - 2.6.99"
reason: relies on numeric sort optimization that landed in 2.7.0 only
So the test should be skipped for anything below 2.7.0 (we backported this change to 2.x so it should work for 3.0.0 and 2.7.0) but for some reasons, the test is run on versions it is not supposed to ... I am curious to find out why ....
Running the test on 2.x branch manually passed for me, looking into it ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this. @reta
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #6901 +/- ##
============================================
- Coverage 70.72% 70.72% -0.01%
+ Complexity 59259 59246 -13
============================================
Files 4812 4812
Lines 283764 283750 -14
Branches 40918 40915 -3
============================================
- Hits 200700 200689 -11
+ Misses 66597 66523 -74
- Partials 16467 16538 +71
... and 466 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@@ -0,0 +1,65 @@ | |||
"search across indices with mixed long and double numeric types": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gashutos so I found out why test is flipping, this is very interesting.
The test only works on single node clusters (or, to generalize it, when search results come from single node without serialization being involved). The reason for that is that SortedNumericSortField
s are deserialized by Apache Lucene as SortField
s and that is how they come to SearchPhaseController
. As you may guessed already, the sort optimization does not trigger anymore because instanceof SortedNumericSortField
is not met (those are just SortField
instances).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is concerning @reta , The optimization will still get applied but while merging results, if it has different SortFiled.Type (INT & LONG from two different shards), it will fail the request -
Let me see what could be done here....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will fail the request, which is essentially the "expected" behavior as per current implementation. So basically without the optimization - the test always fails, with optimization - passed on single node clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But current implementation works for Int & Long indices. Since we used to widen sortType to Long for Int field too during shard level sort. That will break with this, I wasnt knowing serilization will trigger issue here hence didnt test on multinode scenario....
tagging @nknize too..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But current implementation works for Int & Long indices.
No, it does not (Lucene is typecasting), here is the prove:
1> Caused by: java.lang.ClassCastException: class java.lang.Long cannot be cast to class java.lang.Integer (java.lang.Long and java.lang.Integer are in module java.base of loader 'bootstrap')
1> at java.base/java.lang.Integer.compareTo(Integer.java:71)
1> at org.apache.lucene.search.FieldComparator.compareValues(FieldComparator.java:115)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gashutos just to may be clarify, this is not the regression caused by optimization changes, it existed before and still exist in Elasticsearch 7.x release line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reta I verified that this is regression. To reproduce this, do below steps.
- Ingest http_logs workload from OSB to multi node cluster (I created with 5 nodes)
- Create an index with exactly same field mapping except change
size
field aslong
fromint
- Re-index any of the http_logs workload index to newly created index on 2nd step.
- Sort by asc/desc order on size field with logs-* as an index.
- Check this behaviour without code changes in Enable numeric sort optimization support for all numeric types #6424 (it works fine, while with Enable numeric sort optimization support for all numeric types #6424 it breaks).
I raised a PR to fix this behaviour, #6926 lets get that in before this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 test case have been updated
Gradle Check (Jenkins) Run Completed with:
|
Signed-off-by: Andriy Redko <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
@nknize LGTM? :-) |
…ty (#6901) Signed-off-by: Andriy Redko <[email protected]> (cherry picked from commit 55936ac) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…ty (#6901) (#6957) (cherry picked from commit 55936ac) Signed-off-by: Andriy Redko <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…ty (opensearch-project#6901) Signed-off-by: Andriy Redko <[email protected]> Signed-off-by: Valentin Mitrofanov <[email protected]>
Description
Fixing SortField comparison to use equals instead of reference equality
Issues Resolved
N/A
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.