-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding back [Time series based workload desc order optimization through reverse segment read (#7244)] with fixes #7967
Conversation
…rough reverse segment read (opensearch-project#7244)" (opensearch-project#7892)" This reverts commit bb26536. Signed-off-by: gashutos <[email protected]>
…also ASC order reverse should only consider in @timestamp field Signed-off-by: gashutos <[email protected]>
Please reivew second commit 59285c02385bb13bf98f710c6e683b67a2af041c only, The first commit in this PR contains just revert of #7244 many thanks for Hailong-am for catching this and @andrross to help reverting to unblock 2.8 release. |
Signed-off-by: gashutos <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #7967 +/- ##
============================================
+ Coverage 70.87% 71.40% +0.53%
- Complexity 56504 56940 +436
============================================
Files 4719 4719
Lines 267408 267446 +38
Branches 39196 39210 +14
============================================
+ Hits 189521 190970 +1449
+ Misses 61861 60657 -1204
+ Partials 16026 15819 -207
|
@gashutos would be great if we could somehow add test case(s) for that, what do you think? |
Agreed. @gashutos Can you add an integration or unit test to ensure we're not still hitting the early terminate behavior? |
…termination Signed-off-by: gashutos <[email protected]>
Oops, I missed adding file to commit. Just added now. |
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
I was trying a lot about adding unit test to cover |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-7967-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5c3225692dcea1eddbb3e76ae19f47de5ea23a96
# Push it to GitHub
git push --set-upstream origin backport/backport-7967-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
@gashutos mind please to create a manual backport to |
…gh reverse segment read (opensearch-project#7244)] with fixes (opensearch-project#7967) * Revert "Revert "Time series based workload desc order optimization through reverse segment read (opensearch-project#7244)" (opensearch-project#7892)" This reverts commit bb26536. Signed-off-by: gashutos <[email protected]> * Enable time series optimization only if it is not IndexSorted index, also ASC order reverse should only consider in @timestamp field Signed-off-by: gashutos <[email protected]> * Modifying CHANGELOG Signed-off-by: gashutos <[email protected]> * Adding integ test for scroll API where sort by _doc is getting early termination Signed-off-by: gashutos <[email protected]> --------- Signed-off-by: gashutos <[email protected]>
…gh reverse segment read (#7244)] with fixes (#7967) (#8037) Signed-off-by: gashutos <[email protected]>
…gh reverse segment read (opensearch-project#7244)] with fixes (opensearch-project#7967) (opensearch-project#8037) Signed-off-by: gashutos <[email protected]>
…gh reverse segment read (opensearch-project#7244)] with fixes (opensearch-project#7967) * Revert "Revert "Time series based workload desc order optimization through reverse segment read (opensearch-project#7244)" (opensearch-project#7892)" This reverts commit bb26536. Signed-off-by: gashutos <[email protected]> * Enable time series optimization only if it is not IndexSorted index, also ASC order reverse should only consider in @timestamp field Signed-off-by: gashutos <[email protected]> * Modifying CHANGELOG Signed-off-by: gashutos <[email protected]> * Adding integ test for scroll API where sort by _doc is getting early termination Signed-off-by: gashutos <[email protected]> --------- Signed-off-by: gashutos <[email protected]> Signed-off-by: Rishab Nahata <[email protected]>
…gh reverse segment read (opensearch-project#7244)] with fixes (opensearch-project#7967) * Revert "Revert "Time series based workload desc order optimization through reverse segment read (opensearch-project#7244)" (opensearch-project#7892)" This reverts commit bb26536. Signed-off-by: gashutos <[email protected]> * Enable time series optimization only if it is not IndexSorted index, also ASC order reverse should only consider in @timestamp field Signed-off-by: gashutos <[email protected]> * Modifying CHANGELOG Signed-off-by: gashutos <[email protected]> * Adding integ test for scroll API where sort by _doc is getting early termination Signed-off-by: gashutos <[email protected]> --------- Signed-off-by: gashutos <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Description
This PR adds back reverted PR #7244 due to bug introduced in re-index in this issue #7878.
The issue was, re-index operation on timeseries based workload was only able to re-index documents in its last segment and was skipping all other segments, hence we had data loss.
The reason behind this, we do early terminate the sort queries if lucene index is
indexsorted
or sort field is_doc
. As per TopFieldCollector code, in case ofIndexSort
or_doc
field sort, it does earlyTerminate and searches only very first segment. ReIndex plugin queriesscroll
query on_doc
field inasc
order. Hence the issue was with Reindex.In this PR, I am adding check not to enable this optimization in case if sorting is not on @timestamp field as well if it is IndexSorted.
Related Issues
Resolves #7878
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.