-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reindex resolve indices early #49850
Reindex resolve indices early #49850
Conversation
Resolve indices before starting to reindex. This ensures that the list of indices does not change when failing over (TBD). The one exception to this is aliases, which we still need to access through the alias. In addition, resolved index patterns are sorted by create-date and otherwise the listed order is preserved. This ensures that once we reindex one index at a time, we will get reasonable time locality for time based indices. The resolved list of indices will also by used to do searching one index (or index group) at a time, improving search performance (since we use sort) and allowing us to do more fine-grained checkpoint and track progress (TBD). Relates elastic#42612
Pinging @elastic/es-distributed (:Distributed/Reindex) |
ci/1 test failure fixed here: #49855 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question mostly
@@ -87,7 +103,7 @@ protected void doExecute(Task task, StartReindexTaskAction.Request request, Acti | |||
|
|||
// In the current implementation, we only need to store task results if we do not wait for completion | |||
boolean storeTaskResult = request.getWaitForCompletion() == false; | |||
ReindexTaskParams job = new ReindexTaskParams(storeTaskResult, included); | |||
ReindexTaskParams job = new ReindexTaskParams(storeTaskResult, resolveIndexPatterns(request.getReindexRequest()), included); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to be putting this in the cluster state? I don't have a sense for how large this gets, but I assume it could go in the index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is clearly wrong, fixed in 33d8fd4.
Marking this WIP and intend to close later based on our discussions the other day. The conclusion there was that we should try to improve this inside search rather than try to improve performance in reindex by doing one index at a time. |
Resolve indices before starting to reindex. This ensures that the list
of indices does not change when failing over (TBD). The one exception to
this is aliases, which we still need to access through the alias.
In addition, resolved index patterns are sorted by create-date and
otherwise the listed order is preserved. This ensures that once we
reindex one index at a time, we will get reasonable time locality for
time based indices.
The resolved list of indices will also by used to do searching one
index (or index group) at a time, improving search performance (since we
use sort) and allowing us to do more fine-grained checkpoint and track
progress (TBD).
Relates #42612