Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reindex rethrottle persistent task #51599

Conversation

henningandersen
Copy link
Contributor

This adds support for rethrottling resilient/persistent reindex
through updating the .reindex index and notifying the task. This
ensures that the new throttle value sticks on failovers while
also ensuring that the task wakes up immediately if it had a very
low throttle value.

Related to #42612

This is the replacement for #50854.

Tests on the transport layer are missing, but I wanted to ensure
we agree on approach first and then tackle adding those
in a follow-up.

This adds support for rethrottling resilient/persistent reindex
through updating the .reindex index and notifying the task. This
ensures that the new throttle value sticks on failovers while
also ensuring that the task wakes up immediately if it had a very
low throttle value.

Related to elastic#42612
@henningandersen henningandersen added >non-issue :Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down labels Jan 29, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Reindex)

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has a bug currently.

BulkByScrollParallelizationHelper#sendSubRequests will use the requests per second parameter in request.forSlice which means it will use the original value (not updated) on a failover.

I think you are attempting to avoid mutating the request inside the index (and storing an additional request per second field) to retain the original request? Is that of value? We could store like a rethrottle audit field and mutate the request. So there is like a list field showing the submitted throttle values (and maybe time) and update the request to whatever the current value is.

private final ActionListener<ReindexTaskStateDoc> finishedListener;
private final Runnable onCheckpointAssignmentConflict;
private Consumer<Float> requestsPerSecondObserver;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? I feel like adding the "poll" interaction model when we are doing all the working to implement the "push" component makes the interaction with the state updater and task more complicated that required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I agree, removed in b6096a9

@Tim-Brooks
Copy link
Contributor

You could also mutate the request immediately when putting it out of the index and still have a dedicated field for the requests per second.

Modify reindex request instead of passing in requestsPerSecond
Remove rethrottle on .reindex version conflict.
@henningandersen
Copy link
Contributor Author

I think this has a bug currently.

That was kind of on purpose, since slicing is not supported for resilient reindex anyway.

I changed it to modify the request instead, that is a bit nicer.

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good to me. I don't see anything that needs to change. Let me just take one last look tomorrow when I have a bit more time. Also, I think it needs to be updated now that I have merged the persistent task stuff.

@henningandersen
Copy link
Contributor Author

Test failure reported here #51926, so:
@elasticmachine run elasticsearch-ci/1

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@henningandersen henningandersen merged commit 5c66735 into elastic:reindex_v2 Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down >non-issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants