Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable recovery monitor before recovery start #93551

Conversation

DaveCTurner
Copy link
Contributor

We do nontrivial amounts of work before we start a peer recovery, particularly recovering from the local translog up to its global checkpoint. Today the recovery monitor is running during this time, and will (repeatedly) fail the recovery if it takes more than 30 minutes to complete. With this commit we disable the recovery monitor until this local process has completed.

Closes #93542

We do nontrivial amounts of work before we start a peer recovery,
particularly recovering from the local translog up to its global
checkpoint. Today the recovery monitor is running during this time, and
will (repeatedly) fail the recovery if it takes more than 30 minutes to
complete. With this commit we disable the recovery monitor until this
local process has completed.

Closes elastic#93542
@DaveCTurner DaveCTurner added >bug :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v8.7.0 labels Feb 7, 2023
@DaveCTurner DaveCTurner requested a review from pxsalehi February 7, 2023 11:19
@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Feb 7, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner
Copy link
Contributor Author

Same as #93543 except we must now allow for the recovery monitor blocks to overlap.

@DaveCTurner DaveCTurner merged commit b8c9dc9 into elastic:main Feb 7, 2023
@DaveCTurner DaveCTurner deleted the 2023-02-07-disable-recovery-monitor-before-recovery-start-redux branch February 7, 2023 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Peer recovery may time out during recoverLocallyUpToGlobalCheckpoint
3 participants