-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Segment Replication] Update peer recovery logic for segment replication #5344
[Segment Replication] Update peer recovery logic for segment replication #5344
Conversation
1e127d0
to
4af83a1
Compare
This comment was marked as outdated.
This comment was marked as outdated.
4af83a1
to
b413298
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
53fee4e
to
1414c09
Compare
Gradle Check (Jenkins) Run Completed with:
|
|
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/RecoveryTarget.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/LocalStorePeerRecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
if (didRefresh | ||
&& shard.state() != IndexShardState.CLOSED | ||
&& shard.getReplicationTracker().isPrimaryMode() | ||
&& shard.isBlockInternalCheckPointRefresh() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is primary mode not sufficient for publishing?
Lets assume the condition is correct, how do we ensure that the publisher stops lets say in usecase where once publish starts and then this condition becomes false? The primary term validation is present only for TransportReplicationActions. Is it possible that a particular target node could be receiving segment files from old primary as well a new primary given the afterRefresh method is async. If this understanding is incorrect, pls do help with the concerned code that handles this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ashking94 for raising this point. Today., SegRepTargetService uses events (current shard promoted as primary & shard close) to cancel ongoing replication. When received a new checkpoint, SegRepTargetService also cancels on-going replication happening on a lower primary term (older primary). But, I have observed that during primary relocation, the primary term bump does not happen and thus needs handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats correct primary term bump does not happen during peer primary recoveries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can be handled in clusterChanged
event inside SegmentReplicationSourceService. I will start with a test first. Added one integ test here with other minor cosmetic changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server/src/main/java/org/opensearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/RecoveryTarget.java
Outdated
Show resolved
Hide resolved
1414c09
to
d7f1217
Compare
Signed-off-by: Suraj Singh <[email protected]>
Signed-off-by: Suraj Singh <[email protected]>
…tService Signed-off-by: Suraj Singh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Suraj Singh <[email protected]>
1edd581
to
68d16e2
Compare
This comment was marked as outdated.
This comment was marked as outdated.
Signed-off-by: Suraj Singh <[email protected]>
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@Bukhtawar @ashking94 : Please review the latest changes. |
@Bukhtawar : Please feel free to leave any comment in case you have any concerns. |
@dreamer-89 are you planning to backport this PR to 2.x? |
Yes @ashking94. I am planning to backport this along with the fix. Right now, I am testing the fix on actual cluster with setup you mentioned in #5848 |
It is probably best to stay on top of the backports (even if for this case it means backporting a muted integration test). When you don't backport a change that touches the same files as a subsequent change that is to be backported, then the cherry-pick will fail and it will be harder for everyone. #5944 is a backport of a change that touched the new integration test introduced in this PR. I had to resolve a conflict on the cherry-pick to exclude the new file, but it also means when you do backport you'll need to manually apply the subsequent changes to that integration test that happened on main. |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-5344-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d56965c59636b18450a48378668f6e2f0f0529e9
# Push it to GitHub
git push --set-upstream origin backport/backport-5344-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
Thank you @andrross for the comment and apologies for inconvenience. Backport was initially delayed to avoid the change in 2.5.0. I am backporting this change now. |
…ion (opensearch-project#5344) * [Segment Replication] Update peer recovery logic for segment replication Signed-off-by: Suraj Singh <[email protected]> * Add integration test for indexing during relocation Signed-off-by: Suraj Singh <[email protected]> * Address review comments Signed-off-by: Suraj Singh <[email protected]> * Spotless check apply fixes & one failing unit test Signed-off-by: Suraj Singh <[email protected]> * Delay force segment replication till relocation handoff Signed-off-by: Suraj Singh <[email protected]> * Spotless fix Signed-off-by: Suraj Singh <[email protected]> * Unit test fix Signed-off-by: Suraj Singh <[email protected]> * Address review comment, move force segrep sync handler to SegRepTargetService Signed-off-by: Suraj Singh <[email protected]> * Resolve conflicts and enabled tests Signed-off-by: Suraj Singh <[email protected]> * Spotless fix Signed-off-by: Suraj Singh <[email protected]> Signed-off-by: Suraj Singh <[email protected]>
…ion (#5344) (#5959) * [Segment Replication] Update peer recovery logic for segment replication Signed-off-by: Suraj Singh <[email protected]> * Add integration test for indexing during relocation Signed-off-by: Suraj Singh <[email protected]> * Address review comments Signed-off-by: Suraj Singh <[email protected]> * Spotless check apply fixes & one failing unit test Signed-off-by: Suraj Singh <[email protected]> * Delay force segment replication till relocation handoff Signed-off-by: Suraj Singh <[email protected]> * Spotless fix Signed-off-by: Suraj Singh <[email protected]> * Unit test fix Signed-off-by: Suraj Singh <[email protected]> * Address review comment, move force segrep sync handler to SegRepTargetService Signed-off-by: Suraj Singh <[email protected]> * Resolve conflicts and enabled tests Signed-off-by: Suraj Singh <[email protected]> * Spotless fix Signed-off-by: Suraj Singh <[email protected]> Signed-off-by: Suraj Singh <[email protected]> Signed-off-by: Suraj Singh <[email protected]>
…ion (#5344) (#5959) * [Segment Replication] Update peer recovery logic for segment replication Signed-off-by: Suraj Singh <[email protected]> * Add integration test for indexing during relocation Signed-off-by: Suraj Singh <[email protected]> * Address review comments Signed-off-by: Suraj Singh <[email protected]> * Spotless check apply fixes & one failing unit test Signed-off-by: Suraj Singh <[email protected]> * Delay force segment replication till relocation handoff Signed-off-by: Suraj Singh <[email protected]> * Spotless fix Signed-off-by: Suraj Singh <[email protected]> * Unit test fix Signed-off-by: Suraj Singh <[email protected]> * Address review comment, move force segrep sync handler to SegRepTargetService Signed-off-by: Suraj Singh <[email protected]> * Resolve conflicts and enabled tests Signed-off-by: Suraj Singh <[email protected]> * Spotless fix Signed-off-by: Suraj Singh <[email protected]> Signed-off-by: Suraj Singh <[email protected]> Signed-off-by: Suraj Singh <[email protected]>
…ion (opensearch-project#5344) * [Segment Replication] Update peer recovery logic for segment replication Signed-off-by: Suraj Singh <[email protected]> * Add integration test for indexing during relocation Signed-off-by: Suraj Singh <[email protected]> * Address review comments Signed-off-by: Suraj Singh <[email protected]> * Spotless check apply fixes & one failing unit test Signed-off-by: Suraj Singh <[email protected]> * Delay force segment replication till relocation handoff Signed-off-by: Suraj Singh <[email protected]> * Spotless fix Signed-off-by: Suraj Singh <[email protected]> * Unit test fix Signed-off-by: Suraj Singh <[email protected]> * Address review comment, move force segrep sync handler to SegRepTargetService Signed-off-by: Suraj Singh <[email protected]> * Resolve conflicts and enabled tests Signed-off-by: Suraj Singh <[email protected]> * Spotless fix Signed-off-by: Suraj Singh <[email protected]> Signed-off-by: Suraj Singh <[email protected]>
Signed-off-by: Suraj Singh [email protected]
Description
Update the peer recovery logic to work with segment replication. The existing primary relocation is broken because of segment files conflicts. This happens because of target (new primary) is started with InternalEngine writes its own version of segment files which later on conflicts with file copies on replicas (copied from older primary); when this target is promoted as primary.
This change patches the recovery flow by first recovering the primary target with
NRTReplicationEngine
engine (to prevent target from generating segment files). Post phase 1 file copy operation, the older primary is blocked from performing segment file copy events to replicas (prevent segment file copy to replicas from auto refreshes). Post phase 2 before finalizeRecovery and handoff, the target (new primary) is forced to refresh segment files from older primary (to catch up with other replicas) followed by switching engine to InternalEngine.Issues Resolved
#5242
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.