-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renewing retention lease with global-ckp + 1 , as we want operations … #206
Conversation
@@ -268,7 +268,7 @@ class ShardReplicationTask(id: Long, type: String, action: String, description: | |||
|
|||
//renew retention lease with global checkpoint so that any shard that picks up shard replication task has data until then. | |||
try { | |||
retentionLeaseHelper.renewRetentionLease(leaderShardId, indexShard.lastSyncedGlobalCheckpoint, followerShardId) | |||
retentionLeaseHelper.renewRetentionLease(leaderShardId, indexShard.lastSyncedGlobalCheckpoint + 1, followerShardId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe the rationale in description and also leave a comment why it is +1?
Also can you share details of testing? Is there a functional test we can create that would fail with earlier code and it passes now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have used existing UTs and did a sanity check with index having million of documents in bootstrap and post bootstrap as well. It is difficult to write a functional test that would fail with earlier code as the issue is not reproducible every time. Besides that, I have not been able to repro this locally .
…from that seq number Signed-off-by: Gaurav Bafna <[email protected]>
If we're going with this, I think we can remove the error suppression based on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Can you please backport to 7.10 and 1.0 branches as well? |
…from that seq number (opensearch-project#206) Signed-off-by: Gaurav Bafna <[email protected]>
…from that seq number (opensearch-project#206) Signed-off-by: Gaurav Bafna <[email protected]>
…from that seq number (#206) (#211) Signed-off-by: Gaurav Bafna <[email protected]>
…from that seq number (#206) (#210) Signed-off-by: Gaurav Bafna <[email protected]>
have backported to both the branches. |
…from that seq number
Signed-off-by: Gaurav Bafna [email protected]
Description
Retention leases preserve the operations including and starting from the
retainingSequenceNumber
we specify when we take the lease .Hence for follower side, once the global checkpoint moves to
x
, we should renew the lease fromx+1
as retainingSequenceNumber .The current code tries to take lease from
x
, which is not necessary . This can also become problematic when the earlier retention lease taken by bootstrap isx+1
which is observed in issueIssues Resolved
#205
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.