Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: failure while transferring lease away when exiting joint config can lead to range unavailability #83687

Closed
arulajmani opened this issue Jul 1, 2022 · 2 comments · Fixed by #83686
Assignees
Labels
branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team

Comments

@arulajmani
Copy link
Collaborator

arulajmani commented Jul 1, 2022

Describe the problem

@nvanbenschoten, @andreimatei, @shralex, @knz and I were looking into a closed loop workload that had been stalled for around 10 minutes. We correlated the stall with an uptick in backoffs due to NLHE errors in the dist sender.

Screen Shot 2022-06-30 at 9 06 37 PM

Digging a bit deeper in the logs for one of the nodes, it turned out all these errors pertained to a single range r54 and all NLHE errors indicated that n16 was the lease holder. Interestingly, n16s replica was of type VOTER_DEMOTING. We also see n4 has a VOTER_INCOMING replica, so the range in question is undergoing a lateral lease transfer using a joint config.

...
E220630 23:39:00.154796 2779303 kv/kvclient/kvcoord/dist_sender.go:2268 ⋮ [n1,client=10.11.28.172:35106,user=root] 109206 application error: ‹[NotLeaseHolderError] lease held by different store; r54: replica (n4,s4):17VOTER_INCOMING not lease holder; current lease is repl=(n16,s16):16VOTER_INCOMING seq=8 start=1656629768.388660032,0 epo=1 pro=1656629767.668815515,0›: "sql txn" meta={id=3e7719c8 key=‹/Table/106/1/-9207304179336853575/0› pri=0.00128420 epo=0 ts=1656629806.010494656,0 min=1656629806.010494656,0 seq=0} lock=true stat=PENDING rts=1656629806.010494656,0 wto=false gul=1656629806.510494656,0
I220630 23:39:00.154970 2779303 kv/kvclient/kvcoord/dist_sender.go:1659 ⋮ [n1,client=10.11.28.172:35106,user=root] 109207 evicting range desc ‹desc:r54:/Table/106{-/1/-9131597190716917760} [(n9,s9):13, (n5,s5):2, (n16,s16):16VOTER_DEMOTING_LEARNER, (n20,s20):8NON_VOTER, (n31,s31):14NON_VOTER, (n38,s38):11NON_VOTER, (n4,s4):17VOTER_INCOMING, next=18, gen=58] lease:<nil> spec desc: <nil>› after ‹failed to send RPC: sending to all replicas failed; last error: [NotLeaseHolderError] lease held by different store; r54: replica (n4,s4):17VOTER_INCOMING not lease holder; current lease is repl=(n16,s16):16VOTER_INCOMING seq=8 start=1656629768.388660032,0 epo=1 pro=1656629767.668815515,0›
E220630 23:39:00.178641 2778510 kv/kvclient/kvcoord/dist_sender.go:2268 ⋮ [n1] 109211 application error: ‹[NotLeaseHolderError] lease held by different store; r54: replica (n9,s9):13 not lease holder; current lease is repl=(n16,s16):16VOTER_INCOMING seq=8 start=1656629768.388660032,0 epo=1 pro=1656629767.668815515,0›: "sql txn" meta={id=fc837e32 key=‹/Table/106/1/-9139405485398707370/0› pri=0.00621197 epo=0 ts=1656629805.744869361,0 min=1656629805.744869361,0 seq=1501} lock=true stat=STAGING rts=1656629805.744869361,0 wto=false gul=1656629806.244869361,0 int=500 ifw=1499
E220630 23:39:00.181837 2778510 kv/kvclient/kvcoord/dist_sender.go:2157 ⋮ [n1] 109214 ‹[NotLeaseHolderError] lease held by different store; r54: replica (n5,s5):2 not lease holder; current lease is repl=(n16,s16):16VOTER_INCOMING seq=8 start=1656629768.388660032,0 epo=1 pro=1656629767.668815515,0›: "sql txn" meta={id=fc837e32 key=‹/Table/106/1/-9139405485398707370/0› pri=0.00621197 epo=0 ts=1656629805.744869361,0 min=1656629805.744869361,0 seq=1501} lock=true stat=STAGING rts=1656629805.744869361,0 wto=false gul=1656629806.244869361,0 int=500 ifw=1499
E220630 23:39:00.409114 2778510 kv/kvclient/kvcoord/dist_sender.go:2268 ⋮ [n1] 109359 application error: ‹[NotLeaseHolderError] lease held by different store; r54: replica (n5,s5):2 not lease holder; current lease is repl=(n16,s16):16VOTER_INCOMING seq=8 start=1656629768.388660032,0 epo=1 pro=1656629767.668815515,0›: "sql txn" meta={id=fc837e32 key=‹/Table/106/1/-9139405485398707370/0› pri=0.00621197 epo=0 ts=1656629805.744869361,0 min=1656629805.744869361,0 seq=1501} lock=true stat=STAGING rts=1656629805.744869361,0 wto=false gul=1656629806.244869361,0 int=500 ifw=1499
I220630 23:39:00.409345 2778510 kv/kvclient/kvcoord/dist_sender.go:1659 ⋮ [n1] 109360 evicting range desc ‹desc:r54:/Table/106{-/1/-9131597190716917760} [(n9,s9):13, (n5,s5):2, (n16,s16):16VOTER_DEMOTING_LEARNER, (n20,s20):8NON_VOTER, (n31,s31):14NON_VOTER, (n38,s38):11NON_VOTER, (n4,s4):17VOTER_INCOMING, next=18, gen=58] lease:repl=(n16,s16):16VOTER_INCOMING seq=8 start=1656629768.388660032,0 epo=1 pro=1656629767.668815515,0 spec desc: <nil>› after ‹failed to send RPC: sending to all replicas failed; last error: routing information detected to be stale: [NotLeaseHolderError] lease held by different store; r54: replica (n5,s5):2 not lease holder; current lease is repl=(n16,s16):16VOTER_INCOMING seq=8 start=1656629768.388660032,0 epo=1 pro=1656629767.668815515,0›

Looking at n16s logs next, we find log lines of the form:

I220630 23:45:43.994460 5373681 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 241 request range lease (attempt #2440)
I220630 23:45:43.994472 5373167 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 242 request range lease (attempt #2886)
I220630 23:45:43.994483 3345317 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 243 request range lease (attempt #3554)
I220630 23:45:43.994497 5373402 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 244 request range lease (attempt #2681)
I220630 23:45:43.994509 5363466 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 245 request range lease (attempt #1342)
I220630 23:45:43.994526 5373472 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 246 request range lease (attempt #2615)
I220630 23:45:43.994543 5374278 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 247 request range lease (attempt #1956)
I220630 23:45:43.994560 5374712 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 248 request range lease (attempt #1608)
I220630 23:45:43.994572 5372414 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 249 request range lease (attempt #3484)
I220630 23:45:43.994589 5376030 kv/kvserver/replica_range_lease.go:1329 ⋮ [n16,s16,r54/16:‹/Table/106{-/1/-913…}›] 250 request 

We also noticed from the ranges status page that n16's status is marked as PROSCRIBED. This checks out, given it appears as the leaseholder to all other nodes in the cluster yet it is repeatedly trying to acquire the lease for r54. It is repeatedly failing to acquire the lease however, which seems to be running into this check which only allows VOTER_FULL and VOTER_INCOMING replicas to acquire a lease.

// CheckCanReceiveLease checks whether `wouldbeLeaseholder` can receive a lease.
// Returns an error if the respective replica is not eligible.
//
// An error is also returned is the replica is not part of `replDescs`.
func CheckCanReceiveLease(wouldbeLeaseholder ReplicaDescriptor, replDescs ReplicaSet) error {

The proposed solution here would be to as simple as relaxing the check about to allow a VOTER_DEMOTING replica to acquire the lease if it is part of a joint configuration with another VOTER_INCOMING replica.

Jira issue: CRDB-17207

@arulajmani arulajmani added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 1, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Jul 1, 2022
andreimatei added a commit to andreimatei/cockroach that referenced this issue Jul 5, 2022
Before evaluating a TransferLeaseRequest, we revoke the existing lease.
This is needed so that further reads are not permitted before the new
lease takes effect without being captured in the timestamp cache summary
carried by the transfer.

This patch makes it so that we restore the original lease if the
evaluation of the transfer fails(*), as there doesn't seem to be any
reason to force the replica to acquire a new lease in that case. This
patch is strictly about evaluation failing, and not about failures or
ambiguity during application of the lease transfer Raft command.

(*) It's not common for the evaluation of a transfer to fail. We see
something going wrong with a transfer (either with the evaluation or
with the proposal) in cockroachdb#83687. Evaluation fails on inconsistent next
leases, which I think can happen under transfer races.

Release note: None
@celiala celiala added release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. branch-release-21.1 labels Jul 6, 2022
shralex added a commit to shralex/cockroach that referenced this issue Jul 7, 2022
…oming voter

Fixes cockroachdb#83687.

It is possible that we've entered a joint config where the leaseholder is being
replaced with an incoming voter. We try to transfer the lease away, but this fails.
In this case, we need the demoted voter to re-aquire the lease, as it might be an epoch
based lease, that doesn't expire. Otherwise we might be stuck without anyone else getting
the lease (the original leaseholder will be in the PROSCRIBED state, repeatedly trying to
re-aquire the lease, but will fail since its a VOTER_DEMOTING_LEARNER).

Release note (bug fix): Fixes a critical bug (cockroachdb#83687) introduced in 22.1.0 where
failure to transfer a lease in the joint config may result in range unavailability. The fix
allows the original leaseholder to re-aquire the lease so that lease transfer can be retried.
@celiala celiala added branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 and removed branch-release-21.1 labels Jul 7, 2022
@nvanbenschoten
Copy link
Member

I think this issue may be exacerbated by 034611b. That commit adds a new case where lease transfers can fail after the outgoing lease has been revoked (placed in a PROSCRIBED state). We see in #83261 that we now hit this case more often. I suspect that this is why the lease ends up in a PROSCRIBED state in this issue.

That doesn't mean that we shouldn't urgently fix this issue, but it may be more rare on v22.1 (which does not have 034611b) than we originally thought.

craig bot pushed a commit that referenced this issue Jul 8, 2022
83686: kvserver: allow voter demoting to get a lease, in case there's an incoming voter r=shralex a=shralex

Fixes #83687.

It is possible that we've entered a joint config where the leaseholder is being
replaced with an incoming voter. We try to transfer the lease away, but this fails.
In this case, we need the demoted voter to re-aquire the lease, as it might be an epoch
based lease, that doesn't expire. Otherwise we might be stuck without anyone else getting
the lease (the original leaseholder will be in the PROSCRIBED state, repeatedly trying to 
re-aquire the lease, but will fail since its a VOTER_DEMOTING_LEARNER).

Release note (bug fix): Fixes a critical bug (#83687) introduced in 22.1.0 where
failure to transfer a lease in the joint config may result in range unavailability. The fix
allows the original leaseholder to re-aquire the lease so that lease transfer can be retried.

Co-authored-by: shralex <[email protected]>
@craig craig bot closed this as completed in 139dc42 Jul 8, 2022
@arulajmani
Copy link
Collaborator Author

I think this issue may be exacerbated by 034611b.

Anecdotally, I started seeing this only after a rebase and once I saw it the first time I kept on hitting it quite frequently.

nvanbenschoten pushed a commit to nvanbenschoten/cockroach that referenced this issue Jul 9, 2022
…oming voter

Fixes cockroachdb#83687.

It is possible that we've entered a joint config where the leaseholder is being
replaced with an incoming voter. We try to transfer the lease away, but this fails.
In this case, we need the demoted voter to re-aquire the lease, as it might be an epoch
based lease, that doesn't expire. Otherwise we might be stuck without anyone else getting
the lease (the original leaseholder will be in the PROSCRIBED state, repeatedly trying to
re-aquire the lease, but will fail since its a VOTER_DEMOTING_LEARNER).

Release note (bug fix): Fixes a critical bug (cockroachdb#83687) introduced in 22.1.0 where
failure to transfer a lease in the joint config may result in range unavailability. The fix
allows the original leaseholder to re-aquire the lease so that lease transfer can be retried.
shralex added a commit to shralex/cockroach that referenced this issue Oct 7, 2022
This PR restricts the case when a VOTER_DEMOTING_LEARNER can
aquire the lease in a joint configuration to only the case where
it was the last leaseholder. Since it is being removed, we only
want it to get the lease if no other replica can aquire it,
the scenario described in cockroachdb#83687

This fix solves a potential starvation scenario where a VOTER_DEMOTING_LEARNER keeps
transferring the lease to the VOTER_INCOMING, succeeding, but then re-acquiring
because the VOTER_INCOMING is dead and the lease expires. In this case, we would
want another replica to pick up the lease, which would allow us to exit the joint configuration.

This PR also removes the leaseHolderRemovalAllowed parameter of CheckCanReceiveLease,
since it is always true since 22.2.

Release note (bug fix): stability fix.
shralex added a commit to shralex/cockroach that referenced this issue Oct 7, 2022
This PR restricts the case when a VOTER_DEMOTING_LEARNER can
aquire the lease in a joint configuration to only the case where
it was the last leaseholder. Since it is being removed, we only
want it to get the lease if no other replica can aquire it,
the scenario described in cockroachdb#83687

This fix solves a potential starvation scenario where a VOTER_DEMOTING_LEARNER keeps
transferring the lease to the VOTER_INCOMING, succeeding, but then re-acquiring
because the VOTER_INCOMING is dead and the lease expires. In this case, we would
want another replica to pick up the lease, which would allow us to exit the joint configuration.

This PR also removes the leaseHolderRemovalAllowed parameter of CheckCanReceiveLease,
since it is always true since 22.2.

Release note (bug fix): narrows down the conditions under which a VOTER_DEMOTING_LEARNER
can acquire the lease in a joint configuration to a) there has to be an VOTER_INCOMING
in the configuration and b) the VOTER_DEMOTING_LEARNER was the last leaseholder. This
prevents it from acquiring the lease unless it is the only one that can acquire it,
because transferring the lease away is necessary before exiting the joint config (without
the fix the system can be stuck in a joint configuration in some rare situations).

Fixes: cockroachdb#88667
See also cockroachdb#89340
shralex added a commit to shralex/cockroach that referenced this issue Oct 7, 2022
This PR restricts the case when a VOTER_DEMOTING_LEARNER can
aquire the lease in a joint configuration to only the case where
it was the last leaseholder. Since it is being removed, we only
want it to get the lease if no other replica can aquire it,
the scenario described in cockroachdb#83687

This fix solves a potential starvation scenario where a VOTER_DEMOTING_LEARNER keeps
transferring the lease to the VOTER_INCOMING, succeeding, but then re-acquiring
because the VOTER_INCOMING is dead and the lease expires. In this case, we would
want another replica to pick up the lease, which would allow us to exit the joint configuration.

This PR also removes the leaseHolderRemovalAllowed parameter of CheckCanReceiveLease,
since it is always true since 22.2.

Release note (bug fix): narrows down the conditions under which a VOTER_DEMOTING_LEARNER
can acquire the lease in a joint configuration to a) there has to be an VOTER_INCOMING
in the configuration and b) the VOTER_DEMOTING_LEARNER was the last leaseholder. This
prevents it from acquiring the lease unless it is the only one that can acquire it,
because transferring the lease away is necessary before exiting the joint config (without
the fix the system can be stuck in a joint configuration in some rare situations).

Fixes: cockroachdb#88667
See also cockroachdb#89340
craig bot pushed a commit that referenced this issue Oct 8, 2022
89564: kvserver: don't allow VOTER_DEMOTING to acquire lease after transfer r=shralex a=shralex

This PR restricts the case when a VOTER_DEMOTING_LEARNER can aquire the lease in a joint configuration to only the case where it was the last leaseholder. Since it is being removed, we only want it to get the lease if no other replica can aquire it, the scenario described in #83687

This fix solves a potential starvation scenario where a VOTER_DEMOTING_LEARNER keeps transferring the lease to the VOTER_INCOMING, succeeding, but then re-acquiring because the VOTER_INCOMING is dead and the lease expires. In this case, we would want another replica to pick up the lease, which would allow us to exit the joint configuration.

This PR also removes the leaseHolderRemovalAllowed parameter of CheckCanReceiveLease,
since it is always true since 22.2.

Release note (bug fix): narrows down the conditions under which a VOTER_DEMOTING_LEARNER can acquire the lease in a joint configuration to a) there has to be an VOTER_INCOMING in the configuration and b) the VOTER_DEMOTING_LEARNER was the last leaseholder. This prevents it from acquiring the lease unless it is the only one that can acquire it, because transferring the lease away is necessary before exiting the joint config (without the fix the system can be stuck in a joint configuration in some rare situations).

Fixes: #88667
See also #89340

Co-authored-by: shralex <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Oct 8, 2022
This PR restricts the case when a VOTER_DEMOTING_LEARNER can
aquire the lease in a joint configuration to only the case where
it was the last leaseholder. Since it is being removed, we only
want it to get the lease if no other replica can aquire it,
the scenario described in #83687

This fix solves a potential starvation scenario where a VOTER_DEMOTING_LEARNER keeps
transferring the lease to the VOTER_INCOMING, succeeding, but then re-acquiring
because the VOTER_INCOMING is dead and the lease expires. In this case, we would
want another replica to pick up the lease, which would allow us to exit the joint configuration.

This PR also removes the leaseHolderRemovalAllowed parameter of CheckCanReceiveLease,
since it is always true since 22.2.

Release note (bug fix): narrows down the conditions under which a VOTER_DEMOTING_LEARNER
can acquire the lease in a joint configuration to a) there has to be an VOTER_INCOMING
in the configuration and b) the VOTER_DEMOTING_LEARNER was the last leaseholder. This
prevents it from acquiring the lease unless it is the only one that can acquire it,
because transferring the lease away is necessary before exiting the joint config (without
the fix the system can be stuck in a joint configuration in some rare situations).

Fixes: #88667
See also #89340
blathers-crl bot pushed a commit that referenced this issue Oct 8, 2022
This PR restricts the case when a VOTER_DEMOTING_LEARNER can
aquire the lease in a joint configuration to only the case where
it was the last leaseholder. Since it is being removed, we only
want it to get the lease if no other replica can aquire it,
the scenario described in #83687

This fix solves a potential starvation scenario where a VOTER_DEMOTING_LEARNER keeps
transferring the lease to the VOTER_INCOMING, succeeding, but then re-acquiring
because the VOTER_INCOMING is dead and the lease expires. In this case, we would
want another replica to pick up the lease, which would allow us to exit the joint configuration.

This PR also removes the leaseHolderRemovalAllowed parameter of CheckCanReceiveLease,
since it is always true since 22.2.

Release note (bug fix): narrows down the conditions under which a VOTER_DEMOTING_LEARNER
can acquire the lease in a joint configuration to a) there has to be an VOTER_INCOMING
in the configuration and b) the VOTER_DEMOTING_LEARNER was the last leaseholder. This
prevents it from acquiring the lease unless it is the only one that can acquire it,
because transferring the lease away is necessary before exiting the joint config (without
the fix the system can be stuck in a joint configuration in some rare situations).

Fixes: #88667
See also #89340
shralex added a commit to shralex/cockroach that referenced this issue Oct 8, 2022
This PR restricts the case when a VOTER_DEMOTING_LEARNER can
aquire the lease in a joint configuration to only the case where
it was the last leaseholder. Since it is being removed, we only
want it to get the lease if no other replica can aquire it,
the scenario described in cockroachdb#83687

This fix solves a potential starvation scenario where a VOTER_DEMOTING_LEARNER keeps
transferring the lease to the VOTER_INCOMING, succeeding, but then re-acquiring
because the VOTER_INCOMING is dead and the lease expires. In this case, we would
want another replica to pick up the lease, which would allow us to exit the joint configuration.

Release note (bug fix): narrows down the conditions under which a VOTER_DEMOTING_LEARNER
can acquire the lease in a joint configuration to a) there has to be an VOTER_INCOMING
in the configuration and b) the VOTER_DEMOTING_LEARNER was the last leaseholder. This
prevents it from acquiring the lease unless it is the only one that can acquire it,
because transferring the lease away is necessary before exiting the joint config (without
the fix the system can be stuck in a joint configuration in some rare situations).

Release justification: solves a serious bug.

Fixes: cockroachdb#88667
See also cockroachdb#89340
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants