Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Possible race between RequestLease and ChangeReplicas #15385

Open
bdarnell opened this issue Apr 26, 2017 · 11 comments
Open

storage: Possible race between RequestLease and ChangeReplicas #15385

bdarnell opened this issue Apr 26, 2017 · 11 comments
Labels
A-kv-replication Relating to Raft, consensus, and coordination. branch-release-20.1 Used to mark GA and release blockers, technical advisories, and bugs for 20.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Milestone

Comments

@bdarnell
Copy link
Contributor

bdarnell commented Apr 26, 2017

As discussed in #15355, TransferLease is now sequenced with respect to concurrent replica changes by the command queue. RequestLease has no such protection because it is evaluated on followers (unlike all other commands). It is instead guarded by the RaftCommand.ProposerLease field, which ensures that the replica requesting the lease has an up-to-date view of the current lease. This allows for a race in which a replica is removed from the range immediately after it has attempted to take the lease. (This race is difficult to hit in practice because the range must be healthy to execute the ChangeReplicas transaction, but the current lease holder must be (or appear to be) unhealthy in order for a follower to attempt to grab the lease) When this occurs, we will hit the log.Fatal which prevents ranges from getting stuck with a lease on a non-member store.

The simplest fix I see is to add a counter to roachpb.Lease which is incremented on every replica change, so that the ProposerLease will be seen as outdated when a RequestLease crosses over a rebalance.

Jira issue: CRDB-44008

@bdarnell
Copy link
Contributor Author

bdarnell commented May 2, 2017

Since the only instance of this we've seen was accompanied by weird gossip behavior (now fixed), I'm pushing this to 1.1.

@bdarnell bdarnell modified the milestones: 1.1, 1.0 May 2, 2017
@tbg tbg assigned tbg and unassigned bdarnell May 3, 2017
@tbg
Copy link
Member

tbg commented May 3, 2017

@bdarnell, work-stealing this from you.

@tbg
Copy link
Member

tbg commented May 4, 2017

I wonder what would happen when a replica winds up removing itself from the
range. My understanding is that we try to prevent it as much as we can, but we
don't actually guarantee it won't happen. Am I missing a mechanism that makes
this an impossible scenario? If it did happen, wouldn't an epoch-based lease
remain active forever? After all, the node is likely heartbeating its entry.

Unless I messed up, this implies having to make sure that a lease holder cannot
remove itself, on top of making sure there can't be a stray RequestLease
after the fact (for which @bdarnell already had a suggestion). Thoughts?

@petermattis
Copy link
Collaborator

I wonder what would happen when a replica winds up removing itself from the range.

I thought we disallowed removing the leaseholder replica. I know for certain replicateQueue won't do this. But looking just now I don't see any other protection. Seems worthwhile to add.

@a-robinson
Copy link
Contributor

What we currently have is this pretty hacky check in the proposal code path:
https://github.com/cockroachdb/cockroach/blob/master/pkg/storage/replica.go#L2608

@a-robinson
Copy link
Contributor

(but you're very right about the risk of a non-replica holding the lease -- node liveness will keep the lease alive forever and it has been the source of a few different headaches while you were gone)

@tbg
Copy link
Member

tbg commented May 5, 2017

@petermattis @a-robinson sorry, should've been a little more clear - I'm aware of both the proposal code and the replicate queue trying not to have this happen, but I didn't see us explicitly preventing it anywhere. But now I see (or at least think) that the propose check does prevent it, not make it less likely. For example, if replica 1 is trying to remove replica 2:

  • (rpl1).Send takes the final EndTransaction, proposes it.
  • the lease changes hands from rpl1 to rpl2.
  • the EndTransaction proposal bubbles up in the retry loop around tryExecuteWriteBatch (this can happen in various ways)
  • it is retried, eventually ending up at rpl2 for proposal. But the propose check catches precisely that.

Since a proposal can only commit under the lease that proposed it, and since only a replica holding the lease proposes under that lease, we should be good. But the whole thing is pretty subtle. I'll make sure to add some commentary.

@tbg
Copy link
Member

tbg commented May 5, 2017

As for the rest, I agree with @bdarnell's suggestion.

tbg added a commit to tbg/cockroach that referenced this issue May 8, 2017
Make sure that a lease with an intended lease holder that has since been
removed catches a forcedError.

See cockroachdb#15385.
tbg added a commit to tbg/cockroach that referenced this issue May 18, 2017
Make sure that a lease with an intended lease holder that has since been
removed catches a forcedError.

See cockroachdb#15385.
@tbg
Copy link
Member

tbg commented May 22, 2017

Closed in #15754.

@tbg tbg closed this as completed May 22, 2017
@nvanbenschoten nvanbenschoten added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-replication Relating to Raft, consensus, and coordination. labels Oct 17, 2024
Copy link

blathers-crl bot commented Oct 17, 2024

Hi @nvanbenschoten, please add branch-* labels to identify which branch(es) this C-bug affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@nvanbenschoten
Copy link
Member

The fix for this issue was to check whether the leaseholder replica is part of the current RangeDescriptor when applying a lease command once serialized under raft, rejecting the command if not:

if isLeaseRequest {
// Lease commands are ignored by the counter (and their MaxLeaseIndex is ignored). This
// makes sense since lease commands are proposed by anyone, so we can't expect a coherent
// MaxLeaseIndex. Also, lease proposals are often replayed, so not making them update the
// counter makes sense from a testing perspective.
//
// However, leases get special vetting to make sure we don't give one to a replica that was
// since removed (see #15385 and a comment in redirectOnOrAcquireLease).
if _, ok := replicaState.Desc.GetReplicaDescriptor(requestedLease.Replica.StoreID); !ok {
return ForcedErrResult{
LeaseIndex: leaseIndex,
Rejection: ProposalRejectionPermanent,
ForcedError: kvpb.NewError(&kvpb.LeaseRejectedError{
Existing: *replicaState.Lease,
Requested: requestedLease,
Message: "replica not part of range",
}),
}
}

Since that fix was introduced, we've added additional constraints on who in a range can hold a lease. Notably, LEARNER and NON_VOTER replicas cannot hold the lease. This is checked above raft (twice) and encoded into the roachpb.CheckCanReceiveLease function.

It would appear that these additional constraints did not make their way to the below-raft check back when we began using learner replicas in v19.2.0, permitting a new variant of this race where a RequestLease lands after the target leaseholder has been demoted to a learner.

To fix this, we need to replace the GetReplicaDescriptor(requestedLease) check with a CheckCanReceiveLease(requestedLease) check in CheckForcedErr. We need to be careful when doing this, because below-raft logic needs to remain deterministic across versions, We'll want to attach a flag to the corresponding Raft proposal to identify pre-fix and post-fix lease requests, and change behavior in CheckForcedErr accordingly.

@nvanbenschoten nvanbenschoten added the branch-release-20.1 Used to mark GA and release blockers, technical advisories, and bugs for 20.1 label Oct 17, 2024
@tbg tbg removed their assignment Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-replication Relating to Raft, consensus, and coordination. branch-release-20.1 Used to mark GA and release blockers, technical advisories, and bugs for 20.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

5 participants