Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: drained nodes can re-acquire + retain leases #74691

Open
irfansharif opened this issue Jan 11, 2022 · 6 comments
Open

kvserver: drained nodes can re-acquire + retain leases #74691

irfansharif opened this issue Jan 11, 2022 · 6 comments
Labels
A-kv-decom-rolling-restart Decommission and Rolling Restarts A-kv-distribution Relating to rebalancing and leasing. A-server-start-drain Pertains to server startup and shutdown sequences C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-server-and-security DB Server & Security

Comments

@irfansharif
Copy link
Contributor

irfansharif commented Jan 11, 2022

Describe the problem

In a recent customer escalation we observed a cluster with v21.1.3 binaries where a node that was drained, when re-drained appeared to have re-acquired range leases. The re-drain attempt should've been a no-op; the node was not restarted in the interim.

$ crdb node drain --host :26256 --cluster-name=<redacted>
node is draining... remaining: 129
node is draining... remaining: 0 (complete)

$ crdb node drain --host :26256 --cluster-name=<redacted>
node is draining... remaining: 2
node is draining... remaining: 1
node is draining... remaining: 0 (complete)

Should this be possible?

To Reproduce

Haven't tried reproducing.

Expected behavior

Drained nodes to not acquire additional leases.


+cc @knz for routing/triage.

Jira issue: CRDB-12212

@irfansharif irfansharif added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-server-start-drain Pertains to server startup and shutdown sequences labels Jan 11, 2022
@blathers-crl blathers-crl bot added the T-server-and-security DB Server & Security label Jan 11, 2022
@jtsiros
Copy link

jtsiros commented Feb 18, 2022

We'll likely pick this up during stability and investigate.

@aayushshah15
Copy link
Contributor

Related conversation: https://cockroachlabs.slack.com/archives/C0KB9Q03D/p1648756314828899. At least one of the causes for this is the fact that AdminScatter called with the RandomizeLeases option (as IMPORTs do) will freely transfer leases to draining nodes.

aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Apr 3, 2022
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Apr 3, 2022
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Apr 6, 2022
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Apr 6, 2022
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
craig bot pushed a commit that referenced this issue Apr 7, 2022
79295: kvserver: don't transfer leases to draining nodes during scatters r=aayushshah15 a=aayushshah15

**kvserver: introduce Allocator.ValidLeaseTargets()**

This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) #74691.

Release note: none

**kvserver: don't transfer leases to draining nodes during scatters**

Previously, `AdminScatter` called with the `RandomizeLeases` option could
potentially transfer leases to nodes marked draining.  This commit leverages
the refactor from the last commit to fix this bug by first filtering the set of
candidates down to a set of valid candidates that meet lease preferences and
are not marked suspect or draining.

Relates to and fixes a part of #74691.

Release note (bug fix): Fixes a bug where draining / drained nodes could
re-acquire leases during an import or an index backfill.

Co-authored-by: Aayush Shah <[email protected]>
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Apr 30, 2022
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Jun 10, 2022
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
@irfansharif
Copy link
Contributor Author

Saw this again elsewhere when trying to work through implications of allowing rebalancing snapshots to drained nodes (#87969); was hitting this code path:

// If we're draining, we'd rather not take any new leases (since we're also
// trying to move leases away elsewhere). But if we're the leader, we don't
// really have a choice and we take the lease - there might not be any other
// replica available to take this lease (perhaps they're all draining).
if r.store.IsDraining() {
// NB: Replicas that are not the Raft leader will not take leases anyway
// (see the check inside propBuf.FlushLockedWithRaftGroup()), so we don't
// really need any special behavior for draining nodes here. This check
// serves mostly as a means to get more granular logging and as a defensive
// precaution.
if r.raftBasicStatusRLocked().RaftState != raft.StateLeader {
log.VEventf(ctx, 2, "refusing to take the lease because we're draining")
return r.mu.pendingLeaseRequest.newResolvedHandle(
roachpb.NewError(
newNotLeaseHolderError(
roachpb.Lease{}, r.store.StoreID(), r.mu.state.Desc,
"refusing to take the lease; node is draining",
),
),
)
}
log.Info(ctx, "trying to take the lease while we're draining since we're the raft leader")
}

@irfansharif irfansharif changed the title server: drained nodes can re-acquire leases kvserver: drained nodes can re-acquire leases Sep 15, 2022
@irfansharif irfansharif added A-kv-distribution Relating to rebalancing and leasing. A-kv-decom-rolling-restart Decommission and Rolling Restarts labels Sep 15, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Sep 15, 2022
@irfansharif
Copy link
Contributor Author

I guess this issue has been true as of #55624. What I guess is surprising to me though is that there's no active mechanism to take the leases away from raft leaders on drained nodes when there are valid replicas elsewhere. This is easily rectified in practice with another cockroach node drain but this to me is somewhat surprising.

@irfansharif irfansharif changed the title kvserver: drained nodes can re-acquire leases kvserver: drained nodes can re-acquire + retain leases Sep 15, 2022
@irfansharif
Copy link
Contributor Author

I used the following repro steps (-ish):

$ roachprod create local -n 5
$ roachprod put local ./cockroach # built using https://github.com/cockroachdb/cockroach/pull/87969
$ roachprod start local --racks 5
$ roachprod sql local:1 -- -e "alter range liveness configure zone using num_replicas = 3";
... repeat for range meta, range default, range system, database system
... wait for initial upreplication/lease balancing
$ roachprod run local:1 './cockroach node drain 4 --insecure'
$ roachprod sql local:1 -- -e "alter range liveness configure zone using constraints = '{"+rack=3": 1}';"
... repeat for range meta, range default, range system, database system
... observe replica transfers into n4, as a result of the changes in https://github.com/cockroachdb/cockroach/pull/87969
... also observe some lease transfers into n4 despite the drained state; the leases stay there despite other targets being available

aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Sep 26, 2022
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Sep 28, 2022
This commit is a minor refactor of the `Allocator.TransferLeaseTarget` logic in
order to make it more readable and, to abstract out a new exported `Allocator`
method called `ValidLeaseTargets()`.

The contract of `ValidLeaseTargets()` is as follows:
```
// ValidLeaseTargets returns a set of candidate stores that are suitable to be
// transferred a lease for the given range.
//
// - It excludes stores that are dead, or marked draining or suspect.
// - If the range has lease_preferences, and there are any non-draining,
// non-suspect nodes that match those preferences, it excludes stores that don't
// match those preferences.
// - It excludes replicas that may need snapshots. If replica calling this
// method is not the Raft leader (meaning that it doesn't know whether follower
// replicas need a snapshot or not), produces no results.

```

Previously, there were multiple places where we were performing the logic
that's encapsulated by `ValidLeaseTargets()`, which was a potential source of
bugs. This is an attempt to unify this logic in one place that's relatively
well-tested. This commit is only a refactor, and does not attempt to change any
behavior. As such, no existing tests have been changed, with the exception of a
subtest inside `TestAllocatorTransferLeaseTargetDraining`. See the comment over
that subtest to understand why the behavior change made by this patch is
desirable.

The next commit in this PR uses this method to fix (at least part of) cockroachdb#74691.

Release note: none
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-decom-rolling-restart Decommission and Rolling Restarts A-kv-distribution Relating to rebalancing and leasing. A-server-start-drain Pertains to server startup and shutdown sequences C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-server-and-security DB Server & Security
Projects
None yet
Development

No branches or pull requests

3 participants