Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/raft: force campaigning on leader removal should respect fortification #129796

Open
nvanbenschoten opened this issue Aug 28, 2024 · 0 comments
Assignees
Labels
A-kv-replication Relating to Raft, consensus, and coordination. A-leader-leases Related to the introduction of leader leases C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Aug 28, 2024

Extracted from #125254.

The remaining use of manual campaigning is in forceCampaignLocked. However, this use is special in that it uses a "forced" campaign through MsgTimeoutNow. It does so because it's trying to handle the case where the raft leader has stepped down or been removed from a range.

In practice, this may be ok today because we will only call forceCampaignLocked as the leaseholder-not-leader. Still, if a leader is fortified, we probably want to handle it a different way. Since we only demote leaders to voters and never remove them immediately, there may be a better way to handle this inside of raft, around this logic:

cockroach/pkg/raft/raft.go

Lines 2176 to 2191 in e243814

if (pr == nil || r.isLearner) && r.state == StateLeader {
// This node is leader and was removed or demoted, step down if requested.
//
// We prevent demotions at the time writing but hypothetically we handle
// them the same way as removing the leader.
//
// TODO(tbg): ask follower with largest Match to TimeoutNow (to avoid
// interruption). This might still drop some proposals but it's better than
// nothing.
if r.stepDownOnRemoval {
// NB: Similar to the CheckQuorum step down case, we must remember our
// prior stint as leader, lest we regress the QSE.
r.becomeFollower(r.Term, r.lead)
}
return cs
}

Jira issue: CRDB-41715

Epic CRDB-37522

@nvanbenschoten nvanbenschoten added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-replication Relating to Raft, consensus, and coordination. T-kv KV Team A-leader-leases Related to the introduction of leader leases labels Aug 28, 2024
@nvanbenschoten nvanbenschoten self-assigned this Aug 28, 2024
@nvanbenschoten nvanbenschoten changed the title kv/raft: handle fortified leader step down kv/raft: force campaigning on leader removal should respect fortification Sep 11, 2024
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Sep 13, 2024
Informs cockroachdb#129796.

This commit is a natural continuation of 7f37eaa (from 5 years ago). The commit
made it so that the replicateQueue would never issue a ChangeReplicas request
that removed a voter replica. Instead, the replicate queue would always demote
the voter replica to a learner replica first.

This commit makes it so that the ChangeReplicasTrigger code does not even know
how to remove a voter replica, it only knows how to demote them. This prevents
abuse and strengthens the guarantee that we will never perform such unsafe
configuration changes.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Sep 13, 2024
Informs cockroachdb#129796.

This commit makes it so that the confChangeImpl code will never create a
raftpb.ConfChange(,V2) with a direct voter removal. Instead, it will always
demote the voter replica to a learner replica first.

Release note: None
craig bot pushed a commit that referenced this issue Sep 18, 2024
130660: kv: don't remove voter replicas in ChangeReplicasTrigger r=nvanbenschoten a=nvanbenschoten

Informs #129796.

This commit is a natural continuation of 7f37eaa (from 5 years ago). The commit made it so that the replicateQueue would never issue a ChangeReplicas request that removed a voter replica. Instead, the replicate queue would always demote the voter replica to a learner replica first.

This commit makes it so that the ChangeReplicasTrigger code does not even know how to remove a voter replica, it only knows how to demote them. This prevents abuse and strengthens the guarantee that we will never perform such unsafe configuration changes.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Sep 26, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

This commit is a natural continuation of 7f37eaa (from 5 years ago). The commit
made it so that the replicateQueue would never issue a ChangeReplicas request
that removed a voter replica. Instead, the replicate queue would always demote
the voter replica to a learner replica first.

This commit makes it so that raft will no longer allow direct voter removal in a
proposed configuration change. Instead, it requires the voter (who may be the
leader) to first be demoted to a learner. This prevents abuse and strengthens
the guarantee that we will never perform such unsafe configuration changes.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Sep 26, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

Perform the same assertion as we have in pkg/raft, but for cases where
DisableConfChangeValidation is set to true.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 7, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

This commit is a natural continuation of 7f37eaa (from 5 years ago). The commit
made it so that the replicateQueue would never issue a ChangeReplicas request
that removed a voter replica. Instead, the replicate queue would always demote
the voter replica to a learner replica first.

This commit makes it so that raft will no longer allow direct voter removal in a
proposed configuration change. Instead, it requires the voter (who may be the
leader) to first be demoted to a learner. This prevents abuse and strengthens
the guarantee that we will never perform such unsafe configuration changes.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 7, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

Perform the same assertion as we have in pkg/raft, but for cases where
DisableConfChangeValidation is set to true.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 8, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

This commit is a natural continuation of 7f37eaa (from 5 years ago). The commit
made it so that the replicateQueue would never issue a ChangeReplicas request
that removed a voter replica. Instead, the replicate queue would always demote
the voter replica to a learner replica first.

This commit makes it so that raft will no longer allow direct voter removal in a
proposed configuration change. Instead, it requires the voter (who may be the
leader) to first be demoted to a learner. This prevents abuse and strengthens
the guarantee that we will never perform such unsafe configuration changes.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 8, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

Perform the same assertion as we have in pkg/raft, but for cases where
DisableConfChangeValidation is set to true.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 8, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

This commit is a natural continuation of 7f37eaa (from 5 years ago). The commit
made it so that the replicateQueue would never issue a ChangeReplicas request
that removed a voter replica. Instead, the replicate queue would always demote
the voter replica to a learner replica first.

This commit makes it so that raft will no longer allow direct voter removal in a
proposed configuration change. Instead, it requires the voter (who may be the
leader) to first be demoted to a learner. This prevents abuse and strengthens
the guarantee that we will never perform such unsafe configuration changes.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 8, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

Perform the same assertion as we have in pkg/raft, but for cases where
DisableConfChangeValidation is set to true.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 14, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

This commit is a natural continuation of 7f37eaa (from 5 years ago). The commit
made it so that the replicateQueue would never issue a ChangeReplicas request
that removed a voter replica. Instead, the replicate queue would always demote
the voter replica to a learner replica first.

This commit makes it so that raft will no longer allow direct voter removal in a
proposed configuration change. Instead, it requires the voter (who may be the
leader) to first be demoted to a learner. This prevents abuse and strengthens
the guarantee that we will never perform such unsafe configuration changes.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 14, 2024
Informs cockroachdb#129796.
Informs cockroachdb#125355.

Perform the same assertion as we have in pkg/raft, but for cases where
DisableConfChangeValidation is set to true.

Release note: None
craig bot pushed a commit that referenced this issue Oct 15, 2024
131445: raft: forbid direct voter removal without demotion r=nvanbenschoten a=nvanbenschoten

Informs #129796.
Informs #125355.

This commit is a natural continuation of 7f37eaa (from 5 years ago). The commit made it so that the replicateQueue would never issue a ChangeReplicas request that removed a voter replica. Instead, the replicate queue would always demote the voter replica to a learner replica first.

This commit makes it so that raft will no longer allow direct voter removal in a proposed configuration change. Instead, it requires the voter (who may be the leader) to first be demoted to a learner. This prevents abuse and strengthens the guarantee that we will never perform such unsafe configuration changes.

Release note: None

132349: server: Fix NPE in spanStatsFanOut r=kyle-a-wong a=kyle-a-wong

a NPE was surfaced when doing a spanstats request on an unfinalized (mixed version) cluster.

To fix, defensive checks are added to defend against potential nil responses and references to non-existant map entries for a request span stat

Fixes: #132130
Release note (bug fix): Fixes a bug where a span stats request on a mixed version cluster resulted in an NPE

132574: tablemetadatacache: fix TestTableMetadataUpdateJobProgressAndMetrics r=kyle-a-wong a=kyle-a-wong

This test has a high degree of variability in run time, likely due to the generation for so many tables. To fix, new tables
are no longer generated in the test. Instead, the batch size for the table metadata update job has is lowered to effectively trigger the same behavior

Epic: [CRDB-37558](https://cockroachlabs.atlassian.net/browse/CRDB-37558)
Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Kyle Wong <[email protected]>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Oct 23, 2024
Informs cockroachdb#129796.

This commit addresses a TODO to transfer raft leadership away when the leader
steps down to a learner through the application of a configuration change. The
leader now ask the follower with the largest Match to campaign immediately with
a TimeoutNow to avoid interruption.

This comes as a safer, easier to understand alternative to "force campaigning"
on the leaseholder after applying a configuration change which has demoted the
leader to a learner. The force campaign approach is more complex, is a layering
violation, may not work if the leaseholder isn't told about the larger applied
index before the leader steps down, and poses problems for fortification and
leader leases.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-replication Relating to Raft, consensus, and coordination. A-leader-leases Related to the introduction of leader leases C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
None yet
Development

No branches or pull requests

1 participant