-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: add no-op MsgFortifyLeader and MsgFortifyLeaderResp #128426
Conversation
96f7423
to
6b1ac6e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 2 of 2 files at r2, 14 of 16 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)
pkg/kv/kvserver/raft.go
line 29 at r3 (raw file):
// maxRaftMsgType is the maximum value in the raft.MessageType enum. const maxRaftMsgType = raftpb.MsgFortifyLeaderResp
Based on the test failures, it looks like we'll need to do something with RaftRcvdMessages
. Either we assign fake metrics or we just address #124498 immediately. It's probably easiest to just add the real metrics right now, like we did in the prototype.
pkg/kv/kvserver/replica_store_liveness.go
line 45 at r3 (raw file):
return 0, false } if r.store.storeLiveness == nil {
Should we also have this case in SupportFrom
?
pkg/raft/raft.go
line 676 at r3 (raw file):
} // sendFortify sends a fortification RPC to the supplied follower.
nit: s/supplied follower/given peer/
to mirror the language on sendHeartbeat
.
pkg/raft/raft.go
line 708 at r3 (raw file):
including the leader itself
Are there other cases where a leader r.send
s a message to itself? There are certainly cases where a leader Step
s a message to itself, but I'm not sure how this will behave.
EDIT: never mind, I see the TODO above. When we get to that, we should make sure whatever we're doing is consistent with how other broadcasts work, like MsgApp
.
pkg/raft/raft.go
line 1649 at r3 (raw file):
r.becomeLeader() r.bcastAppend() r.bcastFortify()
It shouldn't matter, but consider calling bcastFortify
before bcastAppend
so that the fortify messages go out ahead of the append messages. Conceptually, you can think of fortifying as part of "completing the leader election", while broadcasting the empty entry is part of "beginning log replication", so it feels consistent to switch the order.
pkg/raft/raft.go
line 1658 at r3 (raw file):
case pb.MsgTimeoutNow: r.logger.Debugf("%x [term %d state %v] ignored MsgTimeoutNow from %x", r.id, r.Term, r.state, m.From) case pb.MsgFortifyLeader:
nit: no strong feelings, but in this function and in stepFollower
, there seem to be two classes of messages, those that come from the leader and those that don't. You can tell by the // always m.Term == r.Term
comments. What do you think about moving this right below MsgSnap
in both functions?
pkg/raft/raft.go
line 2125 at r3 (raw file):
// runtimeAssert panics with the supplied message if the condition does not hold // true. func runtimeAssert(condition bool, msg string) {
Out of curiosity, why the "runtime" prefix?
We could move this to util.go
, next to assertConfStatesEquivalent
and follow it's lead about accepting a Logger
.
pkg/raft/util.go
line 42 at r2 (raw file):
pb.MsgStorageApply: true, pb.MsgStorageApplyResp: true, pb.MsgForgetLeader: true,
We step MsgForgetLeader
in forgetLeaderLocked
. Isn't this going to break that by returning a ErrStepLocalMsg
?
pkg/raft/raftpb/raft.proto
line 65 at r3 (raw file):
MsgStorageApplyResp = 22; MsgForgetLeader = 23; MsgFortifyLeader = 24;
nit: want to align the numbers with all of the other messages?
Fixes an oversight when these two fields were added. Epic: none Release note: None
This was missed when MsgForgetLeader was introduced. Fix it. Epic: none Release note: None
6b1ac6e
to
99e36b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFAL
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/raft.go
line 29 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Based on the test failures, it looks like we'll need to do something with
RaftRcvdMessages
. Either we assign fake metrics or we just address #124498 immediately. It's probably easiest to just add the real metrics right now, like we did in the prototype.
Done.
pkg/raft/raft.go
line 708 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
including the leader itself
Are there other cases where a leader
r.send
s a message to itself? There are certainly cases where a leaderStep
s a message to itself, but I'm not sure how this will behave.EDIT: never mind, I see the TODO above. When we get to that, we should make sure whatever we're doing is consistent with how other broadcasts work, like
MsgApp
.
We definitely don't want to r.send
the message to ourselves here. I was imaging we'd do something similar to the prototype, where we handle this special case in sendFortify
above. I like that a bit more instead of adding the special case here (like bcastHeartbeat
and bcastAppend
), but we no strong feelings -- we can talk about it when we get to the next PR.
pkg/raft/raft.go
line 1649 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It shouldn't matter, but consider calling
bcastFortify
beforebcastAppend
so that the fortify messages go out ahead of the append messages. Conceptually, you can think of fortifying as part of "completing the leader election", while broadcasting the empty entry is part of "beginning log replication", so it feels consistent to switch the order.
Done.
pkg/raft/raft.go
line 1658 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: no strong feelings, but in this function and in
stepFollower
, there seem to be two classes of messages, those that come from the leader and those that don't. You can tell by the// always m.Term == r.Term
comments. What do you think about moving this right belowMsgSnap
in both functions?
Agreed. Done.
pkg/raft/raft.go
line 2125 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Out of curiosity, why the "runtime" prefix?
We could move this to
util.go
, next toassertConfStatesEquivalent
and follow it's lead about accepting aLogger
.
It's because some of the tests import a package called assert
("github.com/stretchr/testify/assert"). I've moved it to util
and called it Assert
. The public bit is unfortunate, but I think it's better than calling it runtimeAssert
or assertCondition
.
pkg/raft/util.go
line 42 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We step
MsgForgetLeader
inforgetLeaderLocked
. Isn't this going to break that by returning aErrStepLocalMsg
?
Good catch. I misunderstood what "local" meant in this context; fixed.
pkg/raft/raftpb/raft.proto
line 65 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: want to align the numbers with all of the other messages?
Done.
pkg/kv/kvserver/replica_store_liveness.go
line 45 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we also have this case in
SupportFrom
?
We should, but I'm actually going to revert it for now -- it doesn't belong in this commit. I initially had it when I was making use of StoreLiveness
in the implementation of handleMsgFortify
; we're stubbing it out now, so we don't need this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be good to go after one more round.
Reviewed 16 of 16 files at r4, 1 of 1 files at r5, 17 of 17 files at r6, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)
pkg/raft/raft.go
line 676 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
s/supplied follower/given peer/
to mirror the language onsendHeartbeat
.
No love for this suggestion? My thinking is that by using different language, readers now need to ask whether the wording difference is meaningful or not. In this case it's not, so might as well make their life easier.
I guess that's especially true because we accept to == r.id
, so the peer may not even be a follower.
pkg/raft/raft.go
line 708 at r3 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
We definitely don't want to
r.send
the message to ourselves here. I was imaging we'd do something similar to the prototype, where we handle this special case insendFortify
above. I like that a bit more instead of adding the special case here (likebcastHeartbeat
andbcastAppend
), but we no strong feelings -- we can talk about it when we get to the next PR.
Sounds good. We can decide later.
pkg/raft/raft.go
line 1649 at r3 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Done.
I think you'll need to regen some of the datadriven tests after this change.
pkg/raft/raft.go
line 2125 at r3 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
It's because some of the tests import a package called
assert
("github.com/stretchr/testify/assert"). I've moved it toutil
and called itAssert
. The public bit is unfortunate, but I think it's better than calling itruntimeAssert
orassertCondition
.
I don't think we want exported Assert
because then that becomes part of the exported interface of pkg/raft
. If assert
doesn't work, then I'm +1 on going back to runtimeAssert
or assertTrue
.
pkg/raft/util_test.go
line 99 at r6 (raw file):
{pb.MsgStorageApply, true}, {pb.MsgStorageApplyResp, true}, {pb.MsgFortifyLeader, false},
Did we lose the test case for MsgForgetLeader
in the third commit?
99e36b1
to
429519c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/raft/raft.go
line 676 at r3 (raw file):
Ofc there's love for your suggestions, so much so that I decided to address comments on the last round of reviews twice (ran into some git nonsense). I had this marked as resolved, so ended up missing it when I came back to addressing things a second time around.
I guess that's especially true because we accept
to == r.id
, so the peer may not even be a follower.
+1.
pkg/raft/raft.go
line 1649 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I think you'll need to regen some of the datadriven tests after this change.
Done.
pkg/raft/raft.go
line 2125 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I don't think we want exported
Assert
because then that becomes part of the exported interface ofpkg/raft
. Ifassert
doesn't work, then I'm +1 on going back toruntimeAssert
orassertTrue
.
Switched to assertTrue
.
pkg/raft/util_test.go
line 99 at r6 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Did we lose the test case for
MsgForgetLeader
in the third commit?
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 12 of 12 files at r7, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani)
429519c
to
a7aa6d7
Compare
CI failures look unrelated. TFTR! bors r=nvanbenschoten |
Build failed: |
This commit adds two new messages to raft -- MsgFortifyLeader and MsgFortifyLeaderResp. A candidate attempts to fortify its leadership term after winning an election. It does so by broadcasting a MsgFortifyLeader to all followers. Fow now, the broadcast is a no-op; we'll meaningfully implement it in an upcoming patch. Informs cockroachdb#125261 Release note: None
We need these in preperation for actually sending out these messages in an upcoming patch. Closes cockroachdb#124498 Release note: None
a7aa6d7
to
bbca286
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks! bors r=nvanbenschoten |
With this patch, raft leaders broadcast a MsgFortifyLeader to all peers after being successfully elected as leader. We added most of the structure for this in cockroachdb#128426. However, we never actually sent messages to fortify leadership because doing so needed to be predicated on a cluster version check. This patch simply adds the cluster version check and updates tests. We're not handling MsgFortifyLeader or MsgFortifyLeaderResp just yet. That'll happen in a subsequent patch. Informs cockroachdb#125261 Release note: None
128647: raft: start sending MsgFortify requests on becoming leader r=nvanbenschoten a=arulajmani With this patch, raft leaders broadcast a MsgFortifyLeader to all peers after being successfully elected as leader. We added most of the structure for this in #128426. However, we never actually sent messages to fortify leadership because doing so needed to be predicated on a cluster version check. This patch simply adds the cluster version check and updates tests. We're not handling MsgFortifyLeader or MsgFortifyLeaderResp just yet. That'll happen in a subsequent patch. Informs #125261 Release note: None Co-authored-by: Arul Ajmani <[email protected]>
See individual commits
This commit adds two new messages to raft -- MsgFortifyLeader and
MsgFortifyLeaderResp. A candidate attempts to fortify its leadership
term after winning an election. It does so by broadcasting a
MsgFortifyLeader to all followers.
Currently, the handling of MsgFortify is a no-op; requests are trivially
rejected. In subsequent patches, we'll hook into store liveness and
correctly respond.
Informs #125261
Release note: None