-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stepWait not used in ProposeConfChange #43
Comments
fixes: etcd-io#43 Signed-off-by: Bogdan Kanivets <[email protected]>
OP wonders about Besides I will talk only about cluster where
It appears that the set of predicates could hold for
The need is somewhat simple : people wants to have I understand that the protocol and logic behing it implies that there is no timeframe guarantee on the commitability or failure of Proposals. It's OKAY for something to be in progress, and having to wait for an indeterminate amount of time for it to progress, as long as further progress is expected. What's not okay if for something to definitely having stopped progression, where failure is certain (absence of further progress is certain), but no mechanism is being provided to detect/notify the certainty of the failure. What indeed feels unatural is, when a "proposal requester" (e.g. a GRPC caller whishing to have some data commited to the Raft logs) DIRECTLY requested the leader, and when this leader reached a point of knowledge of the wherabouts of the ongoing Proposal, specifically reached knowledge of its failure, there is zero documentation on how to detect that. Note : I'm saying "a / the leader", because if the leader is splitbrained from the majority and doesn't know it yet, it's okay for it to hold "proposal requester" in holding status, because it would still consider itself as "a leader" whilst being only really an "incapacitated leader", whereas an "operative leader with majority" has been elected on the other splitbrain side, but of course the incapacitated wouldn't know it before any timeout occurs. I think that OP's |
stepWait
was implemented to fail-fast in Propose method.But it was never used in ProposeConfChange method.
Question was raised about it on original PR.
Also, it's was causing timeout in ExampleCluster_memberAddAsLearner test. comment
This isn't a critical issue, because in production ProposeConfChange isn't usually called consecutively. But it seams to me that
stepWait
was missed forProposeConfChange
method.The text was updated successfully, but these errors were encountered: