-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block rpc handling until state store is caught up #5911
Conversation
Here, we ensure that when leader only responds to RPC calls when state store is up to date. At leadership transition or launch with restored state, the server local store might not be caught up with latest raft logs and may return a stale read. The solution here is to have an RPC consistency read gate, enabled when `establishLeadership` completes before we respond to RPC calls. `establishLeadership` is gated by a `raft.Barrier` which ensures that all prior raft logs have been applied. Conversely, the gate is disabled when leadership is lost. This is very much inspired by https://github.com/hashicorp/consul/pull/3154/files
@@ -289,6 +289,8 @@ func (s *Server) establishLeadership(stopCh chan struct{}) error { | |||
return err | |||
} | |||
|
|||
s.setConsistentReadReady() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need to wait till the end of this method? This method is called after the barrier write, so any reason not to toggle this at the beginning of this method rather than wait for all the other housekeeping that happens in this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was being defensive for fear of an RPC handling accessing/mutating brokers before they are enabled. I'd hope that establishing leadership is a quick call and happens rarely such that optimizing to find exact ideal place to set consistent readiness isn't worth it. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed that its a minor optimization, lets leave it as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had one question, overall approach LGTM
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Here, we ensure that when leader only responds to RPC calls when state
store is up to date. At leadership transition or launch with restored
state, the server local store might not be caught up with latest raft
logs and may return a stale read.
The solution here is to have an RPC consistency read gate, enabled when
establishLeadership
completes before we respond to RPC calls.establishLeadership
is gated by araft.Barrier
which ensures thatall prior raft logs have been applied.
Conversely, the gate is disabled when leadership is lost.
This fixes the underlying issue in #5906 .
This is very much inspired by https://github.com/hashicorp/consul/pull/3154/files