Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block rpc handling until state store is caught up #5911

Merged
merged 1 commit into from
Aug 20, 2019

Conversation

notnoop
Copy link
Contributor

@notnoop notnoop commented Jul 2, 2019

Here, we ensure that when leader only responds to RPC calls when state
store is up to date. At leadership transition or launch with restored
state, the server local store might not be caught up with latest raft
logs and may return a stale read.

The solution here is to have an RPC consistency read gate, enabled when
establishLeadership completes before we respond to RPC calls.
establishLeadership is gated by a raft.Barrier which ensures that
all prior raft logs have been applied.

Conversely, the gate is disabled when leadership is lost.

This fixes the underlying issue in #5906 .

This is very much inspired by https://github.com/hashicorp/consul/pull/3154/files

Here, we ensure that when leader only responds to RPC calls when state
store is up to date.  At leadership transition or launch with restored
state, the server local store might not be caught up with latest raft
logs and may return a stale read.

The solution here is to have an RPC consistency read gate, enabled when
`establishLeadership` completes before we respond to RPC calls.
`establishLeadership` is gated by a `raft.Barrier` which ensures that
all prior raft logs have been applied.

Conversely, the gate is disabled when leadership is lost.

This is very much inspired by https://github.com/hashicorp/consul/pull/3154/files
@notnoop notnoop added the 0.10.0 label Jul 2, 2019
@notnoop notnoop requested a review from preetapan July 2, 2019 10:39
@preetapan preetapan added this to the 0.10.0 milestone Jul 24, 2019
@preetapan preetapan removed the 0.10.0 label Jul 25, 2019
@@ -289,6 +289,8 @@ func (s *Server) establishLeadership(stopCh chan struct{}) error {
return err
}

s.setConsistentReadReady()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we need to wait till the end of this method? This method is called after the barrier write, so any reason not to toggle this at the beginning of this method rather than wait for all the other housekeeping that happens in this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was being defensive for fear of an RPC handling accessing/mutating brokers before they are enabled. I'd hope that establishing leadership is a quick call and happens rarely such that optimizing to find exact ideal place to set consistent readiness isn't worth it. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed that its a minor optimization, lets leave it as is

Copy link
Contributor

@preetapan preetapan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had one question, overall approach LGTM

@github-actions
Copy link

github-actions bot commented Feb 4, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants