Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stability: real-world testing of proposal quota #8659

Closed
petermattis opened this issue Aug 18, 2016 · 14 comments
Closed

stability: real-world testing of proposal quota #8659

petermattis opened this issue Aug 18, 2016 · 14 comments
Milestone

Comments

@petermattis
Copy link
Collaborator

petermattis commented Aug 18, 2016

From discussion spawned out of #8639, we need to introduce a flow control mechanism for admitting write operations to replicas. If writes are being applied to a replica sufficiently fast, the raft log might be growing faster than we can generate and apply a snapshot. If that situation arises we'll get a loop of continuous snapshot generation and application which is a drain on the system (and, in effect, throttles all writes). Adjusting the Raft log truncation heuristics (again) is not sufficient as applying a sufficiently large chunk of Raft log entries is slower than using a snapshot.

One idea for a flow control mechanism is to throttle incoming write operations based on the size of the Raft log. A small Raft log indicates that the replicas are all keeping up. As the Raft log grows closer to its target max size (currently the replica size) we would want to throttle writes. I haven't thought of a specific heuristic to use, but am thinking we'd want something that incorporated the excess "raft log capacity" (the delta between the current raft log size and its target max size).

Cc @cockroachdb/stability

@petermattis
Copy link
Collaborator Author

I have a proof of concept for flow control, but it leaves a lot to be desired. The gist of the approach is to maintain a proposal quota pool that we "acquire" quota from before submitting a proposal and "release" quota back to when all of the live replicas have indicated they have committed the proposal. In addition to being fragile, this approach introduces performance blips when a follower node goes down. Without this flow control, when a follower goes down the range is unaffected. With the flow control turned on, when a follower goes down we quickly run out of quota a block new operations on the range until the node is marked dead. I haven't even plumbed through the liveness stuff, but already I'm not happy with this approach.

@petermattis
Copy link
Collaborator Author

Cc @spencerkimball

@spencerkimball
Copy link
Member

spencerkimball commented Feb 27, 2017 via email

@petermattis
Copy link
Collaborator Author

Let's start a new issue for the flow control.

Eh? That's what this issue is (or intended to be).

@petermattis
Copy link
Collaborator Author

A different approach would be something more along the lines of what was mentioned in the original message: use a rate.Limiter and dynamically adjust the rate based on the untruncated size of the Raft log. I foresee difficulties in achieving a stable rate limit with this approach.

@spencerkimball
Copy link
Member

Yeah, I tried the rate.Limiter without getting any good results the last time I looked at this problem (when we were trying to handle slow recoveries pre-delete-range support). The biggest issue with it is setting the natural rate, which is definitely not obvious.

@petermattis
Copy link
Collaborator Author

I was thinking of adjusting the rate limit in raftLogQueue. That is, we would lower the rate limit as the un-truncated size of the Raft log grows and raise it when the Raft log is truncated. Might have to match this up with a moving average of the actual achieved rate over the previous few seconds.

petermattis added a commit to petermattis/cockroach that referenced this issue Mar 1, 2017
The leader maintains a pool of "proposal quota". Before proposing a Raft
command, we acquire 1 unit of proposal quota. When all of the active
followers have committed an entry, that unit of proposal quota is
returned to the pool. The proposal quota pool size is hard coded to 1000
which allows fairly deep pipelining of Raft commands. Non-leaders are
given an infinite quota pool because there is no mechanism for them to
return quota to their pool.

We only consider "active" followers when determining if a unit of quota
should be returned to the pool. An active follower is one we've received
any type of message from in the past 2 seconds.

See cockroachdb#8659
petermattis added a commit to petermattis/cockroach that referenced this issue Mar 2, 2017
The leader maintains a pool of "proposal quota". Before proposing a Raft
command, we acquire 1 unit of proposal quota. When all of the active
followers have committed an entry, that unit of proposal quota is
returned to the pool. The proposal quota pool size is hard coded to 1000
which allows fairly deep pipelining of Raft commands.

We only consider "active" followers when determining if a unit of quota
should be returned to the pool. An active follower is one we've received
any type of message from in the past 2 seconds.

See cockroachdb#8659
petermattis added a commit to petermattis/cockroach that referenced this issue Mar 6, 2017
The leader maintains a pool of "proposal quota". Before proposing a Raft
command, we acquire 1 unit of proposal quota. When all of the active
followers have committed an entry, that unit of proposal quota is
returned to the pool. The proposal quota pool size is hard coded to 1000
which allows fairly deep pipelining of Raft commands.

We only consider "active" followers when determining if a unit of quota
should be returned to the pool. An active follower is one we've received
any type of message from in the past 2 seconds.

See cockroachdb#8659
petermattis added a commit to petermattis/cockroach that referenced this issue Mar 6, 2017
The leader maintains a pool of "proposal quota". Before proposing a Raft
command, we acquire 1 unit of proposal quota. When all of the active
followers have committed an entry, that unit of proposal quota is
returned to the pool. The proposal quota pool size is hard coded to 1000
which allows fairly deep pipelining of Raft commands.

We only consider "active" followers when determining if a unit of quota
should be returned to the pool. An active follower is one we've received
any type of message from in the past 2 seconds.

See cockroachdb#8659
petermattis added a commit to petermattis/cockroach that referenced this issue Mar 27, 2017
The leader maintains a pool of "proposal quota". Before proposing a Raft
command, we acquire 1 unit of proposal quota. When all of the active
followers have committed an entry, that unit of proposal quota is
returned to the pool. The proposal quota pool size is hard coded to 1000
which allows fairly deep pipelining of Raft commands.

We only consider "active" followers when determining if a unit of quota
should be returned to the pool. An active follower is one we've received
any type of message from in the past 2 seconds.

See cockroachdb#8659
@spencerkimball
Copy link
Member

@petermattis ping.

@petermattis
Copy link
Collaborator Author

I'll clean up the existing PR this week.

@petermattis
Copy link
Collaborator Author

I never got this done due to other work taking precedence. Given that we don't have a known workload that would benefit from this, I'm moving to Later with the expectation that we'll address very early in the next release cycle.

@petermattis petermattis added this to the Later milestone Apr 13, 2017
@petermattis petermattis removed this from the 1.0 milestone Apr 13, 2017
@petermattis petermattis modified the milestones: 1.1, Later Apr 24, 2017
@petermattis
Copy link
Collaborator Author

@irfansharif Is there anything left to do here given #15802?

@tbg
Copy link
Member

tbg commented Jun 6, 2017

For one, we have zero integration testing.

@irfansharif
Copy link
Contributor

no but I want to run some more end-to-end tests demonstrating the behavior, was going to use this as an umbrella issue for that.

@petermattis petermattis removed their assignment Jun 6, 2017
@tbg tbg changed the title stability: need flow control mechanism for throttling replica operations stability: real-world testing of proposal quota Jul 10, 2017
@petermattis petermattis modified the milestones: 1.1, 1.2 Sep 26, 2017
@bdarnell
Copy link
Contributor

bdarnell commented Feb 8, 2018

We've done more testing of this feature since it shipped in 1.1; I don't think there's anything else in the works for this issue.

@bdarnell bdarnell closed this as completed Feb 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants