-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stability: real-world testing of proposal quota #8659
Comments
I have a proof of concept for flow control, but it leaves a lot to be desired. The gist of the approach is to maintain a proposal quota pool that we "acquire" quota from before submitting a proposal and "release" quota back to when all of the live replicas have indicated they have committed the proposal. In addition to being fragile, this approach introduces performance blips when a follower node goes down. Without this flow control, when a follower goes down the range is unaffected. With the flow control turned on, when a follower goes down we quickly run out of quota a block new operations on the range until the node is marked dead. I haven't even plumbed through the liveness stuff, but already I'm not happy with this approach. |
Let's start a new issue for the flow control.
…On Mon, Feb 27, 2017 at 12:34 PM Peter Mattis ***@***.***> wrote:
Cc @spencerkimball <https://github.com/spencerkimball>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8659 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AF3MTbFncE3R5zwYIIuUkAS3nbQheGnEks5rgzNzgaJpZM4JnwJU>
.
|
Eh? That's what this issue is (or intended to be). |
A different approach would be something more along the lines of what was mentioned in the original message: use a |
Yeah, I tried the |
I was thinking of adjusting the rate limit in |
The leader maintains a pool of "proposal quota". Before proposing a Raft command, we acquire 1 unit of proposal quota. When all of the active followers have committed an entry, that unit of proposal quota is returned to the pool. The proposal quota pool size is hard coded to 1000 which allows fairly deep pipelining of Raft commands. Non-leaders are given an infinite quota pool because there is no mechanism for them to return quota to their pool. We only consider "active" followers when determining if a unit of quota should be returned to the pool. An active follower is one we've received any type of message from in the past 2 seconds. See cockroachdb#8659
The leader maintains a pool of "proposal quota". Before proposing a Raft command, we acquire 1 unit of proposal quota. When all of the active followers have committed an entry, that unit of proposal quota is returned to the pool. The proposal quota pool size is hard coded to 1000 which allows fairly deep pipelining of Raft commands. We only consider "active" followers when determining if a unit of quota should be returned to the pool. An active follower is one we've received any type of message from in the past 2 seconds. See cockroachdb#8659
The leader maintains a pool of "proposal quota". Before proposing a Raft command, we acquire 1 unit of proposal quota. When all of the active followers have committed an entry, that unit of proposal quota is returned to the pool. The proposal quota pool size is hard coded to 1000 which allows fairly deep pipelining of Raft commands. We only consider "active" followers when determining if a unit of quota should be returned to the pool. An active follower is one we've received any type of message from in the past 2 seconds. See cockroachdb#8659
The leader maintains a pool of "proposal quota". Before proposing a Raft command, we acquire 1 unit of proposal quota. When all of the active followers have committed an entry, that unit of proposal quota is returned to the pool. The proposal quota pool size is hard coded to 1000 which allows fairly deep pipelining of Raft commands. We only consider "active" followers when determining if a unit of quota should be returned to the pool. An active follower is one we've received any type of message from in the past 2 seconds. See cockroachdb#8659
The leader maintains a pool of "proposal quota". Before proposing a Raft command, we acquire 1 unit of proposal quota. When all of the active followers have committed an entry, that unit of proposal quota is returned to the pool. The proposal quota pool size is hard coded to 1000 which allows fairly deep pipelining of Raft commands. We only consider "active" followers when determining if a unit of quota should be returned to the pool. An active follower is one we've received any type of message from in the past 2 seconds. See cockroachdb#8659
@petermattis ping. |
I'll clean up the existing PR this week. |
I never got this done due to other work taking precedence. Given that we don't have a known workload that would benefit from this, I'm moving to Later with the expectation that we'll address very early in the next release cycle. |
@irfansharif Is there anything left to do here given #15802? |
For one, we have zero integration testing. |
no but I want to run some more end-to-end tests demonstrating the behavior, was going to use this as an umbrella issue for that. |
We've done more testing of this feature since it shipped in 1.1; I don't think there's anything else in the works for this issue. |
From discussion spawned out of #8639, we need to introduce a flow control mechanism for admitting write operations to replicas. If writes are being applied to a replica sufficiently fast, the raft log might be growing faster than we can generate and apply a snapshot. If that situation arises we'll get a loop of continuous snapshot generation and application which is a drain on the system (and, in effect, throttles all writes). Adjusting the Raft log truncation heuristics (again) is not sufficient as applying a sufficiently large chunk of Raft log entries is slower than using a snapshot.
One idea for a flow control mechanism is to throttle incoming write operations based on the size of the Raft log. A small Raft log indicates that the replicas are all keeping up. As the Raft log grows closer to its target max size (currently the replica size) we would want to throttle writes. I haven't thought of a specific heuristic to use, but am thinking we'd want something that incorporated the excess "raft log capacity" (the delta between the current raft log size and its target max size).
Cc @cockroachdb/stability
The text was updated successfully, but these errors were encountered: