stability: real-world testing of proposal quota #8659

petermattis · 2016-08-18T17:58:42Z

From discussion spawned out of #8639, we need to introduce a flow control mechanism for admitting write operations to replicas. If writes are being applied to a replica sufficiently fast, the raft log might be growing faster than we can generate and apply a snapshot. If that situation arises we'll get a loop of continuous snapshot generation and application which is a drain on the system (and, in effect, throttles all writes). Adjusting the Raft log truncation heuristics (again) is not sufficient as applying a sufficiently large chunk of Raft log entries is slower than using a snapshot.

One idea for a flow control mechanism is to throttle incoming write operations based on the size of the Raft log. A small Raft log indicates that the replicas are all keeping up. As the Raft log grows closer to its target max size (currently the replica size) we would want to throttle writes. I haven't thought of a specific heuristic to use, but am thinking we'd want something that incorporated the excess "raft log capacity" (the delta between the current raft log size and its target max size).

Cc @cockroachdb/stability

petermattis · 2017-02-27T20:34:43Z

I have a proof of concept for flow control, but it leaves a lot to be desired. The gist of the approach is to maintain a proposal quota pool that we "acquire" quota from before submitting a proposal and "release" quota back to when all of the live replicas have indicated they have committed the proposal. In addition to being fragile, this approach introduces performance blips when a follower node goes down. Without this flow control, when a follower goes down the range is unaffected. With the flow control turned on, when a follower goes down we quickly run out of quota a block new operations on the range until the node is marked dead. I haven't even plumbed through the liveness stuff, but already I'm not happy with this approach.

petermattis · 2017-02-27T20:34:52Z

Cc @spencerkimball

spencerkimball · 2017-02-27T20:38:54Z

Let's start a new issue for the flow control.

…

On Mon, Feb 27, 2017 at 12:34 PM Peter Mattis ***@***.***> wrote: Cc @spencerkimball <https://github.com/spencerkimball> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8659 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AF3MTbFncE3R5zwYIIuUkAS3nbQheGnEks5rgzNzgaJpZM4JnwJU> .

petermattis · 2017-02-27T20:40:59Z

Let's start a new issue for the flow control.

Eh? That's what this issue is (or intended to be).

petermattis · 2017-02-27T21:26:35Z

A different approach would be something more along the lines of what was mentioned in the original message: use a rate.Limiter and dynamically adjust the rate based on the untruncated size of the Raft log. I foresee difficulties in achieving a stable rate limit with this approach.

spencerkimball · 2017-02-27T21:32:51Z

Yeah, I tried the rate.Limiter without getting any good results the last time I looked at this problem (when we were trying to handle slow recoveries pre-delete-range support). The biggest issue with it is setting the natural rate, which is definitely not obvious.

petermattis · 2017-02-27T21:38:48Z

I was thinking of adjusting the rate limit in raftLogQueue. That is, we would lower the rate limit as the un-truncated size of the Raft log grows and raise it when the Raft log is truncated. Might have to match this up with a moving average of the actual achieved rate over the previous few seconds.

The leader maintains a pool of "proposal quota". Before proposing a Raft command, we acquire 1 unit of proposal quota. When all of the active followers have committed an entry, that unit of proposal quota is returned to the pool. The proposal quota pool size is hard coded to 1000 which allows fairly deep pipelining of Raft commands. Non-leaders are given an infinite quota pool because there is no mechanism for them to return quota to their pool. We only consider "active" followers when determining if a unit of quota should be returned to the pool. An active follower is one we've received any type of message from in the past 2 seconds. See cockroachdb#8659

The leader maintains a pool of "proposal quota". Before proposing a Raft command, we acquire 1 unit of proposal quota. When all of the active followers have committed an entry, that unit of proposal quota is returned to the pool. The proposal quota pool size is hard coded to 1000 which allows fairly deep pipelining of Raft commands. We only consider "active" followers when determining if a unit of quota should be returned to the pool. An active follower is one we've received any type of message from in the past 2 seconds. See cockroachdb#8659

spencerkimball · 2017-04-02T19:08:14Z

@petermattis ping.

petermattis · 2017-04-02T19:19:21Z

I'll clean up the existing PR this week.

petermattis · 2017-04-13T13:37:00Z

I never got this done due to other work taking precedence. Given that we don't have a known workload that would benefit from this, I'm moving to Later with the expectation that we'll address very early in the next release cycle.

petermattis · 2017-06-06T14:37:59Z

@irfansharif Is there anything left to do here given #15802?

tbg · 2017-06-06T14:43:14Z

For one, we have zero integration testing.

irfansharif · 2017-06-06T14:43:48Z

no but I want to run some more end-to-end tests demonstrating the behavior, was going to use this as an umbrella issue for that.

bdarnell · 2018-02-08T19:25:13Z

We've done more testing of this feature since it shipped in 1.1; I don't think there's anything else in the works for this issue.

petermattis added this to the Q3 milestone Aug 18, 2016

petermattis self-assigned this Aug 18, 2016

petermattis assigned spencerkimball and unassigned petermattis Sep 15, 2016

petermattis mentioned this issue Jan 9, 2017

stability: tail of under-replicated ranges lasts too long #11984

Closed

petermattis mentioned this issue Feb 24, 2017

stability: investigate indigo issues #13687

Closed

petermattis assigned petermattis and unassigned spencerkimball Feb 24, 2017

petermattis modified the milestones: 1.0, Q3 Feb 24, 2017

petermattis added this to the Later milestone Apr 13, 2017

petermattis removed this from the 1.0 milestone Apr 13, 2017

petermattis modified the milestones: 1.1, Later Apr 24, 2017

petermattis assigned irfansharif May 1, 2017

petermattis removed their assignment Jun 6, 2017

tbg changed the title ~~stability: need flow control mechanism for throttling replica operations~~ stability: real-world testing of proposal quota Jul 10, 2017

petermattis unassigned irfansharif Sep 26, 2017

petermattis modified the milestones: 1.1, 1.2 Sep 26, 2017

bdarnell closed this as completed Feb 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stability: real-world testing of proposal quota #8659

stability: real-world testing of proposal quota #8659

petermattis commented Aug 18, 2016 •

edited

Loading

petermattis commented Feb 27, 2017

petermattis commented Feb 27, 2017

spencerkimball commented Feb 27, 2017 via email

petermattis commented Feb 27, 2017

petermattis commented Feb 27, 2017

spencerkimball commented Feb 27, 2017

petermattis commented Feb 27, 2017

spencerkimball commented Apr 2, 2017

petermattis commented Apr 2, 2017

petermattis commented Apr 13, 2017

petermattis commented Jun 6, 2017

tbg commented Jun 6, 2017

irfansharif commented Jun 6, 2017

bdarnell commented Feb 8, 2018

stability: real-world testing of proposal quota #8659

stability: real-world testing of proposal quota #8659

Comments

petermattis commented Aug 18, 2016 • edited Loading

petermattis commented Feb 27, 2017

petermattis commented Feb 27, 2017

spencerkimball commented Feb 27, 2017 via email

petermattis commented Feb 27, 2017

petermattis commented Feb 27, 2017

spencerkimball commented Feb 27, 2017

petermattis commented Feb 27, 2017

spencerkimball commented Apr 2, 2017

petermattis commented Apr 2, 2017

petermattis commented Apr 13, 2017

petermattis commented Jun 6, 2017

tbg commented Jun 6, 2017

irfansharif commented Jun 6, 2017

bdarnell commented Feb 8, 2018

petermattis commented Aug 18, 2016 •

edited

Loading