[wip] kvcoord: improve truncate() and next() #68564

jordanlewis · 2021-08-06T21:39:36Z

Touches #68536.

Previously, the way the DistSender invoked truncate(), next(), and
prev() was quadratic: each invocation of these functions required
iterating over the entire list of requests in a batch request, to either
find a start key or to find all requests that intersect with a given
range.

However, we can do better: we can process the batches sequentially and
keep track of which parts of the requests' spans we've "chopped off" and
used already. By doing this, we can skip doing extra work for those
already-used span portions.

The loop in this case will remain quadratic, but it's kind of amortized,
since we reduce the set of things to check on every iteration.

There could be a better way of doing this still (Yahor suggested an
interval tree) but I'm not quite sure, and it's complicated.

Release note: None

Previously, the way the DistSender invoked truncate(), next(), and prev() was quadratic: each invocation of these functions required iterating over the entire list of requests in a batch request, to either find a start key or to find all requests that intersect with a given range. However, we can do better: we can process the batches sequentially and keep track of which parts of the requests' spans we've "chopped off" and used already. By doing this, we can skip doing extra work for those already-used span portions. The loop in this case will remain quadratic, but it's kind of amortized, since we reduce the set of things to check on every iteration. There could be a better way of doing this still (Yahor suggested an interval tree) but I'm not quite sure, and it's complicated. Release note: None

cockroach-teamcity · 2021-08-06T21:39:45Z

This change is

jordanlewis · 2021-08-06T21:41:36Z

Note: this is not working yet, it is definitely still buggy.

Also, it's just one idea to fix the issue and may not be optimal. I think an approach like this would solve much of the wasted CPU, yet it doesn't truly reduce the complexity of the algorithm in a way that would be ideal, since we're not sorting the requests, or using a tree, etc - we're just marking ones that have been consumed and skipping work that has already been known to be used up.

Finally, we had to remove some concurrency - it used to be the case that truncate itself could be run in the async task, saving some serial work. But doing this makes it very hard to coordinate the work so that ranges can skip iterating through things that have been done already. So, we've moved out the truncate into the serial component for now.

yuzefovich · 2022-06-15T04:54:41Z

I have a solution that sorts the requests in #82865, so I think this PR should be closed now.

yuzefovich · 2022-07-14T20:25:10Z

Overtaken by #82865.

jordanlewis requested review from yuzefovich and nvanbenschoten August 6, 2021 21:39

jordanlewis requested a review from a team as a code owner August 6, 2021 21:39

tbg removed the request for review from a team September 15, 2021 08:19

jordanlewis marked this pull request as draft October 8, 2021 22:56

nvanbenschoten requested a review from irfansharif October 20, 2021 00:06

irfansharif removed their request for review June 20, 2022 20:54

yuzefovich closed this Jul 14, 2022

jordanlewis deleted the fix-truncate-and-next branch July 17, 2022 05:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip] kvcoord: improve truncate() and next() #68564

[wip] kvcoord: improve truncate() and next() #68564

jordanlewis commented Aug 6, 2021

cockroach-teamcity commented Aug 6, 2021

jordanlewis commented Aug 6, 2021

yuzefovich commented Jun 15, 2022

yuzefovich commented Jul 14, 2022

[wip] kvcoord: improve truncate() and next() #68564

[wip] kvcoord: improve truncate() and next() #68564

Conversation

jordanlewis commented Aug 6, 2021

cockroach-teamcity commented Aug 6, 2021

jordanlewis commented Aug 6, 2021

yuzefovich commented Jun 15, 2022

yuzefovich commented Jul 14, 2022