Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: batch ranged intent resolution #46952

Merged

Conversation

nvanbenschoten
Copy link
Member

Fixes #46752.
Resolves the recent perf regression on TPC-C.

This commit follows in the footsteps of #34803 and introduces batching for ranged intent resolution, where previously only point intent resolution was batched. As we found in #46752, this is more important than it has been in the past, because implicit SELECT FOR UPDATE acquires a ranged lock on each row that it updates instead of a single-key lock.

The change addresses this by adding a third request batcher to IntentResolver. ResolveIntent requests and ResolveIntentRange requests are batched separately, which is necessary for the use of MaxSpanRequestKeys to work properly (in fact, to be accepted by DistSender at all).

To accommodate the ranged nature of ResolveIntentRange request, the notion of pagination is introduced into RequestBatcher. A MaxKeysPerBatchReq option is added to the configuration of a RequestBatcher, which corresponds to the MaxSpanRequestKeys value set on each BatchRequest header. The RequestBatcher is then taught about request pagination and how to work with partial responses. The semantics at play here are clarified in an earlier commit in the PR.

Release justification: important fix to avoid a performance regression when implicit SELECT FOR UPDATE is enabled. All commits except the last are testing-only. The last commit is subtle but small and well-tested.

@andreimatei: I assigned you because I think you know the most about MaxSpanRequestKeys. I'm mostly interested to get your input on the "rationalize Header.MaxSpanRequestKeys" commit (testing + comments only).

@nvanbenschoten nvanbenschoten added the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Apr 3, 2020
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: nice

Reviewed 1 of 1 files at r1, 3 of 3 files at r2, 3 of 3 files at r3, 3 of 3 files at r4, 4 of 4 files at r5, 4 of 4 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/internal/client/requestbatcher/batcher_test.go, line 32 at r6 (raw file):

type batchResp struct {
	// TODO(ajwerner): we never actually test that this result is what we expect

Fair.


pkg/internal/client/requestbatcher/batcher_test.go, line 434 at r6 (raw file):

}

// TestMaxKeysPerBatchReq exercises the RequestBatcher when it is configured to

This test is nice.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r5 LGTM. God's work.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/roachpb/api.proto, line 1833 at r5 (raw file):

  // Overlapping requests
  //
  // The spans accessed by the requests are allowed overlap. However, if any

allowed to overlap


pkg/roachpb/api.proto, line 1837 at r5 (raw file):

  // responses in the corresponding BatchResponse. If no requests overlap, then
  // only up to one request will return a partial result. Additionally, if two
  // requests touch the same key, it is double counted towards the key limit.

wanna put some words here about how the distsender split factors in?


pkg/roachpb/api.proto, line 1843 at r5 (raw file):

  // The spans accessed by requests do not need to be in sorted order. However,
  // if the requests are not in sorted order (e.g. increasing key order for
  // Scans and other forward requests, decreasing key order for ReverseScans),

wanna say that mixing forward and reverse scans is not permitted when the limit is used?
Also, do you know why it isn't permitted?


pkg/roachpb/api.proto, line 1854 at r5 (raw file):

  // If a limit is provided, the batch can contain only the following range
  // request types:
  // - ScanRequest

If this is true, please put a nod to it at the beginning of the comment where you say that this param applies to range requests. Or cut "ranged" from that sentence.

Before this change, `var br *roachpb.BatchResponse` was being forced
onto the heap:
```
$ goescape . | grep moved
./batcher.go:262:6: moved to heap: br
```

By moving the closures into the RunWorker function, we avoid the
allocation:
```
$ goescape . | grep moved | wc -l
0
```

Release justification: probably none. I'll sit on this if it's alone.
Previously, the only way to access this functionality outside of
the roachpb package was through the BatchResponse.Combine method.
There doesn't seem to be a strong reason for this encapsulation,
given that Response is an exported part of the roachpb interface.
This has long-since been replaced by `t.Helper()`.
This uses of this have long-since been replaceable by `t.Helper()`.
…uests

This commit improves the contract around MaxSpanRequestKeys and its interaction
with overlapping and unsorted requests. Instead of saying that only sorted,
non-overlapping requests were allowed in a batch with a limit (which was not
being respected by users of the KV API), we now discuss what callers should
expect if they cannot provide one or both of these guarantees.

The commit then improves the testing around this area.
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/resRangeIntent branch from 801f5c1 to c1d34fe Compare April 3, 2020 23:24
Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei)


pkg/roachpb/api.proto, line 1833 at r5 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

allowed to overlap

Done.


pkg/roachpb/api.proto, line 1837 at r5 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

wanna put some words here about how the distsender split factors in?

Done.


pkg/roachpb/api.proto, line 1843 at r5 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

wanna say that mixing forward and reverse scans is not permitted when the limit is used?
Also, do you know why it isn't permitted?

I'm guessing they're not permitted because it's hard to interpret their results, but essentially that's because mixing them causes the request to be unorderable. I added a bit about this.


pkg/roachpb/api.proto, line 1854 at r5 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

If this is true, please put a nod to it at the beginning of the comment where you say that this param applies to range requests. Or cut "ranged" from that sentence.

Done.

@craig
Copy link
Contributor

craig bot commented Apr 3, 2020

Build failed

Fixes cockroachdb#46752.
Resolves the recent perf regression on TPC-C.

This commit follows in the footsteps of cockroachdb#34803 and introduces batching for
ranged intent resolution, where previously only point intent resolution was
batched. As we found in cockroachdb#46752, this is more important than it has been in
the past, because implicit SELECT FOR UPDATE acquires a ranged lock on each
row that it updates instead of a single-key lock.

The change addresses this by adding a third request batcher to IntentResolver.
ResolveIntent requests and ResolveIntentRange requests are batched separately,
which is necessary for the use of MaxSpanRequestKeys to work properly (in fact,
to be accepted by DistSender at all).

To accommodate the ranged nature of ResolveIntentRange request, the notion of
pagination is introduced into RequestBatcher. A MaxKeysPerBatchReq option is
added to the configuration of a RequestBatcher, which corresponds to the
MaxSpanRequestKeys value set on each BatchRequest header. The RequestBatcher is
then taught about request pagination and how to work with partial responses. See
the previous commit for clarification about the semantics at play here.

Release justification: important fix to avoid a performance regression when
implicit SELECT FOR UPDATE is enabled.
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/resRangeIntent branch from c1d34fe to cf11645 Compare April 4, 2020 01:06
@nvanbenschoten
Copy link
Member Author

Made a "fundemental" misspelling of the word fundamental.

bors r+

@craig
Copy link
Contributor

craig bot commented Apr 4, 2020

Build failed (retrying...)

@nvanbenschoten
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented Apr 4, 2020

Build succeeded

@craig craig bot merged commit 82fec00 into cockroachdb:master Apr 4, 2020
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/resRangeIntent branch April 4, 2020 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kv: ranged intent resolution is not batched, regresses performance on TPC-C
4 participants