kvserver: record ScanStats once per BatchRequest #97442

yuzefovich · 2023-02-22T04:49:11Z

Previously, we would record a kvpb.ScanStats object into the trace for
each evaluated Get, Scan, and ReverseScan command. This was suboptimal
for two reasons:

this required an allocation of that kvpb.ScanStats object
this required propagating all of these separate objects via the
tracing infrastructure which might make it so that the tracing limits
are reached resulting in some objects being dropped.

This commit, instead, changes the ScanStats to be tracked at the
BatchRequest level, thus, we only need to record a single object per
BatchRequest. This results in reduced granularity, but that is still
sufficient for the SQL needs which simply aggregates all
kvpb.ScanStats from a single SQL processor into one object. As
a result, the tpch_concurrency metric averaged over 20 runs increased
from 76.75 to 84.75.

Additionally, this commit makes it so that we track the number of Gets,
Scans, and ReverseScans actually evaluated as part of the BatchResponse.
This information is plumbed through a couple of protos but is not
exposed in any SQL Observability virtual tables. Still, due to having it
in the protos will include this information into the trace.

Informs: #64906.
Fixes: #71351.

Release note: None

blathers-crl · 2023-02-22T04:49:15Z

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

cockroach-teamcity · 2023-02-22T04:49:21Z

This change is

DrewKimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)

pkg/kv/kvserver/replica_evaluate.go line 250 at r1 (raw file):

				// record the scan stats for all batches, even if there are no
				// Gets nor Scans? This will avoid the need to figure out the
				// value of foundGetOrScan.

The RecordStructured method doesn't own it's argument - maybe we could keep a ScanStats struct somewhere further up the stack (Replica maybe) and pass it into evaluateBatch? I think we can be confident that it would be cheap enough to unconditionally record in that case. What do you think?

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @sumeerbhola)

pkg/kv/kvserver/replica_evaluate.go line 250 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

The RecordStructured method doesn't own it's argument - maybe we could keep a ScanStats struct somewhere further up the stack (Replica maybe) and pass it into evaluateBatch? I think we can be confident that it would be cheap enough to unconditionally record in that case. What do you think?

Are you suggesting to effectively reduce allocations further? It doesn't seem worth it to me - a single kvpb.ScanStats allocation per BatchRequest seems reasonable. My comment here is about whether we should remove the conditional on foundGetOrScan and just call sp.RecordStructured(scanStats) in all cases, even for write-only batches (for which the object will contain all zeros). That shouldn't be a performance concern either, I'm thinking more about this being confusing - i.e. a write-only BatchRequest records the ScanStats into the trace (perhaps with all zeros it wouldn't be confusing). Thoughts?

DrewKimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)

pkg/kv/kvserver/replica_evaluate.go line 250 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Are you suggesting to effectively reduce allocations further? It doesn't seem worth it to me - a single kvpb.ScanStats allocation per BatchRequest seems reasonable. My comment here is about whether we should remove the conditional on foundGetOrScan and just call sp.RecordStructured(scanStats) in all cases, even for write-only batches (for which the object will contain all zeros). That shouldn't be a performance concern either, I'm thinking more about this being confusing - i.e. a write-only BatchRequest records the ScanStats into the trace (perhaps with all zeros it wouldn't be confusing). Thoughts?

If we aren't concerned with allocating the struct in those cases, I'd be in favor of reducing the code complexity. Since we're now aggregating over multiple requests, what do you think about tracking the number of scan requests and get requests in ScanStats? This could be useful when viewing traces, and could be used to avoid recording anything when there are no gets or scans.

Previously, we would record a `kvpb.ScanStats` object into the trace for each evaluated Get, Scan, and ReverseScan command. This was suboptimal for two reasons: - this required an allocation of that `kvpb.ScanStats` object - this required propagating all of these separate objects via the tracing infrastructure which might make it so that the tracing limits are reached resulting in some objects being dropped. This commit, instead, changes the ScanStats to be tracked at the BatchRequest level, thus, we only need to record a single object per BatchRequest. This results in reduced granularity, but that is still sufficient for the SQL needs which simply aggregates all `kvpb.ScanStats` from a single SQL processor into one object. As a result, the tpch_concurrency metric averaged over 20 runs increased from 76.75 to 84.75. Additionally, this commit makes it so that we track the number of Gets, Scans, and ReverseScans actually evaluated as part of the BatchResponse. This information is plumbed through a couple of protos but is not exposed in any SQL Observability virtual tables. Still, due to having it in the protos will include this information into the trace. Release note: None

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @sumeerbhola)

pkg/kv/kvserver/replica_evaluate.go line 250 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

If we aren't concerned with allocating the struct in those cases, I'd be in favor of reducing the code complexity. Since we're now aggregating over multiple requests, what do you think about tracking the number of scan requests and get requests in ScanStats? This could be useful when viewing traces, and could be used to avoid recording anything when there are no gets or scans.

Thanks, I like this idea, done.

DrewKimball

Reviewed 4 of 8 files at r1, 8 of 8 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)

yuzefovich · 2023-02-23T22:16:47Z

@cockroachdb/storage @cockroachdb/kv-prs does some want to give this a look? The changes seem pretty straightforward, so I don't think it's strictly necessary.

sumeerbhola

Looks ok. But do we have any test coverage (I realize this lack of coverage probably predates your change, so sorry to put you on the spot)? If not can you add a test that verifies the trace output?

Reviewed 1 of 8 files at r1, 2 of 8 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)

yuzefovich · 2023-02-24T22:15:16Z

We do have a test for this TestExplainMVCCSteps in sql/explain_test.go.

TFTRs!

bors r+

craig · 2023-02-24T23:45:48Z

Build succeeded:

Bazel Essential CI (Cockroach)

blathers-crl · 2023-02-24T23:45:55Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from 1154be5 to blathers/backport-release-22.2-97442: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

yuzefovich force-pushed the scan-stats branch 2 times, most recently from fba2dd5 to 3ea67ec Compare February 22, 2023 17:24

yuzefovich changed the title ~~WIP on recording a single ScanStats object per batch~~ kvserver: record ScanStats once per BatchRequest Feb 22, 2023

yuzefovich marked this pull request as ready for review February 22, 2023 17:26

yuzefovich requested review from a team as code owners February 22, 2023 17:26

yuzefovich requested review from sumeerbhola and DrewKimball February 22, 2023 17:26

yuzefovich added the backport-22.2.x label Feb 22, 2023

DrewKimball reviewed Feb 22, 2023

View reviewed changes

yuzefovich commented Feb 22, 2023

View reviewed changes

DrewKimball reviewed Feb 22, 2023

View reviewed changes

yuzefovich force-pushed the scan-stats branch from 3ea67ec to 1154be5 Compare February 22, 2023 22:05

yuzefovich requested a review from a team February 22, 2023 22:05

yuzefovich requested a review from a team as a code owner February 22, 2023 22:05

yuzefovich commented Feb 22, 2023

View reviewed changes

DrewKimball approved these changes Feb 22, 2023

View reviewed changes

sumeerbhola reviewed Feb 24, 2023

View reviewed changes

craig bot merged commit 647bd0b into cockroachdb:master Feb 24, 2023

yuzefovich deleted the scan-stats branch February 24, 2023 23:46

yuzefovich mentioned this pull request Feb 25, 2023

release-22.2: kvserver: record ScanStats once per BatchRequest #97665

Merged

yuzefovich mentioned this pull request Jun 2, 2023

release-23.1: sql: improve how tracing of processors is done on the gateway node #100534

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: record ScanStats once per BatchRequest #97442

kvserver: record ScanStats once per BatchRequest #97442

yuzefovich commented Feb 22, 2023 •

edited

Loading

blathers-crl bot commented Feb 22, 2023

cockroach-teamcity commented Feb 22, 2023

DrewKimball left a comment

yuzefovich left a comment

DrewKimball left a comment

yuzefovich left a comment

DrewKimball left a comment

yuzefovich commented Feb 23, 2023

sumeerbhola left a comment

yuzefovich commented Feb 24, 2023

craig bot commented Feb 24, 2023

blathers-crl bot commented Feb 24, 2023

kvserver: record ScanStats once per BatchRequest #97442

kvserver: record ScanStats once per BatchRequest #97442

Conversation

yuzefovich commented Feb 22, 2023 • edited Loading

blathers-crl bot commented Feb 22, 2023

cockroach-teamcity commented Feb 22, 2023

DrewKimball left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

yuzefovich commented Feb 23, 2023

sumeerbhola left a comment

Choose a reason for hiding this comment

yuzefovich commented Feb 24, 2023

craig bot commented Feb 24, 2023

blathers-crl bot commented Feb 24, 2023

yuzefovich commented Feb 22, 2023 •

edited

Loading