Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: record ScanStats once per BatchRequest #97442

Merged
merged 1 commit into from
Feb 24, 2023

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Feb 22, 2023

Previously, we would record a kvpb.ScanStats object into the trace for
each evaluated Get, Scan, and ReverseScan command. This was suboptimal
for two reasons:

  • this required an allocation of that kvpb.ScanStats object
  • this required propagating all of these separate objects via the
    tracing infrastructure which might make it so that the tracing limits
    are reached resulting in some objects being dropped.

This commit, instead, changes the ScanStats to be tracked at the
BatchRequest level, thus, we only need to record a single object per
BatchRequest. This results in reduced granularity, but that is still
sufficient for the SQL needs which simply aggregates all
kvpb.ScanStats from a single SQL processor into one object. As
a result, the tpch_concurrency metric averaged over 20 runs increased
from 76.75 to 84.75.

Additionally, this commit makes it so that we track the number of Gets,
Scans, and ReverseScans actually evaluated as part of the BatchResponse.
This information is plumbed through a couple of protos but is not
exposed in any SQL Observability virtual tables. Still, due to having it
in the protos will include this information into the trace.

Informs: #64906.
Fixes: #71351.

Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Feb 22, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the scan-stats branch 2 times, most recently from fba2dd5 to 3ea67ec Compare February 22, 2023 17:24
@yuzefovich yuzefovich changed the title WIP on recording a single ScanStats object per batch kvserver: record ScanStats once per BatchRequest Feb 22, 2023
@yuzefovich yuzefovich marked this pull request as ready for review February 22, 2023 17:26
@yuzefovich yuzefovich requested review from a team as code owners February 22, 2023 17:26
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)


pkg/kv/kvserver/replica_evaluate.go line 250 at r1 (raw file):

				// record the scan stats for all batches, even if there are no
				// Gets nor Scans? This will avoid the need to figure out the
				// value of foundGetOrScan.

The RecordStructured method doesn't own it's argument - maybe we could keep a ScanStats struct somewhere further up the stack (Replica maybe) and pass it into evaluateBatch? I think we can be confident that it would be cheap enough to unconditionally record in that case. What do you think?

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @sumeerbhola)


pkg/kv/kvserver/replica_evaluate.go line 250 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

The RecordStructured method doesn't own it's argument - maybe we could keep a ScanStats struct somewhere further up the stack (Replica maybe) and pass it into evaluateBatch? I think we can be confident that it would be cheap enough to unconditionally record in that case. What do you think?

Are you suggesting to effectively reduce allocations further? It doesn't seem worth it to me - a single kvpb.ScanStats allocation per BatchRequest seems reasonable. My comment here is about whether we should remove the conditional on foundGetOrScan and just call sp.RecordStructured(scanStats) in all cases, even for write-only batches (for which the object will contain all zeros). That shouldn't be a performance concern either, I'm thinking more about this being confusing - i.e. a write-only BatchRequest records the ScanStats into the trace (perhaps with all zeros it wouldn't be confusing). Thoughts?

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)


pkg/kv/kvserver/replica_evaluate.go line 250 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Are you suggesting to effectively reduce allocations further? It doesn't seem worth it to me - a single kvpb.ScanStats allocation per BatchRequest seems reasonable. My comment here is about whether we should remove the conditional on foundGetOrScan and just call sp.RecordStructured(scanStats) in all cases, even for write-only batches (for which the object will contain all zeros). That shouldn't be a performance concern either, I'm thinking more about this being confusing - i.e. a write-only BatchRequest records the ScanStats into the trace (perhaps with all zeros it wouldn't be confusing). Thoughts?

If we aren't concerned with allocating the struct in those cases, I'd be in favor of reducing the code complexity. Since we're now aggregating over multiple requests, what do you think about tracking the number of scan requests and get requests in ScanStats? This could be useful when viewing traces, and could be used to avoid recording anything when there are no gets or scans.

Previously, we would record a `kvpb.ScanStats` object into the trace for
each evaluated Get, Scan, and ReverseScan command. This was suboptimal
for two reasons:
- this required an allocation of that `kvpb.ScanStats` object
- this required propagating all of these separate objects via the
tracing infrastructure which might make it so that the tracing limits
are reached resulting in some objects being dropped.

This commit, instead, changes the ScanStats to be tracked at the
BatchRequest level, thus, we only need to record a single object per
BatchRequest. This results in reduced granularity, but that is still
sufficient for the SQL needs which simply aggregates all
`kvpb.ScanStats` from a single SQL processor into one object. As
a result, the tpch_concurrency metric averaged over 20 runs increased
from 76.75 to 84.75.

Additionally, this commit makes it so that we track the number of Gets,
Scans, and ReverseScans actually evaluated as part of the BatchResponse.
This information is plumbed through a couple of protos but is not
exposed in any SQL Observability virtual tables. Still, due to having it
in the protos will include this information into the trace.

Release note: None
@yuzefovich yuzefovich requested a review from a team February 22, 2023 22:05
@yuzefovich yuzefovich requested a review from a team as a code owner February 22, 2023 22:05
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @sumeerbhola)


pkg/kv/kvserver/replica_evaluate.go line 250 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

If we aren't concerned with allocating the struct in those cases, I'd be in favor of reducing the code complexity. Since we're now aggregating over multiple requests, what do you think about tracking the number of scan requests and get requests in ScanStats? This could be useful when viewing traces, and could be used to avoid recording anything when there are no gets or scans.

Thanks, I like this idea, done.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 8 files at r1, 8 of 8 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)

@yuzefovich
Copy link
Member Author

@cockroachdb/storage @cockroachdb/kv-prs does some want to give this a look? The changes seem pretty straightforward, so I don't think it's strictly necessary.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok. But do we have any test coverage (I realize this lack of coverage probably predates your change, so sorry to put you on the spot)? If not can you add a test that verifies the trace output?

Reviewed 1 of 8 files at r1, 2 of 8 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)

@yuzefovich
Copy link
Member Author

We do have a test for this TestExplainMVCCSteps in sql/explain_test.go.

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 24, 2023

Build succeeded:

@craig craig bot merged commit 647bd0b into cockroachdb:master Feb 24, 2023
@blathers-crl
Copy link

blathers-crl bot commented Feb 24, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 1154be5 to blathers/backport-release-22.2-97442: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: make the execution stats collection have less impact on the memory usage
4 participants