kvserver: Fix performance regression due to new call to collectSpansRead #91462

KaiSun314 · 2022-11-08T04:56:53Z

When we incorporated the use of response data in the load-based splitter, we called collectSpansRead, which is allocation heavy and computationally expensive, resulting in a performance regression.

To address this, this patch performs 3 optimizations:

Remove the call to collectSpansRead; instead, add a custom function to iterate over the batch of requests / responses and calculate the true spans
Instead of constructing a *spanset.SpanSet and finding the union of spans (which uses O(batch_size) memory), we directly compute the union of spans while iterating over the batch resulting in only O(1) memory used
Lazily compute the union of true spans only when it is truly needed i.e. we are under heavy load (e.g. >2500QPS) and a load-based splitter has been initialized

Cherry-picking this commit to the commit right before we incorporated response data in the load-based splitter (068845f) and running

~/benchdiff/benchdiff --old=068845ff72315f8b64f0e930c17c48f078203bc4 --new=abf61ce75c47e16bc39ed0e714f2e46f1d97eb7c --count=20 --post-checkout='./dev generate go' --run='KV/././rows=1$$' ./pkg/sql/tests

the output is:

Release note: None

cockroach-teamcity · 2022-11-08T04:57:00Z

This change is

kvoli · 2022-11-08T15:43:57Z

Could you add the bench diff for comparison.

KaiSun314 · 2022-11-08T22:03:27Z

Could you add the bench diff for comparison.

Added

kvoli

Some nits / questions - the results look good.

Reviewed 3 of 4 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @KaiSun314)

pkg/kv/kvserver/replica_send.go line 404 at r1 (raw file):

	defer func() {
		// Handle load-based splitting, if necessary.
		if br != nil {

Is there a reason you swapped to checking the br as opposed to the error - are there cases where we would return a pErr and also a br here?

pkg/kv/kvserver/replica_split_load.go line 51 at r1 (raw file):

// getResponseBoundarySpan computes the union span of the true spans that were
// iterated over (using the request span and the response's resumeSpan).

nit : drop the parens.

KaiSun314

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @kvoli)

pkg/kv/kvserver/replica_send.go line 404 at r1 (raw file):

Previously, kvoli (Austen) wrote…

Is there a reason you swapped to checking the br as opposed to the error - are there cases where we would return a pErr and also a br here?

It seems that executeBatchWithConcurrencyRetries returns either a non-nil br and nil pErr, or a nil br and a non-nil pErr.

To be safe though, I added both checks.

KaiSun314

Thank you for the review Austen!

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @kvoli)

kvoli

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @KaiSun314)

pkg/kv/kvserver/replica_split_load.go line 77 at r3 (raw file):

		}

		// TODO(kaisun): There are a few situations where the request did not

Could we add a tracking issue for this as well? Just to mention it is known behavior and was apparent in the previous request based splitter too.

KaiSun314

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @kvoli)

pkg/kv/kvserver/replica_split_load.go line 77 at r3 (raw file):

Previously, kvoli (Austen) wrote…

Could we add a tracking issue for this as well? Just to mention it is known behavior and was apparent in the previous request based splitter too.

Done.

tbg · 2022-11-13T22:26:26Z

Drive-by comment, the release note seems much too technical. Release notes are consumed by the docs team who will prepare them for consumption by customers. It's unclear to me what they could conceivably make of the release note at hand. Also, I'm not sure a release note is even necessary: the perf regression never made it into a release, right?

KaiSun314 · 2022-11-14T20:38:22Z

Ah true good point Tobi, thanks! I have changed to none for the release note.

When we incorporated the use of response data in the load-based splitter, we called collectSpansRead, which is allocation heavy and computationally expensive, resulting in a performance regression. To address this, this patch performs 3 optimizations: 1. Remove the call to collectSpansRead; instead, add a custom function to iterate over the batch of requests / responses and calculate the true spans 2. Instead of constructing a *spanset.SpanSet and finding the union of spans (which uses O(batch_size) memory), we directly compute the union of spans while iterating over the batch resulting in only O(1) memory used 3. Lazily compute the union of true spans only when it is truly needed i.e. we are under heavy load (e.g. >2500QPS) and a load-based splitter has been initialized Release note: None

kvoli

Thanks for updating the test structure.

Reviewed 3 of 3 files at r6, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @KaiSun314)

KaiSun314 · 2022-11-22T22:35:56Z

Thank you so much for the review!

bors r+

craig · 2022-11-22T23:02:05Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

craig · 2022-11-23T01:53:12Z

Build succeeded:

Bazel Essential CI (Cockroach)

KaiSun314 requested a review from a team as a code owner November 8, 2022 04:56

KaiSun314 force-pushed the use-response-data-more-efficient branch from 6023a92 to 0ed1846 Compare November 8, 2022 05:12

KaiSun314 requested a review from kvoli November 8, 2022 15:08

kvoli approved these changes Nov 9, 2022

View reviewed changes

KaiSun314 force-pushed the use-response-data-more-efficient branch from 0ed1846 to 68b0f01 Compare November 9, 2022 22:34

KaiSun314 commented Nov 9, 2022

View reviewed changes

KaiSun314 force-pushed the use-response-data-more-efficient branch from 68b0f01 to 343085f Compare November 10, 2022 17:12

kvoli approved these changes Nov 10, 2022

View reviewed changes

KaiSun314 force-pushed the use-response-data-more-efficient branch from 343085f to 80078ec Compare November 11, 2022 03:45

KaiSun314 commented Nov 11, 2022

View reviewed changes

KaiSun314 force-pushed the use-response-data-more-efficient branch from 80078ec to 9589632 Compare November 14, 2022 20:35

nvanbenschoten mentioned this pull request Nov 14, 2022

kvserver: Treat requests that were not evaluated specially in the span passed to the load-based splitter #91723

Closed

KaiSun314 force-pushed the use-response-data-more-efficient branch from 9589632 to abb7aa5 Compare November 14, 2022 22:43

KaiSun314 force-pushed the use-response-data-more-efficient branch from abb7aa5 to ca76c28 Compare November 15, 2022 00:57

kvoli approved these changes Nov 22, 2022

View reviewed changes

nvanbenschoten approved these changes Nov 22, 2022

View reviewed changes

craig bot merged commit 0d9669a into cockroachdb:master Nov 23, 2022

kvoli mentioned this pull request Jun 7, 2023

release-22.2: kvserver: avoid load based splits in middle of SQL row #104563

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: Fix performance regression due to new call to collectSpansRead #91462

kvserver: Fix performance regression due to new call to collectSpansRead #91462

KaiSun314 commented Nov 8, 2022 •

edited

Loading

cockroach-teamcity commented Nov 8, 2022

kvoli commented Nov 8, 2022

KaiSun314 commented Nov 8, 2022 •

edited

Loading

kvoli left a comment

KaiSun314 left a comment

KaiSun314 left a comment

kvoli left a comment

KaiSun314 left a comment

tbg commented Nov 13, 2022

KaiSun314 commented Nov 14, 2022

kvoli left a comment

KaiSun314 commented Nov 22, 2022

craig bot commented Nov 22, 2022

craig bot commented Nov 23, 2022

kvserver: Fix performance regression due to new call to collectSpansRead #91462

kvserver: Fix performance regression due to new call to collectSpansRead #91462

Conversation

KaiSun314 commented Nov 8, 2022 • edited Loading

cockroach-teamcity commented Nov 8, 2022

kvoli commented Nov 8, 2022

KaiSun314 commented Nov 8, 2022 • edited Loading

kvoli left a comment

Choose a reason for hiding this comment

KaiSun314 left a comment

Choose a reason for hiding this comment

KaiSun314 left a comment

Choose a reason for hiding this comment

kvoli left a comment

Choose a reason for hiding this comment

KaiSun314 left a comment

Choose a reason for hiding this comment

tbg commented Nov 13, 2022

KaiSun314 commented Nov 14, 2022

kvoli left a comment

Choose a reason for hiding this comment

KaiSun314 commented Nov 22, 2022

craig bot commented Nov 22, 2022

craig bot commented Nov 23, 2022

KaiSun314 commented Nov 8, 2022 •

edited

Loading

KaiSun314 commented Nov 8, 2022 •

edited

Loading