Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split: Redesign the load-based splitter to be consistent with new rebalancing signals. #93838

Merged
merged 1 commit into from
Jan 3, 2023

Conversation

KaiSun314
Copy link
Contributor

@KaiSun314 KaiSun314 commented Dec 17, 2022

Fixes: #90574

In the current load splitter, we find the split key that best balances the QPS of the left and right sides. As a result, each request is unweighted, since one request contributes one to the QPS. In particular, the load splitter does not differentiate between what kinds of requests they are, how heavy the request is, and what resources these requests consume, which can result in scenarios where QPS is balanced but one side has a lot more work due to a few heavy requests. Moreover, the current load splitter treats requests that contain a split key as “contained”. Optimizing for QPS, contained requests are bad since splitting at a point in a contained request will not help lower the QPS of either side. However, optimizing for other signals like CPU, splitting at a point in a contained request is great as each side will get part of the work of processing that request. This motivates a redesign of the load splitter, one that enables recording weighted requests and considers contained requests in the weight balancing for splitting.

In this PR, we redesign the load-based splitter with the following interface:

  1. Record a point key “start” or span “[start, end)” with a weight “w” at a specific time “ts”, where “w” is some measure of load recorded for a span e.g. Record(ts, start, w) or Record(ts, [start, end), w)
  2. Find a split key such that the load (i.e. total weight) on the resulting split ranges would be as equal as possible according to the recorded loads above e.g. Key()

To make the current load-based splitter (Finder) weighted, we make the following modifications:

  1. Instead of using reservoir sampling, we use weighted reservoir sampling (a simplified version of A-Chao)
  2. Remove the contained counter
  3. Increment the left and right counters by the weight of the request rather than just 1
  4. Treat a weighted range request ([start, end), w) into two weighted point requests (start, w/2) and (end, w/2)

For more details, see (internal)
https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x.

Release note (ops change): The load-based splitter has been redesigned to be more consistent with CPU-based rebalancing rather than QPS-based rebalancing to improve range splits.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@KaiSun314 KaiSun314 force-pushed the weighted-finder branch 7 times, most recently from 6fede2b to e57c3ce Compare December 17, 2022 04:56
@KaiSun314 KaiSun314 marked this pull request as ready for review December 17, 2022 05:17
@KaiSun314 KaiSun314 requested a review from a team as a code owner December 17, 2022 05:17
@KaiSun314 KaiSun314 requested a review from kvoli December 17, 2022 05:23
Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff.

I left a few comments, these are mostly w.r.t readability.

The algorithm itself looks good.

Reviewed 11 of 11 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @KaiSun314)


pkg/kv/kvserver/split/decider.go line 104 at r1 (raw file):

	loadSplitterMetrics *LoadSplitterMetrics,
) {
	if randSource == nil {

What calls this with nil? Could you add a comment here explaining or update the callers to always pass a source.


pkg/kv/kvserver/split/decider.go line 155 at r1 (raw file):

		if d.mu.lastQPS >= d.qpsThreshold() {
			if d.mu.splitFinder == nil {
				d.mu.splitFinder = NewWeightedFinder(now, d.randSource)

The old finder code is still in this commit. What plans do you have for it? This instantiates the weighted finder so does that mean the old code is now unused?

We could either:

  1. Completely remove the old code and tests.
  2. Mark the old code as deprecated e.g. DecprecatedFinder and deprecated_finder.go, then add a cluster setting EnableDeprecatedLBSplitFinder that defaults to false.

pkg/kv/kvserver/split/finder.go line 183 at r1 (raw file):

}

// NoSplitKeyCauseLogMsg returns a log message containing all of this

nit: consider rewording this comment, specifically all and this.


pkg/kv/kvserver/split/weighted_finder.go line 21 at r1 (raw file):

	"github.com/cockroachdb/cockroach/pkg/roachpb"
)

nit: Consider adding a block comment above that links the A-Chao paper or Wikipedia.


pkg/kv/kvserver/split/weighted_finder.go line 29 at r1 (raw file):

}

type RandSource interface {

nit: Add comments to these exported structs.


pkg/kv/kvserver/split/weighted_finder.go line 42 at r1 (raw file):

}

func NewWeightedFinder(startTime time.Time, randSource RandSource) *WeightedFinder {

Add a comment for this fn.


pkg/kv/kvserver/split/weighted_finder.go line 56 at r1 (raw file):

func (f *WeightedFinder) record(key roachpb.Key, weight float64) {
	if f == nil {

When can this condition hit?


pkg/kv/kvserver/split/weighted_finder.go line 68 at r1 (raw file):

	} else if f.randSource.Float64() > splitKeySampleSize*weight/f.totalWeight {
		for i := range f.samples {
			if comp := key.Compare(f.samples[i].key); comp < 0 {

nit: This is an important piece of code. The comment is good, could you also add a small diagram similar to the RFC for the cases to illustrate < and >=.


pkg/kv/kvserver/split/weighted_finder.go line 110 at r1 (raw file):

			continue
		}
		balanceScore := math.Abs(s.left-s.right) / (s.left + s.right)

nit: Likewise to above, this is one of the most important parts of the code. Could you comment an example.


pkg/kv/kvserver/split/load_based_splitter_test.go line 335 at r1 (raw file):

}

func TestCompareFinders(t *testing.T) {

You could change this test to be an example, rather than a unit test - given that there is no assertion atm.

Example_Finder()

see: https://go.dev/blog/examples


pkg/kv/kvserver/split/load_based_splitter_test.go line 342 at r1 (raw file):

	runTestMultipleSettings(t, []settings{
		{
			desc:                    "Weighted Finder",

Add a more descriptive desc e.g. weighted/start=uniform/length=uniform


pkg/kv/kvserver/split/weighted_finder_test.go line 51 at r1 (raw file):

// TestSplitWeightedFinderKey verifies the Key() method correctly
// finds an appropriate split point for the range.
func TestSplitWeightedFinderKey(t *testing.T) {

Are the tests in this file very similar to the unweighted split tests?

I think adding some weights here is necessary for testing - or add granular assertions to the TestCompareFinders.

@KaiSun314 KaiSun314 requested a review from a team as a code owner December 22, 2022 14:50
@KaiSun314 KaiSun314 force-pushed the weighted-finder branch 4 times, most recently from abc261b to 1195672 Compare December 22, 2022 18:22
Copy link
Contributor Author

@KaiSun314 KaiSun314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)


pkg/kv/kvserver/split/decider.go line 104 at r1 (raw file):

Previously, kvoli (Austen) wrote…

What calls this with nil? Could you add a comment here explaining or update the callers to always pass a source.

Done.


pkg/kv/kvserver/split/decider.go line 155 at r1 (raw file):

Previously, kvoli (Austen) wrote…

The old finder code is still in this commit. What plans do you have for it? This instantiates the weighted finder so does that mean the old code is now unused?

We could either:

  1. Completely remove the old code and tests.
  2. Mark the old code as deprecated e.g. DecprecatedFinder and deprecated_finder.go, then add a cluster setting EnableDeprecatedLBSplitFinder that defaults to false.

Done. Marked the old code as deprecated and added a cluster setting that defaults to true (discussed offline)


pkg/kv/kvserver/split/weighted_finder.go line 42 at r1 (raw file):

Previously, kvoli (Austen) wrote…

Add a comment for this fn.

Done.


pkg/kv/kvserver/split/weighted_finder.go line 56 at r1 (raw file):

Previously, kvoli (Austen) wrote…

When can this condition hit?

This condition is probably not hit (we check if the finder is nil in decider.go), but it is there as a safety check I think? This check is in the original finder code, so I also kept the check for the weighted finder


pkg/kv/kvserver/split/load_based_splitter_test.go line 335 at r1 (raw file):

Previously, kvoli (Austen) wrote…

You could change this test to be an example, rather than a unit test - given that there is no assertion atm.

Example_Finder()

see: https://go.dev/blog/examples

Done.


pkg/kv/kvserver/split/load_based_splitter_test.go line 342 at r1 (raw file):

Previously, kvoli (Austen) wrote…

Add a more descriptive desc e.g. weighted/start=uniform/length=uniform

Done.


pkg/kv/kvserver/split/weighted_finder_test.go line 51 at r1 (raw file):

Previously, kvoli (Austen) wrote…

Are the tests in this file very similar to the unweighted split tests?

I think adding some weights here is necessary for testing - or add granular assertions to the TestCompareFinders.

The tests are similar to the unweighted split tests, but there are some differences that test the differences between the finder and weighted finder

E.g. in TestSplitWeightedFinderKey, we only check the balance score (left and right counter difference).
In TestSplitWeightedFinderRecorder, we record a ranged span with weight 1, which should be split up into a point request with weight 0.5 for the start key and a point request with weight 0.5 for the end key. I think this should test the weights?

@blathers-crl
Copy link

blathers-crl bot commented Jan 2, 2023

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Jan 2, 2023
@KaiSun314 KaiSun314 force-pushed the weighted-finder branch 4 times, most recently from eaeadbb to 1c1cfd7 Compare January 2, 2023 19:28
@KaiSun314 KaiSun314 requested a review from kvoli January 2, 2023 20:39
…alancing signals.

Fixes: cockroachdb#90574

In the current load splitter, we find the split key that best balances
the QPS of the left and right sides. As a result, each request is
unweighted, since one request contributes one to the QPS. In particular,
the load splitter does not differentiate between what kinds of requests
they are, how heavy the request is, and what resources these requests
consume, which can result in scenarios where QPS is balanced but one
side has a lot more work due to a few heavy requests. Moreover, the
current load splitter treats requests that contain a split key as
“contained”. Optimizing for QPS, contained requests are bad since
splitting at a point in a contained request will not help lower the QPS
of either side. However, optimizing for other signals like CPU,
splitting at a point in a contained request is great as each side will
get part of the work of processing that request. This motivates a
redesign of the load splitter, one that enables recording weighted
requests and considers contained requests in the weight balancing for
splitting.

In this PR, we redesign the load-based splitter with the following
interface:
1. Record a point key “start” or span “[start, end)” with a weight “w”
at a specific time “ts”, where “w” is some measure of load recorded for
a span e.g. Record(ts, start, w) or Record(ts, [start, end), w)
2. Find a split key such that the load (i.e. total weight) on the
resulting split ranges would be as equal as possible according to the
recorded loads above e.g. Key()

To make the current load-based splitter (Finder) weighted, we make the
following modifications:
1. Instead of using reservoir sampling, we use weighted reservoir
sampling (a simplified version of A-Chao)
2. Remove the contained counter
3. Increment the left and right counters by the weight of the request
rather than just 1
4. Treat a weighted range request ([start, end), w) into two weighted
point requests (start, w/2) and (end, w/2)

For more details, see the design doc:
https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x.

Release note (ops change): The load-based splitter has been redesigned
to be more consistent with CPU-based rebalancing rather than QPS-based
rebalancing to improve range splits. This redesign is disabled by
default currently.
Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Updated some naming in the patch.

Reviewed 1 of 12 files at r2, 2 of 3 files at r3, 10 of 10 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @KaiSun314)


pkg/kv/kvserver/split/decider.go line 67 at r4 (raw file):

}

var enableDeprecatedLBSplitFinder = settings.RegisterBoolSetting(

Going back on what I said earlier - changing the name here to be something more specific than deprecated is better. I'll update.


pkg/kv/kvserver/split/decider.go line 68 at r4 (raw file):

var enableDeprecatedLBSplitFinder = settings.RegisterBoolSetting(
	settings.TenantWritable,

This should be system writable only here, rather than tenant.


pkg/kv/kvserver/split/weighted_finder.go line 100 at r4 (raw file):

			// f.samples[i].Key).
			//
			// Suppose we record the following weighted keys:

This is really great!

@cockroachdb cockroachdb deleted a comment from craig bot Jan 3, 2023
@kvoli kvoli self-requested a review January 3, 2023 18:06
Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @KaiSun314)

@kvoli
Copy link
Collaborator

kvoli commented Jan 3, 2023

bors r+

@KaiSun314
Copy link
Contributor Author

Thank you so much for the review and for updating the finder name, Austen!

@craig
Copy link
Contributor

craig bot commented Jan 3, 2023

Build succeeded:

@craig craig bot merged commit f259e66 into cockroachdb:master Jan 3, 2023
kvoli added a commit to kvoli/cockroach that referenced this pull request Jan 5, 2023
In cockroachdb#93838 we started initializing a new seeded rand for use in the load
based splitter's reservoir sampling algorithm. Previously, the splitter
was initialized using the global rand.

`rand.Source` heap allocates on init approximately 4kb. When initialized
per-replica, this is problematic as nodes scale to large replica counts.

This patch replaces initializing a new random source per-replica with
using the global rand instead.

resolves: cockroachdb#94752
resolves: cockroachdb#94737

Release note: None
craig bot pushed a commit that referenced this pull request Jan 5, 2023
94644: tenantcapabilities: introduce TenantCapabilities proto r=ecwall a=arulajmani

This commit adds the skeletal structure for a `TenantCapabilities` proto,
which is intended to encapsulate capabilities for a specific tenant.
Capabilities are intended to be stored in the `system.tenants` table,
in its Info column. To that end, we modify `TenantInfo` to contain
capabilities. However, actually populating this field through SQL is
left to a future commit.

Future commits will also add the infrastructure required to check a
tenant's requests against its capabilities for "privileged" operations.
For now, I've only accounted for the `CanAdminSplit` capability -- this
will likely expand to a fuller set as we introduce other capabilities
in the system.

References #94643

Epic: CRDB-18503
Release note: None

94754: split: init replica lb splitter with global rand r=erikgrinaker a=kvoli

In #93838 we started initializing a new seeded rand for use in the load
based splitter's reservoir sampling algorithm. Previously, the splitter
was initialized using the global rand.

`rand.Source` heap allocates on init approximately 4kb. When initialized
per-replica, this is problematic as nodes scale to large replica counts.

This patch replaces initializing a new random source per-replica with
using the global rand instead.

This removes the heap allocation. This is shown below running 
`kv/splits/nodes=3/quiesce=true` (not at identical replica counts in the test,
 the proportions are the important part).

before:

![image](https://user-images.githubusercontent.com/39606633/210825416-a940904f-47c9-4271-9692-11b40be927f4.png)

after:

![image](https://user-images.githubusercontent.com/39606633/210825742-fc62ea6c-2125-44fe-ac07-984c22986089.png)

resolves: #94752
resolves: #94737

Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: redesign load-based splitter to be consistent with new rebalancing signals
3 participants