split: Redesign the load-based splitter to be consistent with new rebalancing signals. #93838

KaiSun314 · 2022-12-17T00:29:07Z

In the current load splitter, we find the split key that best balances the QPS of the left and right sides. As a result, each request is unweighted, since one request contributes one to the QPS. In particular, the load splitter does not differentiate between what kinds of requests they are, how heavy the request is, and what resources these requests consume, which can result in scenarios where QPS is balanced but one side has a lot more work due to a few heavy requests. Moreover, the current load splitter treats requests that contain a split key as “contained”. Optimizing for QPS, contained requests are bad since splitting at a point in a contained request will not help lower the QPS of either side. However, optimizing for other signals like CPU, splitting at a point in a contained request is great as each side will get part of the work of processing that request. This motivates a redesign of the load splitter, one that enables recording weighted requests and considers contained requests in the weight balancing for splitting.

In this PR, we redesign the load-based splitter with the following interface:

Record a point key “start” or span “[start, end)” with a weight “w” at a specific time “ts”, where “w” is some measure of load recorded for a span e.g. Record(ts, start, w) or Record(ts, [start, end), w)
Find a split key such that the load (i.e. total weight) on the resulting split ranges would be as equal as possible according to the recorded loads above e.g. Key()

To make the current load-based splitter (Finder) weighted, we make the following modifications:

Instead of using reservoir sampling, we use weighted reservoir sampling (a simplified version of A-Chao)
Remove the contained counter
Increment the left and right counters by the weight of the request rather than just 1
Treat a weighted range request ([start, end), w) into two weighted point requests (start, w/2) and (end, w/2)

For more details, see (internal)
https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x.

Release note (ops change): The load-based splitter has been redesigned to be more consistent with CPU-based rebalancing rather than QPS-based rebalancing to improve range splits.

cockroach-teamcity · 2022-12-17T00:29:16Z

This change is

kvoli

Great stuff.

I left a few comments, these are mostly w.r.t readability.

The algorithm itself looks good.

Reviewed 11 of 11 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @KaiSun314)

pkg/kv/kvserver/split/decider.go line 104 at r1 (raw file):

	loadSplitterMetrics *LoadSplitterMetrics,
) {
	if randSource == nil {

What calls this with nil? Could you add a comment here explaining or update the callers to always pass a source.

pkg/kv/kvserver/split/decider.go line 155 at r1 (raw file):

		if d.mu.lastQPS >= d.qpsThreshold() {
			if d.mu.splitFinder == nil {
				d.mu.splitFinder = NewWeightedFinder(now, d.randSource)

The old finder code is still in this commit. What plans do you have for it? This instantiates the weighted finder so does that mean the old code is now unused?

We could either:

Completely remove the old code and tests.
Mark the old code as deprecated e.g. DecprecatedFinder and deprecated_finder.go, then add a cluster setting EnableDeprecatedLBSplitFinder that defaults to false.

pkg/kv/kvserver/split/finder.go line 183 at r1 (raw file):

}

// NoSplitKeyCauseLogMsg returns a log message containing all of this

nit: consider rewording this comment, specifically all and this.

pkg/kv/kvserver/split/weighted_finder.go line 21 at r1 (raw file):

	"github.com/cockroachdb/cockroach/pkg/roachpb"
)

nit: Consider adding a block comment above that links the A-Chao paper or Wikipedia.

pkg/kv/kvserver/split/weighted_finder.go line 29 at r1 (raw file):

}

type RandSource interface {

nit: Add comments to these exported structs.

pkg/kv/kvserver/split/weighted_finder.go line 42 at r1 (raw file):

}

func NewWeightedFinder(startTime time.Time, randSource RandSource) *WeightedFinder {

Add a comment for this fn.

pkg/kv/kvserver/split/weighted_finder.go line 56 at r1 (raw file):

func (f *WeightedFinder) record(key roachpb.Key, weight float64) {
	if f == nil {

When can this condition hit?

pkg/kv/kvserver/split/weighted_finder.go line 68 at r1 (raw file):

	} else if f.randSource.Float64() > splitKeySampleSize*weight/f.totalWeight {
		for i := range f.samples {
			if comp := key.Compare(f.samples[i].key); comp < 0 {

nit: This is an important piece of code. The comment is good, could you also add a small diagram similar to the RFC for the cases to illustrate < and >=.

pkg/kv/kvserver/split/weighted_finder.go line 110 at r1 (raw file):

			continue
		}
		balanceScore := math.Abs(s.left-s.right) / (s.left + s.right)

nit: Likewise to above, this is one of the most important parts of the code. Could you comment an example.

pkg/kv/kvserver/split/load_based_splitter_test.go line 335 at r1 (raw file):

}

func TestCompareFinders(t *testing.T) {

You could change this test to be an example, rather than a unit test - given that there is no assertion atm.

Example_Finder()

see: https://go.dev/blog/examples

pkg/kv/kvserver/split/load_based_splitter_test.go line 342 at r1 (raw file):

	runTestMultipleSettings(t, []settings{
		{
			desc:                    "Weighted Finder",

Add a more descriptive desc e.g. weighted/start=uniform/length=uniform

pkg/kv/kvserver/split/weighted_finder_test.go line 51 at r1 (raw file):

// TestSplitWeightedFinderKey verifies the Key() method correctly
// finds an appropriate split point for the range.
func TestSplitWeightedFinderKey(t *testing.T) {

Are the tests in this file very similar to the unweighted split tests?

I think adding some weights here is necessary for testing - or add granular assertions to the TestCompareFinders.

KaiSun314

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)

pkg/kv/kvserver/split/decider.go line 104 at r1 (raw file):