Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add ALTER RANGE x SPLIT syntax #77972

Closed
wants to merge 1 commit into from

Conversation

lunevalex
Copy link
Collaborator

Fixes #55116

Extend the ALTER RANGE syntax to support splitting a range.
ALTER RANGE x SPLIT will split a range at the best available
load based key. The new command will not perform a split directly
but rather advise the load based split decider to ignore the QPS
threshold and perform a load based split based on the lastest
available information. The decider has a 10s timeout built in between
deciding to split and determing the split key, so the actual split will
occur approximately 10s after the commnad is run.

To support all of this we changed AdminSplitRequest and added a new parameter
LoadBased. When set this will trigger a load based split on the range
through the split_queue.

Release note(sql change): Enhance the ALTER RANGE syntax to
allow an operator to manually perform a load based split of
a range.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @lunevalex)


pkg/sql/parser/sql.y, line 2120 at r1 (raw file):

    }
  }
| ALTER RANGE SPLIT FOR select_stmt WITH EXPIRATION a_expr

I recommend putting the FOR select_stmt at the end in every case. Every time we want grammar with a keyword on the right side of a SELECT clause at a top-level statement, we're closing the door to syntax extensions within SELECT itself.

@lunevalex lunevalex marked this pull request as ready for review June 13, 2022 21:18
@lunevalex lunevalex requested review from a team as code owners June 13, 2022 21:18
@lunevalex lunevalex requested a review from a team June 13, 2022 21:18
@lunevalex lunevalex requested review from a team as code owners June 13, 2022 21:18
@lunevalex lunevalex force-pushed the alex/load-split branch 3 times, most recently from e2203d5 to 91930ed Compare June 14, 2022 17:39
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 32 files at r1, 28 of 38 files at r2, 1 of 3 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz, @lunevalex, and @nvanbenschoten)


pkg/cli/cliflags/flags.go line 1249 at r3 (raw file):

	DemoWorkloadMaxQPS = FlagInfo{
		Name:        "workload-max-qps",
		Description: "The maximum QPS when a workload is  running.",

This looks accidental.


pkg/kv/db.go line 604 at r3 (raw file):

"as determined by the range stats"

This doesn't sound correct. We're not using the range stats to compute the split key, we're using a 10s snapshot of observed query load.


pkg/kv/db.go line 613 at r3 (raw file):

//
func (db *DB) AdminLoadBasedSplit(
	ctx context.Context, startKey interface{}, expirationTime hlc.Timestamp,

Let's add a sentence to the comment above about what startKey means.


pkg/kv/kvserver/replica_command.go line 338 at r3 (raw file):

	// traffic to this range.
	if args.LoadBased {
		r.loadBasedSplitter.ForceSplitDecision()

Are there cases where the split may not immediately go into effect? Or may fail? Should we be waiting on the split instead of immediately returning to the client so that we can return the result? If we want return a real response, we could use r.store.splitQueue.MaybeAddCallback to listen to the outcome of the next range split attempt, like we do when backpressuring writes on range splits.


pkg/kv/kvserver/split/decider.go line 98 at r3 (raw file):

// ForceSplitDecision forces the Decider to enable tracking of individual keys
// to split on, which will result in a split happening once enough
// data is collected. .

nit: double "."

@aayushshah15 aayushshah15 self-assigned this Aug 15, 2022
@lunevalex lunevalex requested a review from a team as a code owner September 27, 2022 20:40
@lunevalex
Copy link
Collaborator Author

@nvanbenschoten @aayushshah15 I finally got a few free moments, so I fixed up the cosmetic comments and rebased this to the latest master.

This leaves one open question on the UX of calling AdminSplitLoad and whether it should return immediately or wait for the split to be applied. In all cases it will take at least 10s to apply the split to the range, for that to occur the range has to also experience some traffic so that the system could measure QPS on the range. The way this works right now is that a range is selected for a load split, we then activate the machinery to measure QPS per key to pick a good key to split on, this is done for 10s. Once the data is collected the split is executed. I specifically did not add any waiting to the replica_command path because this flow is non-deterministic. This is why the UX for ALTER RANGE X SPLIT is that it says split requested should happen in X seconds, rather than have the UI hang until it detects a split, which may not happen or happen past the timeout expiration. Happy to change that but would like some guidance on what UX we want here.

Fixes cockroachdb#55116

Extend the ALTER RANGE syntax to support splitting a range.
ALTER RANGE x SPLIT will split a range at the best available
load based key. The new command will not perform a split directly
but rather advise the load based split decider to ignore the QPS
threshold and perform a load based split based on the lastest
available information. The decider has a 10s timeout built in between
deciding to split and determing the split key, so the actual split will
occur approximately 10s after the commnad is run.

To support all of this we changed AdminSplitRequest and added a new parameter
LoadBased. When set this will trigger a load based split on the range
through the split_queue.

Release note(sql change): Enhance the ALTER RANGE syntax to
allow an operator to manually perform a load based split of
a range.
@rafiss rafiss removed the request for review from a team April 26, 2023 20:06
@lunevalex lunevalex closed this Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

debug: add capability to "force" a load-based split
5 participants