-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: redesign load-based splitter to be consistent with new rebalancing signals #90574
Closed
Tracked by
#90582
Labels
A-kv-distribution
Relating to rebalancing and leasing.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Milestone
Comments
KaiSun314
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-kv-distribution
Relating to rebalancing and leasing.
labels
Oct 24, 2022
13 tasks
irfansharif
changed the title
Redesign Load-Based Splitter Consistent With New Rebalancing Signals
kvserver: redesign load-based splitter consistent with new rebalancing signals
Oct 28, 2022
irfansharif
changed the title
kvserver: redesign load-based splitter consistent with new rebalancing signals
kvserver: redesign load-based splitter to be consistent with new rebalancing signals
Oct 28, 2022
craig bot
pushed a commit
that referenced
this issue
Nov 14, 2022
91636: split: Add a testing framework to benchmark load splitter approaches r=kvoli a=KaiSun314 Informs: #90574 Here, we create a testing framework to benchmark load-based splitters, to enable experimenting and comparison of different designs and settings. This testing framework inputs: 1. Generator settings to generate the requests for the load splitter to record - Start key generator type (zipfian or uniform) and iMax - Span length generator type (zipfian or uniform) and iMax - Weight generator type (zipfian or uniform) and iMax 2. Range request percent (percent of range requests [startKey, endKey) as opposed to point requests with just a start key) 3. Load-based splitter constructor 4. Random seed This testing framework performs the following work: 1. Generates the requests for the load splitter to record 2. Calculates the optimal split key and its left / right weights 3. Constructs the load splitter and invokes the load splitter's Record function on all the generated requests 4. Invokes the load splitter's Key function to get the split key and calculates its left / right weights 5. Maintains the times it took to execute the load splitter's Record and Key functions This testing framework also supports repeating the test multiple times with different random numbers and calculating the average / max percentage differences of left / right weights and execution times. This testing framework also supports running tests with different settings and printing the test results for each setting. Example of testing framework output: <img width="1452" alt="Load-Based Splitter Testing Framework Sample Output" src="https://user-images.githubusercontent.com/28762332/201250317-7bf71fd7-0ef2-44d9-807a-45c66aab79ea.png"> Release note: None Co-authored-by: Kai Sun <[email protected]>
KaiSun314
added a commit
to KaiSun314/cockroach
that referenced
this issue
Dec 17, 2022
…alancing signals. Fixes: cockroachdb#90574 In the current load splitter, we find the split key that best balances the QPS of the left and right sides. As a result, each request is unweighted, since one request contributes one to the QPS. In particular, the load splitter does not differentiate between what kinds of requests they are, how heavy the request is, and what resources these requests consume, which can result in scenarios where QPS is balanced but one side has a lot more work due to a few heavy requests. Moreover, the current load splitter treats requests that contain a split key as “contained”. Optimizing for QPS, contained requests are bad since splitting at a point in a contained request will not help lower the QPS of either side. However, optimizing for other signals like CPU, splitting at a point in a contained request is great as each side will get part of the work of processing that request. This motivates a redesign of the load splitter, one that enables recording weighted requests and considers contained requests in the weight balancing for splitting. In this PR, we redesign the load-based splitter with the following interface: 1. Record a point key “start” or span “[start, end)” with a weight “w” at a specific time “ts”, where “w” is some measure of load recorded for a span e.g. Record(ts, start, w) or Record(ts, [start, end), w) 2. Find a split key such that the load (i.e. total weight) on the resulting split ranges would be as equal as possible according to the recorded loads above e.g. Key() To make the current load-based splitter (Finder) weighted, we make the following modifications: 1. Instead of using reservoir sampling, we use weighted reservoir sampling (a simplified version of A-Chao) 2. Remove the contained counter 3. Increment the left and right counters by the weight of the request rather than just 1 4. Treat a weighted range request ([start, end), w) into two weighted point requests (start, w/2) and (end, w/2) For more details, see the design doc: https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x. Release note (ops change): The load-based splitter has been redesigned to be more consistent with CPU-based rebalancing rather than QPS-based rebalancing to improve range splits.
KaiSun314
added a commit
to KaiSun314/cockroach
that referenced
this issue
Dec 22, 2022
…alancing signals. Fixes: cockroachdb#90574 In the current load splitter, we find the split key that best balances the QPS of the left and right sides. As a result, each request is unweighted, since one request contributes one to the QPS. In particular, the load splitter does not differentiate between what kinds of requests they are, how heavy the request is, and what resources these requests consume, which can result in scenarios where QPS is balanced but one side has a lot more work due to a few heavy requests. Moreover, the current load splitter treats requests that contain a split key as “contained”. Optimizing for QPS, contained requests are bad since splitting at a point in a contained request will not help lower the QPS of either side. However, optimizing for other signals like CPU, splitting at a point in a contained request is great as each side will get part of the work of processing that request. This motivates a redesign of the load splitter, one that enables recording weighted requests and considers contained requests in the weight balancing for splitting. In this PR, we redesign the load-based splitter with the following interface: 1. Record a point key “start” or span “[start, end)” with a weight “w” at a specific time “ts”, where “w” is some measure of load recorded for a span e.g. Record(ts, start, w) or Record(ts, [start, end), w) 2. Find a split key such that the load (i.e. total weight) on the resulting split ranges would be as equal as possible according to the recorded loads above e.g. Key() To make the current load-based splitter (Finder) weighted, we make the following modifications: 1. Instead of using reservoir sampling, we use weighted reservoir sampling (a simplified version of A-Chao) 2. Remove the contained counter 3. Increment the left and right counters by the weight of the request rather than just 1 4. Treat a weighted range request ([start, end), w) into two weighted point requests (start, w/2) and (end, w/2) For more details, see the design doc: https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x. Release note (ops change): The load-based splitter has been redesigned to be more consistent with CPU-based rebalancing rather than QPS-based rebalancing to improve range splits.
KaiSun314
added a commit
to KaiSun314/cockroach
that referenced
this issue
Dec 22, 2022
…alancing signals. Fixes: cockroachdb#90574 In the current load splitter, we find the split key that best balances the QPS of the left and right sides. As a result, each request is unweighted, since one request contributes one to the QPS. In particular, the load splitter does not differentiate between what kinds of requests they are, how heavy the request is, and what resources these requests consume, which can result in scenarios where QPS is balanced but one side has a lot more work due to a few heavy requests. Moreover, the current load splitter treats requests that contain a split key as “contained”. Optimizing for QPS, contained requests are bad since splitting at a point in a contained request will not help lower the QPS of either side. However, optimizing for other signals like CPU, splitting at a point in a contained request is great as each side will get part of the work of processing that request. This motivates a redesign of the load splitter, one that enables recording weighted requests and considers contained requests in the weight balancing for splitting. In this PR, we redesign the load-based splitter with the following interface: 1. Record a point key “start” or span “[start, end)” with a weight “w” at a specific time “ts”, where “w” is some measure of load recorded for a span e.g. Record(ts, start, w) or Record(ts, [start, end), w) 2. Find a split key such that the load (i.e. total weight) on the resulting split ranges would be as equal as possible according to the recorded loads above e.g. Key() To make the current load-based splitter (Finder) weighted, we make the following modifications: 1. Instead of using reservoir sampling, we use weighted reservoir sampling (a simplified version of A-Chao) 2. Remove the contained counter 3. Increment the left and right counters by the weight of the request rather than just 1 4. Treat a weighted range request ([start, end), w) into two weighted point requests (start, w/2) and (end, w/2) For more details, see the design doc: https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x. Release note (ops change): The load-based splitter has been redesigned to be more consistent with CPU-based rebalancing rather than QPS-based rebalancing to improve range splits.
KaiSun314
added a commit
to KaiSun314/cockroach
that referenced
this issue
Jan 2, 2023
…alancing signals. Fixes: cockroachdb#90574 In the current load splitter, we find the split key that best balances the QPS of the left and right sides. As a result, each request is unweighted, since one request contributes one to the QPS. In particular, the load splitter does not differentiate between what kinds of requests they are, how heavy the request is, and what resources these requests consume, which can result in scenarios where QPS is balanced but one side has a lot more work due to a few heavy requests. Moreover, the current load splitter treats requests that contain a split key as “contained”. Optimizing for QPS, contained requests are bad since splitting at a point in a contained request will not help lower the QPS of either side. However, optimizing for other signals like CPU, splitting at a point in a contained request is great as each side will get part of the work of processing that request. This motivates a redesign of the load splitter, one that enables recording weighted requests and considers contained requests in the weight balancing for splitting. In this PR, we redesign the load-based splitter with the following interface: 1. Record a point key “start” or span “[start, end)” with a weight “w” at a specific time “ts”, where “w” is some measure of load recorded for a span e.g. Record(ts, start, w) or Record(ts, [start, end), w) 2. Find a split key such that the load (i.e. total weight) on the resulting split ranges would be as equal as possible according to the recorded loads above e.g. Key() To make the current load-based splitter (Finder) weighted, we make the following modifications: 1. Instead of using reservoir sampling, we use weighted reservoir sampling (a simplified version of A-Chao) 2. Remove the contained counter 3. Increment the left and right counters by the weight of the request rather than just 1 4. Treat a weighted range request ([start, end), w) into two weighted point requests (start, w/2) and (end, w/2) For more details, see the design doc: https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x. Release note (ops change): The load-based splitter has been redesigned to be more consistent with CPU-based rebalancing rather than QPS-based rebalancing to improve range splits.
craig bot
pushed a commit
that referenced
this issue
Jan 3, 2023
91593: kvserver: add load vec r=nvanbenschoten,andrewbaptist a=kvoli Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to inputs, enable modifying existing dimensions and adding new ones with less code modification. The only load dimension added in this patch is `Queries`, which is then used in place of `QueriesPerSecond` within the rebalancing logic. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying tracking datastructure `ReplicaStats`. Part of #91152 Release note: None 93838: split: Redesign the load-based splitter to be consistent with new rebalancing signals. r=kvoli a=KaiSun314 Fixes: #90574 In the current load splitter, we find the split key that best balances the QPS of the left and right sides. As a result, each request is unweighted, since one request contributes one to the QPS. In particular, the load splitter does not differentiate between what kinds of requests they are, how heavy the request is, and what resources these requests consume, which can result in scenarios where QPS is balanced but one side has a lot more work due to a few heavy requests. Moreover, the current load splitter treats requests that contain a split key as “contained”. Optimizing for QPS, contained requests are bad since splitting at a point in a contained request will not help lower the QPS of either side. However, optimizing for other signals like CPU, splitting at a point in a contained request is great as each side will get part of the work of processing that request. This motivates a redesign of the load splitter, one that enables recording weighted requests and considers contained requests in the weight balancing for splitting. In this PR, we redesign the load-based splitter with the following interface: 1. Record a point key “start” or span “[start, end)” with a weight “w” at a specific time “ts”, where “w” is some measure of load recorded for a span e.g. Record(ts, start, w) or Record(ts, [start, end), w) 2. Find a split key such that the load (i.e. total weight) on the resulting split ranges would be as equal as possible according to the recorded loads above e.g. Key() To make the current load-based splitter (Finder) weighted, we make the following modifications: 1. Instead of using reservoir sampling, we use weighted reservoir sampling (a simplified version of A-Chao) 2. Remove the contained counter 3. Increment the left and right counters by the weight of the request rather than just 1 4. Treat a weighted range request ([start, end), w) into two weighted point requests (start, w/2) and (end, w/2) For more details, see (internal) https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x. Release note (ops change): The load-based splitter has been redesigned to be more consistent with CPU-based rebalancing rather than QPS-based rebalancing to improve range splits. Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Kai Sun <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-distribution
Relating to rebalancing and leasing.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Problem:
The current load based splitter is tuned for determining a split point that will evenly divide QPS of a range if it were split. However it is desirable to experiment with new rebalancing signals, which should be symmetric in both the rebalancer and lb splitter to avoid inconsistency where the range doesn’t split yet cannot be rebalanced due to being too large. For example CPU, the LB splitter does not support finding a split key where the magnitude of each span recorded is not 1. A simple solution is to re-record the span against the decider for each increment in magnitude, however due to the sampling method this would not be fairly weighted in finding a split key.
Solution should support:
Jira issue: CRDB-20841
Epic CRDB-20845
The text was updated successfully, but these errors were encountered: