-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: support different load dimensions in allocation code and cleanup #91152
Closed
Tracked by
#90582
Labels
A-kv-distribution
Relating to rebalancing and leasing.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Milestone
Comments
kvoli
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-kv-distribution
Relating to rebalancing and leasing.
labels
Nov 2, 2022
13 tasks
kvoli
changed the title
kvserver: support different load dimensions in allocation code
kvserver: support different load dimensions in allocation code and cleanup
Nov 2, 2022
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Nov 9, 2022
Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to the underlying inputs in these components to enable modifying existing load dimensions and adding new ones with less code modification. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying accounting `ReplicaStats`, to a tight dependency. Part of cockroachdb#91152 Release note: None
craig bot
pushed a commit
that referenced
this issue
Nov 9, 2022
90604: ui: correct tooltip message for circuit breaker events chart r=koorosh a=koorosh This patch corrects the message on circuit breaker events chart to highlight that the chart displays the total number of events per aggregated interval of time instead of "per second". Release note: None Epic: none 91158: kvserver: misc cleanup of allocation code r=aayushshah15 a=kvoli This patch removes the deprecated method `rangeRebalanceThreshold` which was used for mixed version compatibility between 21.2 and 22.1. This patch adds a metric `kv.allocator.load_based_lease_transfers.follow_the_workload` which tracks the number of times the allocator returned a range lease transfer target based based on access locality. Previously there was no insight into how often this condition was hit. Part of #91152 Co-authored-by: Andrii Vorobiov <[email protected]> Co-authored-by: Austen McClernon <[email protected]>
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Dec 21, 2022
Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to inputs, enable modifying existing dimensions and adding new ones with less code modification. The only load dimension added in this patch is `Queries`, which is then used in place of `QueriesPerSecond` within the rebalancing logic. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying tracking datastructure `ReplicaStats`. Part of cockroachdb#91152 Release note: None
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Dec 27, 2022
Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to inputs, enable modifying existing dimensions and adding new ones with less code modification. The only load dimension added in this patch is `Queries`, which is then used in place of `QueriesPerSecond` within the rebalancing logic. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying tracking datastructure `ReplicaStats`. Part of cockroachdb#91152 Release note: None
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Dec 29, 2022
Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to inputs, enable modifying existing dimensions and adding new ones with less code modification. The only load dimension added in this patch is `Queries`, which is then used in place of `QueriesPerSecond` within the rebalancing logic. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying tracking datastructure `ReplicaStats`. Part of cockroachdb#91152 Release note: None
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Dec 29, 2022
Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to inputs, enable modifying existing dimensions and adding new ones with less code modification. The only load dimension added in this patch is `Queries`, which is then used in place of `QueriesPerSecond` within the rebalancing logic. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying tracking datastructure `ReplicaStats`. Part of cockroachdb#91152 Release note: None
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Jan 3, 2023
Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to inputs, enable modifying existing dimensions and adding new ones with less code modification. The only load dimension added in this patch is `Queries`, which is then used in place of `QueriesPerSecond` within the rebalancing logic. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying tracking datastructure `ReplicaStats`. Part of cockroachdb#91152 Release note: None
craig bot
pushed a commit
that referenced
this issue
Jan 3, 2023
91593: kvserver: add load vec r=nvanbenschoten,andrewbaptist a=kvoli Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to inputs, enable modifying existing dimensions and adding new ones with less code modification. The only load dimension added in this patch is `Queries`, which is then used in place of `QueriesPerSecond` within the rebalancing logic. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying tracking datastructure `ReplicaStats`. Part of #91152 Release note: None 93838: split: Redesign the load-based splitter to be consistent with new rebalancing signals. r=kvoli a=KaiSun314 Fixes: #90574 In the current load splitter, we find the split key that best balances the QPS of the left and right sides. As a result, each request is unweighted, since one request contributes one to the QPS. In particular, the load splitter does not differentiate between what kinds of requests they are, how heavy the request is, and what resources these requests consume, which can result in scenarios where QPS is balanced but one side has a lot more work due to a few heavy requests. Moreover, the current load splitter treats requests that contain a split key as “contained”. Optimizing for QPS, contained requests are bad since splitting at a point in a contained request will not help lower the QPS of either side. However, optimizing for other signals like CPU, splitting at a point in a contained request is great as each side will get part of the work of processing that request. This motivates a redesign of the load splitter, one that enables recording weighted requests and considers contained requests in the weight balancing for splitting. In this PR, we redesign the load-based splitter with the following interface: 1. Record a point key “start” or span “[start, end)” with a weight “w” at a specific time “ts”, where “w” is some measure of load recorded for a span e.g. Record(ts, start, w) or Record(ts, [start, end), w) 2. Find a split key such that the load (i.e. total weight) on the resulting split ranges would be as equal as possible according to the recorded loads above e.g. Key() To make the current load-based splitter (Finder) weighted, we make the following modifications: 1. Instead of using reservoir sampling, we use weighted reservoir sampling (a simplified version of A-Chao) 2. Remove the contained counter 3. Increment the left and right counters by the weight of the request rather than just 1 4. Treat a weighted range request ([start, end), w) into two weighted point requests (start, w/2) and (end, w/2) For more details, see (internal) https://docs.google.com/document/d/1bdSxucz-xFzwnxL3fFXNZsRc9Vsht0oO0XuZrz5Iw84/edit#bookmark=id.xjc41tm3jx3x. Release note (ops change): The load-based splitter has been redesigned to be more consistent with CPU-based rebalancing rather than QPS-based rebalancing to improve range splits. Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Kai Sun <[email protected]>
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Jan 3, 2023
Previously, disparate types were used when comparing `StoreCapacity`, `XThreshold` and `RangeUsageInfo`. This patch introduces a uniform intermediate type `Load`, which is used to perform arithmetic and comparison between types representing load. The purpose of this change is to decouple changes to inputs, enable modifying existing dimensions and adding new ones with less code modification. The only load dimension added in this patch is `Queries`, which is then used in place of `QueriesPerSecond` within the rebalancing logic. Additionally, `RangeUsageInfo` is uniformly passed around in place of any specific calls to the underlying tracking datastructure `ReplicaStats`. Part of cockroachdb#91152 Release note: None
completed on kvoli@eb764b9 |
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Jan 12, 2023
Previously, only bytes written at the leaseholder were accounted for on a replica. Now, follower replicas will record the bytes written. Informs: cockroachdb#91152 Release note: None
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-distribution
Relating to rebalancing and leasing.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
The allocation code currently hardcodes many references to QPS which makes it difficult to swap out the dimension, without duplicating significant amounts of code.
This is unfortunate as the control mechanism for converging a single load dimension across stores is effective and not limited to just QPS.
The solution is to refactor several sections of the allocation code to support registering and using a different load dimension.
The solution should also involve clearing up interactions between the components in the following areas:
pkg/kv/kvserver/allocator/*
pkg/kv/kvserver/replicate_queue.go
pkg/kv/kvserver/store_rebalancer.go
pkg/kv/kvserver/replica_rankings.go
pkg/kv/kvserver/replicastats/*
pkg/kv/kvserver/replica_load.go
pkg/kv/kvserver/replica_metrics.go
pkg/kv/kvserver/store.go.Capacity()
pkg/kv/kvserver/store.go.updateReplicationGuages()
pkg/kv/kvserver/store.go.recordNewPerSecondStats()
pkg/kv/kvserver/store.go.GossipStore()
Jira issue: CRDB-21126
Epic CRDB-20845
The text was updated successfully, but these errors were encountered: