Avoid rebalancing from many sources into a single target node/store #82759
Labels
A-kv-distribution
Relating to rebalancing and leasing.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
This can up in a conversation with @nvanbenschoten, proposed by @kvoli.
Problem
When a new node is added and the rebalance activity is high, many stores may choose to rebalance ranges into that newly added node. For example, when decommissioning a node and adding a new one: #79560.
This thundering herd problem can slow down decommissioning.
Proposed Solution
Stores should be aware of the number of pending rebalance snapshots coming into other stores, so that a store with a long queue will not be considered as a target for rebalancing. We can gossip the queue length of incoming rebalance snapshot requests to all stores.
Alternatively we can pick a "good enough" target instead of the best target, for example by picking a random (or round robin) store to rebalance to, out of the valid stores. The issue here is that it might take a long time to fill the newly added node (though we did not test that), and it is undesired to have an underutilized node for too long (days instead of hours?).
Jira issue: CRDB-16645
The text was updated successfully, but these errors were encountered: