Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
89217: kvserver: Use response data in the load-based splitter r=KaiSun314 a=KaiSun314 Fixes #87279 We investigated why running YCSB Workload E results in a single hot range and we observed that range queries of the form SELECT * FROM table WHERE pkey >= A LIMIT B will result in all request spans having the same end key - similar to [A, range_end) - rather than end keys that take into account the specified LIMIT. Since the majority of request spans have the same end key, the load splitter algorithm cannot find a split key without too many contained and balance between left and right requests. A proposed solution is to use the response span rather than the request span, since the response span is more accurate in reflecting the keys that this request truly iterated over. We utilize the request span as well as the response's resume span to derive the key span that this request truly iterated over. Using response data (resume span) rather than just the request span in the load-based splitter (experimentally) allows the load-based splitter to find a split key under range query workloads (YCSB Workload E, KV workload with spans). Ops/sec for YCSB-E workload with / without this change and various number of nodes (3 / 5) and CPUs (8 / 32): https://docs.google.com/spreadsheets/d/1OcvRUkXORiGpr-f7cMAiuv9DW7qQZgconfqE4UbfQ2c/edit?usp=sharing Release note (ops change): We use response data rather than just the request span in the load-based splitter to pass more accurate data about the keys iterated over to the load splitter to find a suitable split key, enabling the load splitter to find a split key under heavy range query workloads. Co-authored-by: Kai Sun <[email protected]>
- Loading branch information