Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: add minimum to cpu split threshold #98107

Closed
kvoli opened this issue Mar 6, 2023 · 1 comment · Fixed by #98250
Closed

kvserver: add minimum to cpu split threshold #98107

kvoli opened this issue Mar 6, 2023 · 1 comment · Fixed by #98250
Assignees
Labels
A-kv-distribution Relating to rebalancing and leasing. branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) GA-blocker T-kv KV Team

Comments

@kvoli
Copy link
Collaborator

kvoli commented Mar 6, 2023

In #96128 we introduced CPU load based range splitting. to control when to split, kv.range_split.load_cpu_threshold was added.

Setting this value too low will result in excessive, destabilizing splitting, given some baseline foreground workload.

This issue is to add a minimum to this setting.

The current default is 500ms. A minimum of 50ms would be reasonable.

Jira issue: CRDB-25080

@kvoli kvoli added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-distribution Relating to rebalancing and leasing. GA-blocker labels Mar 6, 2023
@kvoli kvoli self-assigned this Mar 6, 2023
@blathers-crl blathers-crl bot added the T-kv KV Team label Mar 6, 2023
@blathers-crl
Copy link

blathers-crl bot commented Mar 6, 2023

Hi @kvoli, please add branch-* labels to identify which branch(es) this release-blocker affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@kvoli kvoli added the branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 label Mar 6, 2023
craig bot pushed a commit that referenced this issue Mar 9, 2023
97717: multitenant: add AdminUnsplitRequest capability r=knz a=ecwall

Fixes #97716

1) Add a new `tenantcapabilitiespb.CanAdminUnsplit` capability.
2) Check capability in `Authorizer.HasCapabilityForBatch`.
3) Remove `ExecutorConfig.RequireSystemTenant` call from
   `execFactory.ConstructAlterTableUnsplit`,
   `execFactory.ConstructAlterTableUnsplitAll`.

Release note: None


98250: kvserver: add minimum cpu lb split threshold r=andrewbaptist a=kvoli

Previously, `kv.range_split.load_cpu_threshold` had no minimum setting value. It is undesirable to allow users to set this setting to low as excessive splitting may occur.

`kv.range_split.load_cpu_threshold` now has a minimum setting value of `10ms`.

See #96869 for additional context on the threshold.

Resolves: #98107

Release note (ops change): `kv.range_split.load_cpu_threshold` now has a minimum setting value of `10ms`.

98270: dashboards: add replica cpu to repl dashboard r=xinhaoz a=kvoli

In #96127 we added the option to load balance replica CPU instead of QPS across
stores in a cluster. It is desirable to view the signal being controlled for
rebalancing in the replication dashboard, similar to QPS.

This pr adds the `rebalancing.cpunanospersecond` metric to the replication
metrics dashboard.

The avg QPS graph on the replication graph previously described the metric as
"Exponentially weighted average", however this is not true.

This pr updates the description to just be "moving average" which is accurate.
Note that follow the workload does use an exponentially weighted value, however
the metric in the dashboard is not the same.

This pr also updates the graph header to include Replica in the title: "Average
Replica Queries per Node". QPS is specific to replicas. This is already
mentioned in the description.

Resolves: #98109


98289: storage: mvccExportToWriter should always return buffered range keys r=adityamaru a=stevendanna

In #96691, we changed the return type of mvccExportToWriter such that it now indicates when a CPU limit has been reached. As part of that change, when the CPU limit was reached, we would immedately `return` rather than `break`ing out of the loop. This introduced a bug, since the code after the loop that the `break` was taking us to is important. Namely, we may have previously buffered range keys that we need to write into our response still. By replacing the break with a return, these range keys were lost.

The end-user impact of this is that a BACKUP that _ought_ to have included range keys (such as a backup of a table with a rolled back IMPORT) would not include those range keys and thus would end up resurecting deleted keys upon restore.

This PR brings back the `break` and adds a test that covers exporting with range keys under CPU exhaustion.

This bug only ever existed on master.

Informs #97694

Epic: none

Release note: None

98329: sql: fix iteration conditions in crdb_internal.scan r=ajwerner a=stevendanna

Rather than using the Next() key of the last key in the response when iterating, we should use the resume span. The previous code could result in a failure in the rare case that the end key of our scan exactly matched the successor key of the very last key in the iteration.

Epic: none

Release note: None

Co-authored-by: Evan Wall <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
@craig craig bot closed this as completed in 903747d Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-distribution Relating to rebalancing and leasing. branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) GA-blocker T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant