-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admission: goschedstats sampling period is too long #111125
Labels
A-admission-control
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Comments
sumeerbhola
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-admission-control
labels
Sep 22, 2023
craig bot
pushed a commit
that referenced
this issue
Sep 29, 2023
111441: goschedstats: reduce underloadedRunnablePerProcThreshold r=aadityasondhi a=sumeerbhola By reducing this threshold, we more often sample the stats at 1ms intervals, which is desirable for admission control slot adjustment. Fixes #111125 Epic: none Release note: None 111521: roachprod: fix createTenantCertBundle script r=andy-kimball a=herkolategan The `createTenantCertBundle` function has a script in it that has erroneous "+" characters which results in an error when trying to start a secure cluster. This change removes those characters. Epic: None Release Note: None Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Herko Lategan <[email protected]>
craig bot
pushed a commit
that referenced
this issue
Sep 29, 2023
109638: sql: use the correct locking strength for FOR SHARE clause r=michae2 a=arulajmani Previously, FOR SHARE and FOR KEY SHARE would use non-locking KV scans. Now that the lock table supports shared locks, we can use lock.Shared as the locking strength for KV scans. This patch does that, and in doing so, wires up SHARED locks end to end. By default, we turn of this functionality for serializable transactions. Instead, it's gated behind a session setting called `enable_shared_locking_for_serializable`. Informs #91545 Release note (sql change): SELECT FOR SHARE and SELECT FOR UPDATE previously did not acquire any locks. Users issuing these statements would expect them to acquire shared locks (multiple readers allowed, no writers though). This patch switches over the behavior to acquire such read locks. For serializable transactions, we default to the old behaviour, unless the `enable_shared_locking_for_serializable` session setting is set to true. We'll probably switch this behavior in the future, but for now, given shared locks are a preview feature, we gate things behind a session setting. 111441: goschedstats: reduce underloadedRunnablePerProcThreshold r=aadityasondhi a=sumeerbhola By reducing this threshold, we more often sample the stats at 1ms intervals, which is desirable for admission control slot adjustment. Fixes #111125 Epic: none Release note: None 111466: ui: update default timescale to 1h r=maryliag a=maryliag On the Metrics page, the default value was 10min, which was too small. This commit updates it to 1h to match the most selected value on the SQL Activity. Fixes #96479 Release note: None 111521: roachprod: fix createTenantCertBundle script r=andy-kimball a=herkolategan The `createTenantCertBundle` function has a script in it that has erroneous "+" characters which results in an error when trying to start a secure cluster. This change removes those characters. Epic: None Release Note: None 111526: kv/spanset: permit writes to lock table by shared lock latching configuration r=nvanbenschoten a=nvanbenschoten Fixes #111409. Fixes #111492. This commit updates addLockTableSpans to account for the way that shared locking requests declare their latches. Even though they declare a "read" latch, they do so at max timestamp, which gives them sufficient isolation to write to the lock table without having to declare a write latch and be serialized with other shared lock acquisitions. Release note: None Co-authored-by: Arul Ajmani <[email protected]> Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: maryliag <[email protected]> Co-authored-by: Herko Lategan <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>
THardy98
pushed a commit
to THardy98/cockroach
that referenced
this issue
Oct 6, 2023
By reducing this threshold, we more often sample the stats at 1ms intervals, which is desirable for admission control slot adjustment. Fixes cockroachdb#111125 Epic: none Release note: None
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-admission-control
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
goschedstats uses a 250ms sampling interval when the CPU utilization is "low"
cockroach/pkg/util/goschedstats/runnable.go
Lines 216 to 220 in ed75b08
This "low" is based on 1 runnable goroutine per proc
https://github.com/cockroachdb/cockroach/blob/master/pkg/util/goschedstats/runnable.go#L72
This threshold is too high.
Here is a node running at close to 80% cpu utilization but is mostly doing sampling at a 250ms interval, due to which the slot adjustment is not fast enough, and we see slot exhaustion.
Jira issue: CRDB-31779
The text was updated successfully, but these errors were encountered: