-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admission,kvadmission: use tenant cpu consumption for inter-tenant fa… #108364
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Did you happen to run the admission_control_multitenant_fairness tests with these changes? Would be interesting to see this CPU-time-based fair sharing in play. Would be more interesting with mixed workloads (kv + tpcc).
@@ -383,7 +390,8 @@ func (n *controllerImpl) AdmitKVWork( | |||
func (n *controllerImpl) AdmittedKVWorkDone(ah Handle, writeBytes *StoreWriteBytes) { | |||
n.elasticCPUGrantCoordinator.ElasticCPUWorkQueue.AdmittedWorkDone(ah.elasticCPUWorkHandle) | |||
if ah.callAdmittedWorkDoneOnKVAdmissionQ { | |||
n.kvAdmissionQ.AdmittedWorkDone(ah.tenantID) | |||
cpuTime := grunning.Time() - ah.cpuStart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use grunning.Difference()
instead to paper over #95529.
@@ -105,25 +105,103 @@ closed epoch: 0 tenantHeap len: 2 top tenant: 53 | |||
tenant-id: 53 used: 0, w: 1, fifo: -128 waiting work heap: [0: pri: normal-pri, ct: 3, epoch: 0, qt: 100] | |||
tenant-id: 71 used: 1, w: 1, fifo: -128 waiting work heap: [0: pri: low-pri, ct: 4, epoch: 0, qt: 100] | |||
|
|||
# The system tenant work is done and consumed 10 cpu. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"10 cpu" reads oddly, perhaps "10 cpu nanos" instead? Ditto further below.
@@ -63,7 +63,7 @@ handle: 50ms | |||
admitted-work-done running=10ms allotted=50ms | |||
---- | |||
granter: return-grant=40ms | |||
work-queue: | |||
work-queue: adjustTenantUsed: tenantID=system additionalUsed=-40000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[minor nit] Could we format this in the time.Duration units like we do for the other parameters? (running=
, allotted=
, return-grant=
). Perhaps avoid the snake casing too, formatting using something like:
work-queue: adjust-used: tenant=system [-/+]40ms
…irness Previously, we were using the instantaneous slots consumed, since that code predated the grunning instrumentation for cpu consumption. The reset logic for tenantInfo.used is now the same for WorkQueues that use slots and tokens. Additionally, there was a bug in WorkQueue.adjustTenantTokens in that it forgot to fix the heap -- this is fixed and tested. Fixes cockroachdb#91533 Epic: none Release note: None
970caa0
to
8e307e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Did you happen to run the admission_control_multitenant_fairness tests with these changes? Would be interesting to see this CPU-time-based fair sharing in play. Would be more interesting with mixed workloads (kv + tpcc).
I didn't. I've added a TODO to alter that test to have different sized work in the two workloads.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @renatolabs, and @smg260)
pkg/kv/kvserver/kvadmission/kvadmission.go
line 393 at r1 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Use
grunning.Difference()
instead to paper over #95529.
I forgot about that -- thanks for the reminded. I've changed this to 1 nano instead.
pkg/util/admission/testdata/elastic_cpu_work_queue
line 66 at r1 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[minor nit] Could we format this in the time.Duration units like we do for the other parameters? (
running=
,allotted=
,return-grant=
). Perhaps avoid the snake casing too, formatting using something like:work-queue: adjust-used: tenant=system [-/+]40ms
Done
pkg/util/admission/testdata/work_queue
line 108 at r1 (raw file):
Previously, irfansharif (irfan sharif) wrote…
"10 cpu" reads oddly, perhaps "10 cpu nanos" instead? Ditto further below.
Done
bors r=irfansharif |
Build succeeded: |
…irness
Previously, we were using the instantaneous slots consumed, since that code predated the grunning instrumentation for cpu consumption. The reset logic for tenantInfo.used is now the same for WorkQueues that use slots and tokens. Additionally, there was a bug in WorkQueue.adjustTenantTokens in that it forgot to fix the heap -- this is fixed and tested.
Fixes #91533
Epic: none
Release note: None