admission: improve inter-tenant fair sharing of CPU #91533
Labels
A-admission-control
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Admission control does CPU allocation via (dynamically estimated) slots. There are reasons for why slots are used instead of tokens (say cpu-nanos/s), that are discussed in https://docs.google.com/document/d/16RISoB3zOORf24k0fQPDJ4tTCisBVh-PkCZEit4iwV8/edit#heading=h.amw25se2ob5m and https://docs.google.com/document/d/16RISoB3zOORf24k0fQPDJ4tTCisBVh-PkCZEit4iwV8/edit#bookmark=id.479u56nbiz7.
When this code was written, we did not have the golang runtime changes to measure the running time of each goroutine. So slots are used for two purposes:
tenantInfo.used
is used for ordering in the heap, a form of weighted fair-sharing.The latter is lacking history and does not have good fair-sharing properties. Consider a scenario with 2 slots and 10 active tenants (each tenant having multiple enqueued work), none of which have yet received any allocation. We will allocate to 2 arbitrary tenants, say T0, T1. Then when T0 completes its work, only T1 has 1 used slot, so this free slot will not go to T1, but could go again to T0. Even if it does not go to T0 immediately, the next decision could again give a slot to T0. The problem is that we have forgotten the allocation given to T0 as soon as T0 returned the slot. We don't want a long history, since fair-sharing algorithms (especially the real-time ones) ought to provide latency isolation, but some history is important. When used with tokens, WorkQueue resets this history every 1s. This 1s reset is crude and could be improved, but it suffices for now (for those interested in an intro to fair queuing algorithms see https://intronetworks.cs.luc.edu/current/html/fairqueuing.html#fair-queuing).
In practice, we don't suffer a lot from this problem because we often have 100+ slots and only a handful of active tenants, but with wider adoption of multi-tenant CockroachDB this will not necessarily be the case. We are also considering work to reduce the number of slots while achieving full CPU utilization, which will make it more likely that the number of active tenants exceed the number of slots.
An easy fix to this is to continue to allocate slots, but change
tenantInfo.used
to account for the cpu-nanos of running time of that tenant. Then the heap ordering will be based on preferring the tenant with the smallest running time.Jira issue: CRDB-21308
The text was updated successfully, but these errors were encountered: