forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
admission: support non-blocking {Store,}WorkQueue.Admit()
Part of cockroachdb#95563. For end-to-end flow control of replicated writes, we want to enable below-raft admission control through the following API on kvadmission.Controller: // AdmitRaftEntry informs admission control of a raft log entry being // written to storage (for the given tenant, the specific range, and // on the named store). AdmitRaftEntry( context.Context, roachpb.TenantID, roachpb.StoreID, roachpb.RangeID, raftpb.Entry. ) This serves as the integration point for log entries received below raft right as they're being written to stable storage. It's a non-blocking interface since we're below-raft and in the raft.Ready() loop. What it effectively does is enqueues a "virtual" work item in the underlying StoreWorkQueue mediating all store IO. This virtual work item is what later gets dequeued once the IO granter informs the work queue of newly available IO tokens. When enqueueing the virtual work item, we still update the StoreWorkQueue's physically-accounted-for bytes since the actual write is not deferred, and timely statistic updates improves accuracy for the underlying linear models that map between accounted-for writes and observed L0 growth (using it to inform IO token generation rates). For each of the arguments above: - The roachpb.TenantID is plumbed to find the right tenant heap to queue it under (for inter-tenant isolation). - The roachpb.StoreID to find the right store work queue on multi-store nodes. We'll also use the StoreID when informing the origin node of log entries being admitted[^1]. - We pass in the roachpb.RangeID on behalf of which work is being admitted. This, along side the raftpb.Entry.{Term,Index} for the replicated write uniquely identifies where the write is to end up. We use these identifiers to return flow tokens on the origin node[^1][^2]. - For standard work queue ordering, our work item needs to include the CreateTime and AdmissionPriority, details that are passed down using dedicated raft log entry encodings[^3][^4] as part of the raftpb.Entry parameter above. - Since the raftpb.Entry encodes within it its origin node[^4], it will be used post-admission to dispatch flow tokens to the right node. This integration is left to future PRs. We use the above to populate the following fields on a per-(replicated write)work basis: // ReplicatedWorkInfo groups everything needed to admit replicated // writes, done so asynchronously below-raft as part of replication // admission control. type ReplicatedWorkInfo struct { RangeID roachpb.RangeID Origin roachpb.NodeID LogPosition LogPosition Ingested bool } Since admission is happening below-raft where the size of the write is known, we no longer need per-work estimates for upfront IO token deductions. Since admission is asynchronous, we also don't use the AdmittedWorkDone interface which was to make token adjustments (without blocking) given the upfront estimates. We still want to intercept the exact point when some write work gets admitted in order to inform the origin node so it can release flow tokens. We do so through the following interface satisfied by the StoreWorkQueue: // onAdmittedReplicatedWork is used to intercept the // point-of-admission for replicated writes. type onAdmittedReplicatedWork interface { admittedReplicatedWork( tenantID roachpb.TenantID, pri admissionpb.WorkPriority, rwi ReplicatedWorkInfo, requestedTokens int64, ) } [^1]: See kvflowcontrolpb.AdmittedRaftLogEntries introduced in cockroachdb#95637. [^2]: See kvflowcontrol.Handle.{ReturnTokensUpto,DeductTokensFor} introduced in cockroachdb#95637. Token deductions and returns are tied to raft log positions. [^3]: See raftlog.EntryEncoding{Standard,Sideloaded}WithAC introduced in cockroachdb#95748. [^4]: See kvflowcontrolpb.RaftAdmissionMeta introduced in cockroachdb#95637. message RaftAdmissionMeta { int32 admission_priority = ...; int64 admission_create_time = ...; int32 admission_origin_node = ...; } Release note: None
- Loading branch information
1 parent
c705d32
commit a8747c9
Showing
38 changed files
with
1,861 additions
and
175 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.