-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: refactor replicate queue to enable allocator code reuse #94114
Conversation
d45aa72
to
fdce47e
Compare
pkg/kv/kvserver/replicate_queue.go
Outdated
// In case of an add action, no replicas are removed and -1 is returned, and if no | ||
// candidates for replacement can be found during a replace action, the returned | ||
// nothingToDo flag will be set to true. | ||
// TODO(sarkesian): If possible, move this logic into the allocator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should certainly try and move this into the allocatorimpl
pkg.
fdce47e
to
c62636b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)
pkg/kv/kvserver/replicate_queue.go
line 1178 at r1 (raw file):
Previously, kvoli (Austen) wrote…
We should certainly try and move this into the
allocatorimpl
pkg.
Done.
Also moved the "avoid fragile quorum" check into the allocator as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. There is one test failing that I would look into:
TestPromoteNonVoterInAddVoter
There is fragile logic which decides to promote a non-voter on removal.
Reviewed 1 of 2 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks)
pkg/kv/kvserver/replicate_queue.go
line 1303 at r2 (raw file):
) (op AllocationOp, _ error) { effects := effectBuilder{} conf := repl.SpanConfig()
This should be passed in to be consistent with the other commit that makes desc/conf consistent.
f6ceb6a
to
f35e4fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)
pkg/kv/kvserver/replicate_queue.go
line 1303 at r2 (raw file):
Previously, kvoli (Austen) wrote…
This should be passed in to be consistent with the other commit that makes desc/conf consistent.
Some work will be required to rebase that change on top of all the other changes to the replicate queue/allocator, so I'll wait until these are merged to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks)
pkg/kv/kvserver/replicate_queue.go
line 1178 at r1 (raw file):
Previously, AlexTalks (Alex Sarkesian) wrote…
Done.
Also moved the "avoid fragile quorum" check into the allocator as well.
Great!
pkg/kv/kvserver/replicate_queue.go
line 1303 at r2 (raw file):
Previously, AlexTalks (Alex Sarkesian) wrote…
Some work will be required to rebase that change on top of all the other changes to the replicate queue/allocator, so I'll wait until these are merged to do that.
ack
pkg/kv/kvserver/replicate_queue.go
line 1038 at r5 (raw file):
action, voterReplicas, nonVoterReplicas, liveVoterReplicas, deadVoterReplicas,
Just curious - was this the source of the test failures previously?
pkg/kv/kvserver/allocator/allocatorimpl/allocator.go
line 136 at r5 (raw file):
) func (a AllocatorAction) Add() bool {
nit: does the linter enforce export function commenting? These are pretty self explanatory, just wondering why that got removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @kvoli)
pkg/kv/kvserver/replicate_queue.go
line 1303 at r2 (raw file):
Previously, kvoli (Austen) wrote…
ack
Done.
pkg/kv/kvserver/replicate_queue.go
line 1038 at r5 (raw file):
Previously, kvoli (Austen) wrote…
Just curious - was this the source of the test failures previously?
One of them, unfortunately.
pkg/kv/kvserver/allocator/allocatorimpl/allocator.go
line 136 at r5 (raw file):
Previously, kvoli (Austen) wrote…
nit: does the linter enforce export function commenting? These are pretty self explanatory, just wondering why that got removed.
I don't think so, can add if needed.
This change refactors parts of the replicate queue's `PlanOneChange(..)` and `addOrRemove{Non}Voters(..)` functions to reusable helper functions that simplify usage of the allocator and deduplicate repeated code paths. The change also adds convenience methods to the `AllocatorAction` enum, to move certain determinations (such as if a computed allocator action is a remove or a replace) closer to the allocator type it is based on. These changes move more of the logic needed to use the allocator into the `allocatorimpl` package itself, enabling usage of the allocator outside of the replicate queue. Part of cockroachdb#91571. Release note: None
While previously we checked that we can avoid a fragile quorum during the addition of new voters in the replicate queue, this change moves the check logic into the allocator code itself, allowing it to be reused by other users of the allocator. This enables us to perform this check when evaluating decommission viability, or anywhere else that uses the allocator for new voter replicas. Part of cockroachdb#91570. Release note: None
f35e4fc
to
91c2a6b
Compare
bors r+ |
Build failed (retrying...): |
Build succeeded: |
This change exposes support via a store for checking the allocator action and upreplication target (if applicable) for any range descriptor. The range does not need to have a replica on the given store, nor is it required to evaluate given the current state of the cluster (i.e. the store's configured StorePool), as a store pool override can be provided in order to evaluate possible future states. Depends on cockroachdb#94114. Part of cockroachdb#91570. Release note: None
This change exposes support via a store for checking the allocator action and upreplication target (if applicable) for any range descriptor. The range does not need to have a replica on the given store, nor is it required to evaluate given the current state of the cluster (i.e. the store's configured StorePool), as a store pool override can be provided in order to evaluate possible future states. Depends on cockroachdb#94114. Part of cockroachdb#91570. Release note: None
This change exposes support via a store for checking the allocator action and upreplication target (if applicable) for any range descriptor. The range does not need to have a replica on the given store, nor is it required to evaluate given the current state of the cluster (i.e. the store's configured StorePool), as a store pool override can be provided in order to evaluate possible future states. Depends on cockroachdb#94114. Part of cockroachdb#91570. Release note: None
94024: kvserver: support checking allocator action and target by range r=AlexTalks a=AlexTalks This change exposes support via a store for checking the allocator action and upreplication target (if applicable) for any range descriptor. The range does not need to have a replica on the given store, nor is it required to evaluate given the current state of the cluster (i.e. the store's configured StorePool), as a store pool override can be provided in order to evaluate possible future states. Depends on #94114. Part of #91570. Release note: None Co-authored-by: Alex Sarkesian <[email protected]>
This change refactors parts of the replicate queue's
PlanOneChange(..)
and
addOrRemove{Non}Voters(..)
functions to reusable helper functionsthat simplify usage of the allocator and deduplicate repeated code
paths. The change also adds convenience methods to the
AllocatorAction
enum, to move certain determinations (such as if a computed allocator
action is a remove or a replace) closer to the allocator type it is
based on. These changes further enable the ability to use the allocator
outside of the replicate queue, and enable the ability for some of this
logic to move into the allocator itself in the future.
Depends on #91941.
Epic: none
Release note: None