Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: implement decommission pre-check api #93950

Merged
merged 1 commit into from
Jan 28, 2023

Conversation

AlexTalks
Copy link
Contributor

This change implements the DecommissionPreCheck RPC on the Admin
service, using the support for evaluating node decommission readiness by
checking each range introduced in #93758. In checking node decommission
readiness, only nodes that have a valid, non-DECOMMISSIONED liveness
status are checked, and ranges with replicas on the checked nodes that
encounter errors in attempting to allocate replacement replicas are
reported in the response. Ranges that have replicas on multiple checked
nodes have their errors reported for each nodeID in the request list.

Depends on #93758, #90222.

Epic: CRDB-20924

Release note: None

@AlexTalks AlexTalks requested review from a team as code owners December 19, 2022 23:11
@AlexTalks AlexTalks requested a review from a team December 19, 2022 23:11
@AlexTalks AlexTalks requested a review from a team as a code owner December 19, 2022 23:11
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just looked at final commit. feel free to squash with #90222 for easier merge.

Reviewed 2 of 2 files at r1, 2 of 2 files at r2, 5 of 5 files at r3, 2 of 2 files at r4, 4 of 4 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks)


pkg/server/admin.go line 2661 at r6 (raw file):

// DecommissionPreCheck runs checks and returns the DecommissionPreCheckResponse
// for the given nodes.
func (s *systemAdminServer) DecommissionPreCheck(

Can you add a test?


pkg/server/admin.go line 2663 at r6 (raw file):

func (s *systemAdminServer) DecommissionPreCheck(
	ctx context.Context, req *serverpb.DecommissionPreCheckRequest,
) (*serverpb.DecommissionPreCheckResponse, error) {

add ctx = s.AnnotateCtx(ctx)

do we need any permission checks here? I notice decom operation itself doesn't seem to have any...so perhaps it's left as a separate issue.

@AlexTalks AlexTalks force-pushed the dpf_impl_api branch 2 times, most recently from fcabdc2 to 60acd56 Compare January 7, 2023 10:30
Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the last two commits.

It would be good to add a couple tests for the main RPC

Reviewed 15 of 15 files at r7, 4 of 5 files at r8, 6 of 6 files at r9, 1 of 3 files at r10.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @dhartunian)


pkg/server/admin.go line 2639 at r10 (raw file):

// DecommissionPreCheck runs checks and returns the DecommissionPreCheckResponse
// for the given nodes.
func (s *systemAdminServer) DecommissionPreCheck(

Are you waiting to add a client test for this fn? It would be good to add something in this commit.

@AlexTalks AlexTalks force-pushed the dpf_impl_api branch 2 times, most recently from afdc2e9 to f44f661 Compare January 21, 2023 00:44
@blathers-crl

This comment was marked as outdated.

Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 16 of 16 files at r11, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks)

This change implements the `DecommissionPreCheck` RPC on the `Admin`
service, using the support for evaluating node decommission readiness by
checking each range introduced in cockroachdb#93758. In checking node decommission
readiness, only nodes that have a valid, non-`DECOMMISSIONED` liveness
status are checked, and ranges with replicas on the checked nodes that
encounter errors in attempting to allocate replacement replicas are
reported in the response. Ranges that have replicas on multiple checked
nodes have their errors reported for each nodeID in the request list.

Depends on cockroachdb#93758, cockroachdb#90222.

Epic: CRDB-20924

Release note: None
@AlexTalks
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 28, 2023

Build succeeded:

@craig craig bot merged commit 78fe59d into cockroachdb:master Jan 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants