Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: tighten and de-fatal replicaGC checks #102248

Merged
merged 1 commit into from
Jul 5, 2023

Conversation

tbg
Copy link
Member

@tbg tbg commented Apr 25, 2023

Now we return most failed checks as errors and report errors to sentry (but
without crashing).

Closes #94813.

Epic: None
Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Apr 25, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg tbg requested a review from pav-kv April 25, 2023 15:08
@tbg tbg marked this pull request as ready for review July 3, 2023 11:37
@tbg tbg requested a review from a team as a code owner July 3, 2023 11:37
@tbg tbg requested review from erikgrinaker and removed request for pav-kv July 3, 2023 12:57
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

pkg/kv/kvserver/replica_gc_queue.go Outdated Show resolved Hide resolved
Now we return most failed checks as errors and report errors to sentry (but
without crashing). `TestStoreRemoveReplicaDestroy` was updated with somewhat
better coverage of these error conditions, which is now

I also slightly rearranged the checks; it makes sense to me to check that the
store doesn't have a different Replica, rather than using a potentially old
Replica and failing the nextReplicaID check.

    old order                                     new order
    - assert replica is inited                    - assert replica is inited
    - assert opts are compatible with each other  - assert opts are compatible with each other
    - maybe return early                          - maybe return early
    - assert destroy status                       - assert destroy status
    - assert replica id < nextReplicaID ───┐  ┌─► - *assert same replica in store*
    - assert replica is inited (again)     └──┼─► - *assert replica id < nextReplicaID*
    - set destroy status                      │   - set destroy status
    - assert same replica in store────────────┘   - remainder
    - remainder

Closes cockroachdb#94813.

Epic: None
Release note: None
@tbg tbg force-pushed the replica-gc-checks branch from e4ed406 to 267099d Compare July 5, 2023 08:42
@tbg
Copy link
Member Author

tbg commented Jul 5, 2023

TFTR!

bors r=erikgrinaker

tbg added a commit to tbg/cockroach that referenced this pull request Jul 5, 2023
Noticed in cockroachdb#102248.

Epic: none
Release note: None
@craig
Copy link
Contributor

craig bot commented Jul 5, 2023

Build succeeded:

@craig craig bot merged commit 31e7a29 into cockroachdb:master Jul 5, 2023
craig bot pushed a commit that referenced this pull request Jul 5, 2023
106103: kvserver: fix off-by-one in FlushLockedWithRaftGroup r=erikgrinaker a=tbg

We need to skip the conf change itself. The result of this bug was that
proposals and entries wouldn't be lined up correctly if there was a conf change
in the batch.

Luckily, the impact of that is limited, since all the proposals are used for in
`proposeBatch` was to log an event to the context. Still, good to fix and improve
test coverage.

Epic: CRDB-25287
Release note: None

106141: logcrash: log Error, not Warning r=knz,erikgrinaker a=tbg

Noticed in #102248.

Epic: none
Release note: None


Co-authored-by: Tobias Grieger <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Jul 5, 2023
Noticed in #102248.

Epic: none
Release note: None
blathers-crl bot pushed a commit that referenced this pull request Jul 5, 2023
Noticed in #102248.

Epic: none
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: v22.1.9: replica descriptor's ID has changed
3 participants