Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: replicate and split queue interactions can cause degraded availability #107520

Open
tbg opened this issue Jul 25, 2023 · 1 comment
Open
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team

Comments

@tbg
Copy link
Member

tbg commented Jul 25, 2023

Describe the problem

See the detailed analysis in #104588.

When a lot of (say, size-based) splits happen alongside replica movement, one can end up in a situation where lots of replicas need a raft snapshot, and the raft snapshots trickle in very slowly essentially due to "keyspace contention" between "stale" replicas (that haven't caught up across all splits) and "new" snapshots (that reflect all splits).

Fundamentally this is because splitting a range for which a follower needs a snapshot results in two ranges for which a follower needs a snapshot, but the snapshot for the right hand side can only go through once the snapshot for the left-hand side has. This interdependence between snapshots which is not visible to the raft snapshot queue causes the build-up of snapshots to be processed very slowly, especially once it has ballooned to hundreds of snapshots of backlog, which can easily happen with enough splits. Additionally, if the snapshots involved are large, there are additional pathologies such as the lease changing hands while snapshots are still in flight1, resulting in wasted work.

To Reproduce

Not sure how to reliably trigger this. These kinds of issues have kept us busy for a long time2, usually stressing a test that suitably combines rebalancing and splits while verifying that there aren't any raft snaps is enough to see these kinds of interactions.

Expected behavior

The goal should be that the only raft snapshots we ever see are due to log truncations (in which case we may wonder if the log truncation heuristics could be improved, but this is outside of the scope of this issue).

Jira issue: CRDB-30091

Epic CRDB-39952

Footnotes

  1. this happens a few times in roachtest: splits/largerange/size=32GiB,nodes=6 failed [raft snaps; needs #106813] #104588 but I'm not sure why.

  2. see https://cockroachlabs.atlassian.net/wiki/spaces/CORE/pages/64749670/Raft+Snapshots+and+why+you+see+them+when+you+oughtn+t (internal)

@tbg tbg added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv-replication labels Jul 25, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jul 25, 2023

cc @cockroachdb/replication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team
Projects
No open projects
Status: Incoming
Development

No branches or pull requests

1 participant