Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backport-2.1: one week of merge PRs #28994

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Aug 23, 2018

Backport:

Please see individual PRs for details.

/cc @cockroachdb/release


benesch added 17 commits August 22, 2018 23:42
This TODO suggested removing maybeWatchForMerge by teaching the LHS
replica to reach into the RHS replica after a merge committed to unblock
requests. It failed to consider that we'd need to do the same if the
merge aborted. We don't presently have an abort trigger, so this is
excessively difficult. Simply discard the TODO; the status quo is fine.

Release note: None
Store.SplitRange was still using the old term "range" to refer to
replicas. Also use "left" and "right" instead of "orig" and "new" for
consistency with Store.MergeRange.

Release note: None
It is important that the in-memory copy of a replica's range descriptor
exactly match the on-disk copy. Previously, during splits/merges, we
would reconstruct the updates to the in-memory descriptor piecemeal.
This is dangerous, especially in mixed version clusters, where nodes can
disagree about what updates to the descriptor to perform. Notably,
splits only update the generation counter in v2.1.

Instead of trying to reconstruct the updates piecemeal during a split
or merge, simply adopt the updated descriptor from the split/merge
trigger wholesale.

We'd ideally adjust replica changes to operate the same way.
Unfortunately the updated descriptor is not currently included in the
ChangeReplicasTrigger, so the migration is rather involved.

Release note: None
Merges will explode if used in a mixed-version cluster. Gate them behind
a cluster setting.

For extra safety, also gate increments of the new generation counter in
the range descriptor behind the same cluster setting. It's not clear
exactly what would go wrong, if anything, if mixed-version clusters
incremented the generation counter, but better to be extra cautious.

Release note: None
When applying a merge, teach the leaseholder of the LHS range to update
its timestamp cache for the keyspace previously owned by the RHS range
appropriately.

Release note: None
TestStoreRangeMergeDuringShutdown shuts down the multiTestContext when
it applies a lease for the RHS. In rare cases, the lease application can
get replayed, which previously caused the multiTestContext to get
shutdown twice, which panics. Add additional state to prevent this case.

Fix cockroachdb#28894.

Release note: None
GetSnapshotForMerge no longer fetches a snapshot of the RHS of a merge.
Rename the method Subsume to better reflect its purpose: to freeze a
range before it is subsumed by its left-hand neighbor.

Release note: None
The rules for generating the settings HTML table got broken at some
point. Fix them by using target-specific variables properly: they only
apply to prerequisites of the declaring target, and are only resolved
within recipes.

Release note: None
TestStoreRangeMergeTimestampCacheCausality could time out if the merge
transaction retried, which occured about one out of every two hundred
runs, because it would try to send multiple values over a channel whose
buffer had room for only one value. Deflake the test by remembering in a
variable the value from the last merge transaction to execute rather
than using channels.

Release note: None
During a merge, it is possible for the RHS lease to change hands, e.g.,
when the original leaseholder dies and another member of the range
acquires the lease. In this case, the new leaseholder is responsible for
checking for a deletion intent on its local range descriptor; if it
discovers such an intent, a merge is in progress and the leaseholder is
required to block all traffic unless it can prove that the merge
aborted.

The previous implementation of this scheme had a small window in which
the new leaseholder had installed a valid lease but had not yet
installed a mergeComplete channel to block all traffic. This race was
never seen in practice, but it could, in theory, lead to a
serializability violation.

Reorder the flow post-lease acquisition so that checking for an
in-progress merge occurs before the new lease is installed.

Release note: None
Now that merges do not include a snapshot of the RHS data in the merge
trigger, we no longer need a setting limiting the size of the RHS of a
merge.

Release note: None
Merges are relatively expensive. Set the merge queue interval to one
second so we avoid processing too many merges at once. Introduce a
cluster setting to allow users/tests to adjust the merge queue interval
if they so choose.

Fix cockroachdb#27769.

Release note: None
The retry loop in AdminSplit can span many seconds. In that time, the
replica may lose its lease, or the range might be merged away entirely.
In either of those cases, the split can never succeed, and so the retry
loop needs to give up.

The loop was properly exiting if it noticed it lost its lease, but a
range can get merged away without losing its lease. The final lease on
that range remains valid until the liveness epoch it is tied to expires.
Teach the loop to notice that condition too by checking
Replica.IsDestroyed on every turn of the loop.

Release note: None
Teach TestSystemZoneConfigs to install zone configs via SQL, rather than
the hacky testing override system, which interacts poorly with the
forthcoming on-by-default merge queue.

Release note: None
Guarantee session consistency for SET CLUSTER SETTING. That is, a
session that executes a SET CLUSTER SETTING in a transaction is
guaranteed to use that new value of the cluster setting after the
transaction commits. (Unless, of course, there are concurrent updates to
the setting.)

Release note: None
Turn off the merge queue in all tests that need it. The actual default
will be changed in a separate PR so that this commit can be safely
backported to release-2.1.

Release note: None
Splitting while the merge queue is enabled is almost certainly a user
mistake. Add a best-effort check to prevent users from splitting while
the merge queue is enabled. Users can override the check and request a
split anyway by twiddling a new session variable,
experimental_force_split_at.

We have plans to eventually make the splits created by SPLIT AT
"sticky", so that the merge queue does not immediately merge them away,
but not in time for 2.1.

Release note: None
@benesch benesch requested review from tbg, nvanbenschoten and a team August 23, 2018 03:46
@benesch benesch requested a review from a team as a code owner August 23, 2018 03:46
@benesch benesch requested review from a team August 23, 2018 03:46
@benesch benesch requested a review from a team as a code owner August 23, 2018 03:46
@benesch benesch requested review from a team August 23, 2018 03:46
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@benesch
Copy link
Contributor Author

benesch commented Aug 27, 2018

Superseded by #29082.

@benesch benesch closed this Aug 27, 2018
@benesch benesch deleted the backport2.1-28885-28865-28793-28902-28910-28884-28928-28894-28961 branch August 27, 2018 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants