-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster: remove a series of 2.1+ cluster versions #38446
cluster: remove a series of 2.1+ cluster versions #38446
Conversation
Release note: None
Release note: None
Release note: None
Release note: None
19d3e4b
to
0f4bd70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 3 of 3 files at r2, 5 of 5 files at r3, 5 of 5 files at r4, 6 of 6 files at r5, 4 of 4 files at r6, 3 of 3 files at r7, 8 of 8 files at r8, 3 of 3 files at r9.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/storage/replica.go, line 1458 at r8 (raw file):
DeprecatedNow: b.Header.Timestamp, PushType: roachpb.PUSH_ABORT, InclusivePushTo: true,
Ah, this is subtle.
pkg/storage/batcheval/cmd_get.go, line 78 at r5 (raw file):
// active which proves that all nodes are running version 19.2. // // TODO(nvanbenschoten): Remove in 20.1. This serves only as documentation
I don't understand why you can't remove this now. I do understand why VersionInclusivePush to needs to stay around, but this?
0f4bd70
to
c6cf7f1
Compare
Release note: None
Release note: None
Release note: None
I had it commented that we could also remove `InclusivePushTo` in 19.2, but I now think that was incorrect. We still need to set it to true to prevent 19.1 nodes that have yet to hear about the `VersionPushTxnToInclusive` becoming active from thinking that the Replica sending the PushTxnRequest is a 2.1 node (which sent exclusive push to timestamps). Release note: None
Release note: None
c6cf7f1
to
7400cc4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @tbg)
pkg/storage/replica.go, line 1458 at r8 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Ah, this is subtle.
Yeah, it's unfortunate.
pkg/storage/batcheval/cmd_get.go, line 78 at r5 (raw file):
Previously, tbg (Tobias Grieger) wrote…
I don't understand why you can't remove this now. I do understand why VersionInclusivePush to needs to stay around, but this?
I'm leaving this around as documentation to explain that because we checked a cluster version to determine whether we should ignore sequence numbers in 19.1, 19.2 nodes can't blindly expect all nodes in their cluster to respect sequence numbers when reading. It's possible that a 19.2 node joins the cluster before all 19.1 nodes realize that the cluster version has been upgraded to 19.1.X. I reworded the comment slightly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 13 files at r10, 4 of 4 files at r11, 3 of 3 files at r12, 8 of 8 files at r13, 3 of 3 files at r14.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
pkg/storage/batcheval/cmd_get.go, line 78 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm leaving this around as documentation to explain that because we checked a cluster version to determine whether we should ignore sequence numbers in 19.1, 19.2 nodes can't blindly expect all nodes in their cluster to respect sequence numbers when reading. It's possible that a 19.2 node joins the cluster before all 19.1 nodes realize that the cluster version has been upgraded to 19.1.X. I reworded the comment slightly.
Feels like I'm missing something still. What confuses me is that I don't understand who you are documenting this for. After removing this and by extension IgnoreSequence
, your code works the same; it's using the new behavior now (right?). From your comment, my best guess now is that this documentation is just in an awkward place. In this code, it looks very much like "19.2 nodes are blindly relying on correct sequence number behavior". According to your comment, there should be places where they have to be more careful. Where is that and why is the comment here and not there?
(It doesn't help that MVCCGetOptions.IgnoreIntent doesn't have any documentation, not sure that it's worth fixing now that it's basically obsolete)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @knz)
pkg/storage/batcheval/cmd_get.go, line 78 at r5 (raw file):
After removing this and by extension IgnoreSequence, your code works the same; it's using the new behavior now (right?)
Yes, this is the same as deleting all of the code. The plan was to do that in 20.1.
Where is that and why is the comment here and not there?
I'm leaving this primarily for myself and @knz because we're going to need to be careful about this when fixing the Halloween problem. The point of the comment is to say that until 20.1, we need to be careful that not all nodes in the cluster will always consider sequence numbers when performing a scan, even though the codebase that we'll be programming against will. Leaving the code in until it is no longer relevant is a way to ensure that we don't completely forget about this migration period.
I don't see where a better place to put it would be. Moving it to Transaction.Sequence
is likely to get lost. Leaving the code like this ensures that readers can understand what to expect from replicas before and after the old cluster setting is active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
bors r=tbg |
38446: cluster: remove a series of 2.1+ cluster versions r=tbg a=nvanbenschoten I wanted to remove `VersionLazyTxnRecord` and `VersionPushTxnToInclusive`, but while there I removed all of the following: - VersionCascadingZoneConfigs - VersionLoadSplits - VersionExportStorageWorkload - VersionLazyTxnRecord - VersionSequencedReads - VersionCreateStats - VersionDirectImport - VersionPushTxnToInclusive Co-authored-by: Nathan VanBenschoten <[email protected]>
Build succeeded |
This completes the migration started in cockroachdb#35297, which was partially finalized in cockroachdb#38446. Release note: None
43227: roachpb: remove PushTxnRequest.InclusivePushTo r=nvanbenschoten a=nvanbenschoten This completes the migration started in #35297, which was partially finalized in #38446. Release note: None 43316: storagebase: take RangeDesc by reference in ContainsKey(Range) r=nvanbenschoten a=nvanbenschoten All callers of this had a *RangeDescriptor, so there's no reason to pass in a large RangeDescriptor struct. Release note: None 43563: storage/intentresolver: don't capture loop iteration vars in async task r=nvanbenschoten a=nvanbenschoten It's unclear if we've ever seen issues from this, but I intend to backport the fix to v19.2, v19.1, and v2.1. I believe the worst thing that could have happened is that a batch that observed multiple intents or pushed multiple txns would only end up cleaning up a single one of these. It would then run into some of these intents again when it tried to re-evaluate, forcing it to push again. This subverts the parallelism that we were trying to achieve here, but would never cause a stall. Release note (bug fix): Ensure that all intents or transactions that a batch observes are asynchronously cleaned up. Co-authored-by: Nathan VanBenschoten <[email protected]>
I wanted to remove
VersionLazyTxnRecord
andVersionPushTxnToInclusive
,but while there I removed all of the following: