-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: ensure that PushTxn request's timestamp cache updates are safe #35297
storage: ensure that PushTxn request's timestamp cache updates are safe #35297
Conversation
c4ad28d
to
ea078dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice. I share your opinion that the reasoning in #1102 does not apply any more.
Reviewed 13 of 13 files at r1, 10 of 10 files at r2, 2 of 2 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/storage/replica_test.go, line 3282 at r2 (raw file):
pushee := newTransaction("pushee", key, 1, tc.Clock()) pushReq := pushTxnArgs(pusher, pushee, roachpb.PUSH_ABORT) // pushReq.Now = tc.Clock().Now()
Remove this line.
Before this change, pushers specified the timestamp before which they would like to push a transaction (i.e. they passed their own timestamp). This change reworks this so that pushers pass the exact timestamp they would like to push the transaction to. There are a few reasons to do this: - it avoids confusion and makes the contract clearer - it makes PushTxnRequest more general - it makes it more clear how PushTo should relate to Now This third reason is the most important. It will be exploited in the next commit. Release note: None
This commit deprecates PushTxnRequest.Now and gives its responsibility to the batch header timestamp. The biggest reason to do this is so that PushTxn requests properly update their receiver's clock. This is critical because a PushTxn request can result in a timestamp cache entry to be created with a value up to this time, so for safety, we need to ensure that the leaseholder updates its clock at least to this time _before_ evaluating the request. Otherwise, a lease transfer could miss the request's effect on the timestamp cache and result in a lost push/abort. The comment on PushTxnRequest.Now mentioned that the header timestamp couldn't be used because the header's timestamp "does not necessarily advance with the node clock across retries and hence cannot detect abandoned transactions." This dates back all the way to cockroachdb#1102. I haven't been able to piece together what kind of retries this is referring to, but I'm almost positive that they don't still apply. Release note: None
This new assertion would fire if bug like the one hypothesized in the following comment exist: cockroachdb#22315 (comment) Release note: None
ea078dd
to
3164bf0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @tbg)
pkg/storage/replica_test.go, line 3282 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Remove this line.
Done.
35297: storage: ensure that PushTxn request's timestamp cache updates are safe r=nvanbenschoten a=nvanbenschoten This PR's overall goal is to ensure that the timestamp cache updates performed by `PushTxn` requests are always safe. It does this in a series of steps: 1. it makes its `PushTo` argument inclusive, making it easier to use and easier to assert against (e.g. `req.PushTo <= req.Timestamp`). 2. it removes the `Now` argument and begins using the batch's `Timestamp` field instead. This field is used to update the receiving store's clock _before_ the request is evaluated and before the lease transfer check is performed. 3. it adds an assertion that all timestamp cache updates are safe given their local clock. This will also catch hypothesized bugs like #22315 (comment). I'm planning on getting this in to 19.1. Co-authored-by: Nathan VanBenschoten <[email protected]>
Build succeeded |
This commit adds a check that QueryTxn requests set sufficiently high timestamps on their request header. This is in the spirit of cockroachdb#35297. Even though QueryTxn will never change the timestamp cache, I think there is a very rare hazard if it doesn't provide a timestamp at least as large as the transaction's timestamp in its batch header. If it doesn't do this, it seems possible that it could evaluate concurrently with a lease transfer (the lease transfer would have to start right after). The timestamp cache could then rotate pages a number of times until its low water mark is above the lease transfer timestamp (which is crazy as this would need to take at least 10s), and the QueryTxn could then consider a transaction ABORTED while evaluating because of the timestamp cache low water mark. The new leaseholder would have a lower timestamp cache low water mark and the transaction might still be able to create its transaction record. I don't think we would have ever seen this in the wild because of the 10s stall and the repeat timestamp cache rotations rotations, but this seems like a good change to make anyway. Release note: None
This commit adds a check that QueryTxn requests set sufficiently high timestamps on their request header. This is in the spirit of cockroachdb#35297. Even though QueryTxn will never change the timestamp cache, I think there is a very rare hazard if it doesn't provide a timestamp at least as large as the transaction's timestamp in its batch header. If it doesn't do this, it seems possible that it could evaluate concurrently with a lease transfer (the lease transfer would have to start right after). The timestamp cache could then rotate pages a number of times until its low water mark is above the lease transfer timestamp (which is crazy as this would need to take at least 10s), and the QueryTxn could then consider a transaction ABORTED while evaluating because of the timestamp cache low water mark. The new leaseholder would have a lower timestamp cache low water mark and the transaction might still be able to create its transaction record. I don't think we would have ever seen this in the wild because of the 10s stall and the repeat timestamp cache rotations rotations, but this seems like a good change to make anyway. Release note: None
36307: storage: require QueryTxn requests set a sufficient timestamp r=nvanbenschoten a=nvanbenschoten This commit adds a check that QueryTxn requests set sufficiently high timestamps on their request header. This is in the spirit of #35297. Even though QueryTxn will never change the timestamp cache, I think there is a very rare hazard if it doesn't provide a timestamp at least as large as the transaction's timestamp in its batch header. If it doesn't do this, it seems possible that it could evaluate concurrently with a lease transfer (the lease transfer would have to start right after). The timestamp cache could then rotate pages a number of times until its low water mark is above the lease transfer timestamp (which is crazy as this would need to take at least 10s), and the QueryTxn could then consider a transaction ABORTED while evaluating because of the timestamp cache low water mark. The new leaseholder would have a lower timestamp cache low water mark and the transaction might still be able to create its transaction record. I don't think we would have ever seen this in the wild because of the 10s stall and the repeat timestamp cache rotations necessary, but this seems like a good change to make anyway. Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]>
This completes the migration started in cockroachdb#35297, which was partially finalized in cockroachdb#38446. Release note: None
43227: roachpb: remove PushTxnRequest.InclusivePushTo r=nvanbenschoten a=nvanbenschoten This completes the migration started in #35297, which was partially finalized in #38446. Release note: None 43316: storagebase: take RangeDesc by reference in ContainsKey(Range) r=nvanbenschoten a=nvanbenschoten All callers of this had a *RangeDescriptor, so there's no reason to pass in a large RangeDescriptor struct. Release note: None 43563: storage/intentresolver: don't capture loop iteration vars in async task r=nvanbenschoten a=nvanbenschoten It's unclear if we've ever seen issues from this, but I intend to backport the fix to v19.2, v19.1, and v2.1. I believe the worst thing that could have happened is that a batch that observed multiple intents or pushed multiple txns would only end up cleaning up a single one of these. It would then run into some of these intents again when it tried to re-evaluate, forcing it to push again. This subverts the parallelism that we were trying to achieve here, but would never cause a stall. Release note (bug fix): Ensure that all intents or transactions that a batch observes are asynchronously cleaned up. Co-authored-by: Nathan VanBenschoten <[email protected]>
This PR's overall goal is to ensure that the timestamp cache updates performed by
PushTxn
requests are always safe. It does this in a series of steps:PushTo
argument inclusive, making it easier to use and easier to assert against (e.g.req.PushTo <= req.Timestamp
).Now
argument and begins using the batch'sTimestamp
field instead. This field is used to update the receiving store's clock before the request is evaluated and before the lease transfer check is performed.I'm planning on getting this in to 19.1.