Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: ensure that PushTxn request's timestamp cache updates are safe #35297

Merged
merged 3 commits into from
Mar 2, 2019

Conversation

nvanbenschoten
Copy link
Member

This PR's overall goal is to ensure that the timestamp cache updates performed by PushTxn requests are always safe. It does this in a series of steps:

  1. it makes its PushTo argument inclusive, making it easier to use and easier to assert against (e.g. req.PushTo <= req.Timestamp).
  2. it removes the Now argument and begins using the batch's Timestamp field instead. This field is used to update the receiving store's clock before the request is evaluated and before the lease transfer check is performed.
  3. it adds an assertion that all timestamp cache updates are safe given their local clock. This will also catch hypothesized bugs like storage: minimize retries with 1PC commits #22315 (comment).

I'm planning on getting this in to 19.1.

@nvanbenschoten nvanbenschoten requested review from tbg and a team March 1, 2019 03:05
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/pushTxnTs branch 3 times, most recently from c4ad28d to ea078dd Compare March 1, 2019 07:37
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: very nice. I share your opinion that the reasoning in #1102 does not apply any more.

Reviewed 13 of 13 files at r1, 10 of 10 files at r2, 2 of 2 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/storage/replica_test.go, line 3282 at r2 (raw file):

	pushee := newTransaction("pushee", key, 1, tc.Clock())
	pushReq := pushTxnArgs(pusher, pushee, roachpb.PUSH_ABORT)
	// pushReq.Now = tc.Clock().Now()

Remove this line.

Before this change, pushers specified the timestamp before which they would like
to push a transaction (i.e. they passed their own timestamp). This change
reworks this so that pushers pass the exact timestamp they would like to push
the transaction to.

There are a few reasons to do this:
- it avoids confusion and makes the contract clearer
- it makes PushTxnRequest more general
- it makes it more clear how PushTo should relate to Now

This third reason is the most important. It will be exploited in the next
commit.

Release note: None
This commit deprecates PushTxnRequest.Now and gives its responsibility
to the batch header timestamp. The biggest reason to do this is so that
PushTxn requests properly update their receiver's clock. This is critical
because a PushTxn request can result in a timestamp cache entry to be
created with a value up to this time, so for safety, we need to ensure
that the leaseholder updates its clock at least to this time _before_
evaluating the request. Otherwise, a lease transfer could miss the
request's effect on the timestamp cache and result in a lost push/abort.

The comment on PushTxnRequest.Now mentioned that the header timestamp
couldn't be used because the header's timestamp "does not necessarily
advance with the node clock across retries and hence cannot detect
abandoned transactions." This dates back all the way to cockroachdb#1102. I haven't
been able to piece together what kind of retries this is referring to,
but I'm almost positive that they don't still apply.

Release note: None
This new assertion would fire if bug like the one hypothesized in
the following comment exist:
cockroachdb#22315 (comment)

Release note: None
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/pushTxnTs branch from ea078dd to 3164bf0 Compare March 2, 2019 01:14
Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @tbg)


pkg/storage/replica_test.go, line 3282 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Remove this line.

Done.

craig bot pushed a commit that referenced this pull request Mar 2, 2019
35297: storage: ensure that PushTxn request's timestamp cache updates are safe r=nvanbenschoten a=nvanbenschoten

This PR's overall goal is to ensure that the timestamp cache updates performed by `PushTxn` requests are always safe. It does this in a series of steps:
1. it makes its `PushTo` argument inclusive, making it easier to use and easier to assert against (e.g. `req.PushTo <= req.Timestamp`).
2. it removes the `Now` argument and begins using the batch's `Timestamp` field instead. This field is used to update the receiving store's clock _before_ the request is evaluated and before the lease transfer check is performed.
3. it adds an assertion that all timestamp cache updates are safe given their local clock. This will also catch hypothesized bugs like #22315 (comment).

I'm planning on getting this in to 19.1.

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig
Copy link
Contributor

craig bot commented Mar 2, 2019

Build succeeded

@craig craig bot merged commit 3164bf0 into cockroachdb:master Mar 2, 2019
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/pushTxnTs branch March 6, 2019 03:53
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Mar 28, 2019
This commit adds a check that QueryTxn requests set sufficiently high
timestamps on their request header. This is in the spirit of cockroachdb#35297.

Even though QueryTxn will never change the timestamp cache, I think
there is a very rare hazard if it doesn't provide a timestamp at
least as large as the transaction's timestamp in its batch header.
If it doesn't do this, it seems possible that it could evaluate
concurrently with a lease transfer (the lease transfer would have
to start right after). The timestamp cache could then rotate pages
a number of times until its low water mark is above the lease
transfer timestamp (which is crazy as this would need to take at
least 10s), and the QueryTxn could then consider a transaction ABORTED
while evaluating because of the timestamp cache low water mark. The
new leaseholder would have a lower timestamp cache low water mark
and the transaction might still be able to create its transaction
record.

I don't think we would have ever seen this in the wild because of
the 10s stall and the repeat timestamp cache rotations rotations,
but this seems like a good change to make anyway.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request May 1, 2019
This commit adds a check that QueryTxn requests set sufficiently high
timestamps on their request header. This is in the spirit of cockroachdb#35297.

Even though QueryTxn will never change the timestamp cache, I think
there is a very rare hazard if it doesn't provide a timestamp at
least as large as the transaction's timestamp in its batch header.
If it doesn't do this, it seems possible that it could evaluate
concurrently with a lease transfer (the lease transfer would have
to start right after). The timestamp cache could then rotate pages
a number of times until its low water mark is above the lease
transfer timestamp (which is crazy as this would need to take at
least 10s), and the QueryTxn could then consider a transaction ABORTED
while evaluating because of the timestamp cache low water mark. The
new leaseholder would have a lower timestamp cache low water mark
and the transaction might still be able to create its transaction
record.

I don't think we would have ever seen this in the wild because of
the 10s stall and the repeat timestamp cache rotations rotations,
but this seems like a good change to make anyway.

Release note: None
craig bot pushed a commit that referenced this pull request May 1, 2019
36307: storage: require QueryTxn requests set a sufficient timestamp r=nvanbenschoten a=nvanbenschoten

This commit adds a check that QueryTxn requests set sufficiently high
timestamps on their request header. This is in the spirit of #35297.

Even though QueryTxn will never change the timestamp cache, I think
there is a very rare hazard if it doesn't provide a timestamp at
least as large as the transaction's timestamp in its batch header.
If it doesn't do this, it seems possible that it could evaluate
concurrently with a lease transfer (the lease transfer would have
to start right after). The timestamp cache could then rotate pages
a number of times until its low water mark is above the lease
transfer timestamp (which is crazy as this would need to take at
least 10s), and the QueryTxn could then consider a transaction ABORTED
while evaluating because of the timestamp cache low water mark. The
new leaseholder would have a lower timestamp cache low water mark
and the transaction might still be able to create its transaction
record.

I don't think we would have ever seen this in the wild because of
the 10s stall and the repeat timestamp cache rotations necessary,
but this seems like a good change to make anyway.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Dec 27, 2019
This completes the migration started in cockroachdb#35297, which was partially
finalized in cockroachdb#38446.

Release note: None
craig bot pushed a commit that referenced this pull request Dec 27, 2019
43227: roachpb: remove PushTxnRequest.InclusivePushTo r=nvanbenschoten a=nvanbenschoten

This completes the migration started in #35297, which was partially finalized in #38446.

Release note: None

43316: storagebase: take RangeDesc by reference in ContainsKey(Range) r=nvanbenschoten a=nvanbenschoten

All callers of this had a *RangeDescriptor, so there's no reason to pass in a large RangeDescriptor struct.

Release note: None

43563: storage/intentresolver: don't capture loop iteration vars in async task r=nvanbenschoten a=nvanbenschoten

It's unclear if we've ever seen issues from this, but I intend to backport the
fix to v19.2, v19.1, and v2.1. I believe the worst thing that could have
happened is that a batch that observed multiple intents or pushed multiple txns
would only end up cleaning up a single one of these. It would then run into some
of these intents again when it tried to re-evaluate, forcing it to push again.
This subverts the parallelism that we were trying to achieve here, but would
never cause a stall.

Release note (bug fix): Ensure that all intents or transactions that a
batch observes are asynchronously cleaned up.

Co-authored-by: Nathan VanBenschoten <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants