Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: use leaseholder's observed timestamp even for follower reads #56679

Merged

Conversation

nvanbenschoten
Copy link
Member

This commit addresses a longstanding TODO to rationalize the use of
observed timestamps during follower reads.

Before this change, we used to use observed timestamps pulled from the
follower node to limit a transaction's uncertainty interval, but this
was incorrect. An observed timestamp pulled from the follower node's
clock has no meaning for the purpose of reducing the transaction's
uncertainty interval. This is because there is no guarantee that at the
time of acquiring the observed timestamp from the follower node, the
leaseholder hadn't already served writes at higher timestamps than the
follower node's clock reflected.

However, if the transaction performing a follower read happens to have
an observed timestamp from the current leaseholder, this timestamp can
be used to reduce the transaction's uncertainty interval. Even though
the read is being served from a different replica in the range, the
observed timestamp still places a bound on the values in the range that
may have been written before the transaction began.

In the past, this was mostly innocuous because AOST txns don't have an
uncertainty interval and the follower read duration was so large that
very few "present time" transactions would ever perform follower reads.
Now that the follower read duration is lower and more and more present
time transactions will perform follower reads, this is more critical to
get right.

The change also reworks the observedts package's docs a little bit to
take advantage of section headers:
localhost_8080_pkg_github com_cockroachdb_cockroach_pkg_kv_kvserver_observedts_

@nvanbenschoten nvanbenschoten requested a review from tbg November 13, 2020 23:29
@cockroach-teamcity
Copy link
Member

This change is Reviewable

This commit addresses a longstanding TODO to rationalize the use of
observed timestamps during follower reads.

Before this change, we used to use observed timestamps pulled from the
follower node to limit a transaction's uncertainty interval, but this
was incorrect. An observed timestamp pulled from the follower node's
clock has no meaning for the purpose of reducing the transaction's
uncertainty interval. This is because there is no guarantee that at the
time of acquiring the observed timestamp from the follower node, the
leaseholder hadn't already served writes at higher timestamps than the
follower node's clock reflected.

However, if the transaction performing a follower read happens to have
an observed timestamp from the current leaseholder, this timestamp can
be used to reduce the transaction's uncertainty interval. Even though
the read is being served from a different replica in the range, the
observed timestamp still places a bound on the values in the range that
may have been written before the transaction began.

In the past, this was mostly innocuous because AOST txns don't have an
uncertainty interval and the follower read duration was so large that
very few "present time" transactions would ever perform follower reads.
Now that the follower read duration is lower and more and more present
time transactions will perform follower reads, this is more critical to
get right.

The change also reworks the observedts package's docs a little bit to
take advantage of section headers.
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/observedTSFollower branch from de181c6 to d7d777b Compare November 14, 2020 21:10
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on the docs format, this is starting to look like a solid pattern.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@nvanbenschoten
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented Nov 16, 2020

Build failed:

@nvanbenschoten
Copy link
Member Author

--- done: testdata/logic_test/hash_join_dist with config 5node: 2 tests, 1 failures
    logic.go:2908: 
        testdata/logic_test/hash_join_dist:9: error while processing
    logic.go:2908: pq: cannot up-replicate to s3; missing gossiped StoreDescriptor

Doesn't look related to this change. I stressed the test for 15 minutes and did not hit this issue, though I did eventually hit #56358 (comment). We don't have any other reports of it, so maybe a very rare flake?

bors r+

@craig
Copy link
Contributor

craig bot commented Nov 16, 2020

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Nov 16, 2020

Build succeeded:

@craig craig bot merged commit 859be50 into cockroachdb:master Nov 16, 2020
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/observedTSFollower branch November 23, 2020 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants