-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: use leaseholder's observed timestamp even for follower reads #56679
kv: use leaseholder's observed timestamp even for follower reads #56679
Conversation
This commit addresses a longstanding TODO to rationalize the use of observed timestamps during follower reads. Before this change, we used to use observed timestamps pulled from the follower node to limit a transaction's uncertainty interval, but this was incorrect. An observed timestamp pulled from the follower node's clock has no meaning for the purpose of reducing the transaction's uncertainty interval. This is because there is no guarantee that at the time of acquiring the observed timestamp from the follower node, the leaseholder hadn't already served writes at higher timestamps than the follower node's clock reflected. However, if the transaction performing a follower read happens to have an observed timestamp from the current leaseholder, this timestamp can be used to reduce the transaction's uncertainty interval. Even though the read is being served from a different replica in the range, the observed timestamp still places a bound on the values in the range that may have been written before the transaction began. In the past, this was mostly innocuous because AOST txns don't have an uncertainty interval and the follower read duration was so large that very few "present time" transactions would ever perform follower reads. Now that the follower read duration is lower and more and more present time transactions will perform follower reads, this is more critical to get right. The change also reworks the observedts package's docs a little bit to take advantage of section headers.
de181c6
to
d7d777b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating on the docs format, this is starting to look like a solid pattern.
Reviewable status: complete! 0 of 0 LGTMs obtained
bors r+ |
Build failed: |
Doesn't look related to this change. I stressed the test for 15 minutes and did not hit this issue, though I did eventually hit #56358 (comment). We don't have any other reports of it, so maybe a very rare flake? bors r+ |
Build failed (retrying...): |
Build succeeded: |
This commit addresses a longstanding TODO to rationalize the use of
observed timestamps during follower reads.
Before this change, we used to use observed timestamps pulled from the
follower node to limit a transaction's uncertainty interval, but this
was incorrect. An observed timestamp pulled from the follower node's
clock has no meaning for the purpose of reducing the transaction's
uncertainty interval. This is because there is no guarantee that at the
time of acquiring the observed timestamp from the follower node, the
leaseholder hadn't already served writes at higher timestamps than the
follower node's clock reflected.
However, if the transaction performing a follower read happens to have
an observed timestamp from the current leaseholder, this timestamp can
be used to reduce the transaction's uncertainty interval. Even though
the read is being served from a different replica in the range, the
observed timestamp still places a bound on the values in the range that
may have been written before the transaction began.
In the past, this was mostly innocuous because AOST txns don't have an
uncertainty interval and the follower read duration was so large that
very few "present time" transactions would ever perform follower reads.
Now that the follower read duration is lower and more and more present
time transactions will perform follower reads, this is more critical to
get right.
The change also reworks the observedts package's docs a little bit to
take advantage of section headers: