-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: avoid stale read due to observed timestamp after lease change #23749
Milestone
Comments
spencerkimball
added a commit
to spencerkimball/cockroach
that referenced
this issue
Mar 21, 2018
This change relocates the logic for using observed timestamps to affect a transaction's `MaxTimestamp` field in order to avoid uncertainty restarts. This was previously done in `storage.Stores.Send`, but is now moved to `storage.Replica.executeReadOnlyBatch` and `storage.Replica.tryExecuteWriteBatch` in order to use information from the replica's lease to avoiding failing to read a value written in absolute time before the txn started, but at a higher timestamp than the timestamp observed at the node which now owns the replica lease. Fixes cockroachdb#23749 Release note: None
spencerkimball
added a commit
to spencerkimball/cockroach
that referenced
this issue
Mar 22, 2018
This change relocates the logic for using observed timestamps to affect a transaction's `MaxTimestamp` field in order to avoid uncertainty restarts. This was previously done in `storage.Stores.Send`, but is now moved to `storage.Replica.executeReadOnlyBatch` and `storage.Replica.tryExecuteWriteBatch` in order to use information from the replica's lease to avoiding failing to read a value written in absolute time before the txn started, but at a higher timestamp than the timestamp observed at the node which now owns the replica lease. Fixes cockroachdb#23749 Release note: None
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Apr 5, 2019
…holders This PR is an initial reaction to cockroachdb#36431 and to `limitTxnMaxTimestamp` being broken for follower reads (PR for that coming soon). To start, we were missing any real documentation on what makes the observed timestamp mechanism safe and what invariants need to hold in order for a clock reading on a node to imply an absence of causality on any values later read at higher timestamp on that node. The PR addresses this by discussing the invariants necessary for observed timestamps to be used to bound a transaction's uncertainty window. In doing so, the PR also attempts to address a second issue, which is that the documentation conflated a node serving reads and writes with leaseholders serving reads and writes. At the time that this was all written, these were equivalent. Now that we have follower reads, the omission of a discussion about leaseholders made this all hard to think about (e.g. "whose observed timestamp do we care about when performing a follower read?"). The PR addresses this by adjusting the wording around nodes and leaseholders. This would have come in handy in reasoning about cockroachdb#23749. Release note: None
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Apr 9, 2019
…holders This PR is an initial reaction to cockroachdb#36431 and to `limitTxnMaxTimestamp` being broken for follower reads (PR for that coming soon). To start, we were missing any real documentation on what makes the observed timestamp mechanism safe and what invariants need to hold in order for a clock reading on a node to imply an absence of causality on any values later read at higher timestamp on that node. The PR addresses this by discussing the invariants necessary for observed timestamps to be used to bound a transaction's uncertainty window. In doing so, the PR also attempts to address a second issue, which is that the documentation conflated a node serving reads and writes with leaseholders serving reads and writes. At the time that this was all written, these were equivalent. Now that we have follower reads, the omission of a discussion about leaseholders made this all hard to think about (e.g. "whose observed timestamp do we care about when performing a follower read?"). The PR addresses this by adjusting the wording around nodes and leaseholders. This would have come in handy in reasoning about cockroachdb#23749. Release note: None
craig bot
pushed a commit
that referenced
this issue
Apr 10, 2019
36554: roachpb: formalize observed timestamp preconditions, talk about leaseholders r=nvanbenschoten a=nvanbenschoten This PR is an initial reaction to #36431 and to `limitTxnMaxTimestamp` being broken for follower reads (PR for that coming soon). To start, we were missing any real documentation on what makes the observed timestamp mechanism safe and what invariants need to hold in order for a clock reading on a node to imply an absence of causality on any values later read at higher timestamp on that node. The PR addresses this by discussing the invariants necessary for observed timestamps to be used to bound a transaction's uncertainty window. In doing so, the PR also attempts to address a second issue, which is that the documentation conflated a node serving reads and writes with leaseholders serving reads and writes. At the time that this was all written, these were equivalent. Now that we have follower reads, the omission of a discussion about leaseholders made this all hard to think about (e.g. "whose observed timestamp do we care about when performing a follower read?"). The PR addresses this by adjusting the wording around nodes and leaseholders. This would have come in handy in reasoning about #23749. Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]>
craig bot
pushed a commit
that referenced
this issue
Apr 10, 2019
36554: roachpb: formalize observed timestamp preconditions, talk about leaseholders r=nvanbenschoten a=nvanbenschoten This PR is an initial reaction to #36431 and to `limitTxnMaxTimestamp` being broken for follower reads (PR for that coming soon). To start, we were missing any real documentation on what makes the observed timestamp mechanism safe and what invariants need to hold in order for a clock reading on a node to imply an absence of causality on any values later read at higher timestamp on that node. The PR addresses this by discussing the invariants necessary for observed timestamps to be used to bound a transaction's uncertainty window. In doing so, the PR also attempts to address a second issue, which is that the documentation conflated a node serving reads and writes with leaseholders serving reads and writes. At the time that this was all written, these were equivalent. Now that we have follower reads, the omission of a discussion about leaseholders made this all hard to think about (e.g. "whose observed timestamp do we care about when performing a follower read?"). The PR addresses this by adjusting the wording around nodes and leaseholders. This would have come in handy in reasoning about #23749. Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Discovered in #21056 (follower reads).
The fundamental problem is that when we pick up a clock read on a node that we visit, we use this as a guarantee that the node has not served any writes at higher timestamps that we could be missing (as in, that happened causally before us). But this guarantee is broken if the node later becomes leaseholder of some range after we first visited the node (where the data was in fact causally there before our visit, but the node wasn't in charge of it and thus unaware at the time).
For example,
@spencerkimball suggests:
The text was updated successfully, but these errors were encountered: