Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: tryReproposeWithNewLeaseIndex may violate closed timestamp #42821

Closed
tbg opened this issue Nov 27, 2019 · 0 comments · Fixed by #42939
Closed

storage: tryReproposeWithNewLeaseIndex may violate closed timestamp #42821

tbg opened this issue Nov 27, 2019 · 0 comments · Fixed by #42939
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@tbg
Copy link
Member

tbg commented Nov 27, 2019

Consider proposing a write and this code path:

ch, abandon, maxLeaseIndex, pErr := r.evalAndPropose(ctx, lease, ba, spans, ec.move())
if pErr != nil {
if maxLeaseIndex != 0 {
log.Fatalf(
ctx, "unexpected max lease index %d assigned to failed proposal: %s, error %s",
maxLeaseIndex, ba, pErr,
)
}
return nil, pErr
}
// A max lease index of zero is returned when no proposal was made or a lease was proposed.
// In both cases, we don't need to communicate a MLAI. Furthermore, for lease proposals we
// cannot communicate under the lease's epoch. Instead the code calls EmitMLAI explicitly
// as a side effect of stepping up as leaseholder.
if maxLeaseIndex != 0 {
untrack(ctx, ctpb.Epoch(lease.Epoch), r.RangeID, ctpb.LAI(maxLeaseIndex))
}

Let's say the maxLeaseIndex assigned here is 100 and the timestamp of the command is 1000. The closed timestamp tracker now knows 100 as the MLAI of a command that has an effect at ts=1000.

The proposal now hits this code path

case proposalIllegalLeaseIndex:
// If we failed to apply at the right lease index, try again with a
// new one. This is important for pipelined writes, since they don't
// have a client watching to retry, so a failure to eventually apply
// the proposal would be a user-visible error.
pErr = r.tryReproposeWithNewLeaseIndex(ctx, cmd)
if pErr != nil {

and gets reproposed with a new lease index, let's say 200:

// Some tests check for this log message in the trace.
log.VEventf(ctx, 2, "retry: proposalIllegalLeaseIndex")
maxLeaseIndex, pErr := r.propose(ctx, p)
if pErr != nil {
log.Warningf(ctx, "failed to repropose with new lease index: %s", pErr)
return pErr

From the perspective of closed timestamps, this is problematic. A CT update might have been sent out stating that "if you've caught up to lease index 100, you've seen all the proposals that will have an effect on timestamp 1000". But it turns out that in this example, such a command will pop out of raft at lease index 200.

It seems that this can cause a violation of the invariant that any closed out MVCC snapshot is immutable.

As for fixing this, the "natural" attempt is to go through the tracker again. However, if we find that the tracker pushes the proposal (i.e. the original timestamp has closed out, as it has in this example), we need someone to reevaluate it.

It's possible that we would be better off removing this one-off below-raft-reproposal path completely and instead opting for having the client reevaluate the command unconditionally. The one-off path was originally introduced to accommodate pipelined writes (where the client returns early and isn't around to reevaluate the command) but this doesn't mean we can't reevaluate the batch, just that we have to create a goroutine to do so. This approach is less performant, but this is a rare case where simplicity is more important than performance.

@tbg tbg added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Nov 27, 2019
tbg added a commit to tbg/cockroach that referenced this issue Dec 4, 2019
Prior to this commit, tryReproposeWithNewLeaseIndex would not consult
with the closed timestamp tracker. This meant that a proposal may end up
with a new lease index at a timestamp that consumers of closed timestamp
updates had already considered immutable. In other words, this broke
closed timestamp guarantees and would by extension also have the
potential to break CDC's.

This path is fairly rare and it isn't one likely to cause problems
in practice, though it certainly could have.

See cockroachdb#42821 for details.

Fixes cockroachdb#42821

Release note (bug fix): fix a bug which could lead to follower reads or
CDC updates that did not reflect the full set of data at the timestamp.
This bug was never observed in practice and should be rare to cause
issues, one of the necessary ingredients being an aggressive closed
timestamp interval.
craig bot pushed a commit that referenced this issue Dec 13, 2019
42939: storage: respect closed timestamp in tryReproposeWithNewLeaseIndex r=nvanbenschoten a=tbg

Prior to this commit, tryReproposeWithNewLeaseIndex would not consult
with the closed timestamp tracker. This meant that a proposal may end up
with a new lease index at a timestamp that consumers of closed timestamp
updates had already considered immutable. In other words, this broke
closed timestamp guarantees and would by extension also have the
potential to break CDC's.

This path is fairly rare and it isn't one likely to cause problems
in practice, though it certainly could have.

See #42821 for details.

Fixes #42821

Release note (bug fix): fix a bug which could lead to follower reads or
CDC updates that did not reflect the full set of data at the timestamp.
This bug was never observed in practice and should be rare to cause
issues, one of the necessary ingredients being an aggressive closed
timestamp interval.

Co-authored-by: Tobias Schottdorf <[email protected]>
@craig craig bot closed this as completed in 20600dc Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants