Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: provide a way to reset ContainsEstimates via stats recomputation #37120

Closed
tbg opened this issue Apr 25, 2019 · 3 comments · Fixed by #37583
Closed

storage: provide a way to reset ContainsEstimates via stats recomputation #37120

tbg opened this issue Apr 25, 2019 · 3 comments · Fixed by #37583
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience.

Comments

@tbg
Copy link
Member

tbg commented Apr 25, 2019

See #36907 for context. Time series merges and AddSSTable requests can "taint" the range stats by introducing the ContainsEstimates flag. The consistency checker is supposed to notice this and will trigger a stats recomputation. However, it doesn't have a good way to actually reset the ContainsEstimates flag, even after stats are correct. This is because the recomputation relies on the commutativity of stats deltas, but resetting a boolean does not commute with setting that boolean. In other words, if the recomputation attempted to reset the boolean, it couldn't make sure that some other command simultaneously introduces an estimate that is then not reflected in the final, untainted, mvccstats.

A simple fix is to upgrade the MVCCStats.ContainsEstimates field from a bool to a counter that is incremented with each Raft command containing estimates. The stats recomputation would emit the negative count observed on the stats contained in the snapshots it's using to compute a stats delta. In practice, this will almost always lead to ContainsEstimates==0, except when an estimate raced in, in which case it will be counted correctly. ContainsEstimates should never become negative since that could only happen with two concurrent recomputations, and I believe we prevent those already (or they could mangle the stats).

Labeling as E-intermediate because there's a bit of a migration involved.

@tbg tbg added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience. labels Apr 25, 2019
@tbg
Copy link
Member Author

tbg commented Apr 25, 2019

cc @knz because we talked about this

@tbg
Copy link
Member Author

tbg commented May 1, 2019

Pointers to understand/tackle this:

The stats sit on the Replica, here (and in a persisted key that's kept in sync with the in-memory object)

// GetMVCCStats returns a copy of the MVCC stats object for this range.
// This accessor is thread-safe, but provides no guarantees about its
// synchronization with any concurrent writes.
func (r *Replica) GetMVCCStats() enginepb.MVCCStats {
r.mu.RLock()
defer r.mu.RUnlock()
return *r.mu.state.Stats
}

Whenever a Raft command applies, we add a corresponding stats update:

// Set the range applied state, which includes the last applied raft and
// lease index along with the mvcc stats, all in one key.
if err := r.raftMu.stateLoader.SetRangeAppliedState(ctx, writer,
raftAppliedIndex, leaseAppliedIndex, &ms); err != nil {
return storagepb.ReplicatedEvalResult{}, errors.Wrap(err, "unable to set range applied state")
}

This happens on every Replica (i.e. three times with a default 3x replicated range), but it's deterministic, so all replicas should have identical stats (at any given position). If this isn't the case it's bad, but this isn't something we worry about in this issue.

Some Raft commands set the ContainsEstimates field to true. These are commands that don't want to compute the exact stats because it's too expensive for them. Instead, they give some approximation which is added (identically, on each replica, like any stats updated). Once a range has applied such an update, its ContainsEstimates stays true eternally until it is merged away (which for most ranges, is never).

The consistency checker is a queue that periodically computes a checksum across the replicas and makes sure it matches (at the same log position across the replicas).

resp, pErr := repl.CheckConsistency(ctx, req)

As a side effect of that (since it has to look at all the data anyway), it can recompute the "true" MVCCStats of the range, and compare it to what the range claims the MVCCStats are. Absent a bug, these stats can only disagree if ContainsEstimates==true on the range.

If a delta is found, the consistency checker initiates a RecomputeStatsRequest:

log.Infof(ctx, "triggering stats recomputation to resolve delta of %+v", results[0].Response.Delta)
req := roachpb.RecomputeStatsRequest{
RequestHeader: roachpb.RequestHeader{Key: startKey},
}

This request computes the current stats (it's kind of questionable that it scans the range again, but let's leave that alone) and adds the delta to them which is then applied to the stats through a raft command:

actualMS, err := rditer.ComputeStatsForRange(desc, snap, cArgs.Header.Timestamp.WallTime)
if err != nil {
return result.Result{}, err
}
currentStats, err := MakeStateLoader(cArgs.EvalCtx).LoadMVCCStats(ctx, snap)
if err != nil {
return result.Result{}, err
}
delta := actualMS
delta.Subtract(currentStats)

Now you'd hope that this would also reset ContainsEstimates to false. But this isn't the case because we can't guarantee that there isn't a request that wants to introduce an estimate that applies just before our "reset" of that flag. If that happened, we'd reset the flag even though there is still an estimate. We don't want that.

The solution is to turn the boolean into something that commutes, i.e. integer addition. Instead of ContainsEstimates bool we have ContainsEstimates int64. Each command introducing an estimate would add 1 to it. In the RecomputeStats evaluation, we would emit a delta of -currentStats.ContainsEstimates. When that applies, usually we have ContainsEstimates==0, i.e. there aren't any. If a command slips in ahead of us, we'll add up with a 1. Everything just works.

Steps to do this:

  • add a cluster version, see
    // To add a version:
    // - Add it at the end of this block.
    // - Add it at the end of the `Versions` block below.
    // - For major or minor versions, bump BinaryMinimumSupportedVersion. For
    // example, if introducing the `1.4` release, bump it from `1.2` to `1.3`.
  • change the ContainsEstimates field from a bool to an int64 (double check the proto spec that this is safe, i.e. that it'll happily decode the 1 as true and 0 as false), update the code accordingly (i.e. false corresponds to 0, everything else corresponds to 1 but we'll only see 0 and 1 at this stage).
  • a gotcha is that all the commands that introduce ContainsEstimates = true (in pkg/storage/batcheval) need to check the current stats, and only emit a ContainsEstimates==true if it isn't already true (otherwise we'll start generating numbers larger than 1 before we know that all nodes can handle that -- an old node would decode such numbers as true and would diverge from the rest, because true corresponds to 1). Once the cluster version is active however, always emit 1.
  • use the cluster version in RecomputeStats: when it's active, we know all nodes know about the integers, and we can emit a delta of -ContainsEstimates. Otherwise, we do what today's code does, i.e. we don't touch ContainsEstimates.

Note that this migration doesn't cover all the bases. There's a case in which the result will be a divergence of the stats. As sort of a pop quiz I'll leave open how that can happen. I have a way out as well. (Updates will be required to the list above).

@giorgosp
Copy link
Contributor

giorgosp commented May 7, 2019

Just a ping that I am working on this although I haven't pinged for several days.
I have wrapped my head around the ContainsEstimates concept and most of my effort was in tracing the related code paths.

On a low level note, I guess the new behavior for AddSSTable should also be implemented for time series merges as well (in updateStatsOnMerge during MVCCMerge).

giorgosp added a commit to giorgosp/cockroach that referenced this issue May 19, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue May 27, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jun 3, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jun 5, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jun 6, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jun 19, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jun 20, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jul 1, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jul 8, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jul 9, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jul 11, 2019
This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jul 19, 2019
This migration makes ContainsEstimates a counter so that the
consistency checker can reset it (by returning a -ContainsEstimates)
delta) without racing with another command that introduces new
estimate stats.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Jul 26, 2019
This migration makes ContainsEstimates a counter so that the
consistency checker can reset it (by returning a -ContainsEstimates)
delta) without racing with another command that introduces new
estimate stats.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Aug 26, 2019
This migration makes ContainsEstimates a counter so that the
consistency checker can reset it (by returning a -ContainsEstimates)
delta) without racing with another command that introduces new
estimate stats.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Sep 16, 2019
This migration makes ContainsEstimates a counter so that the
consistency checker can reset it (by returning a -ContainsEstimates)
delta) without racing with another command that introduces new
estimate stats.

Resolves cockroachdb#37120

Release note: None
giorgosp added a commit to giorgosp/cockroach that referenced this issue Oct 30, 2019
This migration makes ContainsEstimates a counter so that the
consistency checker can reset it (by returning a -ContainsEstimates)
delta) without racing with another command that introduces new
estimate stats.

Resolves cockroachdb#37120

Release note: None
tbg pushed a commit to giorgosp/cockroach that referenced this issue Nov 4, 2019
This migration makes ContainsEstimates a counter so that the
consistency checker can reset it (by returning a -ContainsEstimates)
delta) without racing with another command that introduces new
estimate stats.

Resolves cockroachdb#37120

Release note: None
craig bot pushed a commit that referenced this issue Nov 6, 2019
37583: storage: Migrate MVCCStats.contains_estimates from bool to int64 r=tbg a=giorgosp

This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves #37120

Release note: None

42129: colexec: fix AND and OR projections in some cases r=yuzefovich a=yuzefovich

Previously, the original batch length was not respected when the
selection vector is present. This resulted in, for example, query 19 of
TPCH benchmark to return an error. This is now fixed.

I have troubles coming up with a reduced reproduction though.

I also checked that on release-19.2 branch the query is executed
correctly with vectorized, so it must be the switch to flat bytes that
triggers the problem.

Release note: None

42172: colexec: fix sorted distinct with nulls behavior r=yuzefovich a=yuzefovich

Previously, sorted distinct when the nulls might be present would always
get the value at the index without paying attention whether that value
is actually now. This is incorrect behavior because it is undefined in
some cases (like when getting from flat bytes). Now this is fixed.

Fixes: #42055.

Release note: None

Co-authored-by: George Papadrosou <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig craig bot closed this as completed in 35bdbf0 Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants