Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Migrate MVCCStats.contains_estimates from bool to int64 #37583

Merged
merged 1 commit into from
Nov 6, 2019

Conversation

giorgosp
Copy link
Contributor

This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves #37120

Release note: None

@giorgosp giorgosp requested a review from a team as a code owner May 19, 2019 09:36
@giorgosp giorgosp requested review from a team May 19, 2019 09:36
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@giorgosp giorgosp force-pushed the migrate-containsestimates-int64 branch 4 times, most recently from 0a22a46 to 8def60c Compare May 19, 2019 16:42
@giorgosp giorgosp requested a review from a team May 19, 2019 16:42
@tbg tbg self-requested a review May 19, 2019 19:03
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good so far! I think there's still some confusion about these computations but that's understandable, this stuff melts my brain every time I look at it.

Here's what I'm thinking about the migration. Naively (and this won't be enough) commands (in batcheval, i.e. AddSSTable) would do something like:

cArgs.Stats.ContainsEstimates++

Now here's still a problem: consider some nodes are running the old version still and we propose a command that introduces ContainsEstimates for the first time. It looks good, because the new nodes will compute 1 + 0 = 1 and the old nodes will see a protobuf-varint encoded 1 on the wire, which they decode to ContainsEstimates == true (see here)

But now we run a second command with an estimate, so the new nodes update ContainsEstimates = 1+1 = 2 but the old node still decodes 2 as true and true encodes the same as 1 (in protobuf parlance). So we're screwed because now the old nodes persist different bytes for their stats as the new nodes. We can't use anything but 0and1(with the semantics that1` never gets removed) as long as old nodes are around.

This poses an awkward problem since we need to uphold this also when we can't check the cluster version (i.e. below Raft).

I think the way out is to never use ContainsEstimates == 1 except as a signal that we're still using the legacy behavior. We can then adapt the logic that adds ContainsEstimates so that we have

func (ms *MVCCStats) Subtract(oms MVCCStats) { // similar for Add
  if ms.ContainsEstimates == 1 || oms.ContainsEstimates == 1 {
    ms.ContainsEstimates = 1
  } else {
    ms.ContainsEstimates -= oms.ContainsEstimates
  }
}

and wherever we generate a delta we would use

if !r.ClusterSettings().Version.IsActive(cluster.VersionContainsEstimatesCounter) {
  cArgs.Stats.ContainsEstimates = 1
} else {
  cArgs.Stats.ContainsEstimates += 2
}

Let's walk through an example to see how this changes things and run three AddSSTable commands in a mixed cluster. All of them will set cArgs.Stats.ContainsEstimates = 1 and thanks to ms.{Add,Subtract} above and the fact that nobody will ever use any ContainsEstimates which are not 0 or 1 (since old nodes are around), this will behave (and encode) exactly like the old nodes encoded false and true. Of course there's an exception, which is that MVCCStatsDelta uses sint64 and there, 1 encodes like a regular varint(2), so when in MVCCStatsDelta we need to map 1 -> -1 (i.e. ContainsEstimates == true corresponds to ContainsEstimates == -1). That is just to say that the MVCCStatsDelta and the various other structs of the same goal (which we use for more efficient encodings in some places) need to have extra code in the wrappers:

// ToStats converts the receiver to an MVCCStats.
func (ms *MVCCStatsDelta) ToStats() MVCCStats {
return MVCCStats(*ms)
}
// ToStatsDelta converts the receiver to an MVCCStatsDelta.
func (ms *MVCCStats) ToStatsDelta() MVCCStatsDelta {
return MVCCStatsDelta(*ms)
}
// ToStats converts the receiver to an MVCCStats.
func (ms *MVCCPersistentStats) ToStats() MVCCStats {
return MVCCStats(*ms)
}
// ToPersistentStats converts the receiver to an MVCCPersistentStats.
func (ms *MVCCStats) ToPersistentStats() MVCCPersistentStats {
return MVCCPersistentStats(*ms)
}

Now when the cluster version flips, nodes will start to emit even numbers. Due to our semantics (see Subtract above) as long as the range still has ContainsEstimates == 1, we stick with 1. But once the consistency checker adjusts the stats under the new version (i.e. resets to zero), future updates will actually use the counter semantics, as we wanted.

^- this is all pretty out in the weeds (especially the different proto encodings), so it's totally fine by me (and a valid starting point) to forget about the migration for now and to get the code in good order first. We can then give the migration focused attention. The migration is unfortunately pretty critical since replicas will diverge if we get it wrong.

Reviewed 28 of 28 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @giorgosp)


pkg/storage/client_split_test.go, line 934 at r1 (raw file):

	}

	// TODO(giorgosp): Why msLeft.ContainsEstimates is 193 and msRight.ContainsEstimates is 0??

Each writeRandomTimeSeriesDataToRange write adds an estimate. You split at the midpoint, which recomputes the stats on the left and right, so you should end up with no estimates on either side. I've commented on the code in the split trigger, it's not correct and this test catches the problem.


pkg/storage/replica_consistency.go, line 186 at r1 (raw file):

		}

		if delta.ContainsEstimates == 0 && testingFatalOnStatsMismatch {

Make this <= 0 (and mention in the comment that it's never supposed to go negative, but who knows)


pkg/storage/replica_raft.go, line 2284 at r1 (raw file):

	// If we were running the new VersionContainsEstimatesCounter cluster version,
	// the consistency checker will be able to reset the stats itself.
	if !r.ClusterSettings().Version.IsActive(cluster.VersionContainsEstimatesCounter) {

It's usually not possible to use IsActive safely below Raft (i.e. the code that applies Raft commands to the state machine, i.e. here). The reason is that the cluster version is communicated asynchronously, so nodes will receive it at different times, and will just randomly start using it at whatever log position the update comes in. This can lead to one node running this code and another not, for the same command, thus updating the in-memory state in different ways. I think you want to leave this code in as it was before but change the split code so that it generates a ms that has ContainsEstimates set to a negative value that cancels that of the left-hand side (but only if the version is active). The result will be that when the new version is active, the addition results in zero but it's going to be overwritten by zero once more. If the version isn't active yet, we overwrite with zero and that's ok too.


pkg/storage/batcheval/cmd_add_sstable.go, line 99 at r1 (raw file):

	// remove the ContainsEstimates flag) after they are done ingesting files by
	// sending an explicit recompute.
	// TODO(giorgosp): Should I check if it's 1 already or it always comes as 0?

In practice it will be 0 but you can't know that. In theory the AddSSTableRequest could be in a batch with other requests that cause estimates. Just write stats.ContainsEstimates++ (here and everywhere else). Note the need for a version gate as mentioned in the main commit msg.


pkg/storage/batcheval/cmd_end_transaction.go, line 1049 at r1 (raw file):

	recStats := rec.GetMVCCStats()
	leftDeltaMS.Subtract(recStats)    // subtract pre-split absolute stats
	leftDeltaMS.ContainsEstimates = 0 // if there were any, recomputation removed them

If the left hand side of the split has, say, ContainsEstimates = 10, then leftDeltaMS.Subtract(recStats) should result in leftDeltaMS.ContainsEstimates = -10 which downstream of Raft would give a sum of zero, as desired. (You need to check the cluster version before you emit a non-1 delta). You shouldn't need to touch ContainsEstimates manually here (except for migration stuff).

But note that the resulting leftDeltaMS will mess up rightDeltaMS below if we're not careful. So what you want to do is rightDeltaMS.ContainsEstimates = 0 after subtracting from rightDeltaMS.


pkg/storage/engine/enginepb/mvcc.go, line 146 at r1 (raw file):

	// If either stats object contains estimates, their difference does too.
	// TODO(giorgosp): This is problematic, i.e. 1 - 1 = 0, which means the stats
	// are exact. Should we add them or just keep one of them?

See the comment about the migration on the main thread.


pkg/storage/engine/enginepb/mvcc.proto, line 113 at r1 (raw file):

  // where complete accuracy is required, and instead should be recomputed
  // when necessary.
  optional int64 contains_estimates = 14 [(gogoproto.nullable) = false];

This needs a big comment about the migration when we've figured it out.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @giorgosp)


pkg/storage/engine/enginepb/mvcc3.proto, line 111 at r1 (raw file):

  option (gogoproto.equal) = true;

  sint64 contains_estimates = 14;

I emailed you this already, but for posterity: this can be int64 (it's basically always zero, and if it's not, it's basically always positive, and only negative in very rare cases such as splits and stats adjustments).

Using int64 has the big advantage that all of this "mapping 1 to -1" problem goes away, i.e. ContainsEstimates == 1 should correspond to ContainsEstimates == true without problems.

@giorgosp giorgosp force-pushed the migrate-containsestimates-int64 branch from 8def60c to 9cb57ff Compare May 27, 2019 19:03
Copy link
Contributor Author

@giorgosp giorgosp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the thorough reply Tobias! I have made some small changes

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @giorgosp and @tbg)


pkg/storage/client_split_test.go, line 934 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Each writeRandomTimeSeriesDataToRange write adds an estimate. You split at the midpoint, which recomputes the stats on the left and right, so you should end up with no estimates on either side. I've commented on the code in the split trigger, it's not correct and this test catches the problem.

Thanks!


pkg/storage/replica_consistency.go, line 186 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Make this <= 0 (and mention in the comment that it's never supposed to go negative, but who knows)

Done.


pkg/storage/replica_raft.go, line 2284 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

It's usually not possible to use IsActive safely below Raft (i.e. the code that applies Raft commands to the state machine, i.e. here). The reason is that the cluster version is communicated asynchronously, so nodes will receive it at different times, and will just randomly start using it at whatever log position the update comes in. This can lead to one node running this code and another not, for the same command, thus updating the in-memory state in different ways. I think you want to leave this code in as it was before but change the split code so that it generates a ms that has ContainsEstimates set to a negative value that cancels that of the left-hand side (but only if the version is active). The result will be that when the new version is active, the addition results in zero but it's going to be overwritten by zero once more. If the version isn't active yet, we overwrite with zero and that's ok too.

I see, thanks!


pkg/storage/batcheval/cmd_add_sstable.go, line 99 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

In practice it will be 0 but you can't know that. In theory the AddSSTableRequest could be in a batch with other requests that cause estimates. Just write stats.ContainsEstimates++ (here and everywhere else). Note the need for a version gate as mentioned in the main commit msg.

Is this what you meant?


pkg/storage/engine/enginepb/mvcc.go, line 146 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

See the comment about the migration on the main thread.

Done.


pkg/storage/engine/enginepb/mvcc3.proto, line 111 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I emailed you this already, but for posterity: this can be int64 (it's basically always zero, and if it's not, it's basically always positive, and only negative in very rare cases such as splits and stats adjustments).

Using int64 has the big advantage that all of this "mapping 1 to -1" problem goes away, i.e. ContainsEstimates == 1 should correspond to ContainsEstimates == true without problems.

Done.

@giorgosp giorgosp force-pushed the migrate-containsestimates-int64 branch 5 times, most recently from 5d0691b to 6606331 Compare June 3, 2019 18:46
@giorgosp giorgosp force-pushed the migrate-containsestimates-int64 branch 2 times, most recently from 1f07f87 to 7de5cd2 Compare June 6, 2019 07:06
Copy link
Contributor Author

@giorgosp giorgosp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tbg, made some small changes to have a meaninful base to proceed.

  • ContainsEstimates is int64 instead of sint64 as you suggested to not add unnecessary complexity.
  • AddSSTable command sets it to 1 if it is running the old version. Otherwise it adds 2 to it. Is this what you meant Should we add some specific test?
  • TestRaftBelowProtos fails, my understanding is that it checks the size of the stats for each case? Can I just replace the expected number with the actual?
  • Should I change MVCCMerge in order to ContainsEstimates = 2 when the new version is active instead of = 1 that updateStatsOnMerge does ?
  • Also cleaned up the change from RecomputeStats, it was meaningless. I have to implement returning -ContainsEstimates and see what breaks but after we emit ContainsEstimates correctly.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @giorgosp and @tbg)


pkg/storage/batcheval/cmd_end_transaction.go, line 1049 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

If the left hand side of the split has, say, ContainsEstimates = 10, then leftDeltaMS.Subtract(recStats) should result in leftDeltaMS.ContainsEstimates = -10 which downstream of Raft would give a sum of zero, as desired. (You need to check the cluster version before you emit a non-1 delta). You shouldn't need to touch ContainsEstimates manually here (except for migration stuff).

But note that the resulting leftDeltaMS will mess up rightDeltaMS below if we're not careful. So what you want to do is rightDeltaMS.ContainsEstimates = 0 after subtracting from rightDeltaMS.

I don't really understand what is going on here. I have to read up on this. However, seems like rightDeltaMS.ContainsEstimates could be true if bothDeltaMS.ContainsEstimates comes as true. So maybe we shouldn't always return rightDeltaMS.ContainsEstimates as 0?

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @giorgosp and @tbg)


pkg/cli/debug.go, line 627 at r5 (raw file):

					failed = true
				} else {
					ms.ContainsEstimates = 1

why set this to one? Looks like you'd just leave it alone.


pkg/storage/replica_raft.go, line 2285 at r5 (raw file):
Add a TODO here that we can remove this at some point in the future

// TODO(tbg): splits reset ContainsEstimates as of 19.2, so this can be removed at some point in the future.


pkg/storage/batcheval/cmd_add_sstable.go, line 102 at r5 (raw file):

	//
	// If we are running the old ContainsEstimates version (boolean), we just set it to 1.
	// Otherwise we make it bigger than one to indicate that we are running the newer version.

You're going to have this in a few places, so I'd say add an explanation of these tricky semantics on cluster.VersionContainsEstimatesCounter and then in every other place like here, add something like

_ = cluster.VersionContainsEstimatesCounter // see for info on ContainsEstimates migration

This allows navigating quickly to the explanation without duplicating the explanation in various places.

I think we should also explore centralizing this behavior. Let's say we want to avoid forcing each command that introduces an estimate to do this dance. We should be able to put code into the replica proposal path that does this check only once and massages the output to 1 if the cluster version is not active yet. I think that might be preferable.


pkg/storage/batcheval/cmd_end_transaction.go, line 1049 at r1 (raw file):

Previously, giorgosp (George Papadrosou) wrote…

I don't really understand what is going on here. I have to read up on this. However, seems like rightDeltaMS.ContainsEstimates could be true if bothDeltaMS.ContainsEstimates comes as true. So maybe we shouldn't always return rightDeltaMS.ContainsEstimates as 0?

The high level overview is this: we want to split the replica and so we need to split the stats. We want to recompute as little as possible because for performance reasons. If the range has no estimates, things are "easy" because we can just recompute one half and use math for the other half. We're also writing new data into the ranges in this method, so we have to track their contributions. If the initial range had estimates, we need to recompute both initially.

a few hours later... just a heads up that I have a review open on this but I got my brain melted a bit by these split computations -- I'm going to send a PR in the next couple of days that should clarify these computations a bunch. I'll ping you on the PR. Will review the rest later but generally looks good, I left some comments above this line.

@tbg
Copy link
Member

tbg commented Jun 6, 2019

Here's the PR: #38084

tbg added a commit to tbg/cockroach that referenced this pull request Jun 6, 2019
Pulling out the stats-related computations into a helper taylored
exactly to what happens during a split provides a convenient venue for
lengthy rumination on the finer points of said stats computations, while
decluttering the split trigger's main method.

This was motivated by cockroachdb#37583, which needs to adjust some of this code.
Now that it's (hopefully) more accessible, it should be more straight-
forward to figure out what exactly to do.

Release note: None
@giorgosp giorgosp force-pushed the migrate-containsestimates-int64 branch from 7de5cd2 to 51728de Compare June 7, 2019 17:05
Copy link
Contributor Author

@giorgosp giorgosp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I ll rebase after it gets merged

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @giorgosp and @tbg)


pkg/cli/debug.go, line 627 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

why set this to one? Looks like you'd just leave it alone.

It was true on master and I accidentally removed it in a commit. https://github.com/cockroachdb/cockroach/blob/master/pkg/cli/debug.go#L624


pkg/storage/replica_raft.go, line 2285 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Add a TODO here that we can remove this at some point in the future

// TODO(tbg): splits reset ContainsEstimates as of 19.2, so this can be removed at some point in the future.

Done.

tbg added a commit to tbg/cockroach that referenced this pull request Jun 7, 2019
Pulling out the stats-related computations into a helper taylored
exactly to what happens during a split provides a convenient venue for
lengthy rumination on the finer points of said stats computations, while
decluttering the split trigger's main method.

This was motivated by cockroachdb#37583, which needs to adjust some of this code.
Now that it's (hopefully) more accessible, it should be more straight-
forward to figure out what exactly to do.

Release note: None
@giorgosp giorgosp force-pushed the migrate-containsestimates-int64 branch 2 times, most recently from 9c71899 to 1dcd2a4 Compare September 8, 2019 22:31
@giorgosp giorgosp force-pushed the migrate-containsestimates-int64 branch 2 times, most recently from 8b8be0c to d393a68 Compare September 16, 2019 07:14
Copy link
Contributor Author

@giorgosp giorgosp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tbg, @nvanbenschoten, I have rebased this PR and added some tests for the ContainsEstimates handling during command proposal and application.

I have also added a change to batcheval.RecomputeStats but I am not sure if just checking for the cluster version is enough. Should we also check the loaded mvcc stats and if ContainsEstimates=1, keep it 1 in the returned delta?
Actually, I am not sure if the RecomputeStats command goes through Raft so that ContainsEstimates is handled as in the other commands and we won't have to do something special.

I also see that commits need a "Release justification" message, not sure what to write in this (and if this PR will make it to 19.2 anyway).

Thanks!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @tbg)


pkg/storage/replica_proposal.go, line 728 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

// Encode that this command (and any that follow) uses regular arithmetic for ContainsEstimates
// by making sure ContainsEstimates is > 1.
// This will be interpreted during command application.

Done.


pkg/storage/replica_proposal.go, line 732 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

// This range may still need to have its commands processed by nodes which treat ContainsEstimates
// as a bool, so clamp it to {0,1}. This enables use of bool semantics in command application.

Done.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @giorgosp! You're right that this PR will not make 19.2; we're in the release stabilization period and won't merge anything that can wait until the 20.1 cycle (ie. in a few weeks or so). For the same reason we also won't be able to give this PR more attention right now; there are some fires that need to be put out before we release. That's unfortunate (I'd love for you to be able to get this off your plate) but realistically it won't happen. I'll come back to this PR when the new cycle starts, and will get it in. If you're available to help out then great, otherwise I'm just going to amend your PR - pretty sure not much is left to be done, but I'd need to re-review before the merge.

Reviewed 21 of 21 files at r14.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @tbg)

@tbg tbg self-requested a review September 23, 2019 09:54
Copy link
Contributor Author

@giorgosp giorgosp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! Looking forward to contribute to the next cycle

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @tbg)

@giorgosp giorgosp force-pushed the migrate-containsestimates-int64 branch 2 times, most recently from 1af0fe7 to c2d2fa5 Compare October 30, 2019 23:18
Copy link
Contributor Author

@giorgosp giorgosp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbg ping, rebased this PR :)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @tbg)


pkg/storage/batcheval/cmd_recompute_stats.go, line 103 at r15 (raw file):

		// means some extra engine churn.
		cArgs.Stats.Add(delta)
		if !cArgs.EvalCtx.ClusterSettings().Version.IsActive(cluster.VersionContainsEstimatesCounter) {

Does this change make sense? I could also add a unit test for it.

@tbg tbg force-pushed the migrate-containsestimates-int64 branch 3 times, most recently from 7d01b41 to c0da358 Compare November 4, 2019 14:58
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I made a few changes (to your fork), please make sure you don't accidentally push over them; git fetch --all && git reset --hard @{upstream} should get you to the right place:

  1. grab a missing lock in the test (CI was red because of this)
  2. don't truncate ContainsEstimates to {0,1} when it's negative. Instead, when it's negative and the version isn't active, refuse the command (as a safety check: the only caller emitting negative estimates is RecomputeStats, and that does the check already)
  3. update an existing test to catch the problem fixed in 2 (the problem was that the recomputation would be applied with the factor of 2x).
  4. rebase (with some fallout, but nothing serious)
  5. make sure the split recomputation actually works, by making sure the legacy if split != nil reset path is only taken when there isn't a delta in the incoming split (and making the split trigger emit a wrong delta manually once)

I'll give this another pass tomorrow, but it's ready to merge and I'll do so when I've looked again.

Reviewed 12 of 21 files at r15, 10 of 10 files at r16.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @giorgosp)


pkg/storage/batcheval/cmd_recompute_stats.go, line 103 at r15 (raw file):

Previously, giorgosp (George Papadrosou) wrote…

Does this change make sense? I could also add a unit test for it.

This preserves the old behavior, so 👍

@tbg tbg force-pushed the migrate-containsestimates-int64 branch 2 times, most recently from 64038e2 to b11f067 Compare November 5, 2019 08:59
Copy link
Contributor Author

@giorgosp giorgosp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok @tbg thanks for jumping in! I went through the changes. I won't manage to have a unit test for RecomputeStats ready in the next couple of days, but I think the negative check you added for ContainsEstimates is a safety net, right?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @tbg)

@tbg tbg force-pushed the migrate-containsestimates-int64 branch from b11f067 to 9e63ec8 Compare November 6, 2019 13:02
@giorgosp giorgosp requested a review from a team as a code owner November 6, 2019 13:02
This migration makes ContainsEstimates a counter so that the
consistency checker can reset it (by returning a -ContainsEstimates)
delta) without racing with another command that introduces new
estimate stats.

Resolves cockroachdb#37120

Release note: None
@tbg tbg force-pushed the migrate-containsestimates-int64 branch from 9e63ec8 to 35bdbf0 Compare November 6, 2019 13:22
@tbg
Copy link
Member

tbg commented Nov 6, 2019

bors r=tbg

@craig
Copy link
Contributor

craig bot commented Nov 6, 2019

Build failed

@tbg
Copy link
Member

tbg commented Nov 6, 2019

uitest failed during lint. Is this a known flake?

[15:45:57]
[Test web UI] HeadlessChrome 78.0.3904 (Linux 0.0.0): Executed 130 of 320 SUCCESS (0 secs / 3.073 secs)
[15:45:57]
[Test web UI]  [1A [2KHeadlessChrome 78.0.3904 (Linux 0.0.0) Query Management Saga integration tests AUTO_REFRESH/STOP_AUTO_REFRESH action sets inRunning flag on reducer when query is running. FAILED
[15:45:57]
[Test web UI] 	AssertionError: expected 2 to equal 1
[15:45:57]
[Test web UI] 	    at src/redux/queryManager/saga.spec.ts:20980:34
[15:45:57]
[Test web UI] HeadlessChrome 78.0.3904 (Linux 0.0.0): Executed 131 of 320 (1 FAILED) (0 secs / 3.092 secs)

@tbg
Copy link
Member

tbg commented Nov 6, 2019

bors r=tbg

craig bot pushed a commit that referenced this pull request Nov 6, 2019
37583: storage: Migrate MVCCStats.contains_estimates from bool to int64 r=tbg a=giorgosp

This migration allows ContainsEstimates to be reset after a stats
recomputation (by returning a -ContainsEstimates delta) without
worrying about a race condition. Another command may add 1 to it
flag and it will still be stored as valid.

Resolves #37120

Release note: None

42129: colexec: fix AND and OR projections in some cases r=yuzefovich a=yuzefovich

Previously, the original batch length was not respected when the
selection vector is present. This resulted in, for example, query 19 of
TPCH benchmark to return an error. This is now fixed.

I have troubles coming up with a reduced reproduction though.

I also checked that on release-19.2 branch the query is executed
correctly with vectorized, so it must be the switch to flat bytes that
triggers the problem.

Release note: None

42172: colexec: fix sorted distinct with nulls behavior r=yuzefovich a=yuzefovich

Previously, sorted distinct when the nulls might be present would always
get the value at the index without paying attention whether that value
is actually now. This is incorrect behavior because it is undefined in
some cases (like when getting from flat bytes). Now this is fixed.

Fixes: #42055.

Release note: None

Co-authored-by: George Papadrosou <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig
Copy link
Contributor

craig bot commented Nov 6, 2019

Build succeeded

@craig craig bot merged commit 35bdbf0 into cockroachdb:master Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: provide a way to reset ContainsEstimates via stats recomputation
4 participants