Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-22.2: stats: fix buckets for INT2 and INT4 #88214

Merged
merged 1 commit into from
Sep 21, 2022

Conversation

blathers-crl[bot]
Copy link

@blathers-crl blathers-crl bot commented Sep 20, 2022

Backport 1/1 commits from #88083 on behalf of @yuzefovich.

/cc @cockroachdb/release


Previously, if we needed to create "outer" histogram buckets (which is the case when minimum and maximum values in the column weren't sampled yet they contributed to the distinct count) for INT2 and INT4 types, we would use the values that exceeded the supported range for those types. This could lead to incorrect estimation later on when those "outer" buckets are used during the costing as well as the histograms would need to be manually edited to be injected. This is now fixed by handling these two types separately.

Fixes: #76887.

Release note: None


Release justification: bug fix.

@blathers-crl blathers-crl bot requested a review from a team as a code owner September 20, 2022 03:47
@blathers-crl blathers-crl bot force-pushed the blathers/backport-release-22.2-88083 branch from c38fdeb to 9edd5be Compare September 20, 2022 03:47
@blathers-crl
Copy link
Author

blathers-crl bot commented Sep 20, 2022

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@blathers-crl blathers-crl bot added blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot. labels Sep 20, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich
Copy link
Member

Looks like TestAdjustCounts/random fails under race, so I'll need to take a look at that first, before merging this.

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 10 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)

@yuzefovich yuzefovich force-pushed the blathers/backport-release-22.2-88083 branch from 9edd5be to 2e0f4e9 Compare September 20, 2022 20:18
@yuzefovich
Copy link
Member

Squashed #88261 into this commit, will wait for @rytaft to approve as EM.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 9 of 10 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @blathers-crl[bot] and @yuzefovich)


pkg/sql/stats/histogram.go line 365 at r2 (raw file):

		switch t.Width() {
		case 16:
			if i == math.MinInt16 {

What if i is less than MinInt16? Can that ever happen? Same question below for 32 and for MaxInt16 / MaxInt32 also

(sorry I didn't review the PR on master -- guessing this is fine, but just wanted to check...)

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft)


pkg/sql/stats/histogram.go line 365 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

What if i is less than MinInt16? Can that ever happen? Same question below for 32 and for MaxInt16 / MaxInt32 also

(sorry I didn't review the PR on master -- guessing this is fine, but just wanted to check...)

Hm, this shouldn't really happen because it would imply that we sampled such a value which would mean that we didn't perform the out of bounds check on the write path. However, I'm not confident that we actually do those checks in all places, so it'd be worth being conservative here. I'll open up a PR for this - thanks!

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @yuzefovich)


pkg/sql/stats/histogram.go line 365 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Hm, this shouldn't really happen because it would imply that we sampled such a value which would mean that we didn't perform the out of bounds check on the write path. However, I'm not confident that we actually do those checks in all places, so it'd be worth being conservative here. I'll open up a PR for this - thanks!

Thanks!

Previously, if we needed to create "outer" histogram buckets (which is
the case when minimum and maximum values in the column weren't sampled
yet they contributed to the distinct count) for INT2 and INT4 types, we
would use the values that exceeded the supported range for those types.
This could lead to incorrect estimation later on when those "outer"
buckets are used during the costing as well as the histograms would need
to be manually edited to be injected. This is now fixed by handling
these two types separately.

Release note: None
@yuzefovich yuzefovich force-pushed the blathers/backport-release-22.2-88083 branch from 2e0f4e9 to 9da6f98 Compare September 20, 2022 23:09
@yuzefovich
Copy link
Member

Squashed #88300 into this commit.

@yuzefovich yuzefovich merged commit ac55929 into release-22.2 Sep 21, 2022
@yuzefovich yuzefovich deleted the blathers/backport-release-22.2-88083 branch September 21, 2022 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants