Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: improve ColStatsMap so it's harder to misuse #62531

Open
rytaft opened this issue Mar 24, 2021 · 3 comments
Open

opt: improve ColStatsMap so it's harder to misuse #62531

rytaft opened this issue Mar 24, 2021 · 3 comments
Labels
A-sql-optimizer SQL logical planning and optimizations. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team

Comments

@rytaft
Copy link
Collaborator

rytaft commented Mar 24, 2021

There have been at least two cases of subtle bugs (#62289 and #37953) caused by misuse of ColStatsMap.Lookup(), ColStatsMap.Add(), or ColStatsMap.Get(). All of these functions return pointers to a ColumnStatistic, but as the function comment says in each case:

// The returned *ColumnStatistic is only valid until this ColStatsMap is
// updated via another call to Add() or RemoveIntersecting(). At that
// point, the address of the statistic may have changed, so it must be
// fetched again using Lookup() or Get().

Although this comment is useful, there is nothing enforcing the requirement that variables containing column statistics fetched from these functions are no longer used after a call to Add() or RemoveIntersecting(). It's easy to forget this requirement when updating the code in statistics_builder.go.

We should improve this interface so that this requirement is enforced in some way, or change ColStatsMap to remove the requirement (for example, by changing the other field in ColStatsMap to store a []*ColumnStatistic instead of a []ColumnStatistic).

Jira issue: CRDB-2774

@rytaft rytaft added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. A-sql-optimizer SQL logical planning and optimizations. labels Mar 24, 2021
@rytaft
Copy link
Collaborator Author

rytaft commented Mar 24, 2021

Another suggestion from #37983 (review):

Are you worried that we'll continue having this kind of bug in the future, given the subtlety of it? I wonder if we should change ColStatsMap and colStatVal to directly reference *ColumnStatistic rather than keeping a pos into the initial array and other slice. This would require us to allocate ColumnStatistic objects on the heap once we got past the initial 3, but that's an uncommon case, so probably wouldn't have measurable impact on perf.

@mgartner
Copy link
Collaborator

mgartner commented Mar 24, 2021

And another suggestion from #62507 (review):

At the least, maybe we could add a crdb_test modifier to the function (Add) that makes it always update the address of the ColumnStatistic. I'm thinking that would have caught this bug earlier on because many columns wouldn't have been required to produce it.

@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Aug 30, 2023
@mgartner mgartner moved this from Triage to New Backlog in SQL Queries Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-optimizer SQL logical planning and optimizations. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

3 participants