-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/stats: add cluster setting to disable generation of stats forecasts #86932
Conversation
Note that the first commit is from #86834 and can be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 15 files at r1, 6 of 6 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @msirek)
TFTR! bors r=rytaft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change. I have some comments. You may wish to hold off on the merge.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2 and @rytaft)
-- commits
line 15 at r1:
nit: maybe mention that both the session setting and cluster setting must be true for stats forecasts to be used, so it's clear in docs.
pkg/sql/opt/memo/memo.go
line 192 at r1 (raw file):
reorderJoinsLimit: int(evalCtx.SessionData().ReorderJoinsLimit), zigzagJoinEnabled: evalCtx.SessionData().ZigzagJoinEnabled, useForecasts: evalCtx.SessionData().OptimizerUseForecasts,
Since the stats in the stats cache now depends on both the setting of evalCtx.SessionData().OptimizerUseForecasts
and cluster settingUseStatisticsForecasts
, it looks like this line should be:
useForecasts: evalCtx.SessionData().OptimizerUseForecasts && forecastAllowed(clusterSettings)
pkg/sql/opt/memo/memo.go
line 328 at r1 (raw file):
if m.reorderJoinsLimit != int(evalCtx.SessionData().ReorderJoinsLimit) || m.zigzagJoinEnabled != evalCtx.SessionData().ZigzagJoinEnabled || m.useForecasts != evalCtx.SessionData().OptimizerUseForecasts ||
Similar to the above comment, if the cluster setting changes, we want the query plan to be regenerated instead of using the cached plan, so it should be something like:
m.useForecasts != (evalCtx.SessionData().OptimizerUseForecasts && forecastAllowed(clusterSettings) ||
pkg/sql/stats/stats_cache.go
line 294 at r2 (raw file):
// forecastAllowed returns true if statistics forecasting is allowed for the // given table. func forecastAllowed(table catalog.TableDescriptor, clusterSettings *cluster.Settings) bool {
Remove parameter table
. It is unused.
Build failed (retrying...): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thank you for taking a look! I'll hold off while we review.
bors r-
I think some of the things you point out will make more sense when the next PR, adding a per-table setting to enable or disable forecasts, is opened.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @msirek and @rytaft)
Previously, msirek (Mark Sirek) wrote…
nit: maybe mention that both the session setting and cluster setting must be true for stats forecasts to be used, so it's clear in docs.
Good point!
pkg/sql/opt/memo/memo.go
line 192 at r1 (raw file):
Previously, msirek (Mark Sirek) wrote…
Since the stats in the stats cache now depends on both the setting of
evalCtx.SessionData().OptimizerUseForecasts
and cluster settingUseStatisticsForecasts
, it looks like this line should be:useForecasts: evalCtx.SessionData().OptimizerUseForecasts && forecastAllowed(clusterSettings)
This useForecasts
is similar to useHistograms
and useMultiColStats
: it represents whether we want this particular query to use forecasts, regardless of whether they are available or not. The same is true of useHistograms
and useMultiColStats
.
In the next PR, forecastAllowed
will be a per-table function, so it won't make sense to call here (in a per-query setting).
pkg/sql/opt/memo/memo.go
line 328 at r1 (raw file):
Previously, msirek (Mark Sirek) wrote…
Similar to the above comment, if the cluster setting changes, we want the query plan to be regenerated instead of using the cached plan, so it should be something like:
m.useForecasts != (evalCtx.SessionData().OptimizerUseForecasts && forecastAllowed(clusterSettings) ||
Invalidation when table statistics change is already handled by the call to m.Metadata().CheckDependencies
below. We only need to invalidate the query plan when statistics actually change (which might not necessarily happen when enabling or disabling forecasts) and this is handled by the call to optTable.Equals
inside CheckDependencies
.
pkg/sql/stats/stats_cache.go
line 294 at r2 (raw file):
Previously, msirek (Mark Sirek) wrote…
Remove parameter
table
. It is unused.
table
will be used by the next PR, adding a per-table setting to fix #86353 (which will be opened shortly). I included it here to make sure I was setting everything up to also work with a per-table setting.
Canceled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2 and @msirek)
pkg/sql/opt/memo/memo.go
line 192 at r1 (raw file):
Previously, michae2 (Michael Erickson) wrote…
This
useForecasts
is similar touseHistograms
anduseMultiColStats
: it represents whether we want this particular query to use forecasts, regardless of whether they are available or not. The same is true ofuseHistograms
anduseMultiColStats
.In the next PR,
forecastAllowed
will be a per-table function, so it won't make sense to call here (in a per-query setting).
I see. Thanks for the explanation.
pkg/sql/opt/memo/memo.go
line 328 at r1 (raw file):
Previously, michae2 (Michael Erickson) wrote…
Invalidation when table statistics change is already handled by the call to
m.Metadata().CheckDependencies
below. We only need to invalidate the query plan when statistics actually change (which might not necessarily happen when enabling or disabling forecasts) and this is handled by the call tooptTable.Equals
insideCheckDependencies
.
OK, I see. It looks like this following code would handle it:
cockroach/pkg/sql/opt_catalog.go
Lines 1084 to 1092 in 1540ea8
// Verify the stats are identical. | |
if len(ot.stats) != len(otherTable.stats) { | |
return false | |
} | |
for i := range ot.stats { | |
if !ot.stats[i].equals(&otherTable.stats[i]) { | |
return false | |
} | |
} |
When the cluster setting is changed,
dataSourceForTable
would get a different length list of table stats.
pkg/sql/stats/stats_cache.go
line 294 at r2 (raw file):
Previously, michae2 (Michael Erickson) wrote…
table
will be used by the next PR, adding a per-table setting to fix #86353 (which will be opened shortly). I included it here to make sure I was setting everything up to also work with a per-table setting.
OK
Add a new cluster setting `sql.stats.forecasts.enabled` that can be used to disable generation of statistics forecasts in the stats cache. Changing this setting will cause the stats cache to gradually evict and reload all cached statistics as they are requested. Fixes: cockroachdb#86350 Release justification: Low-risk update to new functionality. Release note (sql change): Add a new cluster setting `sql.stats.forecasts.enabled` which controls whether statistics forecasts are generated by default for all tables. (Note that this is different from the session setting `optimizer_use_forecasts` which controls whether statistics forecasts are used when optimizing the current query. If forecasting is disabled, then even if `optimizer_use_forecasts` is true for a given query it won't have any forecasts to use.)
TFTRs! bors r=rytaft,msirek |
Add a new storage parameter (a.k.a. table setting) which can be used to override cluster setting `sql.stats.forecasts.enabled`. This builds on work done earlier in cockroachdb#79025 and cockroachdb#86932 so we don't have to do much, just wire it all together. Fixes: cockroachdb#86353 Release justification: Low-risk update to new functionality. Release note (sql change): Add a new table setting `sql_stats_forecasts_enabled` which controls whether statistics forecasts are generated for a specific table. When set, this overrides cluster setting `sql.stats.forecasts.enabled`.
Build failed: |
Trying again. bors r=rytaft,msirek |
Add a new storage parameter (a.k.a. table setting) which can be used to override cluster setting `sql.stats.forecasts.enabled`. This builds on work done earlier in cockroachdb#79025 and cockroachdb#86932 so we don't have to do much, just wire it all together. Fixes: cockroachdb#86353 Release justification: Low-risk update to new functionality. Release note (sql change): Add a new table setting `sql_stats_forecasts_enabled` which controls whether statistics forecasts are generated for a specific table. When set, this overrides cluster setting `sql.stats.forecasts.enabled`.
Build succeeded: |
86986: sql/stats: add table setting to disable generation of stats forecasts r=rytaft,msirek a=michae2 Add a new storage parameter (a.k.a. table setting) which can be used to override cluster setting `sql.stats.forecasts.enabled`. This builds on work done earlier in #79025 and #86932 so we don't have to do much, just wire it all together. Fixes: #86353 Release justification: Low-risk update to new functionality. Release note (sql change): Add a new table setting `sql_stats_forecasts_enabled` which controls whether statistics forecasts are generated for a specific table. When set, this overrides cluster setting `sql.stats.forecasts.enabled`. Co-authored-by: Michael Erickson <[email protected]>
Add a new cluster setting
sql.stats.forecasts.enabled
that can beused to disable generation of statistics forecasts in the stats cache.
Changing this setting will cause the stats cache to gradually evict and
reload all cached statistics as they are requested.
Fixes: #86350
Release justification: Low-risk update to new functionality.
Release note (sql change): Add a new cluster setting
sql.stats.forecasts.enabled
which controls whether statisticsforecasts are generated by default for all tables. (Note that this is
different from the session setting
optimizer_use_forecasts
whichcontrols whether statistics forecasts are used when optimizing the
current query. If forecasting is disabled, then even if
optimizer_use_forecasts
is true for a given query it won't have anyforecasts to use.)