Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/stats: add cluster setting to disable generation of stats forecasts #86932

Merged
merged 1 commit into from
Aug 28, 2022

Conversation

michae2
Copy link
Collaborator

@michae2 michae2 commented Aug 26, 2022

Add a new cluster setting sql.stats.forecasts.enabled that can be
used to disable generation of statistics forecasts in the stats cache.
Changing this setting will cause the stats cache to gradually evict and
reload all cached statistics as they are requested.

Fixes: #86350

Release justification: Low-risk update to new functionality.

Release note (sql change): Add a new cluster setting
sql.stats.forecasts.enabled which controls whether statistics
forecasts are generated by default for all tables. (Note that this is
different from the session setting optimizer_use_forecasts which
controls whether statistics forecasts are used when optimizing the
current query. If forecasting is disabled, then even if
optimizer_use_forecasts is true for a given query it won't have any
forecasts to use.)

@michae2 michae2 requested review from rytaft, msirek and a team August 26, 2022 06:23
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@michae2
Copy link
Collaborator Author

michae2 commented Aug 26, 2022

Note that the first commit is from #86834 and can be ignored.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 15 of 15 files at r1, 6 of 6 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @msirek)

@michae2
Copy link
Collaborator Author

michae2 commented Aug 26, 2022

TFTR!

bors r=rytaft

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change. I have some comments. You may wish to hold off on the merge.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2 and @rytaft)


-- commits line 15 at r1:
nit: maybe mention that both the session setting and cluster setting must be true for stats forecasts to be used, so it's clear in docs.


pkg/sql/opt/memo/memo.go line 192 at r1 (raw file):

		reorderJoinsLimit:                      int(evalCtx.SessionData().ReorderJoinsLimit),
		zigzagJoinEnabled:                      evalCtx.SessionData().ZigzagJoinEnabled,
		useForecasts:                           evalCtx.SessionData().OptimizerUseForecasts,

Since the stats in the stats cache now depends on both the setting of evalCtx.SessionData().OptimizerUseForecasts and cluster settingUseStatisticsForecasts, it looks like this line should be:

useForecasts:                           evalCtx.SessionData().OptimizerUseForecasts && forecastAllowed(clusterSettings)

pkg/sql/opt/memo/memo.go line 328 at r1 (raw file):

	if m.reorderJoinsLimit != int(evalCtx.SessionData().ReorderJoinsLimit) ||
		m.zigzagJoinEnabled != evalCtx.SessionData().ZigzagJoinEnabled ||
		m.useForecasts != evalCtx.SessionData().OptimizerUseForecasts ||

Similar to the above comment, if the cluster setting changes, we want the query plan to be regenerated instead of using the cached plan, so it should be something like:

m.useForecasts != (evalCtx.SessionData().OptimizerUseForecasts && forecastAllowed(clusterSettings) ||

pkg/sql/stats/stats_cache.go line 294 at r2 (raw file):

// forecastAllowed returns true if statistics forecasting is allowed for the
// given table.
func forecastAllowed(table catalog.TableDescriptor, clusterSettings *cluster.Settings) bool {

Remove parameter table. It is unused.

@craig
Copy link
Contributor

craig bot commented Aug 26, 2022

Build failed (retrying...):

Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thank you for taking a look! I'll hold off while we review.

bors r-

I think some of the things you point out will make more sense when the next PR, adding a per-table setting to enable or disable forecasts, is opened.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @msirek and @rytaft)


-- commits line 15 at r1:

Previously, msirek (Mark Sirek) wrote…

nit: maybe mention that both the session setting and cluster setting must be true for stats forecasts to be used, so it's clear in docs.

Good point!


pkg/sql/opt/memo/memo.go line 192 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

Since the stats in the stats cache now depends on both the setting of evalCtx.SessionData().OptimizerUseForecasts and cluster settingUseStatisticsForecasts, it looks like this line should be:

useForecasts:                           evalCtx.SessionData().OptimizerUseForecasts && forecastAllowed(clusterSettings)

This useForecasts is similar to useHistograms and useMultiColStats: it represents whether we want this particular query to use forecasts, regardless of whether they are available or not. The same is true of useHistograms and useMultiColStats.

In the next PR, forecastAllowed will be a per-table function, so it won't make sense to call here (in a per-query setting).


pkg/sql/opt/memo/memo.go line 328 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

Similar to the above comment, if the cluster setting changes, we want the query plan to be regenerated instead of using the cached plan, so it should be something like:

m.useForecasts != (evalCtx.SessionData().OptimizerUseForecasts && forecastAllowed(clusterSettings) ||

Invalidation when table statistics change is already handled by the call to m.Metadata().CheckDependencies below. We only need to invalidate the query plan when statistics actually change (which might not necessarily happen when enabling or disabling forecasts) and this is handled by the call to optTable.Equals inside CheckDependencies.


pkg/sql/stats/stats_cache.go line 294 at r2 (raw file):

Previously, msirek (Mark Sirek) wrote…

Remove parameter table. It is unused.

table will be used by the next PR, adding a per-table setting to fix #86353 (which will be opened shortly). I included it here to make sure I was setting everything up to also work with a per-table setting.

@craig
Copy link
Contributor

craig bot commented Aug 26, 2022

Canceled.

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 7 of 7 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2 and @msirek)


pkg/sql/opt/memo/memo.go line 192 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

This useForecasts is similar to useHistograms and useMultiColStats: it represents whether we want this particular query to use forecasts, regardless of whether they are available or not. The same is true of useHistograms and useMultiColStats.

In the next PR, forecastAllowed will be a per-table function, so it won't make sense to call here (in a per-query setting).

I see. Thanks for the explanation.


pkg/sql/opt/memo/memo.go line 328 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Invalidation when table statistics change is already handled by the call to m.Metadata().CheckDependencies below. We only need to invalidate the query plan when statistics actually change (which might not necessarily happen when enabling or disabling forecasts) and this is handled by the call to optTable.Equals inside CheckDependencies.

OK, I see. It looks like this following code would handle it:

// Verify the stats are identical.
if len(ot.stats) != len(otherTable.stats) {
return false
}
for i := range ot.stats {
if !ot.stats[i].equals(&otherTable.stats[i]) {
return false
}
}

When the cluster setting is changed, dataSourceForTable would get a different length list of table stats.


pkg/sql/stats/stats_cache.go line 294 at r2 (raw file):

Previously, michae2 (Michael Erickson) wrote…

table will be used by the next PR, adding a per-table setting to fix #86353 (which will be opened shortly). I included it here to make sure I was setting everything up to also work with a per-table setting.

OK

Add a new cluster setting `sql.stats.forecasts.enabled` that can be
used to disable generation of statistics forecasts in the stats cache.
Changing this setting will cause the stats cache to gradually evict and
reload all cached statistics as they are requested.

Fixes: cockroachdb#86350

Release justification: Low-risk update to new functionality.

Release note (sql change): Add a new cluster setting
`sql.stats.forecasts.enabled` which controls whether statistics
forecasts are generated by default for all tables. (Note that this is
different from the session setting `optimizer_use_forecasts` which
controls whether statistics forecasts are used when optimizing the
current query. If forecasting is disabled, then even if
`optimizer_use_forecasts` is true for a given query it won't have any
forecasts to use.)
@michae2
Copy link
Collaborator Author

michae2 commented Aug 27, 2022

TFTRs!

bors r=rytaft,msirek

michae2 added a commit to michae2/cockroach that referenced this pull request Aug 27, 2022
Add a new storage parameter (a.k.a. table setting) which can be used to
override cluster setting `sql.stats.forecasts.enabled`. This builds on
work done earlier in cockroachdb#79025 and cockroachdb#86932 so we don't have to do much, just
wire it all together.

Fixes: cockroachdb#86353

Release justification: Low-risk update to new functionality.

Release note (sql change): Add a new table setting
`sql_stats_forecasts_enabled` which controls whether statistics
forecasts are generated for a specific table. When set, this overrides
cluster setting `sql.stats.forecasts.enabled`.
@craig
Copy link
Contributor

craig bot commented Aug 27, 2022

Build failed:

@michae2
Copy link
Collaborator Author

michae2 commented Aug 28, 2022

Trying again.

bors r=rytaft,msirek

michae2 added a commit to michae2/cockroach that referenced this pull request Aug 28, 2022
Add a new storage parameter (a.k.a. table setting) which can be used to
override cluster setting `sql.stats.forecasts.enabled`. This builds on
work done earlier in cockroachdb#79025 and cockroachdb#86932 so we don't have to do much, just
wire it all together.

Fixes: cockroachdb#86353

Release justification: Low-risk update to new functionality.

Release note (sql change): Add a new table setting
`sql_stats_forecasts_enabled` which controls whether statistics
forecasts are generated for a specific table. When set, this overrides
cluster setting `sql.stats.forecasts.enabled`.
@craig
Copy link
Contributor

craig bot commented Aug 28, 2022

Build succeeded:

@craig craig bot merged commit e71c128 into cockroachdb:master Aug 28, 2022
craig bot pushed a commit that referenced this pull request Aug 28, 2022
86986: sql/stats: add table setting to disable generation of stats forecasts r=rytaft,msirek a=michae2

Add a new storage parameter (a.k.a. table setting) which can be used to
override cluster setting `sql.stats.forecasts.enabled`. This builds on
work done earlier in #79025 and #86932 so we don't have to do much, just
wire it all together.

Fixes: #86353

Release justification: Low-risk update to new functionality.

Release note (sql change): Add a new table setting
`sql_stats_forecasts_enabled` which controls whether statistics
forecasts are generated for a specific table. When set, this overrides
cluster setting `sql.stats.forecasts.enabled`.

Co-authored-by: Michael Erickson <[email protected]>
@michae2 michae2 deleted the forecastvar branch August 28, 2022 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql/opt: add variables to disable statistics forecasting
4 participants