Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: turn off plan sampling by default #88343

Merged
merged 1 commit into from
Sep 21, 2022

Conversation

maryliag
Copy link
Contributor

Previously, we were sampling plans for fingerprints and saving to statement_statistics tables. Now that we are saving plan hash and plan gist (that allow us to decode back to the plan) we are no longer using the sampled plan anywhere. Since this is a heavy opperation, we are turning it off by default, but changing the default value of sql.metrics.statement_details.plan_collection.enabled to false.
If we don't receive feedback about turning it back on, we can remove this sampling entirely.

Partially addresses #77944

Release note (sql change): Change the default value of sql.metrics.statement_details.plan_collection.enabled to false, since we no longer use this information
anywhere.

@maryliag maryliag requested a review from a team September 21, 2022 13:45
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@xinhaoz xinhaoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @xinhaoz)

@maryliag maryliag requested a review from a team September 21, 2022 14:11
@maryliag maryliag requested a review from a team as a code owner September 21, 2022 15:14
@maryliag maryliag force-pushed the sampled-plan-off branch 2 times, most recently from 14ce760 to c1adb2d Compare September 21, 2022 16:30
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @maryliag, @xinhaoz, and @yuzefovich)


pkg/ccl/telemetryccl/telemetry_logging_test.go line 59 at r2 (raw file):

	// all statements are captured.
	sqlDB.Exec(t, `SET CLUSTER SETTING sql.telemetry.query_sampling.enabled = true;`)
	sqlDB.Exec(t, `SET CLUSTER SETTING sql.metrics.statement_details.plan_collection.enabled = true;`)

Instead of doing this, let's apply the following diff to this commit:

diff --git a/pkg/ccl/telemetryccl/telemetry_logging_test.go b/pkg/ccl/telemetryccl/telemetry_logging_test.go
index 9bc549d629..3e4422b3de 100644
--- a/pkg/ccl/telemetryccl/telemetry_logging_test.go
+++ b/pkg/ccl/telemetryccl/telemetry_logging_test.go
@@ -53,6 +53,9 @@ func TestTelemetryLogRegions(t *testing.T) {
        sqlDB.Exec(t, `ALTER TABLE three_regions SPLIT AT SELECT generate_series(1, 3)`)
        sqlDB.Exec(t, "ALTER TABLE three_regions EXPERIMENTAL_RELOCATE VALUES (ARRAY[1], 1), (ARRAY[2], 2), (ARRAY[3], 3)")
 
+       // Enable the sampling of all statements so that execution statistics
+       // (including the regions information) is collected.
+       sqlDB.Exec(t, `SET CLUSTER SETTING sql.txn_stats.sample_rate = 1.0`)
        // Enable the telemetry logging and increase the sampling frequency so that
        // all statements are captured.
        sqlDB.Exec(t, `SET CLUSTER SETTING sql.telemetry.query_sampling.enabled = true;`)
diff --git a/pkg/sql/instrumentation.go b/pkg/sql/instrumentation.go
index 8224ea387c..f2a0ae0b87 100644
--- a/pkg/sql/instrumentation.go
+++ b/pkg/sql/instrumentation.go
@@ -257,11 +257,13 @@ func (ih *instrumentationHelper) Setup(
        ih.savePlanForStats =
                statsCollector.ShouldSaveLogicalPlanDesc(fingerprint, implicitTxn, p.SessionData().Database)
 
-       if ih.ShouldBuildExplainPlan() {
-               // Populate traceMetadata early in case we short-circuit the execution
-               // before reaching the bottom of this method.
-               ih.traceMetadata = make(execNodeTraceMetadata)
-       }
+       defer func() {
+               if ih.ShouldBuildExplainPlan() {
+                       // Populate traceMetadata at the end once we have all properties of
+                       // the helper setup.
+                       ih.traceMetadata = make(execNodeTraceMetadata)
+               }
+       }()
 
        if sp := tracing.SpanFromContext(ctx); sp != nil {
                if sp.IsVerbose() {
@@ -307,9 +309,6 @@ func (ih *instrumentationHelper) Setup(
        }
 
        ih.collectExecStats = true
-       if ih.traceMetadata == nil {
-               ih.traceMetadata = make(execNodeTraceMetadata)
-       }
        ih.evalCtx = p.EvalContext()
        newCtx, ih.sp = tracing.EnsureChildSpan(ctx, cfg.AmbientCtx.Tracer, "traced statement", tracing.WithRecording(tracingpb.RecordingVerbose))
        ih.shouldFinishSpan = true
@@ -439,7 +438,8 @@ func (ih *instrumentationHelper) ShouldUseJobForCreateStats() bool {
 // ShouldBuildExplainPlan returns true if we should build an explain plan and
 // call RecordExplainPlan.
 func (ih *instrumentationHelper) ShouldBuildExplainPlan() bool {
-       return ih.collectBundle || ih.savePlanForStats || ih.outputMode == explainAnalyzePlanOutput ||
+       return ih.collectBundle || ih.collectExecStats || ih.savePlanForStats ||
+               ih.outputMode == explainAnalyzePlanOutput ||
                ih.outputMode == explainAnalyzeDistSQLOutput
 }

Previously, we were sampling plans for fingerprints
and saving to statement_statistics tables. Now that
we are saving plan hash and plan gist (that allow us
to decode back to the plan) we are no longer using the
sampled plan anywhere. Since this is a heavy opperation,
we are turning it off by default, but changing the default
value of `sql.metrics.statement_details.plan_collection.enabled`
to `false`.
If we don't receive feedback about turning it back on,
we can remove this sampling entirely.

Partially addresses cockroachdb#77944

Release note (sql change): Change the default value of
`sql.metrics.statement_details.plan_collection.enabled`
to false, since we no longer use this information
anywhere.
Copy link
Contributor Author

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @xinhaoz and @yuzefovich)


pkg/ccl/telemetryccl/telemetry_logging_test.go line 59 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Instead of doing this, let's apply the following diff to this commit:

diff --git a/pkg/ccl/telemetryccl/telemetry_logging_test.go b/pkg/ccl/telemetryccl/telemetry_logging_test.go
index 9bc549d629..3e4422b3de 100644
--- a/pkg/ccl/telemetryccl/telemetry_logging_test.go
+++ b/pkg/ccl/telemetryccl/telemetry_logging_test.go
@@ -53,6 +53,9 @@ func TestTelemetryLogRegions(t *testing.T) {
        sqlDB.Exec(t, `ALTER TABLE three_regions SPLIT AT SELECT generate_series(1, 3)`)
        sqlDB.Exec(t, "ALTER TABLE three_regions EXPERIMENTAL_RELOCATE VALUES (ARRAY[1], 1), (ARRAY[2], 2), (ARRAY[3], 3)")
 
+       // Enable the sampling of all statements so that execution statistics
+       // (including the regions information) is collected.
+       sqlDB.Exec(t, `SET CLUSTER SETTING sql.txn_stats.sample_rate = 1.0`)
        // Enable the telemetry logging and increase the sampling frequency so that
        // all statements are captured.
        sqlDB.Exec(t, `SET CLUSTER SETTING sql.telemetry.query_sampling.enabled = true;`)
diff --git a/pkg/sql/instrumentation.go b/pkg/sql/instrumentation.go
index 8224ea387c..f2a0ae0b87 100644
--- a/pkg/sql/instrumentation.go
+++ b/pkg/sql/instrumentation.go
@@ -257,11 +257,13 @@ func (ih *instrumentationHelper) Setup(
        ih.savePlanForStats =
                statsCollector.ShouldSaveLogicalPlanDesc(fingerprint, implicitTxn, p.SessionData().Database)
 
-       if ih.ShouldBuildExplainPlan() {
-               // Populate traceMetadata early in case we short-circuit the execution
-               // before reaching the bottom of this method.
-               ih.traceMetadata = make(execNodeTraceMetadata)
-       }
+       defer func() {
+               if ih.ShouldBuildExplainPlan() {
+                       // Populate traceMetadata at the end once we have all properties of
+                       // the helper setup.
+                       ih.traceMetadata = make(execNodeTraceMetadata)
+               }
+       }()
 
        if sp := tracing.SpanFromContext(ctx); sp != nil {
                if sp.IsVerbose() {
@@ -307,9 +309,6 @@ func (ih *instrumentationHelper) Setup(
        }
 
        ih.collectExecStats = true
-       if ih.traceMetadata == nil {
-               ih.traceMetadata = make(execNodeTraceMetadata)
-       }
        ih.evalCtx = p.EvalContext()
        newCtx, ih.sp = tracing.EnsureChildSpan(ctx, cfg.AmbientCtx.Tracer, "traced statement", tracing.WithRecording(tracingpb.RecordingVerbose))
        ih.shouldFinishSpan = true
@@ -439,7 +438,8 @@ func (ih *instrumentationHelper) ShouldUseJobForCreateStats() bool {
 // ShouldBuildExplainPlan returns true if we should build an explain plan and
 // call RecordExplainPlan.
 func (ih *instrumentationHelper) ShouldBuildExplainPlan() bool {
-       return ih.collectBundle || ih.savePlanForStats || ih.outputMode == explainAnalyzePlanOutput ||
+       return ih.collectBundle || ih.collectExecStats || ih.savePlanForStats ||
+               ih.outputMode == explainAnalyzePlanOutput ||
                ih.outputMode == explainAnalyzeDistSQLOutput
 }

done

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1, 3 of 4 files at r2, 2 of 2 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @maryliag)

@maryliag maryliag removed request for a team September 21, 2022 20:28
@maryliag
Copy link
Contributor Author

TFTR!
bors r+

@craig
Copy link
Contributor

craig bot commented Sep 21, 2022

Build succeeded:

@craig craig bot merged commit c04fae2 into cockroachdb:master Sep 21, 2022
@blathers-crl
Copy link

blathers-crl bot commented Sep 21, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 0a0895f to blathers/backport-release-22.1-88343: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@maryliag maryliag deleted the sampled-plan-off branch September 21, 2022 23:32
maryliag added a commit to maryliag/cockroach that referenced this pull request Sep 29, 2022
This value was disabled by default on cockroachdb#88343
but brought visibility to a bug of statements not being properly
trace when they should be on the first time. Since
we might not be able to finish the proper fix before the
cut on backports, I'm turning back on and a following commit
can turn this back off and add the fix for the problem.

Release note (sql change): Turn the default value of
`sql.metrics.statement_details.plan_collection.enabled` to `true`
maryliag added a commit to maryliag/cockroach that referenced this pull request Sep 29, 2022
This value was disabled by default on cockroachdb#88343
but brought visibility to a bug of statements not being properly
trace when they should be on the first time. Since
we might not be able to finish the proper fix before the
cut on backports, I'm turning back on and a following commit
can turn this back off and add the fix for the problem.

Release note (sql change): Turn the default value of
`sql.metrics.statement_details.plan_collection.enabled` to `true`
craig bot pushed a commit that referenced this pull request Sep 29, 2022
89020: sql: turn plan sampling back on by default r=maryliag a=maryliag

This value was disabled by default on #88343
but brought visibility to a bug of statements not being properly trace when they should be on the first time. Since we might not be able to finish the proper fix before the cut on backports, I'm turning back on and a following commit can turn this back off and add the fix for the problem.

Release note (sql change): Turn the default value of `sql.metrics.statement_details.plan_collection.enabled` to `true`

Co-authored-by: Marylia Gutierrez <[email protected]>
craig bot pushed a commit that referenced this pull request Oct 13, 2022
89782: roachtest: limit FingerprintValidator's memory usage internally r=miretskiy,srosenberg a=renatolabs

In #89332, we started optionally validating changefeed semantics in the `FingerprintValidator` by making sure that unseen updates are never observed if a `resolved` message is received. In order to keep bounds on the validator's memory usage, the `cdc/mixed-versions` test was setting a maximum number of operations at the workload level. However, a consequence of that change is that it creates the possibility for the workload to finish before the roachtest has received all the `resolved` events it expects (for instance, if draining takes longer than usual for some reason; a variety of other non-determinism is also at play).

To deal with this possibility, we instead enforce a maximum number of previously seen events at the validator's level. For the `cdc/mixed-versions` test, we set a maximum of 100,000 previously seen events being stored (20MB memory footprint). This should be equivalent to roughly 50,000 bank transfers, and give enough time for the test to finish successfully.

Release note: None

Epic: None.

89847: sql: turn plan sampling back off by default r=j82w a=j82w

The value was originally disabled by default in: #88343. The value was enabled by default: #89020 because of a bug that was fixed in #89418. This PR is disabling it by default again now that the bug has been fixed.

Part Of #89847

Release note (sql change): Turn the default value of sql.metrics.statement_details.plan_collection.enabled to false.

Co-authored-by: Renato Costa <[email protected]>
Co-authored-by: j82w <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants