-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: turn off plan sampling by default #88343
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @xinhaoz)
318c889
to
7be04e2
Compare
7be04e2
to
b8388d6
Compare
14ce760
to
c1adb2d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @maryliag, @xinhaoz, and @yuzefovich)
pkg/ccl/telemetryccl/telemetry_logging_test.go
line 59 at r2 (raw file):
// all statements are captured. sqlDB.Exec(t, `SET CLUSTER SETTING sql.telemetry.query_sampling.enabled = true;`) sqlDB.Exec(t, `SET CLUSTER SETTING sql.metrics.statement_details.plan_collection.enabled = true;`)
Instead of doing this, let's apply the following diff to this commit:
diff --git a/pkg/ccl/telemetryccl/telemetry_logging_test.go b/pkg/ccl/telemetryccl/telemetry_logging_test.go
index 9bc549d629..3e4422b3de 100644
--- a/pkg/ccl/telemetryccl/telemetry_logging_test.go
+++ b/pkg/ccl/telemetryccl/telemetry_logging_test.go
@@ -53,6 +53,9 @@ func TestTelemetryLogRegions(t *testing.T) {
sqlDB.Exec(t, `ALTER TABLE three_regions SPLIT AT SELECT generate_series(1, 3)`)
sqlDB.Exec(t, "ALTER TABLE three_regions EXPERIMENTAL_RELOCATE VALUES (ARRAY[1], 1), (ARRAY[2], 2), (ARRAY[3], 3)")
+ // Enable the sampling of all statements so that execution statistics
+ // (including the regions information) is collected.
+ sqlDB.Exec(t, `SET CLUSTER SETTING sql.txn_stats.sample_rate = 1.0`)
// Enable the telemetry logging and increase the sampling frequency so that
// all statements are captured.
sqlDB.Exec(t, `SET CLUSTER SETTING sql.telemetry.query_sampling.enabled = true;`)
diff --git a/pkg/sql/instrumentation.go b/pkg/sql/instrumentation.go
index 8224ea387c..f2a0ae0b87 100644
--- a/pkg/sql/instrumentation.go
+++ b/pkg/sql/instrumentation.go
@@ -257,11 +257,13 @@ func (ih *instrumentationHelper) Setup(
ih.savePlanForStats =
statsCollector.ShouldSaveLogicalPlanDesc(fingerprint, implicitTxn, p.SessionData().Database)
- if ih.ShouldBuildExplainPlan() {
- // Populate traceMetadata early in case we short-circuit the execution
- // before reaching the bottom of this method.
- ih.traceMetadata = make(execNodeTraceMetadata)
- }
+ defer func() {
+ if ih.ShouldBuildExplainPlan() {
+ // Populate traceMetadata at the end once we have all properties of
+ // the helper setup.
+ ih.traceMetadata = make(execNodeTraceMetadata)
+ }
+ }()
if sp := tracing.SpanFromContext(ctx); sp != nil {
if sp.IsVerbose() {
@@ -307,9 +309,6 @@ func (ih *instrumentationHelper) Setup(
}
ih.collectExecStats = true
- if ih.traceMetadata == nil {
- ih.traceMetadata = make(execNodeTraceMetadata)
- }
ih.evalCtx = p.EvalContext()
newCtx, ih.sp = tracing.EnsureChildSpan(ctx, cfg.AmbientCtx.Tracer, "traced statement", tracing.WithRecording(tracingpb.RecordingVerbose))
ih.shouldFinishSpan = true
@@ -439,7 +438,8 @@ func (ih *instrumentationHelper) ShouldUseJobForCreateStats() bool {
// ShouldBuildExplainPlan returns true if we should build an explain plan and
// call RecordExplainPlan.
func (ih *instrumentationHelper) ShouldBuildExplainPlan() bool {
- return ih.collectBundle || ih.savePlanForStats || ih.outputMode == explainAnalyzePlanOutput ||
+ return ih.collectBundle || ih.collectExecStats || ih.savePlanForStats ||
+ ih.outputMode == explainAnalyzePlanOutput ||
ih.outputMode == explainAnalyzeDistSQLOutput
}
Previously, we were sampling plans for fingerprints and saving to statement_statistics tables. Now that we are saving plan hash and plan gist (that allow us to decode back to the plan) we are no longer using the sampled plan anywhere. Since this is a heavy opperation, we are turning it off by default, but changing the default value of `sql.metrics.statement_details.plan_collection.enabled` to `false`. If we don't receive feedback about turning it back on, we can remove this sampling entirely. Partially addresses cockroachdb#77944 Release note (sql change): Change the default value of `sql.metrics.statement_details.plan_collection.enabled` to false, since we no longer use this information anywhere.
c1adb2d
to
0a0895f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @xinhaoz and @yuzefovich)
pkg/ccl/telemetryccl/telemetry_logging_test.go
line 59 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Instead of doing this, let's apply the following diff to this commit:
diff --git a/pkg/ccl/telemetryccl/telemetry_logging_test.go b/pkg/ccl/telemetryccl/telemetry_logging_test.go index 9bc549d629..3e4422b3de 100644 --- a/pkg/ccl/telemetryccl/telemetry_logging_test.go +++ b/pkg/ccl/telemetryccl/telemetry_logging_test.go @@ -53,6 +53,9 @@ func TestTelemetryLogRegions(t *testing.T) { sqlDB.Exec(t, `ALTER TABLE three_regions SPLIT AT SELECT generate_series(1, 3)`) sqlDB.Exec(t, "ALTER TABLE three_regions EXPERIMENTAL_RELOCATE VALUES (ARRAY[1], 1), (ARRAY[2], 2), (ARRAY[3], 3)") + // Enable the sampling of all statements so that execution statistics + // (including the regions information) is collected. + sqlDB.Exec(t, `SET CLUSTER SETTING sql.txn_stats.sample_rate = 1.0`) // Enable the telemetry logging and increase the sampling frequency so that // all statements are captured. sqlDB.Exec(t, `SET CLUSTER SETTING sql.telemetry.query_sampling.enabled = true;`) diff --git a/pkg/sql/instrumentation.go b/pkg/sql/instrumentation.go index 8224ea387c..f2a0ae0b87 100644 --- a/pkg/sql/instrumentation.go +++ b/pkg/sql/instrumentation.go @@ -257,11 +257,13 @@ func (ih *instrumentationHelper) Setup( ih.savePlanForStats = statsCollector.ShouldSaveLogicalPlanDesc(fingerprint, implicitTxn, p.SessionData().Database) - if ih.ShouldBuildExplainPlan() { - // Populate traceMetadata early in case we short-circuit the execution - // before reaching the bottom of this method. - ih.traceMetadata = make(execNodeTraceMetadata) - } + defer func() { + if ih.ShouldBuildExplainPlan() { + // Populate traceMetadata at the end once we have all properties of + // the helper setup. + ih.traceMetadata = make(execNodeTraceMetadata) + } + }() if sp := tracing.SpanFromContext(ctx); sp != nil { if sp.IsVerbose() { @@ -307,9 +309,6 @@ func (ih *instrumentationHelper) Setup( } ih.collectExecStats = true - if ih.traceMetadata == nil { - ih.traceMetadata = make(execNodeTraceMetadata) - } ih.evalCtx = p.EvalContext() newCtx, ih.sp = tracing.EnsureChildSpan(ctx, cfg.AmbientCtx.Tracer, "traced statement", tracing.WithRecording(tracingpb.RecordingVerbose)) ih.shouldFinishSpan = true @@ -439,7 +438,8 @@ func (ih *instrumentationHelper) ShouldUseJobForCreateStats() bool { // ShouldBuildExplainPlan returns true if we should build an explain plan and // call RecordExplainPlan. func (ih *instrumentationHelper) ShouldBuildExplainPlan() bool { - return ih.collectBundle || ih.savePlanForStats || ih.outputMode == explainAnalyzePlanOutput || + return ih.collectBundle || ih.collectExecStats || ih.savePlanForStats || + ih.outputMode == explainAnalyzePlanOutput || ih.outputMode == explainAnalyzeDistSQLOutput }
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 3 of 4 files at r2, 2 of 2 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @maryliag)
TFTR! |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 0a0895f to blathers/backport-release-22.1-88343: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
This value was disabled by default on cockroachdb#88343 but brought visibility to a bug of statements not being properly trace when they should be on the first time. Since we might not be able to finish the proper fix before the cut on backports, I'm turning back on and a following commit can turn this back off and add the fix for the problem. Release note (sql change): Turn the default value of `sql.metrics.statement_details.plan_collection.enabled` to `true`
This value was disabled by default on cockroachdb#88343 but brought visibility to a bug of statements not being properly trace when they should be on the first time. Since we might not be able to finish the proper fix before the cut on backports, I'm turning back on and a following commit can turn this back off and add the fix for the problem. Release note (sql change): Turn the default value of `sql.metrics.statement_details.plan_collection.enabled` to `true`
89020: sql: turn plan sampling back on by default r=maryliag a=maryliag This value was disabled by default on #88343 but brought visibility to a bug of statements not being properly trace when they should be on the first time. Since we might not be able to finish the proper fix before the cut on backports, I'm turning back on and a following commit can turn this back off and add the fix for the problem. Release note (sql change): Turn the default value of `sql.metrics.statement_details.plan_collection.enabled` to `true` Co-authored-by: Marylia Gutierrez <[email protected]>
89782: roachtest: limit FingerprintValidator's memory usage internally r=miretskiy,srosenberg a=renatolabs In #89332, we started optionally validating changefeed semantics in the `FingerprintValidator` by making sure that unseen updates are never observed if a `resolved` message is received. In order to keep bounds on the validator's memory usage, the `cdc/mixed-versions` test was setting a maximum number of operations at the workload level. However, a consequence of that change is that it creates the possibility for the workload to finish before the roachtest has received all the `resolved` events it expects (for instance, if draining takes longer than usual for some reason; a variety of other non-determinism is also at play). To deal with this possibility, we instead enforce a maximum number of previously seen events at the validator's level. For the `cdc/mixed-versions` test, we set a maximum of 100,000 previously seen events being stored (20MB memory footprint). This should be equivalent to roughly 50,000 bank transfers, and give enough time for the test to finish successfully. Release note: None Epic: None. 89847: sql: turn plan sampling back off by default r=j82w a=j82w The value was originally disabled by default in: #88343. The value was enabled by default: #89020 because of a bug that was fixed in #89418. This PR is disabling it by default again now that the bug has been fixed. Part Of #89847 Release note (sql change): Turn the default value of sql.metrics.statement_details.plan_collection.enabled to false. Co-authored-by: Renato Costa <[email protected]> Co-authored-by: j82w <[email protected]>
Previously, we were sampling plans for fingerprints and saving to statement_statistics tables. Now that we are saving plan hash and plan gist (that allow us to decode back to the plan) we are no longer using the sampled plan anywhere. Since this is a heavy opperation, we are turning it off by default, but changing the default value of
sql.metrics.statement_details.plan_collection.enabled
tofalse
.If we don't receive feedback about turning it back on, we can remove this sampling entirely.
Partially addresses #77944
Release note (sql change): Change the default value of
sql.metrics.statement_details.plan_collection.enabled
to false, since we no longer use this informationanywhere.