Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
113542: roachtest: mark new npgsql tests as flaky r=rafiss a=rafiss

These tests started flaking due to #108414. We will mark them as flaky and investigate later.

fixes #112686

Release note: None

113550: server/profiler: mark ActiveQueryDumpsEnabled as SystemVisible r=yuzefovich a=yuzefovich

This setting was miscategorized initially - the active query profiler runs in both storage and virtual clusters, so it should be at least SystemVisible, but maybe even ApplicationLevel.

Addresses: #113170.
Epic: None

Release note: None

113553: cli: create new tracing span in mt-start-sql r=yuzefovich a=yuzefovich

This commit fixes a possible misuse of the tracing infrastructure with `mt start-sql` command. In particular, previously it was possible for a tracing span created (and finished) in `runStartInternal` to be used by a couple of things in `makeTenantSQLServerArgs` (main one being `rpc.Context`) that hold on to the context after the corresponding span has been finished. This is disallowed (since we reuse spans after `Finish` has been called on them) and was recently discovered. This problem is now fixed by deriving a child span only in `mt start-sql` code path that is never finished (since those users can be active throughout the whole lifetime of the tenant server).

The bug is pretty minor (I think only serverless is affected), so there is no release note.

Fixes: #113170.

Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
  • Loading branch information
3 people committed Nov 1, 2023
4 parents b22b3f1 + 7af8852 + a2e92c2 + e720de1 commit 63e0b12
Show file tree
Hide file tree
Showing 5 changed files with 21 additions and 2 deletions.
1 change: 1 addition & 0 deletions docs/generated/settings/settings-for-tenants.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ cloudstorage.timeout duration 10m0s the timeout for import/export storage operat
cluster.auto_upgrade.enabled boolean true disable automatic cluster version upgrade until reset application
cluster.organization string organization name system-visible
cluster.preserve_downgrade_option string disable (automatic or manual) cluster version upgrade from the specified version until reset application
diagnostics.active_query_dumps.enabled boolean true experimental: enable dumping of anonymized active queries to disk when node is under memory pressure system-visible
diagnostics.forced_sql_stat_reset.interval duration 2h0m0s interval after which the reported SQL Stats are reset even if not collected by telemetry reporter. It has a max value of 24H. application
diagnostics.reporting.enabled boolean true enable reporting diagnostic metrics to cockroach labs application
diagnostics.reporting.interval duration 1h0m0s interval at which diagnostics data should be reported application
Expand Down
2 changes: 1 addition & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
<tr><td><div id="setting-cluster-auto-upgrade-enabled" class="anchored"><code>cluster.auto_upgrade.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>disable automatic cluster version upgrade until reset</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-cluster-organization" class="anchored"><code>cluster.organization</code></div></td><td>string</td><td><code></code></td><td>organization name</td><td>Serverless/Dedicated/Self-Hosted (read-only)</td></tr>
<tr><td><div id="setting-cluster-preserve-downgrade-option" class="anchored"><code>cluster.preserve_downgrade_option</code></div></td><td>string</td><td><code></code></td><td>disable (automatic or manual) cluster version upgrade from the specified version until reset</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-diagnostics-active-query-dumps-enabled" class="anchored"><code>diagnostics.active_query_dumps.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>experimental: enable dumping of anonymized active queries to disk when node is under memory pressure</td><td>Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-diagnostics-active-query-dumps-enabled" class="anchored"><code>diagnostics.active_query_dumps.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>experimental: enable dumping of anonymized active queries to disk when node is under memory pressure</td><td>Serverless/Dedicated/Self-Hosted (read-only)</td></tr>
<tr><td><div id="setting-diagnostics-forced-sql-stat-reset-interval" class="anchored"><code>diagnostics.forced_sql_stat_reset.interval</code></div></td><td>duration</td><td><code>2h0m0s</code></td><td>interval after which the reported SQL Stats are reset even if not collected by telemetry reporter. It has a max value of 24H.</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-diagnostics-reporting-enabled" class="anchored"><code>diagnostics.reporting.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>enable reporting diagnostic metrics to cockroach labs</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-diagnostics-reporting-interval" class="anchored"><code>diagnostics.reporting.interval</code></div></td><td>duration</td><td><code>1h0m0s</code></td><td>interval at which diagnostics data should be reported</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
Expand Down
12 changes: 12 additions & 0 deletions pkg/cli/mt_start_sql.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/server"
"github.com/cockroachdb/cockroach/pkg/server/serverctl"
"github.com/cockroachdb/cockroach/pkg/util/stop"
"github.com/cockroachdb/cockroach/pkg/util/tracing"
"github.com/cockroachdb/redact"
"github.com/spf13/cobra"
)
Expand Down Expand Up @@ -63,6 +64,17 @@ func runStartSQL(cmd *cobra.Command, args []string) error {
}

newServerFn := func(ctx context.Context, serverCfg server.Config, stopper *stop.Stopper) (serverctl.ServerStartupInterface, error) {
// The context passed into NewSeparateProcessTenantServer will be
// captured in makeTenantSQLServerArgs by a few users that will hold on
// to it. Those users can log something into the current tracing span
// (created in runStartInternal) after the span has been Finish()'ed
// which is disallowed. To go around this issue, in this code path we
// create a new tracing span that is never finished since those users
// stay alive until the tenant server is up.
// TODO(yuzefovich): consider changing NewSeparateProcessTenantServer to
// not take in a context and - akin to NewServer - to construct its own
// from context.Background.
ctx, _ = tracing.ChildSpan(ctx, "mt-start-sql" /* opName */)
// Beware of not writing simply 'return server.NewServer()'. This is
// because it would cause the serverStartupInterface reference to
// always be non-nil, even if NewServer returns a nil pointer (and
Expand Down
6 changes: 6 additions & 0 deletions pkg/cmd/roachtest/tests/npgsql_blocklist.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ var npgsqlBlocklist = blocklist{
`Npgsql.Tests.CommandTests(Multiplexing).Multiple_statements_with_dependencies`: "unknown",
`Npgsql.Tests.CommandTests(Multiplexing).Non_standards_conforming_strings`: "unknown",
`Npgsql.Tests.CommandTests(Multiplexing).Positional_parameters_are_not_supported_with_legacy_batching`: "unknown",
`Npgsql.Tests.CommandTests(Multiplexing).Reuse_command_with_different_parameter_placeholder_types`: "flaky",
`Npgsql.Tests.CommandTests(NonMultiplexing).Cursor_statement`: "unknown",
`Npgsql.Tests.CommandTests(NonMultiplexing).ExecuteNonQuery_Throws_PostgresException(False)`: "unknown",
`Npgsql.Tests.CommandTests(NonMultiplexing).ExecuteNonQuery_Throws_PostgresException(True)`: "unknown",
Expand Down Expand Up @@ -716,9 +717,12 @@ var npgsqlBlocklist = blocklist{
var npgsqlIgnoreList = blocklist{
`Npgsql.Tests.CommandTests(Multiplexing).Cursor_move_RecordsAffected `: "flaky",
`Npgsql.Tests.CommandTests(Multiplexing).QueryNonQuery`: "flaky",
`Npgsql.Tests.CommandTests(Multiplexing).Same_command_different_param_instances`: "flaky",
`Npgsql.Tests.CommandTests(Multiplexing).Same_command_different_param_values`: "flaky",
`Npgsql.Tests.CommandTests(Multiplexing).SingleNonQuery`: "flaky",
`Npgsql.Tests.CommandTests(Multiplexing).SingleQuery`: "flaky",
`Npgsql.Tests.CommandTests(Multiplexing).Statement_mapped_output_parameters(Default)`: "flaky",
`Npgsql.Tests.CommandTests(Multiplexing).TableDirect`: "flaky",
`Npgsql.Tests.CommandTests(Multiplexing).Use_across_connection_change(NotPrepared)`: "flaky",
`Npgsql.Tests.CommandTests(NonMultiplexing).Cached_command_clears_parameters_placeholder_type`: "flaky",
`Npgsql.Tests.CommandTests(NonMultiplexing).CloseConnection_with_exception`: "flaky",
Expand Down Expand Up @@ -771,4 +775,6 @@ var npgsqlIgnoreList = blocklist{
`Npgsql.Tests.TaskTimeoutAndCancellationTest.DelayedFaultedTaskCancellation("TimeoutOnly")`: "flaky",
`Npgsql.Tests.TransactionTests(Multiplexing).Failed_transaction_on_close_with_custom_timeout`: "flaky",
`Npgsql.Tests.TransactionTests(NonMultiplexing).CommitAsync(Prepared)`: "flaky",
`Npgsql.Tests.TransactionTests(NonMultiplexing).Rollback(Prepared)`: "flaky",
`Npgsql.Tests.TransactionTests(NonMultiplexing).RollbackAsync(NotPrepared)`: "flaky",
}
2 changes: 1 addition & 1 deletion pkg/server/profiler/cluster_settings.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ import "github.com/cockroachdb/cockroach/pkg/settings"
// Note: this feature only works for nodes running on unix hosts with cgroups
// enabled.
var ActiveQueryDumpsEnabled = settings.RegisterBoolSetting(
settings.SystemOnly,
settings.SystemVisible,
"diagnostics.active_query_dumps.enabled",
"experimental: enable dumping of anonymized active queries to disk when node is under memory pressure",
true,
Expand Down

0 comments on commit 63e0b12

Please sign in to comment.