Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrich the existing query log format for cloud telemetry #71328

Open
9 of 11 tasks
kevin-v-ngo opened this issue Oct 8, 2021 · 3 comments · Fixed by #83027
Open
9 of 11 tasks

Enrich the existing query log format for cloud telemetry #71328

kevin-v-ngo opened this issue Oct 8, 2021 · 3 comments · Fixed by #83027
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-observability

Comments

@kevin-v-ngo
Copy link

kevin-v-ngo commented Oct 8, 2021

This builds on top of this issue to capture additional workload insights and details in our telemetry pipeline. The previous format for query events is outlined in the following spreadsheet.

Additional information to capture:

  • HasIndexJoin (we have HasFullTableScan today) (not available)
    • index_join_count field
  • Contention (wait time) (not available)
  • Statement Fingerprint ID (measuring uniqueness of a workload) (not available)
  • Database ID (not available)
  • Database name
  • Session ID
  • Transaction ID (measure how large/conversational transactions are)
  • Statement ID (Avoid having to filter based on the query text)
  • Plan from the plan gist (determine the complexity and how optimal the workload is)
  • Whether the query touched all cluster nodes (related issue) (not available)
  • Index recommendation (not available)

Jira issue: CRDB-10514

Epic CRDB-13538

@kevin-v-ngo kevin-v-ngo added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Oct 8, 2021
@kevin-v-ngo kevin-v-ngo changed the title Enrich cloud telemetry by capturing workload metrics and logs Enrich the existing query log format for cloud telemetry Nov 29, 2021
@vy-ton
Copy link
Contributor

vy-ton commented Jun 6, 2022

From raw notes on how to track query plan regression telemetry, Queries would like to see the plan gist added to telemetry logging and backported to 21.2. Any idea if this could fit into June milestone?

@maryliag you mentioned in-flight work to track query timings per fingerpint. Do you know what the data format looks like?

THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jun 17, 2022
Partially resolves: cockroachdb#71328

This change adds a plan gist field to the sampled query telemetry log.
The plan gist is written as a base64 encoded string.

Release note (sql change): The sampled query telemetry log now includes
a plan gist field. The plan gist field provides a compact representation
of a logical plan for the sampled query, the field is written as a
base64 encoded string.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jun 28, 2022
Partially resolves: cockroachdb#71328

This change adds a plan gist field to the sampled query telemetry log.
The plan gist is written as a base64 encoded string.

Release note (sql change): The sampled query telemetry log now includes
a plan gist field. The plan gist field provides a compact representation
of a logical plan for the sampled query, the field is written as a
base64 encoded string.
craig bot pushed a commit that referenced this issue Jun 29, 2022
…83526 #83533 #83548 #83559 #83576 #83588

57838: execinfra: remove MetadataTest* processors r=yuzefovich a=yuzefovich

This commit removes `MetadataTestSender` and `MetadataTestReceiver`
processors since they no longer provide much value. I believe they were
introduced when we added a `ProducerMetadata` as a return parameter to
`Next` method in order to ensure that at least some artificial metadata
is propagated correctly throughout the whole flow.

The main goal of this commit is the removal of `fakedist-metadata` and
`5node-metadata` logic test configs in order to speed up the CI time.

The justification for removal of these processors without putting anything
in their place is that these processors are not that useful - the only
thing they can test is that *some* metadata is propagated through the
row-based flows. Note that they don't test whether all necessary
metadata is emitted (for example, whether `LeafTxnFinalState`). We've
been using the vectorized engine as the default for a while now, and these
processors don't get planned with the vectorized flows. Thus, it seems
silly to have a logic test config like `fakedist-metadata` that is part
of the default configs.

Addresses: #57268.

Release note: None

78104: kvserver: include MVCC range keys in replica consistency checks r=aliher1911 a=erikgrinaker

This patch adds handling of MVCC range keys in replica consistency
checks. These are iterated over as part of `MVCCStats` calculations and
hashed similarly to point keys.

Range keys will only exist after the version gate
`ExperimentalMVCCRangeTombstones` has been enabled, so a separate
version gate is not necessary.

Release note: None

81880: changefeedccl: fix changefeed telemetry resolved r=samiskin a=samiskin

Resolves #81599

Our internal-use changefeed create / failed telemetry events would
incorrectly record "yes" for any non-no value of resolved, rather than
using the value that was passed in.  This change resolves that, emitting
yes for "resolved" and the value itself for "resolved=<value>".

Release note: None

82686: sql: support ttl_expiration_expression for row-level TTL r=otan a=ecwall

refs #76916

Release note (sql change): Allow `CREATE TABLE ... WITH (ttl_expiration_expression='...')`.
Allow `ALTER TABLE ... SET (ttl_expiration_expression='...')` and
`ALTER TABLE ... RESET (ttl_expiration_expression)`. ttl_expiration_expression accepts
an expression that returns a timestamp to support custom TTL calculation.

82885: streamingccl: minor error style cleanups r=miretskiy,HonoreDB a=stevendanna

Capitalized error messages are rare in the code base, so I've made
these more consistent with the rest of the code base.

Release note: None

83004: roachprod: add prometheus/grafana monitoring r=irfansharif a=msbutler

Previously, only roachtests could spin up prom/grafana servers that lasted the
lifetime of the roachtest. This PR introduces new roachprod cmds that allow
a roachprod user to easily spin up/down their own prom/grafana instances. The PR
also hooks up roachtests that rely on prom/grafana into this new infrastructure.

Release note: none

83027: sql: add plan gist to sampled query telemetry log r=THardy98 a=THardy98

Partially resolves: #71328

This change adds a plan gist field to the sampled query telemetry log.
The plan gist is written as a base64 encoded string.

Release note (sql change): The sampled query telemetry log now includes
a plan gist field. The plan gist field provides a compact representation
of a logical plan for the sampled query, the field is written as a
base64 encoded string.

83445: roachtest: skip decommissionBench/nodes=8/cpu=16/warehouses=3000 r=erikgrinaker a=tbg

#82870

Release note: None


83505: streamingccl: deflake TestPartitionedStreamReplicationClient r=miretskiy a=stevendanna

Previously, this test would fail occasionally with:

```
    partitioned_stream_client_test.go:192:
                Error Trace:    partitioned_stream_client_test.go:192
                Error:          Target error should be in err chain:
                                expected: "context canceled"
                                in chain: "pq: query execution canceled"
                Test:           TestPartitionedStreamReplicationClient
```

This is a result of the lib/pq library asyncronously sending a
CancelRequest when it notices the context cancellation. The cancel
request may result in the query ending before the local context
cancellation produces an error.

We now count this query cancellation error as an acceptable response
since the goal of the test is to assert that we respond to context
cancellation.

Fixes #76919

Release note: None

83526: backupccl: create tree.SystemUsers, a new DescriptorCoverage enum r=adityamaru a=msbutler

Previously during planning and execution RESTORE SYSTEM USERS was identified by
a `jobDetails` field. This refactor now identifies this flavor of
restore with a new  DescriptorCoverage enum value, `tree.SystemUsers.

This refactor eases the logic around exposing extra processing steps for
flavors of backup/restore that target different sets of descriptors.

Release note: None

83533: amazon: add custom retryer to retry on `read: connection reset` r=miretskiy,stevendanna a=adityamaru

This change implements a custom retryer that we use when
initializing a new s3 client for interaction with the external
storage sink. This change was motivated by the increased number
of backup job failures we were observing with a `read: connection reset`
error being thrown by s3.

A read connection reset error is thrown when the SDK is unable to read
the response of an underlying API request due to a connection reset. The
DefaultRetryer in the AWS SDK does not treat this error as a retryable error
since the SDK does not have knowledge about the idempotence of the request,
and whether it is safe to retry -
aws/aws-sdk-go#2926 (comment).
In CRDB all operations with s3 (read, write, list) are considered idempotent,
and so we can treat the read connection reset error as retryable too.

Release note (bug fix): Retry s3 operations when they error out with a
read connection reset error instead of failing the top level job.

83548: changefeedccl: Support more stable functions. r=miretskiy a=miretskiy

Add support for additional stable functions to CDC expressions.

Fixes #83466

Release Notes: None

83559: sql: ensure that the plan is closed in apply joins in some error cases r=yuzefovich a=yuzefovich

Previously, it was possible for the apply join's plan to be left
unclosed when an error is encountered during the physical planning of
the main query, and this has now been fixed. We do so by explicitly
closing the plan in such a scenario.

Fixes: #82705.
Fixes: #83368.

Release note: None

83576: streamingccl: re-enabled TestRandomClientGeneration r=miretskiy a=stevendanna

TestRandomClientGeneration was skipped in #61292 as a flake. However,
in the time since then, other changes in this code broke this test
more completely. Re-enabling the test required a few unrelated
changes:

- The stream ingestion processor required a fully formed job to be
  able to poll the cutover time. Now, test code can set a
  cutoverProvider that doesn't depend on a full job record.

- The request intercepting depended on an explicit client being
  set. This test was rather passing the processor a randgen URI. Now we
  pass the client explicitly and also update the test code to make it
  clear that the stream URI isn't actually used for anything.

- The code was attempting to validate the number of rows using SQL. I
  haven't dug into how this was working in the past. But as we are
  connecting to the host tenant and the keys are being ingested to a
  guest tenant, we would need a connection to the guest tenant to
  validate the table data. I've simply removed this assertion since I
  don't think it was testing very much compared to the KV level
  assertions also used in the test.

- The test code assumed that the partitions were keyed based on the
  subscription token rather than the subscription ID.

It isn't clear what the original source of the flakiness was.
However, the test has run a few hundred times under stress without
issue.

Alternatively, we could just delete this test.

Fixes #61287

Release note: None

83588: tree: DROP ROLE should not implement CCLOnlyStatement r=rafiss a=rafiss

This was moved out of CCL licensing a few releases ago.

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Shiranka Miskin <[email protected]>
Co-authored-by: Evan Wall <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Thomas Hardy <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: Aditya Maru <[email protected]>
Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
@craig craig bot closed this as completed in 39a562c Jun 29, 2022
@THardy98 THardy98 reopened this Jun 29, 2022
@THardy98
Copy link

THardy98 commented Jun 29, 2022

Re-opening, the PR that tagged this only partially resolves this.

blathers-crl bot pushed a commit that referenced this issue Jun 30, 2022
Partially resolves: #71328

This change adds a plan gist field to the sampled query telemetry log.
The plan gist is written as a base64 encoded string.

Release note (sql change): The sampled query telemetry log now includes
a plan gist field. The plan gist field provides a compact representation
of a logical plan for the sampled query, the field is written as a
base64 encoded string.
blathers-crl bot pushed a commit that referenced this issue Jun 30, 2022
Partially resolves: #71328

This change adds a plan gist field to the sampled query telemetry log.
The plan gist is written as a base64 encoded string.

Release note (sql change): The sampled query telemetry log now includes
a plan gist field. The plan gist field provides a compact representation
of a logical plan for the sampled query, the field is written as a
base64 encoded string.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jul 20, 2022
log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query. As a tradeoff to collecting query-level stats earlier, we
miss out on a few events that could be added to the trace afterwards
(i.e. events that occur after `dispatchToExecutionEngine` but before the
`instrumentationHelper` `Finish`). These events would no longer be
included in the statement bundle.

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics also results in
an omission of a few tracing events from the statement bundle.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jul 20, 2022
log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query. As a tradeoff to collecting query-level stats earlier, we
miss out on a few events that could be added to the trace afterwards
(i.e. events that occur after `dispatchToExecutionEngine` but before the
`instrumentationHelper` `Finish`). These events would no longer be
included in the statement bundle.

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics also results in
an omission of a few tracing events from the statement bundle.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jul 20, 2022
log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query. As a tradeoff to collecting query-level stats earlier, we
miss out on a few events that could be added to the trace afterwards
(i.e. events that occur after `dispatchToExecutionEngine` but before the
`instrumentationHelper` `Finish`). These events would no longer be
included in the statement bundle.

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics also results in
an omission of a few tracing events from the statement bundle.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jul 21, 2022
log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query. As a tradeoff to collecting query-level stats earlier, we
miss out on a few events that could be added to the trace afterwards
(i.e. events that occur after `dispatchToExecutionEngine` but before the
`instrumentationHelper` `Finish`). These events would no longer be
included in the statement bundle.

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics also results in
an omission of a few tracing events from the statement bundle.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jul 29, 2022
log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query. As a tradeoff to collecting query-level stats earlier, we
miss out on a few events that could be added to the trace afterwards
(i.e. events that occur after `dispatchToExecutionEngine` but before the
`instrumentationHelper` `Finish`). These events would no longer be
included in the statement bundle.

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics also results in
an omission of a few tracing events from the statement bundle.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Jul 30, 2022
log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query. As a tradeoff to collecting query-level stats earlier, we
miss out on a few events that could be added to the trace afterwards
(i.e. events that occur after `dispatchToExecutionEngine` but before the
`instrumentationHelper` `Finish`). These events would no longer be
included in the statement bundle.

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics also results in
an omission of a few tracing events from the statement bundle.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Aug 3, 2022
log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query. As a tradeoff to collecting query-level stats earlier, we
miss out on a few events that could be added to the trace afterwards
(i.e. events that occur after `dispatchToExecutionEngine` but before the
`instrumentationHelper` `Finish`). These events would no longer be
included in the statement bundle.

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics also results in
an omission of a few tracing events from the statement bundle.
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Aug 4, 2022
… log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query.

As a tradeoff to collecting query-level stats earlier, we need to fetch
the trace twice:
- once when populating query-level stats (trace is required to compute
  query-level stats) at `populateQueryLevelStats` in
`dispatchToExecutionEngine` after query execution
- once during the instrumentation helper's `Finish` (as we do currently)

This allows us to collect query-level stats earlier without omitting any
tracing events we record currently. This approach is safer, with the
additional overhead of fetching the trace twice only occuring at the
tracing sampling rate of 1-2%, which is fairly conservative. The concern
with only fetching the trace at query-level stats population was the
ommission of a number of events that occur in
`commitSQLTransactionInternal` (or any execution paths that don't lead
to `dispatchToExecutionEngine`).

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics requires the
additional overhead of fetching the query's trace twice (instead of
previously once).
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Aug 8, 2022
… log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query.

As a tradeoff to collecting query-level stats earlier, we need to fetch
the trace twice:
- once when populating query-level stats (trace is required to compute
  query-level stats) at `populateQueryLevelStats` in
`dispatchToExecutionEngine` after query execution
- once during the instrumentation helper's `Finish` (as we do currently)

This allows us to collect query-level stats earlier without omitting any
tracing events we record currently. This approach is safer, with the
additional overhead of fetching the trace twice only occuring at the
tracing sampling rate of 1-2%, which is fairly conservative. The concern
with only fetching the trace at query-level stats population was the
ommission of a number of events that occur in
`commitSQLTransactionInternal` (or any execution paths that don't lead
to `dispatchToExecutionEngine`).

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics requires the
additional overhead of fetching the query's trace twice (instead of
previously once).
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Aug 9, 2022
… log

Addresses: cockroachdb#71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query.

As a tradeoff to collecting query-level stats earlier, we need to fetch
the trace twice:
- once when populating query-level stats (trace is required to compute
  query-level stats) at `populateQueryLevelStats` in
`dispatchToExecutionEngine` after query execution
- once during the instrumentation helper's `Finish` (as we do currently)

This allows us to collect query-level stats earlier without omitting any
tracing events we record currently. This approach is safer, with the
additional overhead of fetching the trace twice only occuring at the
tracing sampling rate of 1-2%, which is fairly conservative. The concern
with only fetching the trace at query-level stats population was the
ommission of a number of events that occur in
`commitSQLTransactionInternal` (or any execution paths that don't lead
to `dispatchToExecutionEngine`).

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics requires the
additional overhead of fetching the query's trace twice (instead of
previously once).
craig bot pushed a commit that referenced this issue Aug 9, 2022
84514: sql/schemachanger: implemented ALTER PRIMARY KEY for vanilla case  r=postamar a=Xiang-Gu

The PR implements `ALTER PRIMARY KEY` under the declarative schema
changer framework that handles the simplest, "vanilla" case like
```
CREATE TABLE t (i INT PRIMARY KEY, j INT NOT NULL)
ALTER TABLE t ALTER PRIMARY KEY USING COLUMNS (j)
```

This is the first of a series PRs where followup PRs will expand its
capabilities to be able to handle more complex cases, including
  1. Allow the requested new primary key to be hash-sharded;
  2. Consider the case where altering primary key requires us to
     modify existing secondary indexes(see the legacy schema change
     about in what cases we should rewrite)
  3. Consider the case where the old primary index is on the implicitly
     created `rowid` column, in which case we also need to drop that
     column;
  5. Consider partitioning and locality (I'm not sure what they are,
     and why they play a role when `ALTER PRIMARY KEY` but I've seen
     them in the old schema changer, so I assume we ought to do
     something about them too here).
  6. Support `ALTER PRIMARY KEY` with concurrent schema change
      statements. E.g.
 ```ALTER TABLE t ADD COLUMN k INT NOT NULL DEFAULT 30, ALTER PRIMARY KEY USING COLUMNS (j);```

related: #83932
Release note: None

84718: sql: populate query-level stats earlier & add contention to telemetry log r=THardy98 a=THardy98

Addresses: #71328

This change adds contention time (measured in nanoseconds) to the
`SampledQuery` telemetry log.

To accomodate this change, we needed to collect query-level statistics
earlier. Previously, query-level statistics were fetched when we called
`Finish` under the `instrumentationHelper`, however this occurred after
we had already emitted our query execution logs. Now, we collect
query-level stats in `dispatchToExecutionEngine` after we've executed
the query.

As a tradeoff to collecting query-level stats earlier, we need to fetch
the trace twice:
- once when populating query-level stats (trace is required to compute
  query-level stats) at `populateQueryLevelStats` in
`dispatchToExecutionEngine` after query execution
- once during the instrumentation helper's `Finish` (as we do currently)

This allows us to collect query-level stats earlier without omitting any
tracing events we record currently. This approach is safer, with the
additional overhead of fetching the trace twice only occuring at the
tracing sampling rate of 1-2%, which is fairly conservative. The concern
with only fetching the trace at query-level stats population was the
ommission of a number of events that occur in
`commitSQLTransactionInternal` (or any execution paths that don't lead
to `dispatchToExecutionEngine`).

Release note (sql change): Add `ContentionTime` field to the
`SampledQuery` telemetry log. Query-level statistics are collected
earlier to facilitate the adding of contention time to query execution
logs. The earlier collection of query-level statistics requires the
additional overhead of fetching the query's trace twice (instead of
previously once).

85280: sql, server: add new system privileges for observability r=Santamaura a=Santamaura

This patch introduces 2 new system privileges
VIEWDEBUG and VIEWCLUSTERMETADATA. VIEWDEBUG
will now be used to gate taking traces and viewing
debug endpoints. VIEWCLUSTERMETADATA will now be
used to gate the node and range reports.

Resolves #83844, #83856, #83857, #83858, #83861

Release note (sql change): add VIEWDEBUG and
VIEWCLUSTERMETADATA system privileges.

85458: changefeedccl: add retries to sinkless changefeeds r=jayshrivastava a=jayshrivastava

This change updates core/sinkless changefeeds to run in a retry loop, allowing for changefeed restarts in case of transient errors or declarative schema changes. 

See commit notes for more details.

Fixes #85008

85819: kv: use max timestamp during below-Raft scan to gossip liveness r=nvanbenschoten a=nvanbenschoten

Related to https://github.com/cockroachlabs/support/issues/1573.

This commit switches `MaybeGossipNodeLivenessRaftMuLocked` to evaluate its scan
at the maximum timestamp instead of at the local node's HLC time. This ensures
that we gossip the most recent liveness record, regardless of what timestamp it
is written at.

85822: colbuilder: fall back to row-by-row processor wrapping for many renders r=yuzefovich a=yuzefovich

**colbuilder: add a microbenchmark for running many render expressions**

This commit adds a microbenchmark of queries with many render
expressions. It'll be used in the following commit to tune when we fall
back to wrapping a row-by-row processor to handle those renders.

Release note: None

**colbuilder: fall back to row-by-row processor wrapping for many renders**

This commit introduces a mechanism to handle render expressions by
wrapping a row-by-row processor into the vectorized flow when
1. the estimated number of rows to go through the renders is relatively
small
2. the number of renders is relatively high.

The idea is that the vectorized projection operators have higher
overhead when many of them are planned AND there is not enough data to
amortize the overhead, so to improve the performance in those cases
we'll use the row-by-row noop processor. Both of the thresholds are
controlled by cluster settings and the defaults were chosen based on
a representative microbenchmark.

It's worth pointing out that we only have the estimated row count for
the scan operators, so the change has limited applicability.

```
RenderPlanning/rows=1/renders=1-24           407µs ± 2%     408µs ± 2%     ~     (p=0.684 n=10+10)
RenderPlanning/rows=1/renders=8-24           516µs ± 1%     537µs ± 1%   +4.05%  (p=0.000 n=10+10)
RenderPlanning/rows=1/renders=32-24          832µs ± 1%     811µs ± 1%   -2.59%  (p=0.000 n=10+10)
RenderPlanning/rows=1/renders=64-24         1.22ms ± 0%    1.14ms ± 1%   -6.62%  (p=0.000 n=9+10)
RenderPlanning/rows=1/renders=128-24        2.02ms ± 0%    1.80ms ± 1%  -11.18%  (p=0.000 n=8+9)
RenderPlanning/rows=1/renders=512-24        7.75ms ± 1%    5.75ms ± 1%  -25.77%  (p=0.000 n=10+9)
RenderPlanning/rows=1/renders=4096-24        160ms ± 1%      62ms ± 1%  -61.51%  (p=0.000 n=10+9)
RenderPlanning/rows=4/renders=1-24           438µs ± 2%     438µs ± 1%     ~     (p=0.853 n=10+10)
RenderPlanning/rows=4/renders=8-24           603µs ± 1%     633µs ± 2%   +5.06%  (p=0.000 n=10+10)
RenderPlanning/rows=4/renders=32-24         1.08ms ± 1%    1.08ms ± 1%     ~     (p=0.105 n=10+10)
RenderPlanning/rows=4/renders=64-24         1.72ms ± 0%    1.62ms ± 0%   -5.83%  (p=0.000 n=9+9)
RenderPlanning/rows=4/renders=128-24        3.01ms ± 1%    2.75ms ± 1%   -8.78%  (p=0.000 n=10+10)
RenderPlanning/rows=4/renders=512-24        11.6ms ± 1%     9.6ms ± 2%  -17.58%  (p=0.000 n=10+10)
RenderPlanning/rows=4/renders=4096-24        192ms ± 2%      91ms ± 2%  -52.58%  (p=0.000 n=10+10)
RenderPlanning/rows=16/renders=1-24          494µs ± 1%     499µs ± 1%   +1.03%  (p=0.006 n=10+8)
RenderPlanning/rows=16/renders=8-24          855µs ± 1%     901µs ± 1%   +5.37%  (p=0.000 n=10+10)
RenderPlanning/rows=16/renders=32-24        2.03ms ± 1%    2.04ms ± 1%     ~     (p=0.190 n=10+10)
RenderPlanning/rows=16/renders=64-24        3.58ms ± 1%    3.42ms ± 1%   -4.56%  (p=0.000 n=10+9)
RenderPlanning/rows=16/renders=128-24       6.74ms ± 1%    6.31ms ± 1%   -6.37%  (p=0.000 n=10+10)
RenderPlanning/rows=16/renders=512-24       26.9ms ± 1%    24.7ms ± 1%   -8.24%  (p=0.000 n=9+10)
RenderPlanning/rows=16/renders=4096-24       329ms ± 2%     218ms ± 2%  -33.66%  (p=0.000 n=10+10)
RenderPlanning/rows=64/renders=1-24          666µs ± 1%     659µs ± 2%   -1.07%  (p=0.007 n=10+10)
RenderPlanning/rows=64/renders=8-24         1.79ms ± 1%    1.84ms ± 1%   +3.01%  (p=0.000 n=10+10)
RenderPlanning/rows=64/renders=32-24        5.53ms ± 1%    5.79ms ± 2%   +4.74%  (p=0.000 n=10+10)
RenderPlanning/rows=64/renders=64-24        10.8ms ± 1%    11.0ms ± 1%   +1.87%  (p=0.000 n=10+9)
RenderPlanning/rows=64/renders=128-24       21.2ms ± 1%    21.7ms ± 1%   +2.71%  (p=0.000 n=10+10)
RenderPlanning/rows=64/renders=512-24       83.6ms ± 0%    84.9ms ± 0%   +1.47%  (p=0.000 n=10+7)
RenderPlanning/rows=64/renders=4096-24       824ms ± 1%     751ms ± 2%   -8.88%  (p=0.000 n=10+10)
RenderPlanning/rows=128/renders=1-24         853µs ± 1%     851µs ± 1%     ~     (p=0.481 n=10+10)
RenderPlanning/rows=128/renders=8-24        2.98ms ± 1%    3.11ms ± 1%   +4.32%  (p=0.000 n=10+10)
RenderPlanning/rows=128/renders=32-24       10.4ms ± 1%    10.9ms ± 1%   +5.44%  (p=0.000 n=10+10)
RenderPlanning/rows=128/renders=64-24       20.1ms ± 1%    21.3ms ± 1%   +5.99%  (p=0.000 n=10+10)
RenderPlanning/rows=128/renders=128-24      39.7ms ± 1%    42.1ms ± 2%   +5.98%  (p=0.000 n=10+10)
RenderPlanning/rows=128/renders=512-24       160ms ± 1%     168ms ± 2%   +5.13%  (p=0.000 n=9+10)
RenderPlanning/rows=128/renders=4096-24      1.44s ± 1%     1.48s ± 2%   +3.15%  (p=0.000 n=9+10)
RenderPlanning/rows=256/renders=1-24        1.22ms ± 1%    1.21ms ± 1%   -1.01%  (p=0.000 n=10+10)
RenderPlanning/rows=256/renders=8-24        5.22ms ± 0%    5.19ms ± 1%   -0.54%  (p=0.011 n=8+9)
RenderPlanning/rows=256/renders=32-24       19.9ms ± 1%    20.0ms ± 1%     ~     (p=0.182 n=9+10)
RenderPlanning/rows=256/renders=64-24       39.0ms ± 0%    38.9ms ± 0%   -0.33%  (p=0.023 n=10+10)
RenderPlanning/rows=256/renders=128-24      76.8ms ± 1%    76.7ms ± 1%     ~     (p=0.739 n=10+10)
RenderPlanning/rows=256/renders=512-24       316ms ± 1%     319ms ± 1%   +1.15%  (p=0.001 n=9+10)
RenderPlanning/rows=256/renders=4096-24      2.75s ± 1%     2.73s ± 1%   -0.64%  (p=0.002 n=8+9)
```

Fixes: #85632.

Release note: None

Co-authored-by: Xiang Gu <[email protected]>
Co-authored-by: Thomas Hardy <[email protected]>
Co-authored-by: Santamaura <[email protected]>
Co-authored-by: Jayant Shrivastava <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@THardy98 THardy98 reopened this Oct 11, 2022
@THardy98
Copy link

Re-opening due to a couple fields not being added yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-observability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants