Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing should always log first time statement is executed #89185

Closed
j82w opened this issue Oct 3, 2022 · 0 comments · Fixed by #89418
Closed

Tracing should always log first time statement is executed #89185

j82w opened this issue Oct 3, 2022 · 0 comments · Fixed by #89418
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@j82w
Copy link
Contributor

j82w commented Oct 3, 2022

This PR turned off plan sampling. I side effect of this being disabled is the first time a statement is executed it is no longer being traced. This causes the contention event shows up in transaction_contention_events table to be empty most of the time. This was verified by running this to revert the flag SET CLUSTER SETTING sql.metrics.statement_details.plan_collection.enabled = true; and the contention event shows up in transaction_contention_events table.

The PR to change the default was reverted, because of the impact. #89020

The logic needs to be fixed so it always traces the first time a statement is executed, so it does not depend on the above feature flag. The issue seems to be in instrumentation.go.

Original PR always traced it: 125f6e6
Refactored here:
https://github.com/cockroachdb/cockroach/commit/06f6874992c8358fef4baada3bec4c5bda359037#diff-a885af4d7f01dd0f13e3ab3[…]59f573ae9c8647aa60e218b0ba24

Jira issue: CRDB-20165

@j82w j82w added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-observability labels Oct 3, 2022
@j82w j82w self-assigned this Oct 3, 2022
craig bot pushed a commit that referenced this issue Oct 10, 2022
88539: sql: add telemetry for statistics forecast usage r=rytaft a=michae2

Add a few fields to the sampled_query telemetry events that will help us
measure how useful table statistics forecasting is in practice.

Fixes: #86356

Release note (ops change): Add five new fields to the sampled_query
telemetry events:
- `ScanCount`: Number of scans in the query plan.
- `ScanWithStatsCount`: Number of scans using statistics (including
  forecasted statistics) in the query plan.
- `ScanWithStatsForecastCount`: Number of scans using forecasted
  statistics in the query plan.
- `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by
  all scans in the query, as estimated by the optimizer without using
  forecasts.
- `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that
  have passed since the forecast time (or until the forecast time, if it
  is in the future, in which case it will be negative) for any table
  with forecasted stats scanned by this query.

89418: sqlstats: always enable tracing first time fingerprint is seen r=j82w a=j82w

Fixes: #89185

The first time a fingerprint is seen tracing should be enabled. This currently is broken if
sql.metrics.statement_details.plan_collection.enabled is set to false. This can cause crdb_internal.transaction_contention_events to be empty because tracing was never enabled to the contention event was never recorded.

To properly fix this a new value needs to be returned on ShouldSample to tell if it is the first time a fingerprint is seen. This will remove the dependency on plan_collection feature switch.

Release justification: Bug fixes and low-risk updates to new functionality.

Release note (bug fix): Always enable tracing the frist time a fingerprint is seen.

89502: kvserver: do not report diff in consistency checks r=erikgrinaker,tbg a=pavelkalinnikov

This commit removes reporting of the diff between replicas in case of consistency check failures. With the increased range sizes the previous approach has become infeasible.

One possible way to inspect an inconsistency after this change is:
1. Run `cockroach debug range-data` tool to extract the range data from each replica's checkpoint.
2. Use standard OS tools like `diff` to analyse them.

In the meantime, we are researching the UX of this alternative approach, and seeing if there can be a better tooling support.

Part of #21128
Epic: none

Release note (sql change): The `crdb_internal.check_consistency` function now does not include the diff between inconsistent replicas, should they occur. If an inconsistency occurs, the storage engine checkpoints should be inspected. This change is made because previously the range size limit has been increased from 64 MiB to O(GiB), so inlining diffs in consistency checks does not scale.

89529: changefeedccl: update parallel consumer metrics r=jayshrivastava a=jayshrivastava

Previously, both changefeed.nprocs_flush_nanos and changefeed.nprocs_consume_event_nanos were counters that monotonically increased. This was not that useful when determining the average time it takes to consume or flush an event. Changing them to a histogram fixes this issue and allows for percentile values like p90, p99.

This change also updates changefeed.nprocs_in_flight_count to sample values when incrementing inFlight variable. Previously, it was showing up at 0 in the UI. This change makes it show the actual value.

Fixes #89654

Release note: None
Epic: none

89660: tree: use FastIntSet during typechecking r=jordanlewis a=jordanlewis

Previously, the typecheck phase used several slices of ordinals into lists. This is a perfect use case for FastIntSet, because the ordinals tend to be quite small. This commit switches to use FastIntSet instead.

```
name          old time/op    new time/op    delta
TypeCheck-10    3.68µs ± 3%    3.52µs ± 2%   -4.52%  (p=0.000 n=9+10)

name          old alloc/op   new alloc/op   delta
TypeCheck-10      744B ± 0%      576B ± 0%  -22.58%  (p=0.000 n=10+10)

name          old allocs/op  new allocs/op  delta
TypeCheck-10      32.0 ± 0%      18.0 ± 0%  -43.75%  (p=0.000 n=10+10)
```
Issue: None
Epic: None
Release note: None

89662: scop, scdep: Rename `IndexValidator` to `Validator` r=Xiang-Gu a=Xiang-Gu

I've recently done work to enable adding/dropping check constraints in the declarative
schema changer. It is a big PR with many commits. I think it's nicer to separate them
further into multiple PRs. 

This is the first PR in that effort, which is merely renaming and should be easy to review:
commit 1: ValidateCheckConstraint now uses ConstraintID, instead of constraint name;

commit 2: Rename `IndexValidator` to `Validator`.
We previously had a file under scdep called `index_validator.go` where we implement
logic for validating an index. Now that we are going to validate a check constraint, we 
renamed them so they will also validate check constraints.

Informs #89665
Release note: None

Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: j82w <[email protected]>
Co-authored-by: Pavel Kalinnikov <[email protected]>
Co-authored-by: Jayant Shrivastava <[email protected]>
Co-authored-by: Jordan Lewis <[email protected]>
Co-authored-by: Xiang Gu <[email protected]>
@craig craig bot closed this as completed in 9edf2c8 Oct 10, 2022
blathers-crl bot pushed a commit that referenced this issue Oct 10, 2022
Fixes: #89185

The first time a fingerprint is seen tracing should be enabled.
This currently is broken if
sql.metrics.statement_details.plan_collection.enabled is set to false.
This can cause crdb_internal.transaction_contention_events to be
empty because tracing was never enabled so the contention event was
never recorded.

To properly fix this a new value needs to be returned on ShouldSample
to tell if it is the first time a fingerprint is seen. This will remove
the dependency on plan_collection feature switch.

Release justification: Bug fixes and low-risk updates to new
functionality.

Release note: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant