-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
83020: stmtdiagnostics: support continuous bundle collection r=irfansharif a=irfansharif ..until expiry. Informs #82896 (more specifically this is a short-term alternative to the part pertaining to continuous tail capture). The issue has more background, but we repeat some below for posterity. It's desirable to draw from a set of tail execution traces collected over time when investigating tail latencies. #82750 introduced a probabilistic mechanism to capture a single tail event for a individual stmts with bounded overhead (determined by the sampling probability, trading off how long until a single capture is obtained). This PR introduces a `sql.stmt_diagnostics.collect_continuously_until_expired` to collect captures continuously over some period of time for aggregate analysis. Longer term we'd want: - Controls over the maximum number of captures we'd want stored over some period of time; - Eviction of older bundles, assuming they're less relevant, making room for newer captures. To safeguard against misuse (in this current form we should only use it for experiments or escalations under controlled environments), we only act on this setting provided the diagnostics request has an expiration timestamp and a specified probability, crude measures to prevent unbounded growth. --- To get some idea of how this can be used, consider the kinds of experiments we're running as part of #75066. Specifically we have a reproduction where we can observe spikes in latencies for foreground traffic in the presence of concurrent backups (incremental/full). In an experiment with incremental backups running every 10m, with full backups running every 35m (`RECURRING '*/10 * * * *' FULL BACKUP '35 * * * *'`), we observe latency spikes during overlap periods. With this cluster setting we were able to set up trace captures over a 10h window to get a set of representative outlier traces to investigate further. > SELECT crdb_internal.request_statement_bundle( 'INSERT INTO new_order(no_o_id, ...)', -- stmt fingerprint 0.05, -- 5% sampling probability '30ms'::INTERVAL, -- 30ms target (p99.9) '10h'::INTERVAL -- capture window ); > WITH histogram AS (SELECT extract('minute', collected_at) AS minute, count(*) FROM system.statement_diagnostics GROUP BY minute) SELECT minute, repeat('*', (30 * count/(max(count) OVER ()))::INT8) AS freq FROM histogram ORDER BY count DESC LIMIT 10; minute | freq ---------+--------------------------------- 36 | ****************************** 38 | ********************* 35 | ********************* 00 | ********************* 37 | ******************** 30 | ******************** 40 | ***************** 20 | ************** 10 | ************* 50 | *********** (10 rows) We see that we captured just the set of bundles/traces we were interested in. Release note: None 86591: kvserver: sync checksum computation with long poll r=erikgrinaker a=pavelkalinnikov Previously, the checksum computation would run until completion unconditionally (unless the collection request comes before it). This is not the best spend of the limited pool capacity, because the result of this computation may never be requested. After this commit, the checksum computation task is synchronized with the checksum collection request. Both wait at most 5 seconds until the other party has joined. Once joined, the computation starts, otherwise skips. If any party abandons the request, then the `replicaChecksum` record is preserved in the state, and is scheduled for a GC later. This is to help the other party to fail fast, instead of waiting, if it arrives late. This change also removes the no longer needed concurrency limit for the tasks, because tasks are canceled reliably and will not pile up. Fixes #77432 Release note (performance improvement): consistency checks are now properly cancelled on timeout, preventing them from piling up. 88768: ci: add MacOS ARM CI config r=jlinder a=rail Previously, MacOS64 ARM64 platform was added, but CI wouldn't run it. This PR adds a CI platform to build MacOS ARM64 binaries. Release note: None Co-authored-by: irfan sharif <[email protected]> Co-authored-by: Pavel Kalinnikov <[email protected]> Co-authored-by: Rail Aliiev <[email protected]>
- Loading branch information
Showing
11 changed files
with
335 additions
and
124 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
#!/usr/bin/env bash | ||
|
||
set -euo pipefail | ||
|
||
dir="$(dirname $(dirname $(dirname $(dirname $(dirname "${0}")))))" | ||
|
||
source "$dir/teamcity-support.sh" # For $root | ||
source "$dir/teamcity-bazel-support.sh" # For run_bazel | ||
|
||
tc_start_block "Run Bazel build" | ||
run_bazel build/teamcity/cockroach/ci/builds/build_impl.sh crossmacosarm | ||
tc_end_block "Run Bazel build" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.