outliers: move processing off the hot path #81021

matthewtodd · 2022-05-05T02:56:11Z

82473: outliers: latency quantile detector r=matthewtodd a=matthewtodd Closes #79451. Note that this detector comes disabled by default. To enable, use the `sql.stats.outliers.experimental.latency_quantile_detection.enabled` cluster setting. This new detection "algorithm" is a simple, perhaps naïve, heuristic, though it may be good enough to be useful: we consider a statement an outlier if its execution latency is > p99 latency for its fingerprint, so long as it's also greater than twice the median latency and greater than an arbitrary user-defined threshold.[^1] We expect the operation of this detector to be memory constrained. To avoid OOM errors, we offer a cluster setting[^2] to cap the overall memory used, along with metrics reporting the number of fingerprints being tracked, memory usage, and "evictions," or fingerprint histories being dropped due to memory pressure. If a high rate of evictions is observed, it will be necessary to raise either one or both of the memory limit and the "interesting" threshold settings to regain a steady state. Note also that this detector is not designed to be used concurrently by multiple goroutines. Correctness currently relies on the global lock in `outliers.Registry`, to be addressed next in #81021. [^1]: This threshold is defined by a cluster setting, `sql.stats.outliers.experimental.latency_quantile_detection.interesting_threshold`. Without such a threshold, we would otherwise flag a statement that, say, normally executes in 3ms taking 7ms; but 7ms is probably still fast enough that there's nothing to be done. [^2]: `sql.stats.outliers.experimental.latency_quantile_detection.memory_limit` Release note: None 82741: obsservice: ui handler serves db console assets r=sjbarag,abarganier a=dhartunian This change enables the Observabilty Service to serve the DB Console UI itself. Endpoints that CRDB can handle are proxied (`/_admin/`, `/_status/`, `/ts/`) but the base URL will return a blank HTML page with JS assets that load the DB Console. In order to build with the ui code, the `--config=with_ui` flag must be passed to bazel. This commit also adds a shortcut to the `dev` command to build `obsservice` which includes the `--config=with_ui` flag just as we do by default in `cockroach` builds. Release note: None 83717: changefeedccl: infer WITH diff from changefeed expression r=[miretskiy] a=HonoreDB Informs #83322. Specifically this does the "infer whether or not to set diff" part of it. This went through a more ambitious phase (https://github.com/cockroachdb/cockroach/compare/master...HonoreDB:cockroach:cdc_expressions_infer_diff?expand=1) where it also handled user defined types, but I ended up agreeing with wiser heads that this was adding too much complexity for no known use case. I also came really close to setting envelope=row as the default, but envelope=row is not supported in all sinks so again, it felt too messy to do without a little further conversation. Release note (sql change): CREATE CHANGEFEED AS statements no longer need to include WITH DIFF when using cdc_prev(). 83772: storage: remove experimental markers for MVCC range tombstones r=aliher1911 a=erikgrinaker GC is still under development, but I think it's fine to remove the marker before GC works to let people start building with it. --- This removes the experimental markers for MVCC range tombstones, by renaming functions, methods, and parameters to not include `experimental`, as well as warning comments. The changes are entirely mechanical. Release note: None 83784: scexec: refactor schema changer job update r=postamar a=postamar This PR rebases #83724 on top of a few commits which refactor the schema changer job update machinery. Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: David Hartunian <[email protected]> Co-authored-by: Aaron Zinger <[email protected]> Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Marius Posta <[email protected]>

83684: kvstreamer: reuse incomplete Get requests on resume batches r=yuzefovich a=yuzefovich Previously, for all incomplete requests in a batch we'd allocate new Get and Scan requests (since - due to a known issue #75452 - at the moment the lifecycle of the requests is not clearly defined, so we're not allowed to modify them). However, we can reuse the Get requests since they won't be ever modified (i.e. they are either complete or incomplete, and, unlike for Scan requests, the start key won't ever be shifted), so this commit takes advantage of this observation. Release note: None 83709: pkg/util/tracing: Add hidden tag group, make server responsible for sorting. r=benbardin a=benbardin Release note: none This moves all tags marked as "hidden" into a single tag group at the UI layer. This declutters the trace page a little bit and makes it easier to pick out more important information. <img width="1620" alt="Screen Shot 2022-07-01 at 12 37 07 PM" src="https://user-images.githubusercontent.com/261508/176937757-bf8ac920-9e28-4908-8de4-1fbc077fd2c7.png"> 83834: outliers: extract a Registry interface. r=matthewtodd a=matthewtodd This is a pure mechanical refactoring, preparing us for #81021, where we'll move outlier processing off of the hot execution path. The idea is that the outside world will continue to talk to us as a Registry, but we'll now have a seam into which we can insert some asynchrony. Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Ben Bardin <[email protected]> Co-authored-by: Matthew Todd <[email protected]>

83123: sql/stats: conversion of datums to and from quantile function values r=yuzefovich,rytaft,mgartner a=michae2 To predict histograms in statistics forecasts, we will use linear regression over quantile functions. (Quantile functions are another representation of histogram data, in a form more amenable to statistical manipulation.) The conversion of histograms to quantile functions will require conversion of histogram bounds (datums) to quantile values (float64s). And likewise, the inverse conversion from quantile functions back to histograms will require the inverse conversion of float64 quantile values back to datums. These conversions are a little different from our usual SQL conversions in `eval.PerformCast`, so we add them to a new quantile file in the `sql/stats` module. This code was originally part of #77070 but has been pulled out to simplify that PR. A few changes have been made: - `histogramValue` has been renamed to `FromQuantileValue`. - Support for `DECIMAL`, `TIME`, `TIMETZ`, and `INTERVAL` has been dropped. Clamping these types in `FromQuantileValue` was too complex for the first iteration of statistics forecasting. We expect the overwhelming majority of ascending keys to use `INT` or `TIMESTAMP` types. - Bugs in `FLOAT4`, `TIMESTAMP` and `TIMESTAMPTZ` conversions have been fixed. - We're now clamping timestamps to slightly tighter bounds to avoid the problems with infinite timestamps (see #41564). Assists: #79872 Release note: None 83590: sql: fix and rename sql stats session transaction received time r=THardy98 a=THardy98 Resolves: #82894 Due to a change from #76792, implicit transactions can start before `SessionQueryReceived` session phase time is set by the sqlstats system. In turn, this caused the `SessionTransactionReceived` (now renamed as `SessionTransactionStarted`) session phase time to be recorded incorrectly, causing extremely large transactions times on the UI. This change fixes this mistake by setting the actual transaction start time as the `SessionTransactionStarted` session phase time, instead of `SessionQueryReceived`. Release note (bug fix): The `SessionTransactionReceived` session phase time is no longer recorded incorrectly, fixing large transaction times from appearing on the UI, also renamed to `SessionTransactionStarted`. 83943: outliers: protocol buffer adjustments r=matthewtodd a=matthewtodd This handful of commits slightly re-shapes the protocol buffer messages we use for tracking outliers, making some of the work in #81021 a little easier to express. 83961: cmd/dev: add generate execgen subcommand r=ajwerner a=yuzefovich Release note: None 83975: streamingccl: unskip TestTenantStreaming r=miretskiy a=stevendanna This makes a couple of changes aimed at unskipping TestTenantStreaming: - Fix a nil pointer access in our stream status verification function. We changed the name of key that this function was accessing. This NPE was hidden by another panic along the unclean shutdown path in the server. - Lower various intervals so that this test doesn't take 90 seconds. I've run this under stress for a few hundred iterations without error. Release note: None 83979: streamingccl: small logging and tracing cleanups r=miretskiy a=stevendanna - Add a tracing span to ingestion job cutover. - Move a particularly noisy log message to VInfo(3). - Prefer log.VInfo to `if log.V(n) {}` in cases where we aren't doing expensive argument construction. Release note: None Co-authored-by: Michael Erickson <[email protected]> Co-authored-by: Thomas Hardy <[email protected]> Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Steven Danna <[email protected]>

Closes cockroachdb#79451. Note that this detector comes disabled by default. To enable, use the `sql.stats.outliers.experimental.latency_quantile_detection.enabled` cluster setting. This new detection "algorithm" is a simple, perhaps naïve, heuristic, though it may be good enough to be useful: we consider a statement an outlier if its execution latency is > p99 latency for its fingerprint, so long as it's also greater than twice the median latency and greater than an arbitrary user-defined threshold.[^1] We expect the operation of this detector to be memory constrained. To avoid OOM errors, we offer a cluster setting[^2] to cap the overall memory used, along with metrics reporting the number of fingerprints being tracked, memory usage, and "evictions," or fingerprint histories being dropped due to memory pressure. If a high rate of evictions is observed, it will be necessary to raise either one or both of the memory limit and the "interesting" threshold settings to regain a steady state. Note also that this detector is not designed to be used concurrently by multiple goroutines. Correctness currently relies on the global lock in `outliers.Registry`, to be addressed next in cockroachdb#81021. [^1]: This threshold is defined by a cluster setting, `sql.stats.outliers.experimental.latency_quantile_detection.interesting_threshold`. Without such a threshold, we would otherwise flag a statement that, say, normally executes in 3ms taking 7ms; but 7ms is probably still fast enough that there's nothing to be done. [^2]: `sql.stats.outliers.experimental.latency_quantile_detection.memory_limit` Release note: None

84406: sql: make setup of leaf txn for streamer more bullet-proof r=yuzefovich a=yuzefovich This commit makes sure that we try to use the streamer API only if we can actually create a non-nil `LeafTxn`. In some edge cases it appears that all the previous checks were insufficient, and this commit should take care of that. Note that I couldn't reproduce those edge cases manually, so there is no regression test. Fixes: #84239. Release note: None 84455: outliers: prepare for asynchronous processing r=matthewtodd a=matthewtodd Moving a few things around in advance of the real work of #81021. 84485: ui: remove link to stmt details on sessions details r=maryliag a=maryliag Previously, when a statment was active, it would show on the Sessions page with a link to view its details, but since the statement was not yet saved/persisted, clicking the link it would crash the Statement Details page. This commit removes this link. Fixes #84462 Release note (ui change): Removal of `View Statement Details` link inside the Sessions Details page. 84488: roachtest: lower descriptor lease duration in acceptance/version-upgrade r=srosenberg a=renatolabs The `acceptance/version-upgrade` test uncovered a descriptor lease leak that can lead to the test timing out due to waiting a full lease duration (5 minutes by default), making it flaky. Once the bug is fixed, we should be able to use the default duration again. Relates to #84382. Release note: None. Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]> Co-authored-by: Renato Costa <[email protected]>

63416: sql: emit point deletes during delete fastpath r=yuzefovich a=jordanlewis Previously, the "deleteRange" SQL operator, which is meant to be a fast-path for cases in which an entire range of keys can be deleted, always did what it said: emitted DeleteRange KV operations. This precludes a crucial optimization: sending point deletes when the list of deleted keys is exactly known. For example, a query like `DELETE FROM kv WHERE k = 10000` uses the "fast path" delete, since it has a contiguous set of keys to delete, and it doesn't need to know the values that were deleted. But, in this case, the performance is actually worse if we use a DeleteRange KV operation for various reasons (see #53939), because: - ranged KV writes (DeleteRangeRequest) cannot be pipelined because an enumeration of the intents that they will leave cannot be known ahead of time. They must therefore perform evaluation and replication synchronously. - ranged KV writes (DeleteRangeRequest) result in ranged intent resolution, which is less efficient (although this became less important since we re-enabled time-bound iterators). The reason we couldn't previously emit point deletes in this case is that SQL needs to know whether it deleted something or not. This means we can't do a "blind put" of a deletion: we need to actually understand whether there was something that we were "overwriting" with our delete. This commit modifies the DeleteResponse to always return a boolean indicating whether a key from the DeleteRequest was actually deleted. Additionally, the deleteRange SQL operator detects when it can emit single-key deletes, and does so. Closes #53939. Release note (performance improvement): point deletes in SQL are now more efficient during concurrent workloads. 76233: kv: remove clock update on BatchResponse r=nvanbenschoten a=nvanbenschoten Before this change, we were updating the local clock with each BatchResponse's WriteTimestamp. This was meant to handle cases where the batch request timestamp was forwarded during evaluation. This was unnecessary for two reasons. The first is that a BatchResponse can legitimately carry an operation timestamp that leads the local HLC clock on the leaseholder that evaluated the request. This has been true since #80706, which introduced the concept of a "local timestamp". This allowed us to remove the (broken) attempt at ensuring that the HLC on a leaseholder always leads the MVCC timestamp of all values in the leaseholder's keyspace (see the update to `pkg/kv/kvserver/uncertainty/doc.go` in that PR). The second was that it was not even correct. The idea behind bumping the HLC on the response path was to ensure that if a batch request was forwarded to a newer timestamp during evaluation and then completed a write, that forwarded timestamp would be reflected in the leaseholder's HLC. However, this ignored the fact that any forwarded timestamp must have either come from an existing value in the range or from the leaseholder's clock. So if those didn't lead the clock, the derived timestamp wouldn't either. It also ignored the fact that the clock bump here was too late (post-latch release) and if it had actually been needed (it wasn't), it wouldn't have even ensured that the timestamp on any lease transfer led the maximum time of any response served by the outgoing leaseholder. There are no mixed-version migration concerns of this change, because #80706 ensured that any future-time operation will still continue to use the synthetic bit until all nodes are running v22.2 or later. 85350: insights: ingester r=matthewtodd a=matthewtodd Closes #81021. Here we begin observing statements and transactions asynchronously, to avoid slowing down the hot sql execution path as much as possible. Release note: None 85440: colmem: improve memory-limiting behavior of the accounting helpers r=yuzefovich a=yuzefovich **colmem: introduce a helper method when no memory limit should be applied** This commit is a pure mechanical change. Release note: None **colmem: move some logic of capacity-limiting into the accounting helper** This commit moves the logic that was duplicated across each user of the SetAccountingHelper into the helper itself. Clearly, this allows us to de-duplicate some code, but it'll make it easier to refactor the code which is done in the following commit. Additionally, this commit makes a tiny change to make the resetting behavior in the hash aggregator more precise. Release note: None **colmem: improve memory-limiting behavior of the accounting helpers** This commit fixes an oversight in how we are allocating batches of the "dynamic" capacity. We have two related ways for reallocating batches, and both of them work by growing the capacity of the batch until the memory limit is exceeded, and then the batch would be reused until the end of the query execution. This is a reasonable heuristic under the assumption that all tuples in the data stream are roughly equal in size, but this might not be the case. In particular, consider an example when 10k small rows of 1KiB are followed by 10k large rows of 1MiB. According to our heuristic, we happily grow the batch until 1024 in capacity, and then we do not shrink the capacity of that batch, so once the large rows start appearing, we put 1GiB worth of data into a single batch, significantly exceeding our memory limit (usually 64MiB with the default `workmem` setting). This commit introduces a new heuristic as follows: - the first time a batch exceeds the memory limit, its capacity is memorized, and from now on that capacity will determine the upper bound on the capacities of the batches allocated through the helper; - if at any point in time a batch exceeds the memory limit by at least a factor of two, then that batch is discarded, and the capacity will never exceed half of the capacity of the discarded batch; - if the memory limit is not reached, then the behavior of the dynamic growth of the capacity provided by `Allocator.ResetMaybeReallocate` is still applicable (i.e. the capacities will grow exponentially until coldata.BatchSize()). Note that this heuristic does not have an ability to grow the maximum capacity once it's been set although it might make sense to do so (say, if after shrinking the capacity, the next five times we see that the batch is using less than half of the memory limit). This is an conscious omission since I want this change to be backported, and never growing seems like a safer choice. Thus, this improvement is left as a TODO. Also, we still might create batches that are too large in memory footprint in those places that don't use the SetAccountingHelper (e.g. in the columnarizer) since we perform the memory limit check at the batch granularity. However, this commit improves things there so that we don't reuse that batch on the next iteration and will use half of the capacity on the next iteration. Fixes: #76464. Release note (bug fix): CockroachDB now more precisely respects the `distsql_workmem` setting which improves the stability of each node and makes OOMs less likely. **colmem: unexport Allocator.ResetMaybeReallocate** This commit is a mechanical change to unexport `Allocator.ResetMaybeReallocate` so that the users would be forced to use the method with the same name from the helpers. This required splitting off the tests into two files. Release note: None 85492: backupccl: remap all restored tables r=dt a=dt This PR has a few changes, broken down into separate commits: a) stop restoring tmp tables and remove the special-case code to synthesize their special schemas; These were previously restored only to be dropped so that restored jobs that referenced them would not be broken, but we stopped restoring jobs. b) synthesize type-change jobs during cluster restore; this goes with not restoring jobs. c) fix some assumptions in tests/other code about what IDs restored tables have. d) finally, always assign new IDs to all restored objects, even during cluster restore, removing the need to carefully move conflicting tables or other things around. Commit-by-commit review recommended. 85930: jobs: make expiration use intended txn priority r=ajwerner a=rafiss In aed014f these operations were supposed to be changed to use MinUserPriority. However, they weren't using the appropriate txn, so it didn't have the intended effect. Release note: None Co-authored-by: Jordan Lewis <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: David Taylor <[email protected]> Co-authored-by: Rafi Shamim <[email protected]>

matthewtodd added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-observability Related to observability of the SQL layer T-sql-observability labels May 5, 2022

matthewtodd self-assigned this May 5, 2022

jlinder added sync-me and removed sync-me labels May 20, 2022

matthewtodd mentioned this issue Jun 6, 2022

outliers: latency quantile detector #82473

Merged

matthewtodd mentioned this issue Jul 5, 2022

outliers: extract a Registry interface. #83834

Merged

matthewtodd mentioned this issue Jul 7, 2022

outliers: protocol buffer adjustments #83943

Merged

matthewtodd mentioned this issue Jul 14, 2022

outliers: prepare for asynchronous processing #84455

Merged

matthewtodd mentioned this issue Aug 10, 2022

insights: ingester #85350

Merged

craig bot closed this as completed in 1533b8e Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

outliers: move processing off the hot path #81021

outliers: move processing off the hot path #81021

matthewtodd commented May 5, 2022 •

edited by cockroach-jira-scripts

Loading

outliers: move processing off the hot path #81021

outliers: move processing off the hot path #81021

Comments

matthewtodd commented May 5, 2022 • edited by cockroach-jira-scripts Loading

matthewtodd commented May 5, 2022 •

edited by cockroach-jira-scripts

Loading