-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/upgrade/upgrademanager/upgrademanager_test: TestAlreadyRunningJobsAreHandledProperly failed #86121
Labels
branch-master
Failures and bugs on the master branch.
C-test-failure
Broken test (automatically or manually discovered).
O-robot
Originated from a bot.
Milestone
Comments
cockroach-teamcity
added
branch-master
Failures and bugs on the master branch.
C-test-failure
Broken test (automatically or manually discovered).
O-robot
Originated from a bot.
labels
Aug 15, 2022
Failed under stress
cc @ajwerner |
pkg/upgrade/upgrademanager/upgrademanager_test.TestAlreadyRunningJobsAreHandledProperly failed with artifacts on master @ 32f7d263043cfb47f3ac409a8aaa7346e56b973b:
Parameters: |
craig bot
pushed a commit
that referenced
this issue
Aug 16, 2022
85772: ui/cluster-ui: fix polling in fingerprints pages r=xinhaoz a=xinhaoz Fixes: #85236 Previously, SQL statement and transaction stats would be refreshed every 5 minutes via sagas in CC and the cached api reducer in db-console. This method relied on a refresh data call that resided in `shouldComponentUpdate`, which was ignored by the respective polling managers when the time interval was not complete. This pattern was hacky as (unguarded calls in `shouldComponentUpdate` are typically avoided in React. Polling in these pages were removed with the introduciton of persisted stats, however we would like to reintroduce polling when the selected time interval is `Latest xx..' (i.e. not a custom interval). The removal of this polling introduced a bug in the CC fingerprint pages, as the saga effects for when the data was received would invalidate the data after the polling interval. Now that the data was never refreshed, the page would get stuck on the 'loading data' page. This commit reintroduces polling via a `setTimeout` in the page components, rather than through cached data reducer and sagasn for CC. Data in the fingerprints overview pages is now refreshed every 5 minutes for non-custom time ranges. The data invalidation in CC is also cleaned up such that a call to invalidate data only happens right before a request to fetch data (to signify new data is being loaded). Release note (bug fix): the statements and transaction fingerprint will no longer get stuck on the loading page in CC after 5 minutes idling on the page Release note (ui change): the statements and transaction fingerprint now refresh data every 5 minutes for non-custom time ranges Release justification: bug fix 86195: sql/schemachanger: implement ALTER TABLE ... ADD PRIMARY KEY r=ajwerner a=ajwerner This commit extends #86071 to support `ALTER TABLE ... ADD PRIMARY KEY` in implicit transactions in the case that the existing primary key uses the default `rowid` primary key. It adopts the ability of #86071 to drop the `rowid` column along the way. Release justification: minor improvement to new functionality needed to unblock DMS support. Release note (sql change): When running `ALTER TABLE ... ADD PRIMARY KEY` or `ALTER TABLE ... ADD CONSTRAINT ... PRIMARY KEY` in a single-statement, implicit transaction, when no primary key had been added to the table, the old `rowid` column which had been automatically created as the table's `PRIMARY KEY` will now be dropped. 86216: insights: enable the fixed-threshold detector r=matthewtodd a=matthewtodd This commit effectively enables the new "insights" subsystem, wherein we observe in-flight statement executions and gather information about statements that ran longer than a fixed threshold, with the intent of supporting developers and operators tuning their workloads. We are currently building a UI in DB console to surface these insights; in the meantime, they are also available in the `crdb_internal.cluster_execution_insights` table. A local 5-minute run of the kv95 workload with the `sql.insights.latency_threshold` cluster setting set to 0ms (disabled) and 100ms shows an unappreciable impact on execution latencies: | `latency_threshold` | ops(total) | ops/sec(cum) | avg(ms) | p50(ms) | p95(ms) | p99(ms) | pMax(ms) | |---------------------|------------|--------------|---------|---------|---------|---------|----------| | 0ms | 825873 | 2752.9 | 11.6 | 0.6 | 96.5 | 268.4 | 939.5 | | 100ms | 853175 | 2843.9 | 11.2 | 0.7 | 79.7 | 260.0 | 1208.0 | Note that we have another, hopefully more useful, heuristic-based "slow" detector in the works, but it will need more performance tuning before we can enable it by default. Release justification: Category 2: Bug fixes and low-risk updates to new functionality Release note (sql change): A new subsystem, "insights," has been enabled by default, gathering slow statement executions in the `crdb_internal.cluster_execution_insights` table along with possible reasons for the slowness: full scans / missing indexes, contention, plan changes, retries, etc. This system may be tuned by a handful of new cluster settings and monitored with a handful of new metrics, all in the `sql.insights` namespace. 86221: sql/gcjob/gcjobnotifier: fix race r=ajwerner a=ajwerner The callback which I had assumed could only be called once was being called multiple times concurrently. Fixes #86121 Release justification: minor change to new functionality to fix flakey tests Release note: None 86235: sqlsmith: do not attempt to drop system columns r=yuzefovich a=yuzefovich Previously, sqlsmith could attempt to drop a system column which would fail, now it will skip system columns for the drop command. Fixes: #86076. Release justification: test-only change. Release note: None 86258: logictest: attempt to deflake inflight_trace_spans r=ajwerner a=ajwerner Maybe there's some background tasks popping up. Fixes #85812 Release justification: testing only change Release note: None Co-authored-by: Xin Hao Zhang <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
branch-master
Failures and bugs on the master branch.
C-test-failure
Broken test (automatically or manually discovered).
O-robot
Originated from a bot.
pkg/upgrade/upgrademanager/upgrademanager_test.TestAlreadyRunningJobsAreHandledProperly failed with artifacts on master @ 6b7f0058325de8b8b321a08137d462c418c3047f:
Parameters:
TAGS=bazel,gss
Help
See also: How To Investigate a Go Test Failure (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-18586
The text was updated successfully, but these errors were encountered: