Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/upgrade/upgrademanager/upgrademanager_test: TestAlreadyRunningJobsAreHandledProperly failed #86121

Closed
cockroach-teamcity opened this issue Aug 15, 2022 · 2 comments · Fixed by #86221
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Aug 15, 2022

pkg/upgrade/upgrademanager/upgrademanager_test.TestAlreadyRunningJobsAreHandledProperly failed with artifacts on master @ 6b7f0058325de8b8b321a08137d462c418c3047f:

=== RUN   TestAlreadyRunningJobsAreHandledProperly
    test_log_scope.go:162: test logs captured to: /artifacts/tmp/_tmp/3b30e681b3a88fde1405cd1a84158740/logTestAlreadyRunningJobsAreHandledProperly2953702311
    test_log_scope.go:80: use -show-logs to present logs inline

Parameters: TAGS=bazel,gss

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

Jira issue: CRDB-18586

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Aug 15, 2022
@cockroach-teamcity cockroach-teamcity added this to the 22.2 milestone Aug 15, 2022
@yuzefovich
Copy link
Member

Failed under stress

=== RUN   TestAlreadyRunningJobsAreHandledProperly
    test_log_scope.go:162: test logs captured to: /artifacts/tmp/_tmp/3b30e681b3a88fde1405cd1a84158740/logTestAlreadyRunningJobsAreHandledProperly2953702311
    test_log_scope.go:80: use -show-logs to present logs inline
*
* INFO: Running test with the default test tenant. If you are only seeing a test case failure when this message appears, there may be a problem with your test case running within tenants.
*
==================
WARNING: DATA RACE
Read at 0x00c002daa24f by goroutine 193:
  github.com/cockroachdb/cockroach/pkg/sql/gcjob/gcjobnotifier.(*Notifier).run.func1()
      github.com/cockroachdb/cockroach/pkg/sql/gcjob/gcjobnotifier/notifier.go:151 +0xa7
  github.com/cockroachdb/cockroach/pkg/clusterversion.(*handleImpl).SetOnChange.func1()
      github.com/cockroachdb/cockroach/pkg/clusterversion/pkg/clusterversion/clusterversion.go:227 +0xee
  github.com/cockroachdb/cockroach/pkg/settings.(*Values).settingChanged()
      github.com/cockroachdb/cockroach/pkg/settings/pkg/settings/values.go:147 +0x178
  github.com/cockroachdb/cockroach/pkg/settings.(*Values).setGeneric()
      github.com/cockroachdb/cockroach/pkg/settings/pkg/settings/values.go:178 +0xce
  github.com/cockroachdb/cockroach/pkg/settings.(*VersionSetting).SetInternal()
      github.com/cockroachdb/cockroach/pkg/settings/pkg/settings/version.go:167 +0xd3
  github.com/cockroachdb/cockroach/pkg/clusterversion.(*handleImpl).SetActiveVersion()
      github.com/cockroachdb/cockroach/pkg/clusterversion/pkg/clusterversion/clusterversion.go:220 +0x3b8
  github.com/cockroachdb/cockroach/pkg/server.bumpClusterVersion()
      github.com/cockroachdb/cockroach/pkg/server/migration.go:146 +0x5a6
  github.com/cockroachdb/cockroach/pkg/server.(*migrationServer).BumpClusterVersion.func1()
      github.com/cockroachdb/cockroach/pkg/server/migration.go:101 +0x2ef
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:341 +0x30b
  github.com/cockroachdb/cockroach/pkg/server.(*migrationServer).BumpClusterVersion()
      github.com/cockroachdb/cockroach/pkg/server/migration.go:96 +0x498
  github.com/cockroachdb/cockroach/pkg/server/serverpb._Migration_BumpClusterVersion_Handler.func1()
      github.com/cockroachdb/cockroach/pkg/server/serverpb/bazel-out/k8-dbg/bin/pkg/server/serverpb/serverpb_go_proto_/github.com/cockroachdb/cockroach/pkg/server/serverpb/migration.pb.go:593 +0x141
  github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor.ServerInterceptor.func1()
      github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor/grpc_interceptor.go:114 +0x838
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1117 +0x1fb
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func3()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:263 +0x1b4
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1120 +0x451
  github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.unaryInterceptor()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/auth.go:73 +0x728
  github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.unaryInterceptor-fm()
      <autogenerated>:1 +0x12e
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1120 +0x451
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1.1()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:232 +0xe4
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:341 +0x30b
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:230 +0x2a7
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1120 +0x451
  google.golang.org/grpc.chainUnaryInterceptors.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1122 +0x3c9
  github.com/cockroachdb/cockroach/pkg/server/serverpb._Migration_BumpClusterVersion_Handler()
      github.com/cockroachdb/cockroach/pkg/server/serverpb/bazel-out/k8-dbg/bin/pkg/server/serverpb/serverpb_go_proto_/github.com/cockroachdb/cockroach/pkg/server/serverpb/migration.pb.go:595 +0x395
  google.golang.org/grpc.(*Server).processUnaryRPC()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1283 +0x21c5
  google.golang.org/grpc.(*Server).handleStream()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1620 +0xda6
  google.golang.org/grpc.(*Server).serveStreams.func1.2()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:922 +0x21e

Previous write at 0x00c002daa24f by goroutine 635:
  github.com/cockroachdb/cockroach/pkg/sql/gcjob/gcjobnotifier.(*Notifier).run.func1()
      github.com/cockroachdb/cockroach/pkg/sql/gcjob/gcjobnotifier/notifier.go:152 +0x155
  github.com/cockroachdb/cockroach/pkg/clusterversion.(*handleImpl).SetOnChange.func1()
      github.com/cockroachdb/cockroach/pkg/clusterversion/pkg/clusterversion/clusterversion.go:227 +0xee
  github.com/cockroachdb/cockroach/pkg/settings.(*Values).settingChanged()
      github.com/cockroachdb/cockroach/pkg/settings/pkg/settings/values.go:147 +0x178
  github.com/cockroachdb/cockroach/pkg/settings.(*Values).setGeneric()
      github.com/cockroachdb/cockroach/pkg/settings/pkg/settings/values.go:178 +0xce
  github.com/cockroachdb/cockroach/pkg/settings.(*VersionSetting).SetInternal()
      github.com/cockroachdb/cockroach/pkg/settings/pkg/settings/version.go:167 +0xd3
  github.com/cockroachdb/cockroach/pkg/clusterversion.(*handleImpl).SetActiveVersion()
      github.com/cockroachdb/cockroach/pkg/clusterversion/pkg/clusterversion/clusterversion.go:220 +0x3b8
  github.com/cockroachdb/cockroach/pkg/server.bumpClusterVersion()
      github.com/cockroachdb/cockroach/pkg/server/migration.go:146 +0x5a6
  github.com/cockroachdb/cockroach/pkg/server.(*migrationServer).BumpClusterVersion.func1()
      github.com/cockroachdb/cockroach/pkg/server/migration.go:101 +0x2ef
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:341 +0x30b
  github.com/cockroachdb/cockroach/pkg/server.(*migrationServer).BumpClusterVersion()
      github.com/cockroachdb/cockroach/pkg/server/migration.go:96 +0x498
  github.com/cockroachdb/cockroach/pkg/server/serverpb._Migration_BumpClusterVersion_Handler.func1()
      github.com/cockroachdb/cockroach/pkg/server/serverpb/bazel-out/k8-dbg/bin/pkg/server/serverpb/serverpb_go_proto_/github.com/cockroachdb/cockroach/pkg/server/serverpb/migration.pb.go:593 +0x141
  github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor.ServerInterceptor.func1()
      github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor/grpc_interceptor.go:114 +0x838
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1117 +0x1fb
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func3()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:263 +0x1b4
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1120 +0x451
  github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.unaryInterceptor()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/auth.go:73 +0x728
  github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.unaryInterceptor-fm()
      <autogenerated>:1 +0x12e
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1120 +0x451
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1.1()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:232 +0xe4
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:341 +0x30b
  github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1()
      github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:230 +0x2a7
  google.golang.org/grpc.chainUnaryInterceptors.func1.1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1120 +0x451
  google.golang.org/grpc.chainUnaryInterceptors.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1122 +0x3c9
  github.com/cockroachdb/cockroach/pkg/server/serverpb._Migration_BumpClusterVersion_Handler()
      github.com/cockroachdb/cockroach/pkg/server/serverpb/bazel-out/k8-dbg/bin/pkg/server/serverpb/serverpb_go_proto_/github.com/cockroachdb/cockroach/pkg/server/serverpb/migration.pb.go:595 +0x395
  google.golang.org/grpc.(*Server).processUnaryRPC()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1283 +0x21c5
  google.golang.org/grpc.(*Server).handleStream()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:1620 +0xda6
  google.golang.org/grpc.(*Server).serveStreams.func1.2()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:922 +0x21e

Goroutine 193 (running) created at:
  google.golang.org/grpc.(*Server).serveStreams.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:920 +0x746
  google.golang.org/grpc/internal/transport.(*http2Server).operateHeaders()
      google.golang.org/grpc/internal/transport/external/org_golang_google_grpc/internal/transport/http2_server.go:600 +0x4846
  google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams()
      google.golang.org/grpc/internal/transport/external/org_golang_google_grpc/internal/transport/http2_server.go:645 +0xf4e
  google.golang.org/grpc.(*Server).serveStreams()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:906 +0x3e7
  google.golang.org/grpc.(*Server).handleRawConn.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:848 +0x92

Goroutine 635 (running) created at:
  google.golang.org/grpc.(*Server).serveStreams.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:920 +0x746
  google.golang.org/grpc/internal/transport.(*http2Server).operateHeaders()
      google.golang.org/grpc/internal/transport/external/org_golang_google_grpc/internal/transport/http2_server.go:600 +0x4846
  google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams()
      google.golang.org/grpc/internal/transport/external/org_golang_google_grpc/internal/transport/http2_server.go:645 +0xf4e
  google.golang.org/grpc.(*Server).serveStreams()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:906 +0x3e7
  google.golang.org/grpc.(*Server).handleRawConn.func1()
      google.golang.org/grpc/external/org_golang_google_grpc/server.go:848 +0x92
==================
I220815 09:23:21.367305 1 (gostd) testmain.go:88  [-] 1  Test //pkg/upgrade/upgrademanager:upgrademanager_test exited with error code 66

cc @ajwerner

@cockroach-teamcity
Copy link
Member Author

pkg/upgrade/upgrademanager/upgrademanager_test.TestAlreadyRunningJobsAreHandledProperly failed with artifacts on master @ 32f7d263043cfb47f3ac409a8aaa7346e56b973b:

=== RUN   TestAlreadyRunningJobsAreHandledProperly
    test_log_scope.go:162: test logs captured to: /artifacts/tmp/_tmp/3b30e681b3a88fde1405cd1a84158740/logTestAlreadyRunningJobsAreHandledProperly4167522552
    test_log_scope.go:80: use -show-logs to present logs inline

Parameters: TAGS=bazel,gss

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

craig bot pushed a commit that referenced this issue Aug 16, 2022
85772: ui/cluster-ui: fix polling in fingerprints pages r=xinhaoz a=xinhaoz

Fixes: #85236

Previously, SQL statement and transaction stats would be refreshed
every 5 minutes via sagas in CC and the cached api reducer in
db-console. This method relied on a refresh data call that resided
in `shouldComponentUpdate`, which was ignored by the respective
polling managers when the time interval was not complete. This
pattern was hacky as (unguarded calls in `shouldComponentUpdate`
are typically avoided in React. Polling in these pages were removed
with the introduciton of persisted stats, however we would like to
reintroduce polling when the selected time interval is `Latest xx..'
(i.e. not a custom interval). The removal of this polling introduced
a bug in the CC fingerprint pages, as the saga effects for when the
data was received would invalidate the data after the polling interval.
Now that the data was never refreshed, the page would get stuck on
the 'loading data' page.

This commit reintroduces polling via a `setTimeout` in the page
components, rather than through cached data reducer and sagasn for CC.
Data in the fingerprints overview pages is  now refreshed every
5 minutes for non-custom time ranges. The data invalidation in
CC is also cleaned up such that a call to invalidate data only
happens right before a request to fetch data (to signify new data
is being loaded).

Release note (bug fix): the statements and transaction fingerprint
will no longer get stuck on the loading page in CC after 5 minutes
idling on the page

Release note (ui change): the statements and transaction fingerprint
now refresh data every 5 minutes for non-custom time ranges

Release justification: bug fix

86195: sql/schemachanger: implement ALTER TABLE ... ADD PRIMARY KEY r=ajwerner a=ajwerner

This commit extends #86071 to support `ALTER TABLE ... ADD PRIMARY KEY` in
implicit transactions in the case that the existing primary key uses the
default `rowid` primary key. It adopts the ability of #86071 to drop the
`rowid` column along the way.

Release justification: minor improvement to new functionality needed to unblock DMS support.

Release note (sql change): When running `ALTER TABLE ... ADD PRIMARY KEY`
or `ALTER TABLE ... ADD CONSTRAINT ... PRIMARY KEY` in a single-statement,
implicit transaction, when no primary key had been added to the table, the
old `rowid` column which had been automatically created as the table's
`PRIMARY KEY` will now be dropped.

86216: insights: enable the fixed-threshold detector r=matthewtodd a=matthewtodd

This commit effectively enables the new "insights" subsystem, wherein we
observe in-flight statement executions and gather information about
statements that ran longer than a fixed threshold, with the intent of
supporting developers and operators tuning their workloads.

We are currently building a UI in DB console to surface these insights;
in the meantime, they are also available in the
`crdb_internal.cluster_execution_insights` table.

A local 5-minute run of the kv95 workload with the
`sql.insights.latency_threshold` cluster setting set to 0ms (disabled)
and 100ms shows an unappreciable impact on execution latencies:

| `latency_threshold` | ops(total) | ops/sec(cum) | avg(ms) | p50(ms) | p95(ms) | p99(ms) | pMax(ms) |
|---------------------|------------|--------------|---------|---------|---------|---------|----------|
| 0ms                 | 825873     | 2752.9       | 11.6    | 0.6     | 96.5    | 268.4   | 939.5    |
| 100ms               | 853175     | 2843.9       | 11.2    | 0.7     | 79.7    | 260.0   | 1208.0   |

Note that we have another, hopefully more useful, heuristic-based "slow"
detector in the works, but it will need more performance tuning before
we can enable it by default.

Release justification: Category 2: Bug fixes and low-risk updates to new
functionality

Release note (sql change): A new subsystem, "insights," has been enabled
by default, gathering slow statement executions in the
`crdb_internal.cluster_execution_insights` table along with possible
reasons for the slowness: full scans / missing indexes, contention, plan
changes, retries, etc. This system may be tuned by a handful of new
cluster settings and monitored with a handful of new metrics, all in the
`sql.insights` namespace.

86221: sql/gcjob/gcjobnotifier: fix race r=ajwerner a=ajwerner

The callback which I had assumed could only be called once was being
called multiple times concurrently.

Fixes #86121

Release justification: minor change to new functionality to fix flakey tests

Release note: None

86235: sqlsmith: do not attempt to drop system columns r=yuzefovich a=yuzefovich

Previously, sqlsmith could attempt to drop a system column which
would fail, now it will skip system columns for the drop command.

Fixes: #86076.

Release justification: test-only change.

Release note: None

86258: logictest: attempt to deflake inflight_trace_spans r=ajwerner a=ajwerner

Maybe there's some background tasks popping up.

Fixes #85812

Release justification: testing only change

Release note: None

Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
Co-authored-by: Matthew Todd <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig craig bot closed this as completed in 72f4bcc Aug 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants