Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed #112840

Closed
cockroach-teamcity opened this issue Oct 22, 2023 · 1 comment
Closed
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-testeng TestEng Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Oct 22, 2023

roachtest.tpcc/mixed-headroom/multiple-upgrades/n5cpu16 failed with artifacts on release-23.1.12-rc @ 6aefe827ba7e2b62ca066d732e4db8ce4e6c485e:

(versionupgrade.go:372).func1: timed out after 10m0s: expected n1 to be at cluster version 22.2, but is still at 22.1-16: not upgraded yet
(monitor.go:153).Wait: monitor failure: context canceled
test artifacts and logs in: /artifacts/tpcc/mixed-headroom/multiple-upgrades/n5cpu16/run_1

Parameters: ROACHTEST_arch=amd64 , ROACHTEST_cloud=gce , ROACHTEST_cpu=16 , ROACHTEST_encrypted=false , ROACHTEST_fs=ext4 , ROACHTEST_localSSD=true , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

/cc @cockroachdb/test-eng

This test on roachdash | Improve this report!

Jira issue: CRDB-32633

@cockroach-teamcity cockroach-teamcity added branch-release-23.1.12-rc C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Oct 22, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.1 milestone Oct 22, 2023
@blathers-crl blathers-crl bot added the T-testeng TestEng Team label Oct 22, 2023
@renatolabs
Copy link
Contributor

Upgrade from 22.1 to 22.2 got "stuck" on 22.1-18 (UpgradeSequenceToBeReferencedByID) for most of the upgrade duration, waiting for leases to expire:

I231022 19:07:49.670009 486451 sql/catalog/lease/lease.go:163 ⋮ [n4,job=‹MIGRATION id=910725182788567044›,upgrade=22.1-18] 767 waiting for 4 leases to expire: desc=[{‹replication_constraint_stats› 25 4}]

We have seen this type of issue before: see #92995 and #84382. The latter issue already tracks the underlying problem; it's also unclear if there's anything to be done that can be backported to fix this.

For the test itself, we could lower the lease duration for tpcc/mixed-headroom tests. This hasn't been flaky enough to justify the change right now (that could change in the future). For now, I'm closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-testeng TestEng Team
Projects
No open projects
Status: Done
Development

No branches or pull requests

2 participants