-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: oom due to WorkloadKVConverter #68965
Comments
@irfansharif is it expected that this error would bubble up? It is expected that replicas could move around. |
Looks like it's occurring for Unwinding the error trace: Lines 598 to 600 in 2556ac6
Where cockroach/pkg/migration/migrationcluster/cluster.go Lines 144 to 145 in 9157c7b
Where the closure invoked is: cockroach/pkg/migration/migrationcluster/cluster.go Lines 178 to 180 in 9157c7b
Coming from (where the RangeNotFound error is generated): cockroach/pkg/migration/migrations/separated_intents.go Lines 442 to 444 in 9157c7b
I'm not actually sure at what level this error should be handled. The |
I'm also surprised by this. A cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go Lines 2044 to 2051 in 8418f43
We're not addressing the Migrate request to a specific range or anything, and we should be routing the Migrate request through a DistSender, so how is it bubbling all the way back up to here? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Recent failures were fixed by #72432 |
roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on master @ 40f11fead0a0453969634f8ddb0502c1f78b2806:
|
roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on master @ b450fea83a7db1e06403b2563c13f38c9284b932:
|
roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on master @ 3b30a0e12f9a14b08ee8ad55b50299aca50c67a2:
|
roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on master @ 2c014c47c1a242f504f6d595bfd79c0edc20b90a:
|
roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on master @ e89328d92398a3e2d6487179845a51e7f1caa435:
|
Hmm, this isn't good, n2 got oom killed
@AlexTalks could you check if the heap profiles tell us anything? Note that this cluster was running 21.2 (not upgraded to master yet), so there wasn't a |
https://share.polarsignals.com/a059987/ inuse_space: We see that the hog here is |
cc @cockroachdb/bulk-io |
Changed the title so that future test failures aren't directed at Bulk I/O, this is the first time I've seen this oom and I don't think it will a failure mode exclusive to this test. |
haven't heard anyone complain about this lately to going to close this old DR issue. Any new |
roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on master @ ee3efd6b1e24a3e1676778f5028fa0a35266f683:
Reproduce
See: roachtest README
See: CI job to stress roachtests
For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^tpcc/mixed-headroom/n5cpu16$` * Parameters / `env.COUNT`: <number of runs>
Same failure on other branches
This test on roachdash | Improve this report!
Jira issue: CRDB-9386
The text was updated successfully, but these errors were encountered: