Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

github.com/cockroachdb/pebble/internal/metamorphic: TestMeta failed #3556

Closed
cockroach-teamcity opened this issue Apr 26, 2024 · 19 comments · Fixed by #3589
Closed

Comments

@cockroach-teamcity
Copy link
Member

github.com/cockroachdb/pebble/internal/metamorphic.TestMeta failed with artifacts on refs/heads/master @ 282f0c3b0396:

=== RUN   TestMeta
--- FAIL: TestMeta (10.60s)
=== RUN   TestMeta/compare
    --- FAIL: TestMeta/compare (0.20s)
=== RUN   TestMeta/compare/random-019
        --- FAIL: TestMeta/compare/random-019 (0.00s)
panic: runtime error: index out of range [3260] with length 3254 [recovered]
	panic: runtime error: index out of range [3260] with length 3254

goroutine 253 [running]:
testing.tRunner.func1.2({0xe036e0, 0xc00011a0a8})
	GOROOT/src/testing/testing.go:1631 +0x24a
testing.tRunner.func1()
	GOROOT/src/testing/testing.go:1634 +0x377
panic({0xe036e0?, 0xc00011a0a8?})
	GOROOT/src/runtime/panic.go:770 +0x132
github.com/cockroachdb/pebble/metamorphic.reorderHistory({0xc00016c000, 0xcb6, 0xc0000444c0?})
	github.com/cockroachdb/pebble/metamorphic/external/com_github_cockroachdb_pebble/metamorphic/history.go:209 +0x105
github.com/cockroachdb/pebble/metamorphic.RunAndCompare.func4.2(0xc000492000)
	github.com/cockroachdb/pebble/metamorphic/external/com_github_cockroachdb_pebble/metamorphic/meta.go:302 +0xec
testing.tRunner(0xc000492000, 0xc000484000)
	GOROOT/src/testing/testing.go:1689 +0xfb
created by testing.(*T).Run in goroutine 33
	GOROOT/src/testing/testing.go:1742 +0x390
Help

This test on roachdash | Improve this report!

@jbowens
Copy link
Collaborator

jbowens commented Apr 26, 2024

// 06:43:02.008 INFO: [JOB 563] ingesting: sstable created 000522
// 06:43:02.008 INFO: [JOB 563] ingesting: sstable created 000523
// 06:43:02.010 FATAL: checker failed with error: cannot read at offset 17179869180 and length 4 from block of length 24

Weird. 17179869180 as a 64-bit word is 0000000000000000000000000000001111111111111111111111111111111100.

In addition to the actual cause of the failure, it looks like there's an issue with the metamorphic test where the FATAL didn't cause the execution subtest to fail. It only failed once it went to compare and the number of operations didn't match.

@RaduBerinde
Copy link
Member

Thanks, this will be interesting. I remember noticing that the meta logger only logs on Fatal, I'll fix that too.

@RaduBerinde
Copy link
Member

Could not repro locally after ~3hrs.

@RaduBerinde
Copy link
Member

#3559 fixed the test to error during execution.

@cockroach-teamcity
Copy link
Member Author

github.com/cockroachdb/pebble/internal/metamorphic.TestMeta failed with artifacts on refs/heads/master @ ef5c43072862:

        snap13.Get("wlrtbdalwbfe") // [""] pebble: not found #1819
        snap13.Get("swvcs") // [""] pebble: not found #1820
        db1.LogData("s") // <nil> #1821
        snap13.Get("swvcs") // [""] pebble: not found #1822
        db1.RangeKeySet("lorlmrpl", "yjyxaj", "@18", "nczgyfhrcffaia") // <nil> #1823
        db1.Set("xqwb@11", "qjohcqmyogmjvsl") // <nil> #1824
        db1.RangeKeySet("jucl", "upypbssaajao", "@84", "rjc") // <nil> #1825
        // 08:15:42.693 INFO: [JOB 153] ingesting: sstable created 000241
        // 08:15:42.693 INFO: [JOB 153] ingesting: sstable created 000240
        // 08:15:42.693 INFO: [JOB 153] ingesting: sstable created 000242
        // 08:15:42.693 INFO: [JOB 154] flushing 1 memtable (738B) to L0
        // 08:15:42.693 INFO: [JOB 154] flushing: sstable created 000243
        // 08:15:42.694 INFO: [JOB 154] flushing: sstable created 000244
        // 08:15:42.694 INFO: [JOB 154] flushing: sstable created 000245
        // 08:15:42.694 INFO: [JOB 154] flushing: sstable created 000246
        // 08:15:42.694 INFO: [JOB 154] flushing: sstable created 000247
        // 08:15:42.695 INFO: [JOB 154] flushed 1 memtable (738B) to L0 [000243 000244 000245 000246 000247] (5.5KB), in 0.0s (0.0s total), output rate 8.0MB/s
        // 08:15:42.696 INFO: [JOB 153] ingested L0:000241 (773B), L0:000240 (776B), L6:000242 (774B)
        db1.IngestExternalFiles(external0, "xoyakdg" /* start */, "xoyakdgoo" /* end */, "" /* syntheticSuffix */, "xo" /* syntheticPrefix */, external0, "kfyakdg" /* start */, "kfyakdga" /* end */, "@87" /* syntheticSuffix */, "kf" /* syntheticPrefix */, external0, "zyakdg" /* start */, "zyakdgfd" /* end */, "" /* syntheticSuffix */, "z" /* syntheticPrefix */) // <nil> #1826
        batch12 = db1.NewBatch() #1827
        // 08:15:42.696 INFO: [JOB 156] ingesting: sstable created 000248
        // 08:15:42.698 INFO: [JOB 156] ingested L0:000248 (778B)
        db1.IngestExternalFiles(external0, "betyakdg" /* start */, "betyakdgvd" /* end */, "" /* syntheticSuffix */, "bet" /* syntheticPrefix */) // <nil> #1828
        db1.Set("yjyxaj@23", "jerblsojjoawuvwldvd") // <nil> #1829
        snap13.Get("wlrtbdalwbfe@10") // ["tgzmcajzagen"] <nil> #1830
        snap13.Get("uzig") // [""] pebble: not found #1831
        // 08:15:42.698 INFO: [JOB 157] ingesting: sstable created 000249
        // 08:15:42.699 INFO: [JOB 157] ingested L0:000249 (772B)
        db1.IngestExternalFiles(external0, "yakdg" /* start */, "yakdgukh" /* end */, "" /* syntheticSuffix */, "" /* syntheticPrefix */) // <nil> #1832
        // 08:15:42.699 INFO: [JOB 158] ingesting: sstable created 000250
        // 08:15:42.700 INFO: [JOB 158] ingested L0:000250 (771B)
        db1.IngestExternalFiles(external0, "yakdg" /* start */, "yakdgo" /* end */, "@88" /* syntheticSuffix */, "" /* syntheticPrefix */) // <nil> #1833
        // 08:15:42.701 INFO: [JOB 159] ingesting: sstable created 000251
        // 08:15:42.702 INFO: [JOB 159] ingested L0:000251 (774B)
        db1.IngestExternalFiles(external0, "oyakdg" /* start */, "oyakdguxi" /* end */, "" /* syntheticSuffix */, "o" /* syntheticPrefix */) // <nil> #1834
        db1.Ingest(batch12) // <nil> #1835
        // 08:15:42.702 INFO: [JOB 160] ingesting: sstable created 000252
        // 08:15:42.702 INFO: [JOB 160] ingesting: sstable created 000253
        // 08:15:42.703 INFO: [JOB 160] ingested L0:000252 (775B), L0:000253 (772B)
        db1.IngestExternalFiles(external0, "ldfkyakdg" /* start */, "ldfkyakdgbnb" /* end */, "@89" /* syntheticSuffix */, "ldfk" /* syntheticPrefix */, external0, "yakdg" /* start */, "yakdgz" /* end */, "" /* syntheticSuffix */, "" /* syntheticPrefix */) // <nil> #1836
        db1.Set("mjpzbrpk", "blhe") // <nil> #1837
        db1.Set("brege@1", "elpfdsqlsuyhnffx") // <nil> #1838
        db1.Set("tghwk@35", "aty") // <nil> #1839
        // 08:15:42.703 INFO: [JOB 161] ingesting: sstable created 000254
        // 08:15:42.713 FATAL: checker failed with error: cannot read at offset 17179869180 and length 4 from block of length 156
        // 08:15:42.713 INFO: [JOB 161] ingested L0:000254 (772B)
        db1.IngestExternalFiles(external0, "yakdg" /* start */, "yakdgyc" /* end */, "" /* syntheticSuffix */, "" /* syntheticPrefix */) // <nil> #1840
        
        To reduce:  go test ./internal/metamorphic -tags invariants -run 'TestMeta$' --run-dir /artifacts/meta/240427-081531.8041174144291/random-010 --try-to-reduce -v
        --- FAIL: TestMeta/execution/random-010 (0.56s)
Help

To reproduce, try:

go test -tags 'invariants' -exec 'stress -p 1' -timeout 0 -test.v -run 'TestMeta$' ./internal/metamorphic -seed 1714205731804912259 -ops "uniform:5000-10000"

This test on roachdash | Improve this report!

@jbowens
Copy link
Collaborator

jbowens commented Apr 27, 2024

I suspect this is somehow related to the refactor of the level checker and levelIter. The offset is 17179869180 again, trying to read from another small block—maybe a range deletion block?

RaduBerinde added a commit to RaduBerinde/pebble that referenced this issue Apr 27, 2024
We now panic in invariants mode. This might help debug cockroachdb#3556.
@RaduBerinde
Copy link
Member

A stack trace might help: #3562

@RaduBerinde
Copy link
Member

maybe a range deletion block

The error is generated in valueBlockReader.getBlockHandle.

It's possible we may be using a value after we are supposed to (after the iterator that produced it moved along).

@RaduBerinde
Copy link
Member

Trying to reproduce the run above: 240427-081531.8041174144291/random-010, no luck yet.

@RaduBerinde
Copy link
Member

17179869180 is 2^34 - 4, and 4 is the indexEntryLen. So blockNum must be 2^32-1, which is probably an uint32 underflow.

@cockroach-teamcity
Copy link
Member Author

github.com/cockroachdb/pebble/internal/metamorphic.TestMeta failed with artifacts on refs/heads/master @ ef5c43072862:

        // 05:01:32.825 INFO: [JOB 1071] validated table: 001183:[tgfxzvwv@5#1360,SET-tgfxzvwv@5#1360,SET]
        // 05:01:32.825 INFO: [JOB 1071] validated table: 001188:[btgfxzvwv@5#1363,SET-btgfxzvwv@5#1363,SET]
        // 05:01:32.825 INFO: [JOB 1071] validated table: 001189:[aoxj#0,SET-btgfxzvwv@91#0,SET]
        // 05:01:32.825 INFO: [JOB 1071] validated table: 001191:[yuntgfxzvwv@385#1364,SET-yuntgfxzvwv@385#1364,SET]
        // 05:01:32.825 INFO: [JOB 1071] validated table: 001192:[tgfxzvwv@386#1365,SET-tgfxzvwv@386#1365,SET]
        // 05:01:32.825 INFO: [JOB 1071] validated table: 001193:[nxftgfxzvwv@5#0,SET-nxftgfxzvwv@5#0,SET]
        // 05:01:32.825 INFO: [JOB 1071] validated table: 001195:[ntgfxzvwv@5#1367,SET-ntgfxzvwv@5#1367,SET]
        // 05:01:32.825 INFO: [JOB 1071] validated table: 001196:[ynfttgfxzvwv@5#1368,SET-ynfttgfxzvwv@5#1368,SET]
        // 05:01:32.826 INFO: [JOB 1072] flushing 1 memtable (307B) to L0
        // 05:01:32.826 INFO: [JOB 1072] flushing: sstable created 001201
        // 05:01:32.846 INFO: [JOB 1072] flushed 1 memtable (307B) to L0 [001201] (1.4KB), in 0.0s (0.0s total), output rate 68KB/s
        // 05:01:32.846 INFO: [JOB 1073] flushing 1 ingested table
        // 05:01:32.846 INFO: [JOB 1073] flushed 1 ingested flushable L0:001197 (863B) in 0.0s (0.0s total), output rate 7.1MB/s
        // 05:01:32.846 INFO: [JOB 1069] sstable deleted 001180
        // 05:01:32.846 INFO: [JOB 1072] WAL deleted 001176
        // 05:01:32.846 INFO: [JOB 1073] WAL deleted 001198
        // 05:01:32.846 INFO: [JOB 1068] ingested L0:001200 (859B)
        db1.IngestExternalFiles(external0, "tgfxzvwv" /* start */, "tgfxzvwvb" /* end */, "" /* syntheticSuffix */, "" /* syntheticPrefix */) // <nil> #6367
        iter64.SeekLT("tgfxzvwv@12", "") // [false] <nil> #6368
        // 05:01:32.847 INFO: [JOB 1074] ingesting: sstable created 001202
        // 05:01:32.847 INFO: [JOB 1074] ingested L0:001202 (859B)
        db1.IngestExternalFiles(external0, "tgfxzvwv" /* start */, "tgfxzvwvlbjo" /* end */, "" /* syntheticSuffix */, "" /* syntheticPrefix */) // <nil> #6369
        iter66.Last() // [false] <nil> #6370
        iter64.SetBounds("tgfxzvwv@148", "vvyoguj") // <nil> #6371
        iter64.SeekGE("tgfxzvwv@148", "") // [true,"tgfxzvwv@148",<no point>,["tgfxzvwv@148","uxbfhcisqr")=>{"@365"="uchiiy","@343"="ckdsrggxawislogbfcy"}*] <nil> #6372
        iter64.Next("") // [true,"tgfxzvwv@5","xjuivocorna",["tgfxzvwv@148","uxbfhcisqr")=>{"@365"="uchiiy","@343"="ckdsrggxawislogbfcy"}] <nil> #6373
        iter66.First() // [false] <nil> #6374
        iter66.SeekPrefixGE("sshfitrria@253") // [false] pebble: SeekPrefixGE supplied with key outside of upper bound #6375
        iter66.SeekGE("ndco@72", "") // [false] <nil> #6376
        iter66.Last() // [false] <nil> #6377
        iter64.SeekPrefixGE("wmuiyphwlnkm@213") // [false] pebble: SeekPrefixGE supplied with key outside of upper bound #6378
        iter65.First() // [true,"zsciwqjjg","oaxqivlcdtpblclr",["zsciwqjjg","zwtcfzfxefb")=>{"@25"="","@6"="mchbkqantrsrrua"}*] <nil> #6379
        iter66.Last() // [false] <nil> #6380
        iter67.Prev("") // [false] <nil> #6381
        iter64.Prev("") // [false] pebble: SeekPrefixGE supplied with key outside of upper bound #6382
        iter65.SeekLT("guob@306", "") // [false] <nil> #6383
        // 05:01:32.850 INFO: [JOB 1075] validated table: 001200:[tgfxzvwv@5#1372,SET-tgfxzvwv@5#1372,SET]
        // 05:01:32.850 INFO: [JOB 1075] validated table: 001202:[tgfxzvwv@5#1373,SET-tgfxzvwv@5#1373,SET]
        // 05:01:32.854 INFO: [JOB 1076] ingesting: sstable created 001203
        // 05:01:32.854 INFO: [JOB 1076] ingesting: sstable created 001205
        // 05:01:32.854 INFO: [JOB 1076] ingesting: sstable created 001204
        // 05:01:32.858 FATAL: checker failed with error: cannot read at offset 17179869180 and length 4 from block of length 80
        // 05:01:32.858 INFO: [JOB 1076] ingested L0:001203 (865B), L0:001205 (863B), L6:001204 (861B)
        db1.IngestExternalFiles(external0, "ngtgfxzvwv" /* start */, "ngtgfxzvwvavc" /* end */, "@388" /* syntheticSuffix */, "ng" /* syntheticPrefix */, external0, "xwtgfxzvwv" /* start */, "xwtgfxzvwvlbjo" /* end */, "" /* syntheticSuffix */, "xw" /* syntheticPrefix */, external0, "tgfxzvwv" /* start */, "tgfxzvwvj" /* end */, "@389" /* syntheticSuffix */, "" /* syntheticPrefix */) // <nil> #6384
        // 05:01:32.859 INFO: [JOB 1077] validated table: 001203:[ngtgfxzvwv@388#1374,SET-ngtgfxzvwv@388#1374,SET]
        // 05:01:32.859 INFO: [JOB 1077] validated table: 001205:[tgfxzvwv@389#1375,SET-tgfxzvwv@389#1375,SET]
        // 05:01:32.859 INFO: [JOB 1077] validated table: 001204:[xwtgfxzvwv@5#1376,SET-xwtgfxzvwv@5#1376,SET]
        
        To reduce:  go test ./internal/metamorphic -tags invariants -run 'TestMeta$' --run-dir /artifacts/meta/240428-050131.385107181975/random-025 --try-to-reduce -v
        --- FAIL: TestMeta/execution/random-025 (1.77s)
Help

To reproduce, try:

go test -tags 'invariants' -exec 'stress -p 1' -timeout 0 -test.v -run 'TestMeta$' ./internal/metamorphic -seed 1714280491385529514 -ops "uniform:5000-10000"

This test on roachdash | Improve this report!

RaduBerinde added a commit that referenced this issue Apr 28, 2024
We now panic in invariants mode. This might help debug #3556.
@cockroach-teamcity
Copy link
Member Author

github.com/cockroachdb/pebble/internal/metamorphic.TestMeta failed with artifacts on refs/heads/master @ 5d4c28502e75:

        // 04:11:58.974 INFO: [JOB 168] ingesting: sstable created 000542
        // 04:11:58.975 INFO: [JOB 169] WAL created 000543
        // 04:11:58.977 INFO: [JOB 168] ingested as flushable 000541 (751B), 000542 (763B)
        db1.IngestExternalFiles(external0, "aaap" /* start */, "borknxyyyyt" /* end */, "@155" /* syntheticSuffix */, "" /* syntheticPrefix */, external0, "borknxyyyyt" /* start */, "ridb" /* end */, "" /* syntheticSuffix */, "" /* syntheticPrefix */) // <nil> #4207
        // 04:11:58.977 INFO: [JOB 171] flushing 1 memtable (423B) to L0
        // 04:11:58.977 INFO: [JOB 171] flushing: sstable created 000545
        // 04:11:58.978 INFO: [JOB 166] sstable deleted 000491
        // 04:11:58.978 INFO: [JOB 166] sstable deleted 000493
        iter44.SeekPrefixGE("nsveaaozr@85") // [false] <nil> #4205
        iter40.SeekPrefixGE("iwjytfvso@57") // [false] pebble: SeekPrefixGE supplied with key outside of lower bound #4206
        iter44.SetBounds("rerj@24", "zvsr@4") // <nil> #4208
        // 04:11:58.979 INFO: [JOB 170] WAL created 000544
        // 04:11:58.981 INFO: [JOB 166] sstable deleted 000495
        iter44.SeekLT("zvsr@4", "") // [true,"ztdkaaw@16","uqlxjjjgkhlwybe",["zmkfcpj","zvsr")=>{"@6"="ntzxhrywdiv"}*] <nil> #4209
        db1.Set("vkrmbs@153", "vjfrenrsgbryjnvzas") // <nil> #4212
        iter37.Next("") // [false] pebble: SeekPrefixGE supplied with key outside of lower bound #4214
        iter44.Next("") // [false] <nil> #4210
        db1.Get("seevyc@66") // [""] pebble: not found #4215
        iter44.Next("") // [false] <nil> #4211
        // 04:11:58.981 INFO: [JOB 171] MANIFEST created 000546
        // 04:11:58.982 INFO: [JOB 171] flushed 1 memtable (423B) to L0 [000545] (1.4KB), in 0.0s (0.0s total), output rate 1.7MB/s
        db1.RangeKeySet("isvwv", "lxpn", "@147", "") // <nil> #4218
        iter37.SeekGE("tsbufistvtbx", "") // [false] <nil> #4220
        iter37.SeekPrefixGE("nzazygy") // [false] pebble: SeekPrefixGE supplied with key outside of lower bound #4222
        // 04:11:58.982 INFO: [JOB 171] WAL deleted 000524
        // 04:11:58.982 INFO: [JOB 171] MANIFEST deleted 000537
        iter44.SeekLT("aaap", "") // [false] <nil> #4213
        // 04:11:58.982 INFO: [JOB 172] flushing 2 ingested tables
        iter44.First() // [true,"uzmjapdks","vgverdpe",["uzmjapdks","vkrmbs")=>{"@111"="umd","@94"="ghguzocekv","@88"="n","@87"="arqskcawxyrtfcvovacv","@58"="ybmsypwhiincuvvx"}*] <nil> #4216
        iter40.SeekGE("isvwv@41", "wgybvifqy") // [valid,"lescdhhyaxb@66","andinstgafymedi",<no range>] <nil> #4217
        iter40.SeekPrefixGE("ketflyjs@14") // [false] pebble: SeekPrefixGE supplied with key outside of lower bound #4219
        iter44.Next("") // [true,"uzmjapdks@73","avvuvyhbcu",["uzmjapdks","vkrmbs")=>{"@111"="umd","@94"="ghguzocekv","@88"="n","@87"="arqskcawxyrtfcvovacv","@58"="ybmsypwhiincuvvx"}] <nil> #4221
        iter44.SeekGE("gfxhnwuukl@138", "") // [true,"uzmjapdks","vgverdpe",["uzmjapdks","vkrmbs")=>{"@111"="umd","@94"="ghguzocekv","@88"="n","@87"="arqskcawxyrtfcvovacv","@58"="ybmsypwhiincuvvx"}] <nil> #4224
        iter40.SeekGE("bbrekxu@12", "mmakibsxxkuw@71") // [valid,"lescdhhyaxb@66","andinstgafymedi",<no range>] <nil> #4225
        iter44.SeekLT("xhegfxsok@153", "ridb@4") // [valid,"xczlfkaaw@16","uqlxjjjgkhlwybe",["wgybvifqy","zmkfcpj")=>{"@58"="ybmsypwhiincuvvx"}*] <nil> #4226
        iter44.SeekLT("xhvbwzuvw@13", "") // [true,"xhvbwzuvw@15","hccuvtbsfemjafsvuco",["wgybvifqy","zmkfcpj")=>{"@58"="ybmsypwhiincuvvx"}] <nil> #4234
        iter40.SeekLT("xledp@10", "") // [true,"mtcovkaaw@92","uqlxjjjgkhlwybe",<no range>] <nil> #4235
        iter44.SeekLT("yjigi@124", "") // [true,"xledp@47","alhpaqbu",["wgybvifqy","zmkfcpj")=>{"@58"="ybmsypwhiincuvvx"}] <nil> #4240
        iter44.First() // [true,"uzmjapdks","vgverdpe",["uzmjapdks","vkrmbs")=>{"@111"="umd","@94"="ghguzocekv","@88"="n","@87"="arqskcawxyrtfcvovacv","@58"="ybmsypwhiincuvvx"}*] <nil> #4242
        iter40.SeekPrefixGE("uzmjapdks@4") // [false] pebble: SeekPrefixGE supplied with key outside of upper bound #4244
        db1.Set("nlnfjo@8", "gwvvjfiukl") // <nil> #4223
        iter37.SeekLT("afbndhfe", "") // [false] <nil> #4227
        // 04:11:58.991 INFO: [JOB 174] compacting(move) L0 [000545] (1.4KB) Score=100.00 + L2 [] (0B) Score=0.00; OverlappingRatio: Single 0.00, Multi 0.00
        db1.IngestExternalFiles(external0, "pvklohqdz" /* start */, "pvklqbqgplc" /* end */, "@156" /* syntheticSuffix */, "pvkl" /* syntheticPrefix */) // no-op #4228
        db1.IngestExternalFiles(external0, "vkrmbs" /* start */, "vxmigiavzuay" /* end */, "" /* syntheticSuffix */, "" /* syntheticPrefix */) // no-op #4229
        iter37.InternalNext() // <nil> #4230
        // 04:11:58.991 INFO: [JOB 172] MANIFEST created 000549
        
        To reduce:  go test ./internal/metamorphic -tags invariants -run 'TestMeta$' --run-dir /artifacts/meta/240429-041157.0241021847022/random-014 --try-to-reduce -v
        --- FAIL: TestMeta/execution/random-014 (1.15s)
Help

To reproduce, try:

go test -tags 'invariants' -exec 'stress -p 1' -timeout 0 -test.v -run 'TestMeta$' ./internal/metamorphic -seed 1714363917024013813 -ops "uniform:5000-10000"

This test on roachdash | Improve this report!

@jbowens
Copy link
Collaborator

jbowens commented Apr 29, 2024

Got a stack trace:

===== SEED =====
1714363917024013813
===== ERR =====
exit status 2
===== OUT =====
panic: index entry out of bounds: offset 17179869180 length 4 block length 132

goroutine 1786 [running]:
github.com/cockroachdb/pebble/internal/base.AssertionFailedf(...)
	github.com/cockroachdb/pebble/internal/base/external/com_github_cockroachdb_pebble/internal/base/error.go:38
github.com/cockroachdb/pebble/sstable.(*valueBlockReader).getBlockHandle(0x7ff31c001b51?, 0x82298e?)
	github.com/cockroachdb/pebble/sstable/external/com_github_cockroachdb_pebble/sstable/value_block.go:937 +0x313
github.com/cockroachdb/pebble/sstable.(*valueBlockReader).getValueInternal(0xc000d8a360, {0x7ff31c001b51?, 0x0?, 0xc0008fe788?}, 0x2)
	github.com/cockroachdb/pebble/sstable/external/com_github_cockroachdb_pebble/sstable/value_block.go:912 +0x13d
github.com/cockroachdb/pebble/sstable.(*valueBlockReader).Fetch(0xc000d8a360, {0x7ff31c001b51?, 0x2?, 0x0?}, 0x1c001b40?, {0x0?, 0x4?, 0xc0008fe870?})
	github.com/cockroachdb/pebble/sstable/external/com_github_cockroachdb_pebble/sstable/value_block.go:861 +0x371
github.com/cockroachdb/pebble/internal/base.(*LazyValue).fetchValue(0x8?, {0x0?, 0x4?, 0x8?})
	github.com/cockroachdb/pebble/internal/base/external/com_github_cockroachdb_pebble/internal/base/lazy_value.go:212 +0x67
github.com/cockroachdb/pebble/internal/base.(*LazyValue).Value(0xc000298310?, {0x0?, 0x1c90f?, 0xc000298380?})
	github.com/cockroachdb/pebble/internal/base/external/com_github_cockroachdb_pebble/internal/base/lazy_value.go:204 +0x1f
github.com/cockroachdb/pebble.(*simpleMergingIter).handleVisiblePoint(0xc0008ff518, 0xc000440300, 0xc0004400c0)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/level_checker.go:226 +0x285
github.com/cockroachdb/pebble.(*simpleMergingIter).step(0xc0008ff518)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/level_checker.go:133 +0xd5
github.com/cockroachdb/pebble.checkLevelsInternal(0xc000a783c0)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/level_checker.go:659 +0xd25
github.com/cockroachdb/pebble.(*DB).CheckLevels(0xc000c22488, 0x0)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/level_checker.go:577 +0x157
github.com/cockroachdb/pebble.DebugCheckLevels(...)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/options.go:1121
github.com/cockroachdb/pebble/metamorphic.defaultOptions.func1(0xc000c22488)
	github.com/cockroachdb/pebble/metamorphic/external/com_github_cockroachdb_pebble/metamorphic/options.go:310 +0xe5
github.com/cockroachdb/pebble/metamorphic.(*Test).init.func1.1()
	github.com/cockroachdb/pebble/metamorphic/external/com_github_cockroachdb_pebble/metamorphic/test.go:104 +0x1b
github.com/cockroachdb/pebble/metamorphic.withRetries(0xc0008ff730, 0xeb28f0)
	github.com/cockroachdb/pebble/metamorphic/external/com_github_cockroachdb_pebble/metamorphic/retryable.go:29 +0x27
github.com/cockroachdb/pebble/metamorphic.(*Test).withRetries(...)
	github.com/cockroachdb/pebble/metamorphic/external/com_github_cockroachdb_pebble/metamorphic/test.go:249
github.com/cockroachdb/pebble/metamorphic.(*Test).init.func1(0xc000c22558?)
	github.com/cockroachdb/pebble/metamorphic/external/com_github_cockroachdb_pebble/metamorphic/test.go:104 +0x4d
github.com/cockroachdb/pebble.(*DB).updateReadStateLocked(0xc000c22488, 0xc000012348)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/read_state.go:99 +0x196
github.com/cockroachdb/pebble.(*DB).flush1(0xc000c22488)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/compaction.go:1602 +0xd36
github.com/cockroachdb/pebble.(*DB).flush.func1({0xf8d8c0?, 0xc00079fc50?})
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/compaction.go:1217 +0xee
runtime/pprof.Do({0xf8d7a8?, 0x1573520?}, {{0xc0001282c0?, 0xc000c22488?, 0xc000ad6000?}}, 0xc000ef77a8)
	GOROOT/src/runtime/pprof/runtime.go:51 +0x9d
github.com/cockroachdb/pebble.(*DB).flush(0xc000ef77b8?)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/compaction.go:1210 +0x52
created by github.com/cockroachdb/pebble.(*DB).maybeScheduleFlush in goroutine 1779
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/compaction.go:1126 +0xdb

@RaduBerinde
Copy link
Member

decodeRemainingValueHandle doesn't check bounds, it's possible that's how we get a bogus value

@RaduBerinde
Copy link
Member

RaduBerinde commented Apr 29, 2024

A bunch of FFs would decode to the max uint32 value for the block num, and the invalidating iterator trashes values with FFs.

@cockroach-teamcity
Copy link
Member Author

github.com/cockroachdb/pebble/internal/metamorphic.TestMeta failed with artifacts on refs/heads/master @ 3a03bff01359:

        iter56.Last() // [false] <nil> #5753
        iter56.Prev("") // [false] <nil> #5754
        db1.Delete("emcgdnfhrh@24") // <nil> #5752
        db1.Set("rabdw@5", "ekhfwbqvyvdfikzev") // <nil> #5755
        db1.Merge("rkqfchse@116", "yokdbh") // <nil> #5756
        snap40.Get("nmojshlyiqv@75") // [""] pebble: not found #5757
        iter56.SetBounds("pzzhzcizw@32", "qhmhdkccm@32") // <nil> #5758
        iter56.SeekLT("qhmhdkccm@32", "") // [false] <nil> #5759
        iter56.Next("") // [false] <nil> #5760
        iter56.Prev("") // [false] <nil> #5762
        iter56.SetBounds("kbeetlkzq", "kbeetlkzq") // <nil> #5764
        iter56.SeekGE("kbeetlkzq", "") // [false] <nil> #5765
        iter56.Prev("") // [false] <nil> #5766
        iter56.Next("") // [false] <nil> #5767
        iter56.Last() // [false] <nil> #5770
        iter56.SeekLT("jnne@8", "cyblwdfcoofa@2") // [invalid] <nil> #5773
        iter56.Next("iwscglbval@7") // [invalid] <nil> #5777
        iter56.SeekGE("cgephctdqlw@8", "") // [false] <nil> #5781
        iter56.SetBounds("zzuhxmdao@125", "zzuhxmdao@125") // <nil> #5787
        iter56.SeekGE("zzuhxmdao@125", "") // [false] <nil> #5788
        iter56.Prev("") // [false] <nil> #5789
        iter56.Next("") // [false] <nil> #5790
        iter56.SeekGE("hfydhkzuh@3", "") // [false] <nil> #5791
        iter56.First() // [false] <nil> #5798
        iter56.SetBounds("kcnkl", "kcnkl") // <nil> #5809
        iter56.SeekLT("kcnkl", "") // [false] <nil> #5810
        iter56.Next("") // [false] <nil> #5811
        iter56.SeekLT("vbnkv@8", "") // [false] <nil> #5813
        iter56.SetBounds("zrcj", "zzuvnouykgp") // <nil> #5827
        iter56.SeekPrefixGE("jekglvprjfb") // [false] pebble: SeekPrefixGE supplied with key outside of lower bound #5836
        iter56.SetBounds("qslff@8", "qwmwizxws@4") // <nil> #5838
        iter56.SeekLT("qwmwizxws@4", "") // [false] <nil> #5839
        iter56.Next("") // [false] <nil> #5840
        iter56.SeekPrefixGE("gluov@7") // [false] pebble: SeekPrefixGE supplied with key outside of lower bound #5846
        iter56.SeekGE("jjmycast@6", "") // [false] <nil> #5847
        iter56.Prev("") // [false] <nil> #5849
        iter56.First() // [false] <nil> #5853
        iter56.First() // [false] <nil> #5855
        iter56.Prev("") // [false] <nil> #5856
        iter56.SeekPrefixGE("hevgjbpdm@5") // [false] pebble: SeekPrefixGE supplied with key outside of lower bound #5859
        iter56.Next("byglsmtwseyj") // [invalid] cannot use limit with prefix iteration #5862
        iter56.First() // [false] <nil> #5867
        db1.DeleteRange("jekglvprjfb", "khcrhgihpzf@47") // <nil> #5761
        db1.IngestExternalFiles(external0, "aocdjdzbqwe" /* start */, "aocdjdzcam" /* end */, "" /* syntheticSuffix */, "" /* syntheticPrefix */) // no-op #5763
        // 07:15:47.963 INFO: [JOB 430] ingesting: sstable created 000432
        // 07:15:47.963 INFO: [JOB 431] WAL created 000433
        // 07:15:47.963 INFO: [JOB 432] WAL created 000434
        
        To reduce:  go test ./internal/metamorphic -tags invariants -run 'TestMeta$' --run-dir /artifacts/meta/240430-071542.6773977009498/random-005 --try-to-reduce -v
        --- FAIL: TestMeta/execution/random-005 (0.92s)
Help

To reproduce, try:

go test -tags 'invariants' -exec 'stress -p 1' -timeout 0 -test.v -run 'TestMeta$' ./internal/metamorphic -seed 1714461342677962892 -ops "uniform:5000-10000"

This test on roachdash | Improve this report!

@jbowens
Copy link
Collaborator

jbowens commented Apr 30, 2024

It seems suspicious that the level checker clones the key but not the value:

		item.key = base.InternalKey{
			Trailer: l.iterKV.K.Trailer,
			UserKey: append(item.key.UserKey[:0], l.iterKV.K.UserKey...),
		}
		item.value = l.iterKV.V

but I'm not actually seeing the sequence where item.value is not iterKV.V and !item.KV.K.IsExclusiveSentinel() (which is a requirement to trying to read item.value)

@RaduBerinde
Copy link
Member

I can't find a sequence either.. we never Next a levelIter without updating the heap.. I'm trying to repro with MaybeWrapIfInvariants changed to always wrap, no luck yet.

@cockroach-teamcity
Copy link
Member Author

github.com/cockroachdb/pebble/internal/metamorphic.TestMeta failed with artifacts on refs/heads/master @ 8225791375fe:

        iter46.SetBounds("brcmzs@49", "cmjifhcfeghb@60") // <nil> #4719
        iter51.Last() // [true,"pbmsxbrcmzs@2","hun",<no range>] <nil> #4714
        iter47.SeekLT("arjcxfib@81", "") // [false] <nil> #4590
        iter51.Next("") // [false] <nil> #4717
        iter46.SeekLT("cmjifhcfeghb@60", "") // [true,"brvfjmmj",<no point>,["brvfjmmj","cmjifhcfeghb@60")=>{""="iedldakiwhrxyzl","@277"="mifxcmktznolyqnf","@185"="","@93"=""}*] <nil> #4720
        iter51.SetBounds("blts@231", "cmjifhcfeghb") // <nil> #4729
        iter46.Next("") // [false] <nil> #4721
        iter46.Next("") // [false] <nil> #4722
        iter51.SeekLT("cmjifhcfeghb", "") // [true,"brcmzscnhheeuu@268","wbyuiqjdz",<no range>] <nil> #4730
        iter46.Prev("") // [true,"brvfjmmj",<no point>,["brvfjmmj","cmjifhcfeghb@60")=>{""="iedldakiwhrxyzl","@277"="mifxcmktznolyqnf","@185"="","@93"=""}*] <nil> #4723
        iter52.SeekLT("rvnf@5", "ecgreya@74") // [valid,"ohxyt@187","",<no range>] <nil> #4724
        iter52.Prev("") // [true,"ofgoyiihkemm@283","linf",<no range>] <nil> #4725
        iter46.SeekLT("ucvyq@16", "") // [true,"brvfjmmj",<no point>,["brvfjmmj","cmjifhcfeghb@60")=>{""="iedldakiwhrxyzl","@277"="mifxcmktznolyqnf","@185"="","@93"=""}] <nil> #4735
        iter52.SeekPrefixGE("qxkeshh") // [false] pebble: SeekPrefixGE supplied with key outside of upper bound #4736
        iter52.SetBounds("ohxyt@141", "thmvm") // <nil> #4744
        iter46.SetBounds("brcmzs@49", "brcmzs@49") // <nil> #4745
        snap39.Get("laqabn") // [""] pebble: not found #4595
        iter47.Prev("") // [false] <nil> #4598
        iter47.SeekGE("uolg", "") // [false] <nil> #4602
        iter47.SeekGE("cmjifhcfeghb@49", "") // [true,"tbejmdic@257","zqrrwbacd",<no range>] <nil> #4610
        iter47.SeekLT("qhrrm@242", "") // [false] <nil> #4611
        iter47.SetBounds("ucvyq@162", "ucvyq@162") // <nil> #4617
        iter47.SeekLT("ucvyq@162", "") // [false] <nil> #4618
        iter47.Next("") // [false] <nil> #4619
        iter47.Prev("") // [false] <nil> #4620
        iter47.SeekGE("rvnf@70", "") // [false] <nil> #4623
        iter51.Next("") // [false] <nil> #4731
        iter51.Next("") // [false] <nil> #4732
        iter51.SeekLT("blts@75", "") // [false] <nil> #4743
        iter51.Last() // [true,"brcmzscnhheeuu@268","wbyuiqjdz",<no range>] <nil> #4751
        iter51.Prev("oiar@193") // [invalid] <nil> #4752
        iter51.First() // [true,"brcmzs","bl",<no range>] <nil> #4754
        iter46.SeekGE("brcmzs@49", "") // [false] <nil> #4746
        iter46.Prev("") // [false] <nil> #4747
        iter46.Next("") // [false] <nil> #4748
        iter52.SeekPrefixGE("laqabn@212") // [false] pebble: SeekPrefixGE supplied with key outside of lower bound #4750
        iter46.Next("") // [false] <nil> #4757
        iter52.SetBounds("ucvyq@12", "wtkfmfnlqpuj@327") // <nil> #4759
        iter52.SeekGE("ucvyq@12", "") // [true,"ucvyq@12","evugzxjzuvh",<no range>] <nil> #4760
        iter52.Next("") // [true,"uewki@267","hirlvqshe",<no range>] <nil> #4761
        // 07:23:33.531 INFO: [JOB 105] ingesting: sstable created 001049
        // 07:23:33.532 INFO: [JOB 106] WAL created 001050
        // 07:23:33.532 INFO: [JOB 107] flushing 1 memtable (215B) to L0
        // 07:23:33.532 INFO: [JOB 107] flushing: sstable created 001051
        // 07:23:33.533 INFO: [JOB 107] flushing: sstable created 001052
        // 07:23:33.535 INFO: [JOB 107] flushed 1 memtable (215B) to L0 [001051 001052] (2.0KB), in 0.0s (0.0s total), output rate 924KB/s
        // 07:23:33.537 INFO: [JOB 107] WAL deleted 001027
        
        To reduce:  go test ./internal/metamorphic -tags invariants -run 'TestMeta$' --run-dir /artifacts/meta/240503-072330.6991421749388/random-017 --try-to-reduce -v
        --- FAIL: TestMeta/execution/random-017 (1.87s)
Help

To reproduce, try:

go test -tags 'invariants' -exec 'stress -p 1' -timeout 0 -test.v -run 'TestMeta$' ./internal/metamorphic -seed 1714721010699391338 -ops "uniform:5000-10000"

This test on roachdash | Improve this report!

jbowens added a commit to jbowens/pebble that referenced this issue May 6, 2024
Recent refactors made it possible for the level checker to visit a point key
twice if the point key's levelIter interleaved a boundary sentinel key. The
corresponding level's simpleMergingIterItem's key and value were not updated,
leaving the old point key on the heap. When the point key remained at the top
of the heap, but the underlying levelIter had moved on, it was possible to
access freed memory associated with the old point key's value.

Fix cockroachdb#3556.
jbowens added a commit to jbowens/pebble that referenced this issue May 6, 2024
Recent refactors made it possible for the level checker to visit a point key
twice if the point key's levelIter interleaved a boundary sentinel key. The
corresponding level's simpleMergingIterItem's key and value were not updated,
leaving the old point key on the heap. When the point key remained at the top
of the heap, but the underlying levelIter had moved on, it was possible to
access freed memory associated with the old point key's value.

Fix cockroachdb#3556.
jbowens added a commit to jbowens/pebble that referenced this issue May 6, 2024
Recent refactors made it possible for the level checker to visit a point key
twice if the point key's levelIter interleaved a boundary sentinel key. The
corresponding level's simpleMergingIterItem's key and value were not updated,
leaving the old point key on the heap. When the point key remained at the top
of the heap, but the underlying levelIter had moved on, it was possible to
access freed memory associated with the old point key's value.

Fix cockroachdb#3556.
jbowens added a commit that referenced this issue May 6, 2024
Recent refactors made it possible for the level checker to visit a point key
twice if the point key's levelIter interleaved a boundary sentinel key. The
corresponding level's simpleMergingIterItem's key and value were not updated,
leaving the old point key on the heap. When the point key remained at the top
of the heap, but the underlying levelIter had moved on, it was possible to
access freed memory associated with the old point key's value.

Fix #3556.
@jbowens jbowens moved this to Done in [Deprecated] Storage Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants