Encoding errors when using a dictionary using zstdmt #944

mestia · 2017-12-08T16:16:51Z

Dear zstd team,
There seem to be a problem with zstd decoding as described in this bug:
#883816
I see the same problem on my system.

When using a pre-made dictionary, zstdmt will generate corrupted files:

[guus@haplo]/dev/shm/corpus>zstd --rm -D ../dictionary *
[guus@haplo]/dev/shm/corpus>zstd --rm -D ../dictionary -d *
[guus@haplo]/dev/shm/corpus>zstdmt --rm -D ../dictionary *
[guus@haplo]/dev/shm/corpus>zstdmt --rm -D ../dictionary -d *
ar,S=18008914:2,.zst : Decoding error (36) : Corrupted block detected
ar,S=6386609:2,S.zst : Decoding error (36) : Corrupted block detected
ar,S=6382007:2,S.zst : Decoding error (36) : Corrupted block detected
[...]
[guus@haplo]/dev/shm/corpus>zstd --rm -D ../dictionary -d *.zst
ar,S=18008914:2,.zst : Decoding error (36) : Corrupted block detected
ar,S=6386609:2,S.zst : Decoding error (36) : Corrupted block detected
ar,S=6382007:2,S.zst : Decoding error (36) : Corrupted block detected
[...]

Not all files are corrupt, only 1% has problems.

thank you!

The text was updated successfully, but these errors were encountered:

Cyan4973 · 2017-12-08T16:48:39Z

We would need a reproduction case to investigate.
I tried this test setup using multiple test files, but none of them failed so far.
It likely requires specific files to show up.

gsliepen · 2017-12-09T12:11:50Z

I'd like to, but the files I tried it on is my personal mailbox, so you can imagine that I do not want to share that. I could not reproduce it on maildirs from public mailing lists such as debian-devel. However, I did notice that it is always the largest files that are corrupted. Out of ~1500 files, 1000 of them are between 1 and 10 kilobyte in size, then 400 of them are between 10 and 100 kilobyte in size, and then there's the rest which goes up to 25 megabytes in size. The debian-devel mailing list doesn't have such a distribution, the largest file there is only 90 kilobytes.

Cyan4973 · 2017-12-11T07:56:56Z

I've been testing again this setup with an extended test set containing large files (> 10 MB),
but none of them failed so far.
I'm afraid I'll need a reproduction case to be able to investigate.

Correction : I was testing under dev branch.
Switching to v1.3.2, like the debian test, I can observe the bug, specifically when using zstdmt compression with dictionary on large files.
It seems the bug is already fixed in dev branch, as I cannot reproduce it using same files and dictionary. It's fairly hard to determine which change exactly helps, but I would typically look in this direction.

I would recommend testing the dev branch on the test set where you observed the issue.

In the meantime, I will also put a warning notice on v1.3.2.

gsliepen · 2017-12-11T20:27:33Z

I've managed to create a testcase I feel comfortable in sharing, that reproduces the issue 100% of the time on my machine. SInce it's too large to attach to a GitHub issue, I've temporarily made it available here:

http://tinc-vpn.org/temp/zstd-testcase.tar.xz

Note, I get this issue on a CPU with 6 cores, 12 threads.

Cyan4973 · 2017-12-11T21:21:50Z

Thanks for the reproduction case @gsliepen.
I could confirm the diagnosis : v1.3.2 fails on a few files (about a dozen, all > 4 MB),
while dev branch version works just fine for all files.
Therefore, a spurious fix seems already integrated, and will be part of next release.

In the meantime, if you urgently need a quick fix, you could select one of these options :

if using cli : compress files one by one, since the issue only shows up when multiple files are compressed in the same command.
switch to the dev branch, since the bug is no longer triggered there
disable multi-threading, since it's only useful for large files (> several MB)
disable dictionary, since it's only effective for small files (< several KB), while its impact is negligible for large files
route messages to use either dictionary (when they are small) or multi-threading (when they are large), but not both at the same time.

terrelln · 2017-12-11T22:27:20Z

git bisect with this script says that it is fixed by commit fc8d293 from PR #891.

mestia · 2017-12-12T10:32:31Z

Thank you for looking on it and for the test case. As far as I see the fix from commit fc8d293 doesn't solve the issue. I still get "Corrupted block detected" messages with the test case provided by @gsliepen.

terrelln · 2017-12-12T19:18:52Z

Thanks for testing @mestia! We believe we figured out the issue, but hadn't posted here yet. Commit fc8d293 merely hides the issue when compressing a single file, but it remains when compressing multiple files.

When compressing with multiple threads and with a dictionary, we can confuse the window size, and use a larger window size for the second chunk than the first. Since the window size of the first chunk is what is written in the frame, the second chunk can have too-large offsets.

The dev branch has hidden this issue, but the root cause isn't solved yet. @Cyan4973 is looking into it.

If you have corrupted data that you need to recover you can interpret the frame header using the zstd format, change the window size to be larger, then you should be able to decompress the data successfully. You can verify you got the header right with zstd -lv file.zst. The compressed frame contains a checksum by default so you can be confident it is correct (you can check with zstd -l file.zst).

Cyan4973 · 2017-12-12T19:29:33Z

We are considering creating a new release of zstd later in the week, once the deeper root cause is fully investigated and fixed. This bug requires several conditions to be assembled in order to show up, and indeed, the current dev branch masks the issue by making it impossible to generate all these conditions together from the cli. But it doesn't mean the root cause is properly tackled.
More to come after investigation.

in zstreamtest.

This reverts commit 5098d1f.

this test fails on v1.3.2

Fix #944

gsliepen · 2017-12-14T19:03:51Z

On Thu, Dec 14, 2017 at 06:02:02PM +0000, Yann Collet wrote: Closed #944 via e28305f.

I can confirm that that fixed the issue. Thanks!

…

-- Met vriendelijke groet / with kind regards, Guus Sliepen <[email protected]>

Cyan4973 added the bug label Dec 11, 2017

Cyan4973 added a commit that referenced this issue Dec 12, 2017

Created a test case which reliably reproduces bug #944

5098d1f

in zstreamtest.

Cyan4973 added a commit that referenced this issue Dec 12, 2017

Revert "Created a test case which reliably reproduces bug #944"

8a104fd

This reverts commit 5098d1f.

Cyan4973 added a commit that referenced this issue Dec 13, 2017

added cli test veryfying bug #944

98ccc11

this test fails on v1.3.2

Cyan4973 mentioned this issue Dec 13, 2017

Fix #944 #947

Merged

Cyan4973 closed this as completed in e28305f Dec 14, 2017

Cyan4973 added a commit that referenced this issue Dec 14, 2017

Merge pull request #947 from facebook/fix944

c005df1

Fix #944

Cyan4973 mentioned this issue Dec 15, 2017

Release v1.3.3 #954

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoding errors when using a dictionary using zstdmt #944

Encoding errors when using a dictionary using zstdmt #944

mestia commented Dec 8, 2017

Cyan4973 commented Dec 8, 2017 •

edited

Loading

gsliepen commented Dec 9, 2017

Cyan4973 commented Dec 11, 2017 •

edited

Loading

gsliepen commented Dec 11, 2017

Cyan4973 commented Dec 11, 2017 •

edited

Loading

terrelln commented Dec 11, 2017

mestia commented Dec 12, 2017 •

edited

Loading

terrelln commented Dec 12, 2017

Cyan4973 commented Dec 12, 2017 •

edited

Loading

gsliepen commented Dec 14, 2017 via email

Encoding errors when using a dictionary using zstdmt #944

Encoding errors when using a dictionary using zstdmt #944

Comments

mestia commented Dec 8, 2017

Cyan4973 commented Dec 8, 2017 • edited Loading

gsliepen commented Dec 9, 2017

Cyan4973 commented Dec 11, 2017 • edited Loading

gsliepen commented Dec 11, 2017

Cyan4973 commented Dec 11, 2017 • edited Loading

terrelln commented Dec 11, 2017

mestia commented Dec 12, 2017 • edited Loading

terrelln commented Dec 12, 2017

Cyan4973 commented Dec 12, 2017 • edited Loading

gsliepen commented Dec 14, 2017 via email

Cyan4973 commented Dec 8, 2017 •

edited

Loading

Cyan4973 commented Dec 11, 2017 •

edited

Loading

Cyan4973 commented Dec 11, 2017 •

edited

Loading

mestia commented Dec 12, 2017 •

edited

Loading

Cyan4973 commented Dec 12, 2017 •

edited

Loading