Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detecting truncated streams #1109

Closed
cemeyer opened this issue Apr 20, 2018 · 4 comments
Closed

Detecting truncated streams #1109

cemeyer opened this issue Apr 20, 2018 · 4 comments

Comments

@cemeyer
Copy link
Contributor

cemeyer commented Apr 20, 2018

Hi,

zlib has a nice feature where the decompressor (inflate) returns Z_STREAM_END at the end of the stream. This works when zlib is used with or without the gzip framing (an "undocumented" feature accessed by passing a negative window size). The method is not zlib-specific and is a part of the DEFLATE standard: a special bit BFINAL is set on the final frame. (In zlib it is referred to as inflate_state::last.)

This is useful for determining if a DEFLATE stream has been truncated. Without the bit, it is impossible to determine whether a stream ending at a frame boundary is truncated or not. It would be nice if Zstd provided similar functionality. I may be mistaken, but as far as I can tell Zstd does not expose truncated vs not to the user via ZSTD_decompressStream() (and may lack the metadata, although I have not investigated).

@terrelln
Copy link
Contributor

terrelln commented Apr 20, 2018

ZSTD_decompressStream() returns 0 exactly when the frame is completely decoded.

@cemeyer
Copy link
Contributor Author

cemeyer commented Apr 20, 2018

@terrelln Right. But a stream may have multiple frames, right? It returns 0 every time a frame is completely decoded — not just for the final frame. Maybe I misunderstand?

@terrelln
Copy link
Contributor

The zlib stream is a zstd frame.

Generally a stream contains a single frame. A zstd frame ends when ZSTD_compressEnd() is called and its output is flushed. Inside the frame there can be multiple blocks, and ZSTD_decompressStream() will return 0 when the frame is completely decoded.

Just like zlib can have multiple streams back to back, zstd can have multiple frames back to back, but in each case the streams/frames are independent. In the case of multiple zlib streams concatenated together, zlib will return Z_STREAM_END more than once. Zstd does the exact same thing. Each time you end a frame with ZSTD_compressEnd(), ZSTD_decompressStream() will return 0 exactly once.

@cemeyer
Copy link
Contributor Author

cemeyer commented Apr 20, 2018

Ah, perfect! I was confused by the terminology gap. 😅 With the key "zlib stream = zstd frame; zlib frame = zstd block," your initial response makes sense. Sorry about the noise! 🤭

It looks like ZSTD_compressEnd() is invoked via ZSTD_endStream -> ZSTD_compressStream_generic in my use. Cool.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants