Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Commit

Permalink
lib/lzo: fix ambiguous encoding bug in lzo-rle
Browse files Browse the repository at this point in the history
In some rare cases, for input data over 32 KB, lzo-rle could encode two
different inputs to the same compressed representation, so that
decompression is then ambiguous (i.e.  data may be corrupted - although
zram is not affected because it operates over 4 KB pages).

This modifies the compressor without changing the decompressor or the
bitstream format, such that:

- there is no change to how data produced by the old compressor is
  decompressed

- an old decompressor will correctly decode data from the updated
  compressor

- performance and compression ratio are not affected

- we avoid introducing a new bitstream format

In testing over 12.8M real-world files totalling 903 GB, three files were
affected by this bug.  I also constructed 37M semi-random 64 KB files
totalling 2.27 TB, and saw no affected files.  Finally I tested over files
constructed to contain each of the ~1024 possible bad input sequences; for
all of these cases, updated lzo-rle worked correctly.

There is no significant impact to performance or compression ratio.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dave Rodgman <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Dave Rodgman <[email protected]>
Cc: Willy Tarreau <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Markus F.X.J. Oberhumer <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
  • Loading branch information
daverodgman authored and sfrothwell committed May 28, 2020
1 parent 0c84de7 commit 76a76ac
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 2 deletions.
8 changes: 6 additions & 2 deletions Documentation/lzo.txt
Original file line number Diff line number Diff line change
Expand Up @@ -159,11 +159,15 @@ Byte sequences
distance = 16384 + (H << 14) + D
state = S (copy S literals after this block)
End of stream is reached if distance == 16384
In version 1 only, to prevent ambiguity with the RLE case when
((distance & 0x803f) == 0x803f) && (261 <= length <= 264), the
compressor must not emit block copies where distance and length
meet these conditions.

In version 1 only, this instruction is also used to encode a run of
zeros if distance = 0xbfff, i.e. H = 1 and the D bits are all 1.
zeros if distance = 0xbfff, i.e. H = 1 and the D bits are all 1.
In this case, it is followed by a fourth byte, X.
run length = ((X << 3) | (0 0 0 0 0 L L L)) + 4.
run length = ((X << 3) | (0 0 0 0 0 L L L)) + 4

0 0 1 L L L L L (32..63)
Copy of small block within 16kB distance (preferably less than 34B)
Expand Down
13 changes: 13 additions & 0 deletions lib/lzo/lzo1x_compress.c
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,19 @@ lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
*op++ = (M4_MARKER | ((m_off >> 11) & 8)
| (m_len - 2));
else {
if (unlikely(((m_off & 0x403f) == 0x403f)
&& (m_len >= 261)
&& (m_len <= 264))
&& likely(bitstream_version)) {
// Under lzo-rle, block copies
// for 261 <= length <= 264 and
// (distance & 0x80f3) == 0x80f3
// can result in ambiguous
// output. Adjust length
// to 260 to prevent ambiguity.
ip -= m_len - 260;
m_len = 260;
}
m_len -= M4_MAX_LEN;
*op++ = (M4_MARKER | ((m_off >> 11) & 8));
while (unlikely(m_len > 255)) {
Expand Down

0 comments on commit 76a76ac

Please sign in to comment.