forked from exynos-linux-stable/starlte
-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
zstd: Improve fast huffman decoding speed in linux kernel
gcc in the linux kernel was not unrolling the inner loops of the Huffman decoder, which was destroying decoding performance. The compiler was generating crazy code with all sorts of branches. I suspect because of Spectre mitigations, but I'm not certain. Once the loops were manually unrolled, performance was restored. Additionally, when gcc couldn't prove that the variable left shift in the 4X2 decode loop wasn't greater than 63, it inserted checks to verify it. To fix this, mask `entry.nbBits & 0x3F`, which allows gcc to eliete this check. This is a no op, because is guaranteed to be less than 64. Lastly, introduce the `HUF_DISABLE_FAST_DECODE` macro to disable the fast C loops for Issue facebook/zstd#3762 . So if even after this change, there is a performance regression, users can opt-out at compile time.
- Loading branch information
1 parent
d33530b
commit 4cbf3d8
Showing
1 changed file
with
105 additions
and
66 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters