Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support zstd compression (port of Allan Judes patch from FreeBSD) #8044

Closed
wants to merge 0 commits into from

Conversation

BrainSlayer
Copy link
Contributor

@BrainSlayer BrainSlayer commented Oct 19, 2018

This Patch adds zstd compression support zo ZFS

Note:
this is a rework of the original pull request to fullfill the requirement of only offering a single patch. unfortunatly it seems that i have to spend alot of work every day to maintain this pull requests since it gets in conflict with the upstream master quickly so it will evolve quickly and will change usually every day.

Signed-off-by: Sebastian Gottschall [email protected]

Motivation and Context

Description

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@gmelikov gmelikov added the Status: Work in Progress Not yet ready for general review label Oct 19, 2018
@BrainSlayer
Copy link
Contributor Author

@gmelikov looks like the remaining test fails i have seen are common and unrelated to the patch. correct? i have seen the same results in other push requests. so i want to keep the current version as is if no other change requests are required

@gmelikov
Copy link
Member

@BrainSlayer sometimes there are non-always failing tests, you can check this page for them http://build.zfsonlinux.org/known-issues.html

@BrainSlayer
Copy link
Contributor Author

BrainSlayer commented Oct 19, 2018

@gmelikov : i just reviewed every assertion or fail in the logs and did not found any related bugs in this patch. its just about if my work is done for now or if there is still a task open i have to fullfill. personally i was running alot of tests with copying millions of smaller and bigger files to a storage and back including content verification with zstd enabled

@behlendorf
Copy link
Contributor

behlendorf commented Oct 19, 2018

For reference, this PR is based on https://reviews.freebsd.org/D11124 from FreeBSD.

@BrainSlayer I should have been clearer. The requirement isn't strictly to keep the change to one commit, you should absolutely add follow up commits to address bugs or review feedback. But since the CI will build each commit in the branch, and test the top most, we'd prefer not to overwhelm the bots by avoided branches with dozens or hundreds of small commits. Once reviewed and approved we'll squash all the commits before final integration.

Based solely on the CI results it does look like this PR is ready for a first round of review feedback. To help the reviewers, can you provide some additional information.

  • Exactly what commit is this version based on, I'm assuming https://reviews.freebsd.org/D11124?id=45344. We don't want to miss any follow up bug fixes or other changes.
  • Please describe briefly where this patch needed to diverge from its FreeBSD counterpart and why.
  • Are there any known outstanding bugs or future work? The last I heard there's still the compressed L2ARC issue, as well as potential issues with no-op write and dedup.
  • Testing, can you elaborate on how this has been tested. It doesn't look like any of the existing ZFS Test Suite tests have been updated to use zstd, nor have new tests been added. We're going to need to add test coverage and I'm sure @allanjude would welcome to patch which added this.

@behlendorf behlendorf added Status: Code Review Needed Ready for review and testing Type: Feature Feature request or new feature and removed Status: Work in Progress Not yet ready for general review labels Oct 19, 2018
@BrainSlayer
Copy link
Contributor Author

your assumption is correct. i picked the last version i found as reference for my port which can be found at https://reviews.freebsd.org/D11124?id=45344. the code is not much different from the original freebsd patch, but i needed to adjust the patch to fit into zfsonlinux which seem to use a very different codebase for zfs.

for instance the function arc_cksum_is_equal does not exist in zfsonlinux. so i adjusted basicly all occurences of zio_compress_data in zfs to fit the new api which was introduced with this patch

in addition i fixed issues i found while reviewing and running the zfs test suite. ztest may write bogus compression algorithm values which triggers assertions in zfs using this patch. so changed the code a little bit to handle such bogus values. these changes still can be see in the incremental patch set, but i was requested to merge them to a single patch (btw. it wasnt you who requested this explicit but another guy did. but you just helped me to handle it)
i fixed also a major problem in the freebsd patch because the freebsd patch crashes if zstd is not used on fs but another algorithm. this was just a minor fix. the freebsd patch did simply not handle the compression attributes correct.
this patch does also support all zstd compression levels and not just 1 - 19 (1-22 are available in total). so people may also play with ultra compression settings. in addition i used the latest zstd code which has better performance and compression ratio than the in kernel variant of freebsd. the linux in kernel variant cannot be used since its too old and lacks of required features. the zstd code had also be adjusted since it used reserved words for variables which collided with linux macros. this is basicly just renaming hell. (all variables named current are renamed to c_current)

outstanding bugs:
i'm running realistic fs load tests the last days on memory restricted devices to uncover usual leaks or problems. so far it looks good and i found no data errors. i also was running in it combination with dedup and from my point of view i see no conflict with deduplication just from the structure it works. it basicly doesnt work much different from lz4 or zlib standpoint, but feel free to give me a hint where i should look for.
about the l2arc problem. if i have seen in correct allan did write already a solution. this has also been noted on the freebsd patch site. so i dont know if this is still valid.
noop writes should be filteres since zero blocks are not compressed like its done for lz4 and other algorithms.

future work may happen if i see potential for enhancements like newer zstd versions with even better performance or better compression and of course i will track the progress at freebsd side.

testing. yes i did not add any test suites, but there was a reason for. all other algorithms have also just basic or no tests. but i can adapt the some of the original tests to use zstd explicit instead of lz4 . the perf test looks promising here. i can do this till tomorrow

Copy link
Contributor

@richardelling richardelling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update get_compstring() in dbufstat.py

@BrainSlayer
Copy link
Contributor Author

@richardelling done. thx

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've given this a first look and included some feedback and questions.

Makefile.am Outdated Show resolved Hide resolved
include/sys/spa.h Outdated Show resolved Hide resolved
lib/libzpool/Makefile.am Outdated Show resolved Hide resolved
lib/libzpool/Makefile.am Outdated Show resolved Hide resolved
module/Makefile.in Outdated Show resolved Hide resolved
module/zfs/zstd.c Outdated Show resolved Hide resolved
module/zfs/zstd/common/compiler.h Outdated Show resolved Hide resolved
module/zfs/zstd/common/entropy_common.c Outdated Show resolved Hide resolved
module/zfs/zstd/freebsd/stddef.h Outdated Show resolved Hide resolved
module/zfs/zstd/zstd.h Outdated Show resolved Hide resolved
@BrainSlayer BrainSlayer force-pushed the master branch 5 times, most recently from 095b810 to cd01b13 Compare October 22, 2018 06:44
@BrainSlayer
Copy link
Contributor Author

since github had technical issues tonight, most of the tests are broken since the test scripts failed to download

include/spl/sys/kmem_cache.h Outdated Show resolved Hide resolved
include/sys/arc_impl.h Outdated Show resolved Hide resolved
include/sys/zio_compress.h Outdated Show resolved Hide resolved
include/sys/zio_compress.h Outdated Show resolved Hide resolved
module/zstd/common/zstd_common.c Outdated Show resolved Hide resolved
module/zstd/zstd.c Outdated Show resolved Hide resolved
module/zstd/zstd.c Outdated Show resolved Hide resolved
module/zstd/zstd.c Outdated Show resolved Hide resolved
module/zstd/zstd.c Outdated Show resolved Hide resolved
module/zstd/zstd.c Outdated Show resolved Hide resolved
@BrainSlayer
Copy link
Contributor Author

@behlendorf are there any further change requests?

@behlendorf
Copy link
Contributor

@BrainSlayer I'm not going to have time to give this a careful review until next week. But what would be very helpful is if you could include some performance/compression results comparing zstd, to at least lz4.

@BrainSlayer
Copy link
Contributor Author

@behlendorf i will do a realistic benchmark after i got my new testsetup installed. waiting for some hdds. but consider that allen already posted such a comparisation in his original bsd patch

@BrainSlayer
Copy link
Contributor Author

BrainSlayer commented Oct 27, 2018

@behlendorf i was running several tests today and repeated them multiple times since the result was strange and too good to be true but now i'm sure its correct. zstd does indeed outperform lz4 in performance. i copied a single uncompressed tar file with 12 gb size from ram cache to a zvolume ssd drive which took 52 seconds for lz4, but just 46 seconds for zstd. this was always reproduceable (zlib took 1 minute and 48 seconds). i assume the lz4 compression in zfsonlinux has a disadvantage since i force to compile the zstd lib codes with -O3 unlike the standard lz4 implementation used by zfs which uses the kernel compile optimization

so in any way. the result is really impressive. i also copied the content of the tar file which has about 750k files in it. same result. zstd was faster.
this took 1,28 minutes with lz4 and 1,15 minutes with zstd. so same behaviour.

so with current implementation zstd does outperform even lz4 in performance and compression ratio. it may be that zstd scales better on high performance cpus for some reason which could also be the cause of the result.

since this tar file contains alot of c source files lz4 was also very effective with compressing. it has a ratio of 2.07. zstd reaches a ratio of 2.61

i also tested zstd with deduplication to verify your comment that there might be a problem with it. but there is not. deduplication works with zstd without any issue

here the dump of one of these test runs

zpool destroy lz4test
zpool create -f -m "/lz4test" lz4test /dev/sdc
zfs set compression=zstd lz4test
time cp testfile.tar /lz4test

real 0m45.093s
user 0m0.072s
sys 0m4.752s
zfs get all|grep compres
lz4test compressratio 2.61x -
lz4test compression zstd local
lz4test refcompressratio 2.61x -

zpool destroy lz4test
zpool create -f -m "/lz4test" lz4test /dev/sdc
zfs set compression=lz4 lz4test
time cp testfile.tar /lz4test

real 0m53.159s
user 0m0.176s
sys 0m9.396s

zfs get all|grep compres
lz4test compressratio 2.07x -
lz4test compression lz4 local
lz4test refcompressratio 2.07x

ah any by the way. the result without compression
time cp testfile.tar /lz4test

real 1m3.586s
user 0m0.288s
sys 0m12.928s

and if i store incompressible files (gziped tar file) there is no measureable difference between lz4 and zstd. performance is identical
so personally i hope zstd gets merged of course -

@greg-hydrogen
Copy link

@BrainSlayer by any chance did you measure the time for copying files out of the lz4 and zstd volumes? one of the big value adds for lz4 is that the decompression is so fast, just wondering how zstd measured up. I know the standalone version can use as many CPU cores as directed, which really speeds things up over lz4

@allanjude
Copy link
Contributor

@greg-hydrogen Splitting up over CPU cores happens automatically in ZFS, since it compresses each record (128kb by default) separately, and cores-1 of them concurrently.

Here are my slides from the ZFS User Conference: https://docs.google.com/presentation/d/1yDhE2CaTfx6i1fqol_YbjHWcJa6xKhVUo-R3RrLVUI4/edit?usp=sharing

Generally, zstd decompresses are around 800-900 MB/sec/core, whereas LZ4 is ~ 2500MB/sec/core
zstd-fast modes are faster, eventually out-performing lz4 are level -50, but then the compression is not as good as LZ4.

@allanjude
Copy link
Contributor

@behlendorf i was running several tests today and repeated them multiple times since the result was strange and too good to be true but now i'm sure its correct. zstd does indeed outperform lz4 in performance.

and if i store incompressible files (gziped tar file) there is no measureable difference between lz4 and zstd. performance is identical
so personally i hope zstd gets merged of course -

For some of these results, it can just be the fact that higher compression ratios mean you write less data, so it takes less time.

zstd tends to out-compress lz4 as low as level negative 3. It almost always comes at a higher cost, but if you have spare CPU cycles, that cost is likely acceptable. On the good side, even at high compression levels like +15, where zstd drops to a speed of 10MB/sec/core, decompression speed stays at nearly 1 GB/sec/core (on a 4ghz processor)

zstd is always a better option than gzip. With the new negative levels, it can even compete with LZ4 for speed.

@BrainSlayer
Copy link
Contributor Author

@allanjude @behlendorf the reason why it outperforms lz4 is mainly the limited write speed on the physical layer in my case. even if its a pcie ssd i used it seems that this device is still limited to about 250 mb/s writing speed for some reason. but i configured 2 of these cards to a zfs stripe and even then zstd was still faster. it was a 3 ghz 4 core xeon E3-1220 by the way.

here are the results for reading. in that configuration lz4 is slightly faster. but the difference is very thin.

lz4:
echo 3 > /proc/sys/vm/drop_caches
time cp testfile.tar /dev/null
real 1m10.747s
user 0m0.172s
sys 0m8.768s

zstd:
echo 3 > /proc/sys/vm/drop_caches
time cp testfile.tar /dev/null

real 1m13.753s
user 0m0.124s
sys 0m11.064s

@BrainSlayer
Copy link
Contributor Author

BrainSlayer commented Jun 16, 2019 via email

@BrainSlayer
Copy link
Contributor Author

BrainSlayer commented Jun 16, 2019 via email

@c0d3z3r0
Copy link
Contributor

i cannot take it out of the PR unless its merged to master.

You need to rebase the vmem PR, too. So focus on that now until @allanjude updated his branches

@c0d3z3r0
Copy link
Contributor

... and you should squash your commits here and find some appropriate description for one or a few commits

@BrainSlayer
Copy link
Contributor Author

BrainSlayer commented Jun 16, 2019 via email

@BrainSlayer
Copy link
Contributor Author

BrainSlayer commented Jun 16, 2019 via email

@c0d3z3r0
Copy link
Contributor

I just realized you have many commits where the commit message does not fit the content...
In "vmem allocation pool (experimental)" for example you change compression stuff which doesn't have to do anything with vmem...

Maybe it would be better to squash everything to exactly one commit... @behlendorf @allanjude what do you say?

@BrainSlayer
Copy link
Contributor Author

due the heavy upstream changes recently. i merged now all to a single new commit and hopefully i resolved all new conflicts in a correct new

@@ -343,7 +343,7 @@ def get_compstring(c):
"ZIO_COMPRESS_GZIP_6", "ZIO_COMPRESS_GZIP_7",
"ZIO_COMPRESS_GZIP_8", "ZIO_COMPRESS_GZIP_9",
"ZIO_COMPRESS_ZLE", "ZIO_COMPRESS_LZ4",
"ZIO_COMPRESS_FUNCTION"]
"ZIO_COMPRESS_ZSTD", "ZIO_COMPRESS_FUNCTION"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't new stuff get added at the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. ZIO_COMPRESS_FUNCTION is the end definition of the array and must stay at the end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. my fault....


#define ZIO_COMPRESS_DEFAULT ZIO_COMPRESS_OFF

#define BOOTFS_COMPRESS_VALID(compress) \
((compress) == ZIO_COMPRESS_LZJB || \
(compress) == ZIO_COMPRESS_LZ4 || \
(compress) == ZIO_COMPRESS_ZSTD || \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, move new stuff to the end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hat least before ON and OFF. this has been just merged from alan's code 1:1. will change it

@@ -51,15 +51,75 @@ enum zio_compress {
ZIO_COMPRESS_GZIP_9,
ZIO_COMPRESS_ZLE,
ZIO_COMPRESS_LZ4,
ZIO_COMPRESS_ZSTD,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once again, move zstd to the end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope. ZIO_COMPRESS_FUNCTIONS is the terminator for the enum array. eveything after the last entry will be ignored. so after LZ4 is correct

* ZFS supports three different flavors of compression -- gzip, lzjb, and
* zle. Compression occurs as part of the write pipeline and is performed
* in the ZIO_STAGE_WRITE_BP_INIT stage.
* ZFS supports five different flavors of compression -- gzip, lzjb, lz4, zle,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe such fixes should go to a separate PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? its just documentation. and here zstd was added. so its part of the patch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it's a fix which is not related to zstd. let's see what the others say. maybe this is really over-particular ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??? it just adds the line for zstd compression support. which is of course part of the patch. by the way. this documentation update was also taken from alans original bsd code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • ZFS supports five different flavors of compression -- gzip, lzjb, lz4, zle,
  • and zstd. Compression occurs as part of the write pipeline and is
  • performed in the ZIO_STAGE_WRITE_BP_INIT stage.

so mentioning the zstd compression support is part of it

@@ -12,11 +17,16 @@ AM_CFLAGS += $(NO_UNUSED_BUT_SET_VARIABLE)
# Includes kernel code generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)

AM_CFLAGS += -DLIB_ZPOOL_BUILD
AM_CFLAGS += -DLIB_ZPOOL_BUILD -Wframe-larger-than=32768
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will conflict with line 13/18

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be 2048. this is a requirement for the zstd code. usually it was messed up to by your clean up procedure. in the early versions is had 32768, but later i changed it to 2048 which was a safe value for all architecures i tested. i can remove it, but i may raise up warnings depending on the archticture and compiler version. default seem to be 4096 now. according to the configure script. so usually its obsolete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, what I meant was that you will have frame-larger-than be set twice and thus overwritten regardless of $(FRAME_LARGER_THAN)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have seen it. i removed this line in latest code now

@BrainSlayer
Copy link
Contributor Author

now github fucked up everything and closed the PR. man that kills me

@lnicola
Copy link
Contributor

lnicola commented Jun 20, 2019

now github fucked up everything and closed the PR. man that kills me

Try git reflog when in need.

@c0d3z3r0
Copy link
Contributor

now github fucked up everything and closed the PR. man that kills me

You pushed without your commit... that's why gh closed this PR

@BrainSlayer
Copy link
Contributor Author

reflog shows a mess of changes. are you able to help here? otherwise i have to reopen the request with a new one

@BrainSlayer
Copy link
Contributor Author

my fork still has all changed correct. so i dont know to fix this here

@lnicola
Copy link
Contributor

lnicola commented Jun 20, 2019

I mentioned git reflog just in case you've lost your good version of the code. If you didn't, and can't make this work (I can see your commit in your fork), GitHub probably bugged out, and it's probably fine to open a new PR.

@BrainSlayer
Copy link
Contributor Author

#8941

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing Type: Feature Feature request or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants