Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ZSTD compression to ZFS (do not merge, testing branch for PR:10278) #10277

Closed
wants to merge 14 commits into from

Conversation

BrainSlayer
Copy link
Contributor

@BrainSlayer BrainSlayer commented May 1, 2020

This PR introduces ZSTD compression to ZFS.

this is a reopen of the original pull request, since the previous PR was closed by its owner which efforts of the last years

Motivation and Context

ZStandard is a modern, high performance, general compression algorithm originally developed by Yann Collet (the author of LZ4). It aims to provide similar or better compression levels to GZIP, but with much better performance. ZStandard provides a large selection of compression levels to allow a storage administrator to select the preferred performance/compression trade-off.

Description

ZSTD Feature
This PR adds the new compression types: zstd and zstd-fast, which is basically a faster "negative" level of zstd.

Available compression levels for zstd are zstd-1 through zstd-19. In addition, faster levels supported by zstd-fast are 1-10, 20-100 (in increments of 10), 500 and 1000. As zstd-fast is basically a "negative" level of zstd, it gets faster with every level in stark contrast with "normal" zstd which gets slower with every level.

Rather than the approach used when gzip was added, of using a separate compression type for each different level, this PR added a new hidden property, compression_level. All of the zstd levels use the single entry in the zio_compress enum, since all levels use the same decompression function.

However, in some cases, it was necessary to know what level a block was compressed within other parts of the code, so the compression level is stored on disk inline with the data, after the compression header (uncompressed size). The level is plumbed through arc_buf_hdr_t, objset_t, and zio_prop_t. There may still be additional places where this needs to be added.

This new property is hidden from the user, and the user simply does zfs set compress=zstd-10 dataset and zfs_ioctl.c splits this single value into the two values to be stored in the dataset properties. This logic has been extended to Channel Programs as well.

How Has This Been Tested?

The following testing has been performed:

Performance testing
Extra care has been taken to guarantee performance and reliability.

Compression performance using a custom development script by @Ornias.
Repeatability of the test is confirmed by @dan-and.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • I have run the ZFS Test Suite with this change applied.
  • All commit messages are properly formatted and contain Signed-off-by.

@rlaager
Copy link
Member

rlaager commented May 1, 2020

Was the old PR farther along? If so, why not use that code in this PR and adjust from there?

@BrainSlayer
Copy link
Contributor Author

Was the old PR farther along? If so, why not use that code in this PR and adjust from there?

the old PR was not up to date and was not in sync with master and was based on my code anyway. this one here is up to date with all changes from the old PR. i was always constantly maintaining my own git based on all rework efforts

@BrainSlayer
Copy link
Contributor Author

but some things like the location of the zstd lib is different. i will review this and import it from the last PR

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented May 1, 2020

@rlaager of course it was. but leadership has zero interest in it. read the latest comments. #9735

and it's a shame what @BrainSlayer is doing here. we had a clean patch series in #9735 with everything like copyright got right. you really should stop claiming this is all your work -.-

I doubt that this will ever get merged but if you really want to wait again 2 years or more, copy that old PR and start from there.

@BrainSlayer BrainSlayer force-pushed the master branch 5 times, most recently from 6fe7636 to 60fe768 Compare May 1, 2020 09:15
@PrivatePuffin

This comment has been minimized.

@BrainSlayer
Copy link
Contributor Author

@c0d3z3r0 dont tell me what todo or not. i said already that i'm reviewing all changes made in the previous PR and merging it step by step including copyright notices of course. but right now i'm focusing to get the source running again here including freebsd etc. since i'm activily using zfs with zstd since a long time i need to follow up here to ensure its working and stable and no i cannot directly pull the previous work since its broken and not compatible with upstream unlike my version which i maintained over the months. but my changes never got merged into the previous PR. so i start in the opposite direction and merge the required changes of the previous PR into my tree

@PrivatePuffin

This comment has been minimized.

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented May 1, 2020

but my changes never got merged into the previous PR

This is not true. They are.

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented May 1, 2020

Let's call it "leadership fail." lol

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented May 1, 2020

@BrainSlayer you are not allowed to remove the signed-offs

@BrainSlayer
Copy link
Contributor Author

BrainSlayer commented May 1, 2020

you closed it, so you did not sign-off anything. you revoked your sign-off with your action and your comment is also pedantic. i just want get it in a running condition anymore. legal stuff is something other people can deal with it. i'm a developer

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented May 1, 2020

you closed it, so you did not sign-off anything. you revoked your sign-off with your action

lol. no I did not revoke it.

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented May 1, 2020

legal stuff is something other people can deal with it. i'm a developer

this again shows your ignorant and arrogant behaviour. have fun. this will never get merged

@PrivatePuffin

This comment has been minimized.

@BrainSlayer
Copy link
Contributor Author

@c0d3z3r0 you still havent got the point what this is all about. its about a feature i spended alot of time which got destroyed by a third party. i woke up this morning and was very pissed of. i can edit the commit message later today. but now i'm focusing to get it working on all platforms

@BrainSlayer
Copy link
Contributor Author

btw. the signe-off / co-author notices are fixed now.

@PrivatePuffin

This comment has been minimized.

@PrivatePuffin

This comment has been minimized.

@BrainSlayer
Copy link
Contributor Author

btw. the signe-off / co-author notices are fixed now.

No they are not.
The commit "fix copyright notices" is authored by me and @c0d3z3r0 only.

The signoffs and co-autor tags need to be on the commit incorporating work by said people. Not all on one commit. Thats why you should just pull the work where we left off, it took us days to get right.

edit
To give an impression:
I've gone over EVERY commit in all PR's we used code from (all 3 years) and added the signoffs and co-author tags on the right commit if it incorporated work from said person. That is how it should be done.

The fixing of the copyright headers are mostly my work (also going over all history to get right) and checked (+ few fixes himself) by @c0d3z3r0

that "may be" the case. the problem is i copied these noticed from the original PR which changed the copyright notices. so it was wrong from the beginning. also in the old PR

@PrivatePuffin

This comment has been minimized.

@PrivatePuffin
Copy link
Contributor

PrivatePuffin commented May 1, 2020

@BrainSlayer

that "may be" the case. the problem is i copied these noticed from the original PR which changed the copyright notices. so it was wrong from the beginning. also in the old PR

You had nothing to do with those notices, as I was telling you... I redid them ALL, I went over ALL files and ALL history... It took me days, which you know.

It's plain and utter Hubris to even think you did all of my work.

@BrainSlayer
Copy link
Contributor Author

i'm continueing the work. but i'm merging it also with the work i did recently. it keeps also track about the history of my commits for. i have checked out your PR ,but i do also review and changes and merge them manually and of course i use the original messages. i will change some other signed-off messages later once i get the bsd compile results here

@PrivatePuffin
Copy link
Contributor

@BrainSlayer Github told me you mentioned me somewhere in this repo, but I cant figure out where?
(maybe due to your rebase)

@BrainSlayer
Copy link
Contributor Author

@Ornias1993 not intentionally. i indeed just made a rebase

@BrainSlayer BrainSlayer force-pushed the master branch 3 times, most recently from 7bdc1f5 to eabd327 Compare August 18, 2020 08:09
BrainSlayer and others added 14 commits August 19, 2020 11:42
this is a squashed in progress version

Co-authored-by: Allan Jude [email protected]
Co-authored-by: Sebastian Gottschall [email protected]
Co-authored-by: Kjeld Schouten-Lebbing [email protected]
Co-authored-by: Michael Niewöhner [email protected]
Signed-off-by: Allan Jude [email protected]
Signed-off-by: Sebastian Gottschall [email protected]
Signed-off-by: Kjeld Schouten-Lebbing [email protected]
Signed-off-by: Michael Niewöhner [email protected]
Co-authored-by: Allan Jude [email protected]
Co-authored-by: Sebastian Gottschall [email protected]
Co-authored-by: Kjeld Schouten-Lebbing [email protected]
Co-authored-by: Michael Niewöhner [email protected]
Signed-off-by: Allan Jude [email protected]
Signed-off-by: Sebastian Gottschall [email protected]
Signed-off-by: Kjeld Schouten-Lebbing [email protected]
Signed-off-by: Michael Niewöhner [email protected]
the real size of the fieldis 32 bit instead of 16 bit
so using sizeof(levelcookie) is a bad idea

Signed-off-by: Sebastian Gottschall <[email protected]>
this might avoid bad handling of the header size in future and looks
also much better

Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Co-authored-by: Allan Jude <[email protected]>
Co-authored-by: Sebastian Gottschall <[email protected]>
Co-authored-by: Michael Niewöhner <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
misstakenly did it wrong while merging

Signed-off-by: Sebastian Gottschall <[email protected]>
c0d3z3r0 added a commit to c0d3z3r0/zfs that referenced this pull request Aug 19, 2020
This PR adds two new compression types, based on ZStandard:

- zstd: A basic ZStandard compression algorithm Available compression.
  Levels for zstd are zstd-1 through zstd-19, where the compression
  increases with every level, but speed decreases.

- zstd-fast: A faster version of the ZStandard compression algorithm
  zstd-fast is basically a "negative" level of zstd. The compression
  decreases with every level, but speed increases.

  Available compression levels for zstd-fast:
   - zstd-fast-1 through zstd-fast-10
   - zstd-fast-20 through zstd-fast-100 (in increments of 10)
   - zstd-fast-500 and zstd-fast-1000

For more information check the man page.

Implementation details:

Rather than treat each level of zstd as a different algorithm (as was
done historically with gzip), the block pointer `enum zio_compress`
value is simply zstd for all levels, including zstd-fast, since they all
use the same decompression function.

The compress= property (a 64bit unsigned integer) uses the lower 7 bits
to store the compression algorithm (matching the number of bits used in
a block pointer, as the 8th bit was borrowed for embedded block
pointers).  The upper bits are used to store the compression level.

It is necessary to be able to determine what compression level was used
when later reading a block back, so the concept used in LZ4, where the
first 32bits of the on-disk value are the size of the compressed data
(since the allocation is rounded up to the nearest ashift), was
extended, and we store the version of ZSTD and the level as well as the
compressed size. This value is returned when decompressing a block, so
that if the block needs to be recompressed (L2ARC, nop-write, etc), that
the same parameters will be used to result in the matching checksum.

All of the internal ZFS code ( `arc_buf_hdr_t`, `objset_t`,
`zio_prop_t`, etc.) uses the separated _compress and _complevel
variables.  Only the properties ZAP contains the combined/bit-shifted
value. The combined value is split when the compression_changed_cb()
callback is called, and sets both objset members (os_compress and
os_complevel).

The userspace tools all use the combined/bit-shifted value.

Additional notes:

zdb can now also decode the ZSTD compression header (flag -Z) and
inspect the size, version and compression level saved in that header.
For each record, if it is ZSTD compressed, the parameters of the decoded
compression header get printed.

ZSTD is included with all current tests and new tests are added
as-needed.

Per-dataset feature flags now get activated when the property is set.
If a compression algorithm requires a feature flag, zfs activates the
feature when the property is set, rather than waiting for the first
block to be born.  This is currently only used by zstd but can be
extended as needed.

Closes: openzfs#6247
Closes: openzfs#10277
Closes: openzfs#9024
Portions-Sponsored-By: The FreeBSD Foundation
Co-authored-by: Allan Jude <[email protected]>
Co-authored-by: Brian Behlendorf <[email protected]>
Co-authored-by: Sebastian Gottschall <[email protected]>
Co-authored-by: Kjeld Schouten-Lebbing <[email protected]>
Co-authored-by: Michael Niewöhner <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Kjeld Schouten-Lebbing <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
c0d3z3r0 added a commit to c0d3z3r0/zfs that referenced this pull request Aug 19, 2020
This PR adds two new compression types, based on ZStandard:

- zstd: A basic ZStandard compression algorithm Available compression.
  Levels for zstd are zstd-1 through zstd-19, where the compression
  increases with every level, but speed decreases.

- zstd-fast: A faster version of the ZStandard compression algorithm
  zstd-fast is basically a "negative" level of zstd. The compression
  decreases with every level, but speed increases.

  Available compression levels for zstd-fast:
   - zstd-fast-1 through zstd-fast-10
   - zstd-fast-20 through zstd-fast-100 (in increments of 10)
   - zstd-fast-500 and zstd-fast-1000

For more information check the man page.

Implementation details:

Rather than treat each level of zstd as a different algorithm (as was
done historically with gzip), the block pointer `enum zio_compress`
value is simply zstd for all levels, including zstd-fast, since they all
use the same decompression function.

The compress= property (a 64bit unsigned integer) uses the lower 7 bits
to store the compression algorithm (matching the number of bits used in
a block pointer, as the 8th bit was borrowed for embedded block
pointers).  The upper bits are used to store the compression level.

It is necessary to be able to determine what compression level was used
when later reading a block back, so the concept used in LZ4, where the
first 32bits of the on-disk value are the size of the compressed data
(since the allocation is rounded up to the nearest ashift), was
extended, and we store the version of ZSTD and the level as well as the
compressed size. This value is returned when decompressing a block, so
that if the block needs to be recompressed (L2ARC, nop-write, etc), that
the same parameters will be used to result in the matching checksum.

All of the internal ZFS code ( `arc_buf_hdr_t`, `objset_t`,
`zio_prop_t`, etc.) uses the separated _compress and _complevel
variables.  Only the properties ZAP contains the combined/bit-shifted
value. The combined value is split when the compression_changed_cb()
callback is called, and sets both objset members (os_compress and
os_complevel).

The userspace tools all use the combined/bit-shifted value.

Additional notes:

zdb can now also decode the ZSTD compression header (flag -Z) and
inspect the size, version and compression level saved in that header.
For each record, if it is ZSTD compressed, the parameters of the decoded
compression header get printed.

ZSTD is included with all current tests and new tests are added
as-needed.

Per-dataset feature flags now get activated when the property is set.
If a compression algorithm requires a feature flag, zfs activates the
feature when the property is set, rather than waiting for the first
block to be born.  This is currently only used by zstd but can be
extended as needed.

Closes: openzfs#6247
Closes: openzfs#10277
Closes: openzfs#9024
Portions-Sponsored-By: The FreeBSD Foundation
Co-authored-by: Allan Jude <[email protected]>
Co-authored-by: Brian Behlendorf <[email protected]>
Co-authored-by: Sebastian Gottschall <[email protected]>
Co-authored-by: Kjeld Schouten-Lebbing <[email protected]>
Co-authored-by: Michael Niewöhner <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Kjeld Schouten-Lebbing <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
This PR adds two new compression types, based on ZStandard:

- zstd: A basic ZStandard compression algorithm Available compression.
  Levels for zstd are zstd-1 through zstd-19, where the compression
  increases with every level, but speed decreases.

- zstd-fast: A faster version of the ZStandard compression algorithm
  zstd-fast is basically a "negative" level of zstd. The compression
  decreases with every level, but speed increases.

  Available compression levels for zstd-fast:
   - zstd-fast-1 through zstd-fast-10
   - zstd-fast-20 through zstd-fast-100 (in increments of 10)
   - zstd-fast-500 and zstd-fast-1000

For more information check the man page.

Implementation details:

Rather than treat each level of zstd as a different algorithm (as was
done historically with gzip), the block pointer `enum zio_compress`
value is simply zstd for all levels, including zstd-fast, since they all
use the same decompression function.

The compress= property (a 64bit unsigned integer) uses the lower 7 bits
to store the compression algorithm (matching the number of bits used in
a block pointer, as the 8th bit was borrowed for embedded block
pointers).  The upper bits are used to store the compression level.

It is necessary to be able to determine what compression level was used
when later reading a block back, so the concept used in LZ4, where the
first 32bits of the on-disk value are the size of the compressed data
(since the allocation is rounded up to the nearest ashift), was
extended, and we store the version of ZSTD and the level as well as the
compressed size. This value is returned when decompressing a block, so
that if the block needs to be recompressed (L2ARC, nop-write, etc), that
the same parameters will be used to result in the matching checksum.

All of the internal ZFS code ( `arc_buf_hdr_t`, `objset_t`,
`zio_prop_t`, etc.) uses the separated _compress and _complevel
variables.  Only the properties ZAP contains the combined/bit-shifted
value. The combined value is split when the compression_changed_cb()
callback is called, and sets both objset members (os_compress and
os_complevel).

The userspace tools all use the combined/bit-shifted value.

Additional notes:

zdb can now also decode the ZSTD compression header (flag -Z) and
inspect the size, version and compression level saved in that header.
For each record, if it is ZSTD compressed, the parameters of the decoded
compression header get printed.

ZSTD is included with all current tests and new tests are added
as-needed.

Per-dataset feature flags now get activated when the property is set.
If a compression algorithm requires a feature flag, zfs activates the
feature when the property is set, rather than waiting for the first
block to be born.  This is currently only used by zstd but can be
extended as needed.

Portions-Sponsored-By: The FreeBSD Foundation
Co-authored-by: Allan Jude <[email protected]>
Co-authored-by: Brian Behlendorf <[email protected]>
Co-authored-by: Sebastian Gottschall <[email protected]>
Co-authored-by: Kjeld Schouten-Lebbing <[email protected]>
Co-authored-by: Michael Niewöhner <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Kjeld Schouten-Lebbing <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
Closes openzfs#6247
Closes openzfs#9024
Closes openzfs#10277
Closes openzfs#10278
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
This PR adds two new compression types, based on ZStandard:

- zstd: A basic ZStandard compression algorithm Available compression.
  Levels for zstd are zstd-1 through zstd-19, where the compression
  increases with every level, but speed decreases.

- zstd-fast: A faster version of the ZStandard compression algorithm
  zstd-fast is basically a "negative" level of zstd. The compression
  decreases with every level, but speed increases.

  Available compression levels for zstd-fast:
   - zstd-fast-1 through zstd-fast-10
   - zstd-fast-20 through zstd-fast-100 (in increments of 10)
   - zstd-fast-500 and zstd-fast-1000

For more information check the man page.

Implementation details:

Rather than treat each level of zstd as a different algorithm (as was
done historically with gzip), the block pointer `enum zio_compress`
value is simply zstd for all levels, including zstd-fast, since they all
use the same decompression function.

The compress= property (a 64bit unsigned integer) uses the lower 7 bits
to store the compression algorithm (matching the number of bits used in
a block pointer, as the 8th bit was borrowed for embedded block
pointers).  The upper bits are used to store the compression level.

It is necessary to be able to determine what compression level was used
when later reading a block back, so the concept used in LZ4, where the
first 32bits of the on-disk value are the size of the compressed data
(since the allocation is rounded up to the nearest ashift), was
extended, and we store the version of ZSTD and the level as well as the
compressed size. This value is returned when decompressing a block, so
that if the block needs to be recompressed (L2ARC, nop-write, etc), that
the same parameters will be used to result in the matching checksum.

All of the internal ZFS code ( `arc_buf_hdr_t`, `objset_t`,
`zio_prop_t`, etc.) uses the separated _compress and _complevel
variables.  Only the properties ZAP contains the combined/bit-shifted
value. The combined value is split when the compression_changed_cb()
callback is called, and sets both objset members (os_compress and
os_complevel).

The userspace tools all use the combined/bit-shifted value.

Additional notes:

zdb can now also decode the ZSTD compression header (flag -Z) and
inspect the size, version and compression level saved in that header.
For each record, if it is ZSTD compressed, the parameters of the decoded
compression header get printed.

ZSTD is included with all current tests and new tests are added
as-needed.

Per-dataset feature flags now get activated when the property is set.
If a compression algorithm requires a feature flag, zfs activates the
feature when the property is set, rather than waiting for the first
block to be born.  This is currently only used by zstd but can be
extended as needed.

Portions-Sponsored-By: The FreeBSD Foundation
Co-authored-by: Allan Jude <[email protected]>
Co-authored-by: Brian Behlendorf <[email protected]>
Co-authored-by: Sebastian Gottschall <[email protected]>
Co-authored-by: Kjeld Schouten-Lebbing <[email protected]>
Co-authored-by: Michael Niewöhner <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Kjeld Schouten-Lebbing <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
Closes openzfs#6247
Closes openzfs#9024
Closes openzfs#10277
Closes openzfs#10278
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants