Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R #4760

Merged
merged 4 commits into from
Oct 4, 2016

Conversation

tonyhutter
Copy link
Contributor

@tonyhutter tonyhutter commented Jun 15, 2016


OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R

Reviewed by: George Wilson [email protected]
Reviewed by: Prakash Surya [email protected]
Reviewed by: Saso Kiselkov [email protected]
Reviewed by: Richard Lowe [email protected]
Approved by: Garrett D'Amore [email protected]
Ported by: Tony Hutter [email protected]

OpenZFS-issue: https://www.illumos.org/issues/4185
OpenZFS-commit: openzfs/openzfs@45818ee

Porting Notes:
This code is ported on top of the Illumos Crypto Framework code:

https://github.com/zfsonlinux/zfs/pull/4329/commits/b5e030c8dbb9cd393d313571dee4756fbba8c22d

Porting changes include:

  • Copied full module/icp/include/sha2/sha2.h directly from illumos
  • Removed from module/icp/algs/sha2/sha2.c:
    #pragma inline(SHA256Init, SHA384Init, SHA512Init)
  • Added 'ctx' to lib/libzfs/libzfs_sendrecv.c:zio_checksum_SHA256() since
    it now takes in an extra parameter.
  • Added CTASSERT() to assert.h from for module/zfs/edonr_zfs.c
  • Added skein & edonr to libicp/Makefile.am
  • Added sha512.S. It was generated from sha512-x86_64.pl in Illumos.
  • Updated ztest.c with new fletcher_4_*() args; used NULL for new ctx argument.
  • In icp/algs/edonr/edonr_byteorder.h, Removed the#if defined(__linux) section
    to not #include the non-existent endian.h.
  • Fixup test files:
    • Rename <sys/varargs.h> -><varargs.h>, <strings.h> -> <string.h>,
    • Remove <note.h> and define NOTE() as NOP.
    • Define u_longlong_t
    • Rename "#!/usr/bin/ksh" -> "#!/bin/ksh -p"
    • Rename NULL to 0 in "no test vector" array entries to get around a
      compiler warning.
    • Remove "for isa in $($ISAINFO); do" stuff
    • Add/update Makefiles
    • Add some userspace headers like stdio.h/stdlib.h in places of
      sys/types.h.
  • EXPORT_SYMBOL *_Init()/*_Update()/*_Final()... routines in ICP modules.
  • Update scripts/zfs2zol-patch.sed
  • include <sys/sha2.h> in sha2_impl.h
  • Add sha2.h to include/sys/Makefile.am
  • Add skein and edonr dirs to icp Makefile
  • Add new checksums to zpool_get.cfg
  • Move checksum switch block from zfs_secpolicy_setprop() to zfs_check_settable()
  • Fix -Wuninitialized error in edonr_byteorder.h on PPC
  • Fix stack frame size errors on ARM32
    - Don't unroll loops in Skein on 32-bit to save stack space
    - Add memory barriers in sha2.c on 32-bit to save stack space
  • Add filetest_001_pos.ksh checksum sanity test
  • Add option to write psudorandom data in file_write utility

@behlendorf
Copy link
Contributor

Add the follow line to the commit message for the top most patch in this patch stack. The buildbot will then pull that SPL commit for testing.

Requires-spl: refs/pull/561/head

@tonyhutter tonyhutter changed the title OpenZFS - 4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R Jun 16, 2016
@tonyhutter tonyhutter force-pushed the openzfs-4185 branch 5 times, most recently from 4c5d249 to 145c821 Compare June 28, 2016 00:28
@tonyhutter tonyhutter force-pushed the openzfs-4185 branch 6 times, most recently from 7654337 to b9f861f Compare July 11, 2016 18:09
@behlendorf
Copy link
Contributor

@tonyhutter looking good. Can you squash your various fixes and rebase this against the latest master.

@tonyhutter
Copy link
Contributor Author

@behlendorf will do. I also need to sanity test it on my PPC and ARM VMs as well.

@tonyhutter tonyhutter mentioned this pull request Jul 13, 2016
@behlendorf
Copy link
Contributor

@tonyhutter can you rebase and squash this in to a stack of patches which can be merged.

@tonyhutter
Copy link
Contributor Author

I'm waiting for the first patch of #4329 to get pulled first (I just marked it LGTM). It will be easier to rebase once that patch is in since my code depends on it.

@behlendorf
Copy link
Contributor

@tonyhutter ahh OK. Then let me get that merged right away and I'll add your signed off. It LGTM as well.

@behlendorf
Copy link
Contributor

OK, the icp module has been merged to master.

@tonyhutter tonyhutter force-pushed the openzfs-4185 branch 4 times, most recently from 366246d to 67400cc Compare July 22, 2016 17:22
@tonyhutter
Copy link
Contributor Author

I've re-based the patch on top of ICP, and have begun work on an additional, real-world test to satisfy my paranoia about the new checksum code.

@lundman
Copy link
Contributor

lundman commented Jul 26, 2016

Oh yes, there were some minor things that we pushed upstream, as well as, coming from upstream.
Check you also get;

openzfsonosx/zfs@bef06e1
openzfsonosx/zfs@b62a652

hope it helps.

@behlendorf
Copy link
Contributor

@lundman definitely should help, thanks for the heads up!

@tcaputi
Copy link
Contributor

tcaputi commented Aug 16, 2016

@tonyhutter

I can't remember 100% why I moved some of the headers (it was months ago when I started on this...) but I think part of the reason was so they could be easily #included in the automated tests (like sha2_test.c).

That makes sense. For encryption, I used only used the generic interfaces so i didn't hit this problem. Another solution there could be to simply add the icp/include/ path to the test Makefile.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the minor issues described in the review comments the major thing which needs to be addressed in the patch is described in this comment.

@tuxoko
Copy link
Contributor

tuxoko commented Sep 29, 2016

@tonyhutter
I don't think you should byteswap the checksum. If you look at the fletcher, the byteswap is for the data themself. The checksum value should always be native.

@tonyhutter
Copy link
Contributor Author

tonyhutter commented Sep 29, 2016

@tuxoko can you point to the code you're referring to? Are you talking about ld_swap* in edonr.c?

@tonyhutter
Copy link
Contributor Author

I've rebased the code and fixed the review comments:

  • Added *_fini() to edonr_mod/skein_mod.c, removed _info(), added *_init/*_fini() to illumos-crypto.c
  • Removed skein_digest_update_mblk()/skein_digest_final_mblk() from skein_mod.c
  • Switch back to "checksum = ZIO_CHECKSUM_OFF" instead of ZIO_CHECKSUM_NOPARITY in module/zfs/dmu.c
  • Removed "secret" from "Grab the secret checksum salt from the MOS" comment in spa.c
  • Fixed "poool" spelling mistake in libtest.shlib
  • Fixed edonr/skein compiler errors for crypto_provider_info_t, crypto_ops_t, crypto_kcf_provider_handle_t
  • Used &mod_cryptoops instead of &mod_miscops in skein/edonr for modlmisc assignment
  • Added skein/edonr_mod.o to module/icp/Makefile.in

Note that in order to get this code turned around quickly, I didn't do the the cmn_err() change (@behlendorf gave me the ok to skip it).

Also, I talked to @tcaputi offline about #4760 (comment) . He took another look at it, and doesn't think it's an issue after all.

@tuxoko
Copy link
Contributor

tuxoko commented Sep 29, 2016

@tonyhutter
I'm talking about zio_checksum_SHA512_byteswap, zio_checksum_skein_byteswap and zio_checksum_edonr_byteswap.

We should NOT do

    zcp->zc_word[0] = BSWAP_64(zcp->zc_word[0]);
    zcp->zc_word[1] = BSWAP_64(zcp->zc_word[1]);
    zcp->zc_word[2] = BSWAP_64(zcp->zc_word[2]);
    zcp->zc_word[3] = BSWAP_64(zcp->zc_word[3]);

And in fact, for byte based input hash like SHA (and I assume skein and edon-r are also byte based input), there should be not a byteswap version. You should be able to see this in the lack of byteswap version in the SHA256.

@tonyhutter
Copy link
Contributor Author

Ok, I see what you're saying. You're asking, why does sha512/skein/edonr use their "byteswap" function for their byteswap function pointer while sha256/fletcher use their "native" function for their byteswap function pointer:

/*
 * Information about each checksum function.
 */     
typedef struct zio_checksum_info {
        zio_checksum_t  *ci_func[2]; /* checksum function for each byteorder */
...


zio_checksum_info_t zio_checksum_table[ZIO_CHECKSUM_FUNCTIONS] = {
...
        {{zio_checksum_SHA256,          zio_checksum_SHA256},
            NULL, NULL, ZCHECKSUM_FLAG_METADATA | ZCHECKSUM_FLAG_DEDUP |
            ZCHECKSUM_FLAG_NOPWRITE, "sha256"},
...
        {{zio_checksum_SHA512_native,   zio_checksum_SHA512_byteswap},
            NULL, NULL, ZCHECKSUM_FLAG_METADATA | ZCHECKSUM_FLAG_DEDUP |
            ZCHECKSUM_FLAG_NOPWRITE, "sha512"},
...
        {{zio_checksum_skein_native,    zio_checksum_skein_byteswap},
            zio_checksum_skein_tmpl_init, zio_checksum_skein_tmpl_free,
            ZCHECKSUM_FLAG_METADATA | ZCHECKSUM_FLAG_DEDUP |
            ZCHECKSUM_FLAG_SALTED | ZCHECKSUM_FLAG_NOPWRITE, "skein"},

I don't know the answer. I can only tell you that this is taken directly from illumos. The native/byteswap functions seem to be called in two places:

1.zio_checksum_compute() always calls the "native" function directly (via ci->ci_func[0]).
2. zio_checksum_error_impl() may or may not call the "byteswap" function.

@tuxoko
Copy link
Contributor

tuxoko commented Sep 30, 2016

@tonyhutter
Like I said, if you look at fletcher_4_byteswap. It's doing bswap on input data not on zc_word. So what ever illumos is trying to do, it's not correct.

And the reason we don't byteswap zc_word should be obvious, because most of the time, zc_word is stored inside blkptr_t, meaning it will be swapped to native value when we read it into ARC. And in the case of ZIL blocks, as you can see in zio_checksum_error_impl(), the "in-block checksum" eck->zec_cksum is swapped to native accordingly before comparison.

@rlaager
Copy link
Member

rlaager commented Sep 30, 2016

From what I can see, this has already been merged on Illumos, so if there is a byte-swapping mistake, fixing it would be an on-disk format breakage.

@tuxoko
Copy link
Contributor

tuxoko commented Sep 30, 2016

@rlaager
We cannot leave this issue because it will prevent importing pool from foreign endian machine. And I believe there shouldn't have any on-disk effect of this issue because we only write native endian value.

@behlendorf
Copy link
Contributor

@skiselkov can you take a minute look at @tuxoko's byte swapping issue above. There's concern that this code @tonyhutter has ported directly from illumos us just wrong.

tonyhutter and others added 4 commits October 3, 2016 14:51
…, Edon-R

Reviewed by: George Wilson <[email protected]>
Reviewed by: Prakash Surya <[email protected]>
Reviewed by: Saso Kiselkov <[email protected]>
Reviewed by: Richard Lowe <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
Ported by: Tony Hutter <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/4185
OpenZFS-commit: openzfs/openzfs@45818ee

Porting Notes:
This code is ported on top of the Illumos Crypto Framework code:

    openzfs@b5e030c

The list of porting changes includes:

- Copied module/icp/include/sha2/sha2.h directly from illumos

- Removed from module/icp/algs/sha2/sha2.c:
	#pragma inline(SHA256Init, SHA384Init, SHA512Init)

- Added 'ctx' to lib/libzfs/libzfs_sendrecv.c:zio_checksum_SHA256() since
  it now takes in an extra parameter.

- Added CTASSERT() to assert.h from for module/zfs/edonr_zfs.c

- Added skein & edonr to libicp/Makefile.am

- Added sha512.S.  It was generated from sha512-x86_64.pl in Illumos.

- Updated ztest.c with new fletcher_4_*() args; used NULL for new CTX argument.

- In icp/algs/edonr/edonr_byteorder.h, Removed the #if defined(__linux) section
  to not #include the non-existant endian.h.

- In skein_test.c, renane NULL to 0 in "no test vector" array entries to get
  around a compiler warning.

- Fixup test files:
	- Rename <sys/varargs.h> -> <varargs.h>, <strings.h> -> <string.h>,
	- Remove <note.h> and define NOTE() as NOP.
	- Define u_longlong_t
	- Rename "#!/usr/bin/ksh" -> "#!/bin/ksh -p"
	- Rename NULL to 0 in "no test vector" array entries to get around a
	  compiler warning.
	- Remove "for isa in $($ISAINFO); do" stuff
	- Add/update Makefiles
	- Add some userspace headers like stdio.h/stdlib.h in places of
	  sys/types.h.

- EXPORT_SYMBOL *_Init/*_Update/*_Final... routines in ICP modules.

- Update scripts/zfs2zol-patch.sed

- include <sys/sha2.h> in sha2_impl.h

- Add sha2.h to include/sys/Makefile.am

- Add skein and edonr dirs to icp Makefile

- Add new checksums to zpool_get.cfg

- Move checksum switch block from zfs_secpolicy_setprop() to
  zfs_check_settable()

- Fix -Wuninitialized error in edonr_byteorder.h on PPC

- Fix stack frame size errors on ARM32
  	- Don't unroll loops in Skein on 32-bit to save stack space
  	- Add memory barriers in sha2.c on 32-bit to save stack space

- Add filetest_001_pos.ksh checksum sanity test

- Add option to write psudorandom data in file_write utility
…ed in the dedup property value

Authored by: ilovezfs <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Richard Laager <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported-by: Tony Hutter <[email protected]>

zio_checksum_to_feature() expects a zio_checksum enum not a raw property
intval, so the new checksums weren't being detected when the
ZIO_CHECKSUM_VERIFY flag got in the way.

Given a pool without feature@sha512,

    zfs create -o dedup=sha512 naughty/fivetwelve_noverify_ds

would fail as expected since the raw intval would indeed be equal to
SPA_FEATURE_SHA512.

However,

    zfs create -o dedup=sha512,verify naughty/fivetwelve_verify_ds

would incorrectly succeed because ZIO_CHECKSUM_VERIFY would be in the
way, the raw intval would not be a member of the enum, and
zio_checksum_to_feature() would return SPA_FEATURE_NONE, with the result
that spa_feature_is_enabled() would never be called.

This was first detected with edonr, since in that case verify is
required.

This commit clears the ZIO_CHECKSUM_VERIFY flag before calling
zio_checksum_to_feature() using the ZIO_CHECKSUM_MASK and verifies in
zio_checksum_to_feature() that ZIO_CHECKSUM_MASK has been applied by the
caller to attempt to prevent the same bug from occurring again in the
future.

OpenZFS-issue: https://www.illumos.org/issues/6541
OpenZFS-commit: illumos/illumos-gate@971640e

Porting notes:
This code was originally from Illumos, but I actually ported it from:
openzfsonosx/zfs@bef06e1
… on extensible dataset

Authored by: ilovezfs <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Richard Laager <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported by: Tony Hutter <[email protected]>

In any pool without the extensible dataset feature flag already enabled,
creating a dataset with dedup set to use one of the new checksums would
result in the following panic as soon as any data was added:

panic[cpu0]/thread=ffffff0006761c40: feature_get_refcount(spa, feature,
&refcount) != 48 (0x30 != 0x30), file: ../../common/fs/zfs/zfeature.c
line 390

Inpsection showed that feature->fi_feature was 7, which is the value of
SPA_FEATURE_EXTENSIBLE_DATASET in the spa_feature enum.  This commit
adds extensible dataset as a dependency for the sha512, edonr, and skein
feature flags, which prevents the panic.

OpenZFS-issue: https://www.illumos.org/issues/6585
OpenZFS-commit: illumos/illumos-gate@892586e
Porting Notes:
This code was originally from Illumos, but I actually ported it from:
openzfsonosx/zfs@b62a652
As part of its tests, filetest_001_pos.ksh wipes the entire vdev to
create checksum errors.  This patch uses the setup/cleanup scripts from
the scrub_mirror test to create a custom 100MB pool, rather than
using the entire device size that is passed into zfs-tests.sh
(which defaults to 2GB).  This speeds up the buildbot tests, and also
makes it possible for someone to use real disks (say, 1TB) without the
test taking an insanely long amount of time.
@tuxoko
Copy link
Contributor

tuxoko commented Oct 3, 2016

@tonyhutter @behlendorf
Ok, so I take a closer look at this, and I think I kind of know what the illumos is trying to do.

Originally in SHA256, we calculate the checksum and encode it into 64bit numbers. So no matter which endian you are, the same block will calculate into the same 64bit numbers(zc_word[]). And these numbers will be written to disk in the native endianness. When we import a foreign endian pool, the checksum will swapped to native endian as part of blkptr_t, or explicitly in zio_checksum_error_impl. So we don't need a separate function to calculate a block from foreign pool.

Now what they are trying to do with new checksum is, instead of encoding into 64bit numbers, it encode into byte stream. So no matter which endian you are, the same block will calculate to the same byte stream and written that byte stream to disk, but the zc_word[] will be reversed between different endian. Now when we import a foreign endian pool, the byte stream will still get byte swapped as 64bit numbers as part of blkptr_t and zio_checksum_error_impl. So to verify the blocks on foreign pool, we need a separate function version to swap the checksum byte stream calculated from foreign pool.

So this patch as it is would probably work fine when cross importing. However, this also raise the question how dedup would work when we do crosss import. As far as I can tell, dedup use checksum as in 64bit numbers(zc_word[]) as key to search. Which means you will not be able to find the entries in the dedup table written by the foreign machine even if you are writing the exact same data block.

And also, this does have on disk effect, so I'm not sure how we should deal with this.

@behlendorf
Copy link
Contributor

So I'm not sure how we should deal with this.

We should merge this pull request as is to bring ZoL back in sync with OpenZFS.

Pools with these checksums already exist in the wild from illumos, so if there is an endianness issue lurking here which needs to be handled it can be tackled in another PR. In practice big endian systems are fairly rare these days though so it's doubtful to cause users real problems. Even modern ppc system and arm systems support little endian now.

As for the dedup issues you mentioned that's definitely something to address in a different PR.

Copy link
Contributor

@dpquigl dpquigl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the changes while working to ABD enable them and I haven't come across anything that I think needs to be fixed before it can be merged.

@behlendorf behlendorf merged commit 5cc78dc into openzfs:master Oct 4, 2016
@behlendorf
Copy link
Contributor

Thanks everybody I've merged this to master.

tcaputi pushed a commit to tcaputi/zfs that referenced this pull request Oct 9, 2016
When openzfs#4760 was merged tests were added to ensure that the new checksums
were working preperly. However, some of the functionality for sha2
functions were not ported over, resulting in some Coverity defects and
code that would be unstable when needed in the future. This patch
simply ports over the missing code and fixes the defects in the
process.
behlendorf pushed a commit that referenced this pull request Oct 10, 2016
When #4760 was merged tests were added to ensure that the new checksums
were working properly. However, some of the functionality for sha2
functions were not ported over, resulting in some Coverity defects and
code that would be unstable when needed in the future. This patch
simply ports over the missing code and fixes the defects in the
process.

Reviewed by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Issue #4760 
Closes #5251
tcaputi pushed a commit to tcaputi/zfs that referenced this pull request Oct 11, 2016
The ICP requires destructors to for each crypto module that is added.
These do not necessarily exist in Illumos because they assume that
these modules can never be unloaded from the kernel. Some of this
cleanup code was missed when openzfs#4760 was merged, resulting in leaks.
This patch simply fixes that.
behlendorf pushed a commit that referenced this pull request Oct 12, 2016
The ICP requires destructors to for each crypto module that is added.
These do not necessarily exist in Illumos because they assume that
these modules can never be unloaded from the kernel. Some of this
cleanup code was missed when #4760 was merged, resulting in leaks.
This patch simply fixes that.

Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Issue #4760 
Closes #5265
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants