-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SHA2 reworking and API for iterating over multiple implementations #13741
Conversation
10c2baa
to
97c0f4c
Compare
I'll give these a spin on my various sundry machines later today or tomorrow. |
Of course, it's important to include the silliest part, the example benchmarks. Ryzen 5900X:
AArch64 VM on my Mac Mini M1:
Sadly,
So that bodes ill. (This is on my Debian 11 system with a -backports kernel and --enable-debug, but it works/fails the same way on the AArch64 VM too.) (I'm blaming ztest and not zdb because zdb from the unmodified tree fails the same way on the resulting vdev files.) edit 2: well, it fails...DIFFERENTLY with --enable-asan...
|
I think that |
I haven't dug into it, but based on the ASAN failure assertion, I was
wondering if the native encryption stuff was calling into the SHA2 API
somewhere in the ASM and broke from e.g. reordering the arguments.
edit:
```
$ ./tests/zfs-tests/tests/functional/hkdf/hkdf_test
TEST 0: HKDF failed with error code 5
Test failed.
```
So that seems to agree something is discontent in there...
…On Mon, Aug 8, 2022, 11:59 AM Tino Reichardt ***@***.***> wrote:
I think that zdb and ztest are okay and I am missing sth.... I will take
a closer look.
Because this PR replaces the whole SHA2 code, I would be happy if someone
from the OpenZFS team could say sth. for or against it... I will
invenstigate the failure, it makes sense.
—
Reply to this email directly, view it on GitHub
<#13741 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUI7MAOWY6FUZHL3IZTTLVYEVGDANCNFSM55W76FVQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
module/icp/algs/sha2/sha2_generic.c
Outdated
memset(m + pos, 0, (size_t)(128 - pos)); | ||
mlen = cpu_to_be64(ctx->count[1]); | ||
memcpy(m + (128 - 8), &mlen, 64 / 8); | ||
kfpu_end(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scratch that, reverse it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @rincebrain - this branch is just an RFC ... please do not test in deep, it's full of bugs, I need to find in the weekend ;-)
But the EdonR shrinking and the BLAKE3 tuneable fix are ready for a review.
The kfpu_begin()/end() are on my TODO for this branch here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't spend a bunch of time testing it, I just assumed it would at least run enough to try.
I'd advise marking the PR as a draft, and putting a more explicit disclaimer about it being known to not function correctly, if that's the case, otherwise some people who test open PRs if they seem interesting might try it and be distressed.
And the specific thing I was pointing out was that you had reversed kfpu_end
and kfpu_begin
in that one case, not that you were missing them more generally or should delete them.
I was waiting for there to be a few days when those two PRs didn't get an update to review them, as force-pushing over the entire branch makes it difficult to tell what's changed, and the last two times I started reviewing one of them, they got updated before I finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the generic methods the kfpu begin/end was not okay there... I will put it into the transform() function itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was waiting for there to be a few days when those two PRs didn't get an update to review them, as force-pushing over the entire branch makes it difficult to tell what's changed, and the last two times I started reviewing one of them, they got updated before I finished.
I am sorry for this. I rebased the Edon-R branch because some reference for this #13710 was requested by @gmelikov
And the next update to the sha2-rework will be a seperate commit - okay :)
03dac33
to
d36c47b
Compare
@rincebrain - the patch is ready for real testing now. Have you got some time for the EdonR thing? |
0021ef5
to
138d178
Compare
8e72060
to
1d40c3b
Compare
The skeleton file module/icp/include/generic_impl.c can be used for iterating over different implementations of algorithms. It is used by SHA256, SHA512 and BLAKE3 currently. The Solaris SHA2 implementation got replaced with a version which is based on public domain code of cppcrypto v0.10. These assembly files are taken from current openssl master: - sha256-x86_64.S: x64, SSSE3, AVX, AVX2, SHA-NI (x86_64) - sha512-x86_64.S: x64, AVX, AVX2 (x86_64) - sha256-armv7.S: ARMv7, NEON, ARMv8-CE (arm) - sha512-armv7.S: ARMv7, NEON (arm) - sha256-armv8.S: ARMv7, NEON, ARMv8-CE (aarch64) - sha512-armv8.S: ARMv7, ARMv8-CE (aarch64) - sha256-ppc.S: Generic PPC64 LE/BE (ppc64) - sha512-ppc.S: Generic PPC64 LE/BE (ppc64) - sha256-p8.S: Power8 ISA Version 2.07 LE/BE (ppc64) - sha512-p8.S: Power8 ISA Version 2.07 LE/BE (ppc64) Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
This commit changes the BLAKE3 implementation handling and also the calls to it from the ztest command. Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
- instead of ".section .rodata" we should use SECTION_STATIC Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
Is there really some problem with the SHA2 files of this PR within the Windows Port? I used |
Give me a sec, just merging in and making sure its running.. it'll look something like: hrmm i need to pull out the kfpu_vars as a PR - so I can fix up in a separate PR |
Ah, I see - sorry for these. Maybe an extra PR just for these changes would be good - so they get in quickly. |
Ignore previous messages, they were wrong. The issue was actually that it uses the TF macro for most calls, to make wrappers then call into assembly;
Except for the first type:
which created a case where we do regular C calls to "wrappers" (which then do ASMABI calls out), except for If I change that function to also have a wrapper;
Then it is consistent. All wrappers are called in C, and all assembly are called with ASMABI. I presume that If that is the case, I can:
|
Also, the nested kfpu_begin() calls seem to have been cleaned up, so I can probably get rid of kfpu_vars |
Hello @lundman - yes, the What we need in general, is some new SPL code to get crypto functions from FreeBSD, MacOS and Windows supported. I am thinking about this ... and will create some PR as draft when there is sth. ready. Maybe you had also this idea. |
It appears that IBM POWER7 (Power ISA 2.06) has being mis-detected to use
|
@Low-power - can you modify the But I got fresh access to ppc machines here: https://osuosl.org/services/powerdev/ ... but I am not ready for testings there. Edit: the this macro checks for power8: Can you try this patch:
|
I would provide an official PR when you say the kernel oops is away. |
Thanks for the patch, I will try it after a fresh reboot on next Monday. |
I just tested it; the fix however is incomplete,
|
zfs_sha256_available(void) | ||
{ | ||
unsigned long ftr = ((get_ftr(ID_AA64ISAR0_EL1)) >> 12) & 0x3; | ||
return (ftr & 0x1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be different on non-Apple arm64, but my M2 reports:
<zfs`spl_cpuid_id_aa64isar0_el1 (spl-processor.c:172)> cpu_features0: 0x0221100110212120
<zfs`spl_cpuid_id_aa64isar0_el1 (spl-processor.c:172)> cpu_features0: AES PMULL SHA1 SHA256 SHA512 SHA3
Note that the SHA bits [12-15] (0xf) has the value 0x2
. Which would in the above "if" only say yes to SHA512, but not SHA256. I take it to mean that if SHA512 is enabled, then SHA256 is implied as as well. I would think if should
be if (ftr & 0xf)
being anything not-zero. (or 0x3 if preferred, but either bit)
Is this an Apple quirk?
It also happens with the AES bits [4-7]
aes & 3 ? "AES " : "",
aes & 2 ? "PMULL " : "",
If either bit is set, it has AES, if 0x2 is set, it also has PMULL.
OK, so changes to compile on macOS in shared files: openzfsonosx@256294a adding aesv8 |
We had three sha2.h headers in different places. The FreeBSD version, the Linux version and the generic solaris version. The only assembly used for acceleration was some old x86-64 openssl implementation for sha256 within the icp module. For FreeBSD the whole SHA2 files of FreeBSD were copied into OpenZFS, these files got removed also. Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
These are added: - zfs_neon_available() for arm and aarch64 - zfs_sha256_available() for arm and aarch64 - zfs_sha512_available() for aarch64 - zfs_shani_available() for x86_64 Changes: - simd_powerpc.h: change license from CDDL to BSD Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
These are added: - zfs_neon_available() for arm and aarch64 - zfs_sha256_available() for arm and aarch64 - zfs_sha512_available() for aarch64 - zfs_shani_available() for x86_64 Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Co-Authored-By: Sebastian Gottschall <[email protected]> Closes openzfs#13741
These are added via HWCAP interface: - zfs_neon_available() for arm and aarch64 - zfs_sha256_available() for arm and aarch64 - zfs_sha512_available() for aarch64 This one via cpuid() call: - zfs_shani_available() for x86_64 Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
The skeleton file module/icp/include/generic_impl.c can be used for iterating over different implementations of algorithms. It is used by SHA256, SHA512 and BLAKE3 currently. The Solaris SHA2 implementation got replaced with a version which is based on public domain code of cppcrypto v0.10. These assembly files are taken from current openssl master: - sha256-x86_64.S: x64, SSSE3, AVX, AVX2, SHA-NI (x86_64) - sha512-x86_64.S: x64, AVX, AVX2 (x86_64) - sha256-armv7.S: ARMv7, NEON, ARMv8-CE (arm) - sha512-armv7.S: ARMv7, NEON (arm) - sha256-armv8.S: ARMv7, NEON, ARMv8-CE (aarch64) - sha512-armv8.S: ARMv7, ARMv8-CE (aarch64) - sha256-ppc.S: Generic PPC64 LE/BE (ppc64) - sha512-ppc.S: Generic PPC64 LE/BE (ppc64) - sha256-p8.S: Power8 ISA Version 2.07 LE/BE (ppc64) - sha512-p8.S: Power8 ISA Version 2.07 LE/BE (ppc64) Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
This commit changes the BLAKE3 implementation handling and also the calls to it from the ztest command. Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
- instead of ".section .rodata" we should use SECTION_STATIC Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#13741
New Features - Block cloning (#13392) - Linux container support (#14070, #14097, #12263) - Scrub error log (#12812, #12355) - BLAKE3 checksums (#12918) - Corrective "zfs receive" - Vdev and zpool user properties Performance - Fully adaptive ARC (#14359) - SHA2 checksums (#13741) - Edon-R checksums (#13618) - Zstd early abort (#13244) - Prefetch improvements (#14603, #14516, #14402, #14243, #13452) - General optimization (#14121, #14123, #14039, #13680, #13613, #13606, #13576, #13553, #12789, #14925, #14948) Signed-off-by: Brian Behlendorf <[email protected]>
This RFC is about replacing the solaris SHA2 implementation with a public domain variant of SHA2.
Also a new generic implementation API is introduced and could be used for other algorithms like AES, Fletcher and so on...
Here are some benchmarks, which are done while loading the zfs module. By default the fastest impl. will be used.
TODO
The PR #13649 can use additional hardware acceleration via FreeBSD drivers, it should also go into OpenZFS, but not within this pull request.
Description
The skeleton file module/icp/include/generic_impl.c can be used for iterating over different implementations of algorithms. It is used by SHA256, SHA512 and BLAKE3 currently.
The Solaris SHA2 implementation got replaced with public domain code of [ccpcrypto 0.10 (https://sourceforge.net/projects/cppcrypto/files/cppcrypto-0.10-src.zip/download).
These assembly files are taken from current openssl master:
Signed-off-by: Tino Reichardt [email protected]
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.