-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SHA256 fast implementations #1
Conversation
f4ad482
to
4b950c2
Compare
4b950c2
to
55ff508
Compare
55ff508
to
3c53081
Compare
A sha256_bench in /proc/spl/kstat/zfs/ for these, like with fletcher_4_bench or vdev_raidz_bench, to compare the performance of the different options (in case, say, you want to see if using only ssse3 saves you on power usage, but you want to know if the performance hit is likely to make it a nonstarter), would be neat. Thanks for doing this! |
3c53081
to
c2a6d9a
Compare
c2a6d9a
to
40ea8ee
Compare
I added the code for this, but still need to:
|
40ea8ee
to
89e103b
Compare
I decided to drop the sha256 avx2 implementation. I somewhat understand what is happening, but I don't know how to fix it:
These are the problematic instructions:
loop1 is very different for the other implementations - I don't know why they use SRND here. |
Just pinging nicely @tcaputi and/or @AttilaFueloep if they are willing/interested to take a look on this PR. The integration to OpenZFS would be nice. |
@tcaputi @AttilaFueloep @jumbi77 I could really use some help here: zfs/module/icp/algs/impl/impl.c Line 132 in 89e103b
The I stopped after I couldn't figure out how to |
I haven't tried building the branch, but when I do printk debugging in OpenZFS, usually it ends up being something like
|
89e103b
to
14831a8
Compare
Thanks! I was able to fix the segfault and some other minor issues. In the VM on my laptop, I was able to get the following:
Observations: |
Observations:
maybe sha256-avx2 would be a little faster, but I can't get it to work due to assembly/compiler error) |
VB added support for passing AESNI through in 5.0...how old a version
are you testing with?
- Rich
…On Sat, Nov 13, 2021 at 9:23 PM cybojanek ***@***.***> wrote:
Thanks!
I was able to fix the segfault and some other minor issues.
In the VM on my laptop, I was able to get the following:
NOTE: aes-ni is missing from the results, because virtualbox doesn't support passing it through...I have to still test in EC2
***@***.*** zfs]$ cat /proc/spl/kstat/zfs/sha256_bench
4 0 0x01 -1 0 6999258145906 7001015547267
implementation bytes/second
fastest 383045519
generic 286355735
x86_64 373738479
sha-avx 383045519
sha-ssse3 337590387
***@***.*** zfs]$ cat /proc/spl/kstat/zfs/sha512_bench
5 0 0x01 -1 0 6999274221510 7008276525174
implementation bytes/second
fastest 707075025
generic 445629113
x86_64 576709922
sha-avx 635035490
sha-avx2 707075025
sha-ssse3 558050472
Observations:
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
6.1.28-3 in archlinux |
Something seems marginally sad on my Virtualbox VM...
(I'm mostly pointing to the /proc path) Ah, you meant SHA, not AES, which makes sense. I can't find any documentation one way or another about VB passing them through, though the fact that I don't see them in my guest rather indicates ... (e: Debian 11/bullseye x86_64, if you wanted to try) edit again:
|
You'll enjoy this, I think.
As the youths say, "vroom". (From my Ryzen 7 PRO 5850U and an Ubuntu 20.04 VM running under Hyper-V, which does believe in sha_ni .) I also added these to my local copy to allow removing the module without a VERIFY failure:
|
14831a8
to
3f0059d
Compare
I also meant to ask what the compile issue you were having with the sha256 avx2 branch is. |
@rincebrain Thanks for the fini code! Do you have any pointers on how to add tests for cycling through the sha256 implementations? I'm unfamiliar with the testing code. |
Are you asking about how to invoke (or extend) ztest, or adding tests to the test suite for this, or all of the above? For the former, I don't usually run it much, but I'd probably crib notes from how the GH runner does it, and look at things like where ztest sets fletcher4 to cycle, and add similar test functions there. (You should know before running it that at least on the Github runner, it has a nontrivial failure rate even on tests that literally could not touch any code related to it, so it's probably best to treat any issues in ztest as a weak signal (as far as your code) at the moment.) For the latter, I don't see anywhere that it uses cycle in there. |
Sorry I missed this message earlier. You can try to fix it on this branch: https://github.com/cybojanek/zfs/tree/sha256_avx2 To keep track of diffs to linux source asm you can do: wget -c "https://raw.githubusercontent.com/torvalds/linux/v5.14/arch/x86/crypto/sha256-avx2-asm.S"
diff -u sha256-avx2-asm.S module/icp/asm-x86_64/sha2/sha256_avx2.S |
3f0059d
to
d2df809
Compare
- Add HAVE_SHA compiler define - Add zfs_sha_available function - Detect SHA in cpu feature bits Signed-off-by: Jan Kasiak <[email protected]>
Signed-off-by: Jan Kasiak <[email protected]>
Signed-off-by: Jan Kasiak <[email protected]>
Signed-off-by: Jan Kasiak <[email protected]>
Some performance numbers using an EC2 m6i.xlarge instance echo x86_64 > /sys/module/icp/parameters/icp_sha256_impl
modprobe brd rd_nr=1 rd_size=$((12288 * 1024))
zpool create -f -o ashift=12 \
-O acltype=posixacl \
-O relatime=on \
-O xattr=sa \
-O dnodesize=legacy \
-O normalization=formD \
-O devices=off \
-O compression=off \
-O checksum=sha256 \
zscratch /dev/ram0
dd if=/dev/urandom of=/zscratch/data.bin bs=1M count=12000 status=progress conv=fdatasync
zpool export zscratch
for X in generic x86_64 sha-avx sha-ssse3 sha-ni; do
echo $X > /sys/module/icp/parameters/icp_sha256_impl
sleep 1
cat /sys/module/icp/parameters/icp_sha256_impl
zpool import zscratch
echo ""
dd if=/zscratch/data.bin of=/dev/null bs=1M status=progress
zpool export zscratch
done
I also did a similar thing with scrub: zpool export, change algorithm, zpool import, zpool scrub:
|
d2df809
to
948739b
Compare
Closing this in favor of openzfs#12549 |
Motivation and Context
Improves sha256 hash performance by using avx, avx2, ssse3, or sha CPU extensions.
Description
How Has This Been Tested?
NOT YET TESTED
Types of changes
Checklist:
Signed-off-by
.TODO