Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unused Edon-R variants #13618

Merged
merged 1 commit into from
Mar 14, 2023
Merged

Conversation

mcmilk
Copy link
Contributor

@mcmilk mcmilk commented Jul 2, 2022

This commit removes the edonr_byteorder.h file and all unused
variants of Edon-R.

Signed-off-by: Tino Reichardt [email protected]

Motivation and Context

Simplify the code and generate little speedups also ;-)

FreeBSD freebsd 13.1-RELEASE with clang 13.0.0:
implementation               1k      4k     16k     64k    256k      1m      4m     16m
edonr-generic-master       1982    2330    2435    2444    2468    2476    2130    2138
edonr-generic-newone       2202    2567    2679    2657    2618    2430    2373    2461

Debian Linux 5.10.0-16-amd64 with gcc version 10.2.1:
implementation               1k      4k     16k     64k    256k      1m      4m     16m
edonr-generic-master       1898    2495    2650    2748    2778    2700    2737    2739
edonr-generic-newone       2144    2750    2946    2987    2973    2805    2856    2978

The size of the compiled edonr.o with debugging went down from 574,128 to 111,672 bytes.

Description

How Has This Been Tested?

I have run the standard testing procedures on a test system... the CI seems also fine.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@mcmilk
Copy link
Contributor Author

mcmilk commented Jul 2, 2022

I think that ignoring these compiler warnigs may also a reason for memory corruption on i386 and maybe other small stack architectures. I will rework this patch the same way as done with blake3 by pre-allocating a per-cpu space.

@mcmilk mcmilk force-pushed the edonr-cleanup branch 7 times, most recently from 69a0e97 to b7b50eb Compare July 2, 2022 13:47
@rincebrain
Copy link
Contributor

Have you seen memory corruption on i386 with this, then?

I've seen plenty of issues with it panicking on i386, but none of them seemed to be from memory corruption so far...

@mcmilk
Copy link
Contributor Author

mcmilk commented Jul 2, 2022

Have you seen memory corruption on i386 with this, then?

I've seen plenty of issues with it panicking on i386, but none of them seemed to be from memory corruption so far...

I want it to fit into default stack size ... and remove the pragma then. Maybe this is some source of memory corruption, but I have not checked this in detail. The result for for overwriting stacks is mostly pure random.

When I get it into the stack space, the result is slower also :/

@rincebrain
Copy link
Contributor

Just say "I dislike pragmas", if that's why you're doing it.

@mcmilk mcmilk force-pushed the edonr-cleanup branch 5 times, most recently from 0d877c0 to 7360810 Compare July 2, 2022 20:31
@mcmilk
Copy link
Contributor Author

mcmilk commented Jul 2, 2022

Checkstyle is fine, the benchmarks show the same values.... and I could minimize the stack size also a bit.
The -Wframe-larger-than= pragma is needed for gcc: error: the frame size of 1356 bytes is larger than 1024 bytes
Compiling the same file with clang, the stack size is fine - I have no idea why.

@rincebrain
Copy link
Contributor

I'd look at precisely what flags the kernel is passing to compile with make V=1, but in particular, I don't think many versions of clang know what-Wframe-larger-than is. (I see some discussion from 2021 about adding various stack usage warning limits, but that's newer than a number of distros' compiler versions, so I wouldn't imagine the kernel would be happy to rely on that working yet.)

@mcmilk
Copy link
Contributor Author

mcmilk commented Jul 3, 2022

I'd look at precisely what flags the kernel is passing to compile with make V=1, but in particular, I don't think many versions of clang know what-Wframe-larger-than is. (I see some discussion from 2021 about adding various stack usage warning limits, but that's newer than a number of distros' compiler versions, so I wouldn't imagine the kernel would be happy to rely on that working yet.)

I am using gcc version 10.2.1 and clang version 11.0.1 currently. The -Wframe-larger-than is supported and I used -Wframe-larger-than=1 for getting infos about the stack usage of edonr.c (userspace, without ZFS headers).

Results:

# gcc 32bit seems stupid:
[mcmilk@latitude ~]$ gcc -c -Os -m32 -W -Wframe-larger-than=256 edonr.c
edonr.c: In function ‘Q512’:
edonr.c:217:1: warning: the frame size of 1328 bytes is larger than 256 bytes [-Wframe-larger-than=]
edonr.c: In function ‘EdonRHash’:
edonr.c:325:1: warning: the frame size of 400 bytes is larger than 256 bytes [-Wframe-larger-than=]

[mcmilk@latitude ~]$ gcc -c -m32 -O0 -W -Wframe-larger-than=256 edonr.c
edonr.c: In function ‘Q512’:
edonr.c:217:1: warning: the frame size of 2128 bytes is larger than 256 bytes [-Wframe-larger-than=]
edonr.c: In function ‘EdonRHash’:
edonr.c:325:1: warning: the frame size of 400 bytes is larger than 256 bytes [-Wframe-larger-than=]

[mcmilk@latitude ~]$ gcc -c -m32 -O1 -W -Wframe-larger-than=256 edonr.c
edonr.c: In function ‘Q512’:
edonr.c:217:1: warning: the frame size of 1304 bytes is larger than 256 bytes [-Wframe-larger-than=]
edonr.c: In function ‘EdonRHash’:
edonr.c:325:1: warning: the frame size of 400 bytes is larger than 256 bytes [-Wframe-larger-than=]

[mcmilk@latitude ~]$ gcc -c -m32 -O2 -W -Wframe-larger-than=256 edonr.c
edonr.c: In function ‘Q512’:
edonr.c:217:1: warning: the frame size of 1304 bytes is larger than 256 bytes [-Wframe-larger-than=]
edonr.c: In function ‘EdonRHash’:
edonr.c:325:1: warning: the frame size of 400 bytes is larger than 256 bytes [-Wframe-larger-than=]

[mcmilk@latitude ~]$ gcc -c -m32 -O3 -W -Wframe-larger-than=256 edonr.c
edonr.c: In function ‘Q512’:
edonr.c:217:1: warning: the frame size of 1304 bytes is larger than 256 bytes [-Wframe-larger-than=]
edonr.c: In function ‘EdonRHash’:
edonr.c:325:1: warning: the frame size of 432 bytes is larger than 256 bytes [-Wframe-larger-than=]

# gcc 64bit ok:
[mcmilk@latitude ~]$ gcc -c -Os -m64 -W -Wframe-larger-than=256 edonr.c
edonr.c: In function ‘EdonRHash’:
edonr.c:325:1: warning: the frame size of 400 bytes is larger than 256 bytes [-Wframe-larger-than=]

# clang, all ok:
[mcmilk@latitude ~]$ clang -c -Os -m32 -W -Wframe-larger-than=256 edonr.c
edonr.c:121:1: warning: stack frame size of 432 bytes in function 'Q512' [-Wframe-larger-than=]
edonr.c:318:1: warning: stack frame size of 412 bytes in function 'EdonRHash' [-Wframe-larger-than=]

[mcmilk@latitude ~]$ clang -c -Os -m64 -W -Wframe-larger-than=256 edonr.c
edonr.c:318:1: warning: stack frame size of 424 bytes in function 'EdonRHash' [-Wframe-larger-than=]

@rincebrain
Copy link
Contributor

rincebrain commented Jul 3, 2022

Neat, so it believes in it.

It seems like there's no difference in configuration for that flag between compilers, so my only guess would be if Clang is doing a better job at stack optimization...

I'm curious how gcc is behaving so badly at it, in your pasted example for -m32. Maybe clang is playing loose with what's required to be on the stack? Dunno. Might also be interesting to compare with -O2, since I believe that's what the kernel defaults to...

@mcmilk
Copy link
Contributor Author

mcmilk commented Jul 3, 2022

Neat, so it believes in it.

It seems like there's no difference in configuration for that flag between compilers, so my only guess would be if Clang is doing a better job at stack optimization...

I'm curious how gcc is behaving so badly at it, in your pasted example for -m32. Maybe clang is playing loose with what's required to be on the stack? Dunno. Might also be interesting to compare with -O2, since I believe that's what the kernel defaults to...

The comment above got an update...

Each call within Q512() generates some used bytes on the stack at x32 (for gcc) ... when I remove some rounds in the end ... the stack gets under 1024 ... who cares ;-)

@behlendorf behlendorf added Status: Code Review Needed Ready for review and testing labels Jul 5, 2022
@mcmilk
Copy link
Contributor Author

mcmilk commented Jul 11, 2022

@rincebrain - could you make a Code Review?
I haven't touched much code... mainly removing the unused variants and some re-using of variables by introducing these ones: uint64_t z1, z2, z3, z4, z5, z6, z7, z8;

@rincebrain
Copy link
Contributor

I'll try to get to actually looking it over in depth later this week, but when I tested it, it didn't demonstrate any obvious incorrectness on a 100 GB edonr dataset generated with the old code...

I'll also try and give it a spin on my BE systems just to be safe.

@mcmilk
Copy link
Contributor Author

mcmilk commented Jul 22, 2022

I rebased again, cause the unused define EDONR_VALID_HASHBITLEN was left...

@rincebrain
Copy link
Contributor

I'm not Brian, but I at least use EdonR when I want nopwrite. :)

I apparently have been bad at replying directly in here, but the smashing on sparc64 had nothing to do with this PR, to be clear, it just...does that now, I guess. :/

My concern, which I don't think I posted already, sorry, is that while I don't think any of the cleanup is particularly controversial and you could probably convince something to check and agree that the changes don't change the behavior for any codepaths that are hit, if you were really motivated in that way, I just get worried about any sort of large-scale changes to anything core like checksums or compression (...I say with my fingers in many branches of both pies), and I'm wondering how much the cleanup and performance delta is worth any risk of changing it, even marginal. (Normally I'd also be worried about ever syncing with upstream, but I think we can all agree that's never changing again.)

I also tend to optimize toward avoiding risk when giving advice outside of what I do to my own systems, though, so perhaps I'm being overcautious.

@behlendorf
Copy link
Contributor

behlendorf commented Oct 13, 2022

@mcmilk I'm sure edonr is used and nobody would complain about a speed up! But I do share @rincebrain's concerns above about this being a non-trivial change. Creating a testing tool is great, but can you also do some some manual testing if you haven't already. I'm thinking something along the lines of creating a pool on one platform, filling it with some data, then importing the pool on other platforms and scrub it to verify those implementations. It seems like that should catch anything seriously broken.

@mcmilk
Copy link
Contributor Author

mcmilk commented Oct 14, 2022

@mcmilk I'm sure edonr is used and nobody would complain about a speed up! But I do share @rincebrain's concerns above about this being a non-trivial change. Creating a testing tool is great, but can you also do some some manual testing if you haven't already. I'm thinking something along the lines of creating a pool on one platform, filling it with some data, then importing the pool on other platforms and scrub it to verify those implementations. It seems like that should catch anything seriously broken.

I only have x86_64 arch VMs with full root access. Sth. like Arm, Aarch64, PPC and so on is only userspace testing.
But I cold ask for a PPC64 VM @ https://openpower.ic.unicamp.br/minicloud/ ... maybe we could also get a buildbot VM there.

Has someone of you (@rincebrain @behlendorf) already tried to get a machine there?

My Raspberry Pi3 may also an option... I forgot. Will import with old zfs there... this should be a simple testcase... will do this in the weekend.

@rincebrain
Copy link
Contributor

I've not, but I've heard it's straightforward and have suggested adding a CI bot with one.

I do have sparc64 and aarch64 laying around my home, and usually have a ppc64le VM but the host is off for maintenance atm.

@behlendorf
Copy link
Contributor

Nor have I. Another option might be to update PR 6209, it adds offline zhack scrub command which runs entirely in user space.

@mcmilk
Copy link
Contributor Author

mcmilk commented Oct 19, 2022

Not there anymore: https://openpower.ic.unicamp.br/minicloud/ :/

@mcmilk mcmilk force-pushed the edonr-cleanup branch 3 times, most recently from fbfdf4e to bb53d13 Compare December 22, 2022 19:14
@mcmilk mcmilk force-pushed the edonr-cleanup branch 2 times, most recently from 71a5bf3 to fd0883f Compare February 27, 2023 18:00
@behlendorf
Copy link
Contributor

A fresh rebase on master now that #13741 has been merged would be nice.

@mcmilk
Copy link
Contributor Author

mcmilk commented Mar 3, 2023

CentOS 7 does this while installing:

Error: Package: python2-pyzfs-2.0.7-1.el7.noarch (@/python2-pyzfs-2.0.7-1.el7.noarch)
           Requires: libnvpair3 = 2.0.7
           Removing: libnvpair3-2.0.7-1.el7.x86_64 (@/libnvpair3-2.0.7-1.el7.x86_64)
               libnvpair3 = 2.0.7-1.el7
           Updated By: libnvpair3-2.1.99-1781_g4f4252c.el7.x86_64 (/libnvpair3-2.1.99-1781_g4f4252c.el7.x86_64)
               libnvpair3 = 2.1.99-1781_g4f4252c.el7
Error: Package: python2-pyzfs-2.0.7-1.el7.noarch (@/python2-pyzfs-2.0.7-1.el7.noarch)
           Requires: libzfs4 = 2.0.7
           Removing: libzfs4-2.0.7-1.el7.x86_64 (@/libzfs4-2.0.7-1.el7.x86_64)
               libzfs4 = 2.0.7-1.el7
           Obsoleted By: libzfs5-2.1.99-1781_g4f4252c.el7.x86_64 (/libzfs5-2.1.99-1781_g4f4252c.el7.x86_64)
              ~libzfs5 = 2.1.99-1781_g4f4252c.el7
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

@behlendorf behlendorf self-requested a review March 7, 2023 01:21
This commit removes the edonr_byteorder.h file and all unused
variants of Edon-R.

Signed-off-by: Tino Reichardt <[email protected]>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 14, 2023
@behlendorf behlendorf merged commit fe6a7b7 into openzfs:master Mar 14, 2023
@mcmilk mcmilk deleted the edonr-cleanup branch March 15, 2023 19:21
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 17, 2023
This commit removes the edonr_byteorder.h file and all unused
variants of Edon-R.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes openzfs#13618
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 17, 2023
This commit removes the edonr_byteorder.h file and all unused
variants of Edon-R.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes openzfs#13618
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 17, 2023
This commit removes the edonr_byteorder.h file and all unused
variants of Edon-R.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes openzfs#13618
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 17, 2023
This commit removes the edonr_byteorder.h file and all unused
variants of Edon-R.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes openzfs#13618
behlendorf added a commit that referenced this pull request Oct 13, 2023
New Features
- Block cloning (#13392)
- Linux container support (#14070, #14097, #12263)
- Scrub error log (#12812, #12355)
- BLAKE3 checksums (#12918)
- Corrective "zfs receive"
- Vdev and zpool user properties

Performance
- Fully adaptive ARC (#14359)
- SHA2 checksums (#13741)
- Edon-R checksums (#13618)
- Zstd early abort (#13244)
- Prefetch improvements (#14603, #14516, #14402, #14243, #13452)
- General optimization (#14121, #14123, #14039, #13680, #13613,
  #13606, #13576, #13553, #12789, #14925, #14948)

Signed-off-by: Brian Behlendorf <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants