-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aarch64: Use proper guards for NEON instructions #11055
Conversation
IIRC this file is a copy from zstd upstream, it would be best to fix it there then. |
I've submitted this upstream facebook/zstd#2356 there's already a ZSTD_NO_INTRINSICS upstream that's not in openzfs that I could use in the interim. hash e975de289. |
The zstd code assumes that if you are on aarch64, you have NEON insturctions. This is not necessarily true. In a boot loader, where you might not have the VFP properly initialized, these instructions may not be available. It's also an error to include arm_neon.h when the NEON insturctions aren't enabled. Change the guards for using the NEON instructions from __aarch64__ to __ARM_NEON which is the standard symbol for knowing if they are available. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. Updated based on zstd pull request facebook/zstd#2356 Signed-off-by: Warner Losh <[email protected]>
facebook/zstd#2356 has landed... This matches what landed in zstd upstream if you wanted to include it as a 'hot fix' in advance of the next zstd update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Thanks!
mmh editing a sourcecode which is generated by a script is not a proper solution |
IMHO, zstd.c shouldn't be generated. It's causing noticeable delays in the parallel build because the name sorts last and it's such a huge compilation unit that it defeats the large parallelism that's otherwise present. |
@bsdimp but it is. its a combined sourcecode made out of the single file. the script how to make it, is included in the std github repo. so its not generated on demand while building of course. its pregenerated out of the source package before commiting |
@BrainSlayer I'm saying that its use is unwise should be re-thought due to the problems it causes. It bloats binaries as well, I think, since OpenZFS + ZSTD just about doubles the size of the FreeBSD/aarch64 boot loader over the ZFS one that was in place before. It's an unwise practice that's beyond the scope of this PR. However, in this case I think it's fine to do this because this patch was already upstreamed to ZSTD before it landed here. It won't be lost in the next import. It's no different than cherry picking a change from upstream between imports. |
@bsdimp not necessarily. i dont know how the bootloader is compiled. but alot of funktions in zstd.c are usused. so in a typical single binary the compiler will optimize out unused function. the remaining zstd code will be about 150kb then (i already tested this). linked as kernel module the symbols remain of course. i'm not against the patch. its correct. its just a comment after i have seen the patch. so if its upstream in next version we dont have to care about post patching then. |
@BrainSlayer what you describe is extremely compute intensive and is at least part of the long compilation times. To get the optimization you describe, you really need a good, robust LRO, which isn't yet reliable. The down sides of this script are quite real, despite the theoretical workarounds. |
@bsdimp you dont need to use lto for this. lto goes a little bit deeper, but standard unused symbol stripping is done with the standard binutils linker without any special flags or performance loss if the target is not a shared library. it will not take longer |
The zstd code assumes that if you are on aarch64, you have NEON instructions. This is not necessarily true. In a boot loader, where you might not have the VFP properly initialized, these instructions may not be available. It's also an error to include arm_neon.h when the NEON insturctions aren't enabled. Change the guards for using the NEON instructions from __aarch64__ to __ARM_NEON which is the standard symbol for knowing if they are available. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. Updated based on zstd pull request facebook/zstd#2356 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Warner Losh <[email protected]> Closes #11055
@BrainSlayer What you describe is impossible to do by default. Removing unused functions can only be done at the object file granularity by default. If you want to remove functions from within a single object file you need to compile with -ffunction-sections. Unless you have LTO of course. |
Also with the correct symbol visibilities you can make it so shared libraries (ie modules) get unused symbols removed too, since they only have to remain by default because they have global visibility. |
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356 git-svn-id: svn+ssh://svn.freebsd.org/base/head@367119 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356
@jrtc27 this is incorrect. for removing unused functions you dont need lto. the linker will remove unused function by itself if not referenced by any other function. this has nothing todo with lto. this is simple dead code elimination. also function section is not required for this. it depends how the final object is linked. if its a shared library the functions will remain. but the bootloader is no shared library, so unused functions will not retain in the final linked executable. just try it by yourself. |
@BrainSlayer Please stop talking about stuff you clearly don't know about. Once you have an object file, all the functions have already been combined for that translation unit into a single .text section. There is physically no way to unpick that and remove parts of it. You can omit the entire file but not an individual function. This is the entire reason why -ffunction-sections exists as an option you can enable. As a toolchain developer regularly working on linkers I do not need "educating" on such matters. |
@jrtc27 i end the disussion here. i'm just in business for 27 years now so i clearly dont know what i'm talking about. maybe your linker works different than mine. but i have to admit that using function-sections is default anyway for most compiling operations, also at the linux kernel. and it should also be no barrier for you to use it for compiling a bootloader if you have troubles with managing your code or setting functions static for letting the compiler doing the job |
@BrainSlayer your insulting tone has no place here. You argued one aspect into the ground in a rude way ignoring my other points. Regardless of the technical merit, the violates at least the spirit of the code of conduct. Stop taking a condescending tone with people. |
@bsdimp did you notice that i got insulted first? |
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356 git-svn-id: svn+ssh://svn.freebsd.org/base/head@367119 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356 git-svn-id: https://svn.freebsd.org/base/head@367119 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356
We don't have NEON available in the boot loader, so we have to disable it. OpenZFS included ZSTD which used the wrong symbol to bring in neon support. Change to use the code that's been submitted upstream as a pull request to both. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. OpenZFS Pull Request: openzfs/zfs#11055 ZSTD Pull Request: facebook/zstd#2356
@bsdimp While I not totally agree with the argument from Brainslayer, this file is synced with ZSTD upstream. But thats NOT your fault at all and Thanks for pushing this fix upstream! 👍 At the VERY least, any customisations to the library should be CAREFULLY documented and not just merged without carefull attention. Which is what happened here. We can all disagree on how ZSTD is implemented and I would even say we SHOULD do so more often. However: Pushing fixes in pre-build libraries without documentation is not the way to start a discussion about an implementation. edit |
To restate for clarity, this patch landed in upstream ZSTD as a prerequisite before it landed here. This was discussed and coordinated appropriately. |
The zstd code assumes that if you are on aarch64, you have NEON instructions. This is not necessarily true. In a boot loader, where you might not have the VFP properly initialized, these instructions may not be available. It's also an error to include arm_neon.h when the NEON insturctions aren't enabled. Change the guards for using the NEON instructions from __aarch64__ to __ARM_NEON which is the standard symbol for knowing if they are available. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. Updated based on zstd pull request facebook/zstd#2356 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Warner Losh <[email protected]> Closes openzfs#11055
The zstd code assumes that if you are on aarch64, you have NEON instructions. This is not necessarily true. In a boot loader, where you might not have the VFP properly initialized, these instructions may not be available. It's also an error to include arm_neon.h when the NEON insturctions aren't enabled. Change the guards for using the NEON instructions from __aarch64__ to __ARM_NEON which is the standard symbol for knowing if they are available. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. Updated based on zstd pull request facebook/zstd#2356 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Warner Losh <[email protected]> Closes openzfs#11055
The zstd code assumes that if you are on aarch64, you have NEON
insturctions. This is not necessarily true. In a boot loader, where
you might not have the VFP properly initialized, these instructions
may not be available. It's also an error to include arm_neon.h when
the NEON insturctions aren't enabled. Change the guards for using the
NEON instructions from aarch64 to __ARM_NEON which is the standard
symbol for knowing if they are available.
__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest ARM_NEON, but that's the obsolete spelling from
prior versions of the standard.
Signed-off-by: Warner Losh [email protected]
Motivation and Context
The FreeBSD boot loader runs before the FPU is initialized on the ARM64 platform. ZSTD assumes that if aarch64 is defined, NEON is available, but that's not entirely true. It's only available when enabled, and it's not in the early boot. The fix is simple: ARM defines a symbol to use instead.
Description
This changes 3 instances of the too vague aarch64 to the more proper __ARM_NEON to enable using ARM NEON extensions.
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.