Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: Use proper guards for NEON instructions #11055

Merged
merged 1 commit into from
Oct 14, 2020

Conversation

bsdimp
Copy link
Contributor

@bsdimp bsdimp commented Oct 13, 2020

The zstd code assumes that if you are on aarch64, you have NEON
insturctions. This is not necessarily true. In a boot loader, where
you might not have the VFP properly initialized, these instructions
may not be available. It's also an error to include arm_neon.h when
the NEON insturctions aren't enabled. Change the guards for using the
NEON instructions from aarch64 to __ARM_NEON which is the standard
symbol for knowing if they are available.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest ARM_NEON, but that's the obsolete spelling from
prior versions of the standard.

Signed-off-by: Warner Losh [email protected]

Motivation and Context

The FreeBSD boot loader runs before the FPU is initialized on the ARM64 platform. ZSTD assumes that if aarch64 is defined, NEON is available, but that's not entirely true. It's only available when enabled, and it's not in the early boot. The fix is simple: ARM defines a symbol to use instead.

Description

This changes 3 instances of the too vague aarch64 to the more proper __ARM_NEON to enable using ARM NEON extensions.

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@gmelikov
Copy link
Member

IIRC this file is a copy from zstd upstream, it would be best to fix it there then.

@bsdimp
Copy link
Contributor Author

bsdimp commented Oct 13, 2020

I've submitted this upstream facebook/zstd#2356

there's already a ZSTD_NO_INTRINSICS upstream that's not in openzfs that I could use in the interim. hash e975de289.

The zstd code assumes that if you are on aarch64, you have NEON
insturctions. This is not necessarily true. In a boot loader, where
you might not have the VFP properly initialized, these instructions
may not be available. It's also an error to include arm_neon.h when
the NEON insturctions aren't enabled. Change the guards for using the
NEON instructions from __aarch64__ to __ARM_NEON which is the standard
symbol for knowing if they are available.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

Updated based on zstd pull request facebook/zstd#2356

Signed-off-by: Warner Losh <[email protected]>
@bsdimp
Copy link
Contributor Author

bsdimp commented Oct 13, 2020

facebook/zstd#2356 has landed...

This matches what landed in zstd upstream if you wanted to include it as a 'hot fix' in advance of the next zstd update.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Thanks!

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Oct 13, 2020
@behlendorf behlendorf merged commit 6ba2e72 into openzfs:master Oct 14, 2020
@BrainSlayer
Copy link
Contributor

BrainSlayer commented Oct 15, 2020

mmh editing a sourcecode which is generated by a script is not a proper solution

@bsdimp
Copy link
Contributor Author

bsdimp commented Oct 15, 2020

mmh editing a sourcecode which is generated by a script is not a proper solution

IMHO, zstd.c shouldn't be generated. It's causing noticeable delays in the parallel build because the name sorts last and it's such a huge compilation unit that it defeats the large parallelism that's otherwise present.

@BrainSlayer
Copy link
Contributor

@bsdimp but it is. its a combined sourcecode made out of the single file. the script how to make it, is included in the std github repo. so its not generated on demand while building of course. its pregenerated out of the source package before commiting

@bsdimp
Copy link
Contributor Author

bsdimp commented Oct 15, 2020

@bsdimp but it is. its a combined sourcecode made out of the single file. the script how to make it, is included in the std github repo. so its not generated on demand while building of course. its pregenerated out of the source package before commiting

@BrainSlayer I'm saying that its use is unwise should be re-thought due to the problems it causes. It bloats binaries as well, I think, since OpenZFS + ZSTD just about doubles the size of the FreeBSD/aarch64 boot loader over the ZFS one that was in place before. It's an unwise practice that's beyond the scope of this PR.

However, in this case I think it's fine to do this because this patch was already upstreamed to ZSTD before it landed here. It won't be lost in the next import. It's no different than cherry picking a change from upstream between imports.

@BrainSlayer
Copy link
Contributor

@bsdimp not necessarily. i dont know how the bootloader is compiled. but alot of funktions in zstd.c are usused. so in a typical single binary the compiler will optimize out unused function. the remaining zstd code will be about 150kb then (i already tested this). linked as kernel module the symbols remain of course. i'm not against the patch. its correct. its just a comment after i have seen the patch. so if its upstream in next version we dont have to care about post patching then.

@bsdimp
Copy link
Contributor Author

bsdimp commented Oct 15, 2020

@BrainSlayer what you describe is extremely compute intensive and is at least part of the long compilation times. To get the optimization you describe, you really need a good, robust LRO, which isn't yet reliable. The down sides of this script are quite real, despite the theoretical workarounds.

@BrainSlayer
Copy link
Contributor

@bsdimp you dont need to use lto for this. lto goes a little bit deeper, but standard unused symbol stripping is done with the standard binutils linker without any special flags or performance loss if the target is not a shared library. it will not take longer

behlendorf pushed a commit that referenced this pull request Oct 16, 2020
The zstd code assumes that if you are on aarch64, you have NEON
instructions. This is not necessarily true. In a boot loader, where
you might not have the VFP properly initialized, these instructions
may not be available. It's also an error to include arm_neon.h when
the NEON insturctions aren't enabled. Change the guards for using the
NEON instructions from __aarch64__ to __ARM_NEON which is the standard
symbol for knowing if they are available.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

Updated based on zstd pull request facebook/zstd#2356

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Warner Losh <[email protected]>
Closes #11055
@jrtc27
Copy link
Contributor

jrtc27 commented Oct 28, 2020

@BrainSlayer What you describe is impossible to do by default. Removing unused functions can only be done at the object file granularity by default. If you want to remove functions from within a single object file you need to compile with -ffunction-sections. Unless you have LTO of course.

@jrtc27
Copy link
Contributor

jrtc27 commented Oct 28, 2020

Also with the correct symbol visibilities you can make it so shared libraries (ie modules) get unused symbols removed too, since they only have to remain by default because they have global visibility.

uqs pushed a commit to freebsd/freebsd-src that referenced this pull request Oct 28, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356


git-svn-id: svn+ssh://svn.freebsd.org/base/head@367119 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
uqs pushed a commit to freebsd/freebsd-src that referenced this pull request Oct 28, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356
@BrainSlayer
Copy link
Contributor

@jrtc27 this is incorrect. for removing unused functions you dont need lto. the linker will remove unused function by itself if not referenced by any other function. this has nothing todo with lto. this is simple dead code elimination. also function section is not required for this. it depends how the final object is linked. if its a shared library the functions will remain. but the bootloader is no shared library, so unused functions will not retain in the final linked executable. just try it by yourself.

@jrtc27
Copy link
Contributor

jrtc27 commented Oct 29, 2020

@jrtc27 this is incorrect. for removing unused functions you dont need lto. the linker will remove unused function by itself if not referenced by any other function. this has nothing todo with lto. this is simple dead code elimination. also function section is not required for this. it depends how the final object is linked. if its a shared library the functions will remain. but the bootloader is no shared library, so unused functions will not retain in the final linked executable. just try it by yourself.

@BrainSlayer Please stop talking about stuff you clearly don't know about. Once you have an object file, all the functions have already been combined for that translation unit into a single .text section. There is physically no way to unpick that and remove parts of it. You can omit the entire file but not an individual function. This is the entire reason why -ffunction-sections exists as an option you can enable. As a toolchain developer regularly working on linkers I do not need "educating" on such matters.

@BrainSlayer
Copy link
Contributor

@jrtc27 i end the disussion here. i'm just in business for 27 years now so i clearly dont know what i'm talking about. maybe your linker works different than mine. but i have to admit that using function-sections is default anyway for most compiling operations, also at the linux kernel. and it should also be no barrier for you to use it for compiling a bootloader if you have troubles with managing your code or setting functions static for letting the compiler doing the job

@bsdimp
Copy link
Contributor Author

bsdimp commented Oct 29, 2020

@BrainSlayer your insulting tone has no place here. You argued one aspect into the ground in a rude way ignoring my other points. Regardless of the technical merit, the violates at least the spirit of the code of conduct. Stop taking a condescending tone with people.

@BrainSlayer
Copy link
Contributor

@bsdimp did you notice that i got insulted first?

markjdb pushed a commit to markjdb/freebsd that referenced this pull request Oct 30, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356
zxombie pushed a commit to CTSRD-CHERI/freebsd-morello that referenced this pull request Nov 6, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356
bdrewery pushed a commit to bdrewery/freebsd that referenced this pull request Nov 7, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356


git-svn-id: svn+ssh://svn.freebsd.org/base/head@367119 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
brooksdavis pushed a commit to CTSRD-CHERI/cheribsd that referenced this pull request Nov 24, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356
mat813 pushed a commit to mat813/freebsd that referenced this pull request Nov 28, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356


git-svn-id: https://svn.freebsd.org/base/head@367119 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
qwattash pushed a commit to CTSRD-CHERI/cheribsd that referenced this pull request Dec 3, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356
markjdb pushed a commit to markjdb/freebsd that referenced this pull request Dec 10, 2020
We don't have NEON available in the boot loader, so we have to disable
it. OpenZFS included ZSTD which used the wrong symbol to bring in neon
support. Change to use the code that's been submitted upstream as a
pull request to both.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

OpenZFS Pull Request: openzfs/zfs#11055
ZSTD Pull Request: facebook/zstd#2356
@PrivatePuffin
Copy link
Contributor

PrivatePuffin commented Dec 17, 2020

@bsdimp While I not totally agree with the argument from Brainslayer, this file is synced with ZSTD upstream.
It was never designed to be edited, to prevent issues we currently have with LZ4.

But thats NOT your fault at all and Thanks for pushing this fix upstream! 👍
I'm more amazed by how @behlendorf skipped over the comment from @gmelikov who already pointed this out and just merged it without understanding what was actually altered.

At the VERY least, any customisations to the library should be CAREFULLY documented and not just merged without carefull attention. Which is what happened here.

We can all disagree on how ZSTD is implemented and I would even say we SHOULD do so more often. However: Pushing fixes in pre-build libraries without documentation is not the way to start a discussion about an implementation.

edit
I did notice that we didn't actually document the fact the library shouldn't be altered or alterations should be documented at all. We hinted about it, but never actually said it. I'm adding it to my Update PR.

@ghost
Copy link

ghost commented Dec 17, 2020

To restate for clarity, this patch landed in upstream ZSTD as a prerequisite before it landed here. This was discussed and coordinated appropriately.

jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
The zstd code assumes that if you are on aarch64, you have NEON
instructions. This is not necessarily true. In a boot loader, where
you might not have the VFP properly initialized, these instructions
may not be available. It's also an error to include arm_neon.h when
the NEON insturctions aren't enabled. Change the guards for using the
NEON instructions from __aarch64__ to __ARM_NEON which is the standard
symbol for knowing if they are available.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

Updated based on zstd pull request facebook/zstd#2356

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Warner Losh <[email protected]>
Closes openzfs#11055
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
The zstd code assumes that if you are on aarch64, you have NEON
instructions. This is not necessarily true. In a boot loader, where
you might not have the VFP properly initialized, these instructions
may not be available. It's also an error to include arm_neon.h when
the NEON insturctions aren't enabled. Change the guards for using the
NEON instructions from __aarch64__ to __ARM_NEON which is the standard
symbol for knowing if they are available.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

Updated based on zstd pull request facebook/zstd#2356

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Warner Losh <[email protected]>
Closes openzfs#11055
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants