Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inline ASM requires target features on some targets #112709

Closed
tarcieri opened this issue Jun 16, 2023 · 16 comments
Closed

inline ASM requires target features on some targets #112709

tarcieri opened this issue Jun 16, 2023 · 16 comments
Labels
C-bug Category: This is a bug.

Comments

@tarcieri
Copy link
Contributor

tarcieri commented Jun 16, 2023

I have code that, for whatever reason, will compile on aarch64-apple-darwin but not on aarch64-unknown-linux-gnu. It's asking me to enable target_features on a function that's annotated inline(always) (which, in and of itself, does not seem to make sense).

A minimal reproduction is also elusive in that everything I've tried so far compiles. I'm wondering if this is some kind of bug related to the inliner.

To start with, here are the functions it's complaining about:

/// AES single round encryption.
#[inline(always)]
pub unsafe fn vaeseq_u8(mut data: uint8x16_t, key: uint8x16_t) -> uint8x16_t {
    asm!(
        "AESE {d:v}.16B, {k:v}.16B",
        d = inout(vreg) data,
        k = in(vreg) key,
        options(pure, nomem, nostack, preserves_flags)
    );
    data
}

/// AES inverse mix columns.
#[inline(always)]
pub unsafe fn vaesimcq_u8(mut data: uint8x16_t) -> uint8x16_t {
    asm!(
        "AESIMC {d:v}.16B, {d:v}.16B",
        d = inout(vreg) data,
        options(pure, nomem, nostack, preserves_flags)
    );
    data
}

These compile fine in and of themselves. In fact trying to put together a minimal repro I haven't been able to reproduce the compile error, even with quite a bit of the related code in place: https://godbolt.org/z/P7h8KsaMG

So it seems like some strange interactions in more complicated code, possibly related to inlining.

It's easily reproducible, but unfortunately the only repro I have to offer is the following commit of this PR: RustCrypto/block-ciphers@7818f35

The compile error is as follows (which I've reproduced locally): https://github.com/RustCrypto/block-ciphers/actions/runs/5283017067/jobs/9558698657

error: instruction requires: aes
  --> aes/src/armv8/intrinsics.rs:11:10
   |
11 |         "AESE {d:v}.16B, {k:v}.16B",
   |          ^
   |
note: instantiated into assembly here
  --> <inline asm>:1:2
   |
1  |     AESE v1.16B, v0.16B
   |     ^

error: instruction requires: aes
  --> aes/src/armv8/intrinsics.rs:47:10
   |
47 |         "AESIMC {d:v}.16B, {d:v}.16B",
   |          ^
   |
note: instantiated into assembly here
  --> <inline asm>:1:2
   |
1  |     AESIMC v0.16B, v0.16B
   |     ^

One more odd observation, the compile error can be "solved" by replacing #[inline(always)] with #[target_feature(enable = "aes")], as in here: RustCrypto/block-ciphers@10debe5

However, this solution seemed to cause massive performance degradation with spooky-action-at-a-distance impacts on seemingly unrelated code (although code that was using the same CPU instruction): RustCrypto/block-ciphers#365 (comment)

Meta

This reproduces on all versions of rustc I've tried, including the latest nighty. The code in question is targeting an MSRV of 1.61, where the error also occurs.

@tarcieri
Copy link
Contributor Author

tarcieri commented Jun 16, 2023

We did observe that setting -C target_cpu=ampere1 "fixes" the error as well, whereas the error occurs with -C target-cpu=generic.

Note that the code involved here is attempting to leverage runtime CPU feature detection (on aarch64, by querying the relevant OS-specific APIs) and so the relevant target_feature may not be explicitly enabled.

On aarch64-apple-darwin we don't perform that check as we assume all Apple hardware has the aes feature available (which seems to be best practice), so perhaps that's the source of the OS-specific discrepancy.

tarcieri added a commit to RustCrypto/block-ciphers that referenced this issue Jun 16, 2023
This seems to fix the build failures we were experiencing here:

rust-lang/rust#112709
tarcieri added a commit to RustCrypto/block-ciphers that referenced this issue Jun 16, 2023
This seems to fix the build failures we were experiencing here:

rust-lang/rust#112709
@taiki-e
Copy link
Member

taiki-e commented Jun 16, 2023

I guess you have to add #[target_feature(enable = "aes")] to expand_key (or sub_word) and inv_expanded_keys in aes/src/armv8/expand.rs. They call #[inline(always)] function that calls AES instructions, but is no marked #[target_feature(enable = "aes")].


Details:

First, aarch64-unknown-linux-gnu does not assume that AES instructions are always available, and attempting to use AES instructions in an inline assembly without enabling the aes target feature will result in an error. 1

However, this error is usually reported by LLVM after the inline assembly is actually passed to LLVM. For example, if there is an inline attribute, the function is not actually called, or if you are using cargo check instead of cargo build, this error may not yet appear: https://godbolt.org/z/6cdnhWTGc

In your case, I guess that the error occurred because the aes target feature was not enabled in the caller of the #[inline(always)] function that calls the AES instruction.

As for performance degradation, I would suggest checking to see if it still occurs when #[target_feature] and normal #[inline] are combined. (core::arch::aarch64's AES-related functions have them)

In any case, this does not seem to be a compiler bug.

Footnotes

  1. An error here is not a guaranteed behavior, but depends on the architecture and the behavior of the assembler. For example, the LLVM assembler currnelty accepts all instructions on powerpc regardless of target features, but not on AArch64. And, it is also not guaranteed that other assemblers or future LLVMs will accept all instructions on powerpc.

@tarcieri
Copy link
Contributor Author

@TAKI-E I actually came across that solution a few minutes before your response and it seems to be working: RustCrypto/block-ciphers@c7fe62e

I'm willing to accept this is a legitimate bug in the original code, but it would be nice if there were diagnostics which reported the call sites as the error, rather than the inner function. I'm not sure if that deserves a separate issue or where to go from here.

@taiki-e
Copy link
Member

taiki-e commented Jun 16, 2023

it would be nice if there were diagnostics which reported the call sites as the error, rather than the inner function

Hmm, the actual error is generated by LLVM, not rustc, so I'm not sure how realistic it would be to improve the diagnostics on rustc end.

Also, when the problem is only the lack of target features, it may be reasonable to indicate that the problem is in the caller not using #[target_feature], but otherwise we should probably indicate that the problem is in the assembly itself, just as it is now.

(BTW, LLVM intrinsics have similar issues, and stdarch has been switched from #[inline(always)] to #[inline] + #[target_feature]: rust-lang/stdarch#306)

@tarcieri
Copy link
Contributor Author

Interesting, I guess we should follow suit with stdarch.

Going to close this then.

@newpavlov
Copy link
Contributor

@taiki-e

aarch64-unknown-linux-gnu does not assume that AES instructions are always available, and attempting to use AES instructions in an inline assembly without enabling the aes target feature will result in an error.

I don't think it's documented anywhere in Rust resources? It would be nice to have a mention about it in the reference at the very least.

@RalfJung
Copy link
Member

RalfJung commented Jan 22, 2024

This seems surprising to me; shouldn't programmers be allowed to use "extra knowledge" about which instructions are available in an inline asm block, even if the compiler may not be aware that they are available here?

And indeed this should definitely at the very least be documented. But IMO, we'd ideally get LLVM to stop doing this.

The error does seem to also arise without inline(always) (as I would expect); inline(always) just adds some extra confusion because it can make the asm block appear in a different function than where it was written.

Cc @Amanieu @nikic

@RalfJung RalfJung reopened this Jan 22, 2024
@RalfJung RalfJung changed the title "error: instruction requires: aes" on aarch64-unknown-linux-gnu for inline ASM inside inline(always) fn "error: instruction requires: aes" on aarch64-unknown-linux-gnu for inline ASM Jan 22, 2024
@RalfJung RalfJung changed the title "error: instruction requires: aes" on aarch64-unknown-linux-gnu for inline ASM inline ASM requires target features on some targets Jan 22, 2024
@nikic
Copy link
Contributor

nikic commented Jan 22, 2024

@RalfJung You can use "extra knowledge" in inline asm, but then you also need to annotate it accordingly. You should be able to replace #[target_feature(enable = "aes")] here with .arch_extension aes inside the assembly.

@nikic
Copy link
Contributor

nikic commented Jan 22, 2024

I checked, and it does work fine: https://godbolt.org/z/1EY6MW7e5

@nikic nikic closed this as completed Jan 22, 2024
@RalfJung
Copy link
Member

But why is such an annotation required? The main point of target_feature is to tell the compiler which instructions it can use for codegen, but here we are doing the codegen ourselves so it shouldn't matter.

Also, there's still the bug that this isn't documented anywhere on https://doc.rust-lang.org/nightly/reference/inline-assembly.html.

@tarcieri
Copy link
Contributor Author

You should be able to replace #[target_feature(enable = "aes")] here with .arch_extension aes inside the assembly.

@nikic oh nice, does that eliminate the need for the corresponding target_feature to be stable? It's been a big blocker for us to not be able to use certain intrinsics from inline assembly on stable because their corresponding target features aren't stable.

For example we just ran into that trying to add VAES support to the aes crate: RustCrypto/block-ciphers#396 (comment)

@newpavlov
Copy link
Contributor

newpavlov commented Jan 22, 2024

I agree with @RalfJung, to me the need for target_feature annotations for asm! blocks is a very unpleasant surprise which breaks the mental model of "black box" asm blocks. Even worse, this requirement is inconsistent across supported targets as admitted by @taiki-e above. And as written by @tarcieri this requirement is an obstacle we have encountered in practice.

@nikic
Copy link
Contributor

nikic commented Jan 22, 2024

But why is such an annotation required? The main point of target_feature is to tell the compiler which instructions it can use for codegen, but here we are doing the codegen ourselves so it shouldn't matter.

I can only guess, but presumably it's to prevent accidental use of instructions that are not supported on the target.

To be clear, this behavior is not an LLVM invention, it's general assembler behavior for that target. You'll see exactly the same thing if you don't use the integrated assembler and use e.g. binutils instead: https://asm.godbolt.org/z/Mxa1YbnYd

@nikic oh nice, does that eliminate the need for the corresponding target_feature to be stable? It's been a big blocker for us to not be able to use certain intrinsics from inline assembly on stable because their corresponding target features aren't stable.

Yes, as this is part of the assembly string, it's not subject to rustc's stability checks.

@newpavlov
Copy link
Contributor

To be clear, this behavior is not an LLVM invention, it's general assembler behavior for that target.

What are potential drawbacks of enabling all available extensions automatically? I can understand why this feature was historically introduced (after all you need assembler plugins to know how to translate new instructions to binary code), but I am not sure if it's worth to require an explicit opt-in for it in Rust assembler.

@tarcieri
Copy link
Contributor Author

@nikic sorry for the sidebar, but is this error possible to resolve by providing an appropriate .arch_extension, or would this actually require stabilization of avx512_target_feature?

error: register class `zmm_reg` requires the `avx512f` target feature
 --> <source>:9:9
  |
9 |         data = in(zmm_reg) data,
  |         ^^^^^^^^^^^^^^^^^^^^^^^

@nikic
Copy link
Contributor

nikic commented Jan 22, 2024

@tarcieri That would require stabilization. It's enforced by rustc, not by LLVM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

5 participants