LLVM 17 breaks inlining callees with compatible target attributes #65205

kalcutter · 2023-09-02T09:31:30Z

Consider the following code:

#include <immintrin.h>

__attribute__((target("avx512bw")))
static __attribute__((always_inline)) __m512i MM512_MASK_ADD_EPI8(__m512i src,
                                                                  __mmask64 k,
                                                                  __m512i a,
                                                                  __m512i b) {
    __asm__("vpaddb\t{%3, %2, %0 %{%1%}" : "+v"(src) : "Yk"(k), "v"(a), "v"(b));
    return src;
}

__attribute__((target("avx512bw")))
__m512i F(__m512i src, __mmask64 k, __m512i a, __m512i b) {
    return MM512_MASK_ADD_EPI8(src, k, a, b);
}

__attribute__((target("avx512bw,avx512dq")))
__m512i G(__m512i src, __mmask64 k, __m512i a, __m512i b) {
    return MM512_MASK_ADD_EPI8(src, k, a, b);
}

__attribute__((target("avx512bw,avx512vl")))
__m512i H(__m512i src, __mmask64 k, __m512i a, __m512i b) {
    return MM512_MASK_ADD_EPI8(src, k, a, b);
}

When compiling with previous versions of clang (up to and including version 16), MM512_MASK_ADD_EPI8 is inlined into F, G, and H (as expected) . Testing with LLVM 17 RC yields a different result: only F allows inlining and horrible code is generated for G and H. I believe this regression is caused by d6f994a.

Please revert that change or apply the appropriate fix to X86TargetTransformInfo.cpp for the LLVM 17 release.

The text was updated successfully, but these errors were encountered:

tru · 2023-09-02T10:13:21Z

@kazutakahirata

llvmbot · 2023-09-02T10:13:58Z

@llvm/issue-subscribers-backend-x86

marcauberer · 2023-09-02T11:23:09Z

https://godbolt.org/z/rPWd1rhYK

nikic · 2023-09-14T14:33:36Z

I think we should revert this change. I've seen this cause issues on multiple instances already. always_inline overriding the target feature check was a feature, not a bug.

kazutakahirata · 2023-09-14T16:35:16Z

Sorry, I somehow missed this. Let me take a look.

kazutakahirata · 2023-09-14T20:22:21Z

It looks like X86TTIImpl::areInlineCompatible is rejecting the inlining opportunity because calls in the callee may become ABI-incompatible as a result of inlining. Eventually, we get to:

        // We don't know the target features of the callee,
        // assume it is incompatible.
        return false;

Now, the only call in the callee in this case is the inline asm, which shouldn't pose a problem in terms of the ABI compatibility. Disregarding inline asm like so around X86TargetTransformInfo.cpp:6063 fixes the problem:

  for (const Instruction &I : instructions(Callee)) {
    if (const auto *CB = dyn_cast<CallBase>(&I)) {
      if (CB->isInlineAsm())
        continue;

I'll turn this into a pull request with a test case.

nikic · 2023-09-14T20:36:48Z

That's a reasonable change, but this is only a specific issue. Others I've seen are calls being compatible in ways that LLVM does not understand (different target features that do not affect ABI) and the TTI hook not being implemented by all targets (IIRC PowerPC was one of the targets that doesn't, resulting in altivec inlining issues).

We should fix such cases to make general inlining more powerful, but IMHO the change to always_inline behavior is still questionable, at least without a comprehensive review of TTI hooks for accuracy.

kalcutter · 2023-09-14T21:48:25Z

I agree that the new behaviour is questionable. If I annotate a function as always_inline, I want the compiler to try, to the best of its ability, to inline the function. Also, I want it to be visible when inlining fails for whatever reason. That is basically what we had before. It would try and maybe fail later with an error if the backend couldn't handle it.

kazutakahirata · 2023-09-14T23:06:04Z

@nikic Are "others you've seen" are publicly available? If so, could you post pointers to some of those? (I'm aware of #65152 for the 32-bit ARM.)

One problem with this always_inline vs target issue is lack of clear specification and test cases. When I implemented https://reviews.llvm.org/D150396, I did quite a bit of archaeology in getAttributeBasedInliningDecision. I wasn't able to tell if it was intentional to proceed to inline in the presence of mismatching target attributes. No test in llvm/test broke when I changed the behavior. Maybe we should check in some of those test cases so that the compiler behavior stays consistent.

Skipping safety checks is concerning to me. In the best case, the compiler crashes (as seen in #62664), and the user would immediately know something is wrong (but not necessarily why something is wrong). In this particular case, the caller ends up with LLVM IR instructions/intrinsics that cannot be lowered to the target hardware instructions available for the specified target attributes, and the backend fails to select instructions.

Other cases could quietly lead to miscompilation. Ignoring an ABI compatibility like the one checked in X86TTIImpl::areInlineCompatible might be one such example.

I don't think if we should specify always_inline (in the presence of mismatching target attributes) as "inline despite potential miscompilation and/or instruction selection problems". That would be a bit too far and very unfriendly to users.

@kalcutter You might find -Rpass=inline helpful in examining inline decisions.

dzaima · 2023-09-15T12:22:43Z

Currently, a "normal" reason for failed inlining of a __attribute__((always_inline)) function produces a compile-time error: https://godbolt.org/z/cn1rssbd8 - thus, I would assume an error would be produced in similar cases, but that doesn't happen for the original code here, which is IMO quite unexpected/confusing. (also, -Rpass-missed=inline is the more useful thing for the missed inlines here)

I don't know what would be the best option here, but an always_inline function not being always inlined is quite weird at least.

edit side-note: looks like recursive always_inline functions also result in missed inlining without error: https://godbolt.org/z/jebKshhKf; GCC errors for this

kalcutter · 2023-09-15T13:25:56Z

IMO missed inlining of always_inline functions should be an error always. If there is no agreement on that, I think at the very least it should be a warning that can be caught with -Werror. Making people rely on -Rpass-missed=inline doesn't seem reasonable for always_inline. The foot gun is too big. I only noticed this regression by luck while hand inspecting the generated code.

nikic · 2023-09-19T09:51:55Z

@nikic Are "others you've seen" are publicly available? If so, could you post pointers to some of those? (I'm aware of #65152 for the 32-bit ARM.)

The other x86 issues is https://reviews.llvm.org/D157826. I wasn't able to find the powerpc issue in my mail archive quickly.

In any case, we have at least 4 different regressions caused by this change, and probably more we're not aware of (because "missing inlining" may easily go unnoticed), so we should err on the side of reverting.

kazutakahirata · 2023-09-19T23:02:52Z

Currently, a "normal" reason for failed inlining of a __attribute__((always_inline)) function produces a compile-time error: https://godbolt.org/z/cn1rssbd8 - thus, I would assume an error would be produced in similar cases, but that doesn't happen for the original code here, which is IMO quite unexpected/confusing. (also, -Rpass-missed=inline is the more useful thing for the missed inlines here)

Thanks for the gotbolt link. I wasn't aware of the language frontend capable of issuing error messages in certain cases. As far as the original code here goes, the backend, namely X86TTIImpl::areInlineCompatible, rejects the inlining opportunity because the callee contains a "call", namely inline asm, and the backend is afraid that the "call" could create an ABI issue in the presence of mismatching target attributes. In this particular case, I could teach the backend to not worry about the inline asm. Even then, the fact remains that the language frontend and the backend independently check inlinability, which do not necessarily agree with each other.

edit side-note: looks like recursive always_inline functions also result in missed inlining without error: https://godbolt.org/z/jebKshhKf; GCC errors for this

Thanks for this also. If I switch to GCC, I see:

<source>:2:43: error: inlining failed in call to 'always_inline' 'f': recursive inlining
    2 | static __attribute__((always_inline)) int f(int x) {

This reverts commit d6f994a. Several people have reported breakage resulting from this patch: - #65152 - #65205

kazutakahirata · 2023-09-21T19:19:12Z

I've flied an offshoot issue #67054, where the callee is declared is static inline instead of static __attribute__((always_inline)).

This reverts commit d6f994a. Several people have reported breakage resulting from this patch: - llvm#65152 - llvm#65205

nikic · 2023-09-26T13:27:33Z

/cherry-pick b4301df

llvmbot · 2023-09-26T13:35:09Z

/branch llvm/llvm-project-release-prs/issue65205

This reverts commit d6f994a. Several people have reported breakage resulting from this patch: - llvm/llvm-project#65152 - llvm/llvm-project#65205 (cherry picked from commit b4301df61fc77a9d54ac236bc88742a731285f1c)

llvmbot · 2023-09-26T13:42:47Z

/pull-request llvm/llvm-project-release-prs#708

This reverts commit d6f994a. Several people have reported breakage resulting from this patch: - llvm/llvm-project#65152 - llvm/llvm-project#65205 (cherry picked from commit b4301df61fc77a9d54ac236bc88742a731285f1c)

RalfJung · 2023-10-11T18:23:18Z

This revert is keeping a soundness issue alive. LLVM will turn working code into broken code currently, and it's hard for frontends to work around that problem. I wonder if the approach outlined here could be a path forward towards fixing this inliner bug?

github-actions bot added the new issue label Sep 2, 2023

tru added this to the LLVM 17.0.X Release milestone Sep 2, 2023

github-project-automation bot added this to LLVM Release Status Sep 2, 2023

github-project-automation bot moved this to Needs Triage in LLVM Release Status Sep 2, 2023

tru added backend:X86 llvm:codegen and removed new issue labels Sep 2, 2023

kalcutter changed the title ~~LLVM 17 breaks inlining callees with other target attributes~~ LLVM 17 breaks inlining callees with compatible target attributes Sep 4, 2023

nikic moved this from Needs Triage to Needs Fix in LLVM Release Status Sep 21, 2023

kazutakahirata added a commit that referenced this issue Sep 21, 2023

Revert "[InlineCost] Check for conflicting target attributes early"

b4301df

This reverts commit d6f994a. Several people have reported breakage resulting from this patch: - #65152 - #65205

kazutakahirata mentioned this issue Sep 21, 2023

X86 rejects inlining of target-feature-wise compatible functions #67054

Closed

alessandrod pushed a commit to aya-rs/llvm-project that referenced this issue Sep 26, 2023

Revert "[InlineCost] Check for conflicting target attributes early"

ea47988

This reverts commit d6f994a. Several people have reported breakage resulting from this patch: - llvm#65152 - llvm#65205

alessandrod mentioned this issue Sep 26, 2023

Cherry pick: Revert "[InlineCost] Check for conflicting target attributes early" rust-lang/llvm-project#153

Closed

llvmbot mentioned this issue Sep 26, 2023

PR for llvm/llvm-project#65205 llvm/llvm-project-release-prs#708

Merged

nikic moved this from Needs Fix to Needs Review in LLVM Release Status Sep 26, 2023

tru closed this as completed in llvm/llvm-project-release-prs#708 Sep 27, 2023

tru moved this from Needs Review to Done in LLVM Release Status Sep 27, 2023

yingfeng mentioned this issue Oct 24, 2023

LLVM 18 breaks inlining callees with compatible target attributes #70002

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM 17 breaks inlining callees with compatible target attributes #65205

LLVM 17 breaks inlining callees with compatible target attributes #65205

kalcutter commented Sep 2, 2023 •

edited

Loading

tru commented Sep 2, 2023

llvmbot commented Sep 2, 2023

marcauberer commented Sep 2, 2023

nikic commented Sep 14, 2023

kazutakahirata commented Sep 14, 2023

kazutakahirata commented Sep 14, 2023 •

edited

Loading

nikic commented Sep 14, 2023

kalcutter commented Sep 14, 2023

kazutakahirata commented Sep 14, 2023

dzaima commented Sep 15, 2023 •

edited

Loading

kalcutter commented Sep 15, 2023

nikic commented Sep 19, 2023

kazutakahirata commented Sep 19, 2023

kazutakahirata commented Sep 21, 2023

nikic commented Sep 26, 2023

llvmbot commented Sep 26, 2023

llvmbot commented Sep 26, 2023

RalfJung commented Oct 11, 2023

LLVM 17 breaks inlining callees with compatible target attributes #65205

LLVM 17 breaks inlining callees with compatible target attributes #65205

Comments

kalcutter commented Sep 2, 2023 • edited Loading

tru commented Sep 2, 2023

llvmbot commented Sep 2, 2023

marcauberer commented Sep 2, 2023

nikic commented Sep 14, 2023

kazutakahirata commented Sep 14, 2023

kazutakahirata commented Sep 14, 2023 • edited Loading

nikic commented Sep 14, 2023

kalcutter commented Sep 14, 2023

kazutakahirata commented Sep 14, 2023

dzaima commented Sep 15, 2023 • edited Loading

kalcutter commented Sep 15, 2023

nikic commented Sep 19, 2023

kazutakahirata commented Sep 19, 2023

kazutakahirata commented Sep 21, 2023

nikic commented Sep 26, 2023

llvmbot commented Sep 26, 2023

llvmbot commented Sep 26, 2023

RalfJung commented Oct 11, 2023

kalcutter commented Sep 2, 2023 •

edited

Loading

kazutakahirata commented Sep 14, 2023 •

edited

Loading

dzaima commented Sep 15, 2023 •

edited

Loading