-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM codegen for direct return of StaticArray(Tuple)
with mixed type broken for aarch64-darwin
#11021
Comments
Triggers assertion failures on LLVM 10 on M1 (using the universal test build):
will dig in more tonight EDIT: did some digging at lunch. Release mode gives a different (I think more interesting) result:
which looks like it might be related to https://bugs.llvm.org/show_bug.cgi?id=46996 ? will give LLVM 12 a try later and see if it's still broken there too |
Seems like it, considering this comment -> https://bugs.llvm.org/show_bug.cgi?id=46996#c13 |
Issue seems to be with the actual return value of the This compiles fine:
This does not:
Looks like a very specific codegen bug for StaticArrays of Tuples with mixed type members and of four or fewer elements because this compiles fine:
And this also works:
As does
Which seems like the homogenous aggregate optimizations in the AArch64 ABI implementation are probably responsible for this IR being generated given this line: crystal/src/llvm/abi/aarch64.cr Line 34 in fc0617f
By the sounds of the linked bug, the generated IR is valid IR, just not what LLVM expects and so it's causing the aforementioned issue |
@straight-shoota a more apt title for this issue might be "LLVM codegen for direct return of StaticArray of Tuple with mixed type broken for |
StaticArray#to_slice
broken on cross-compile for aarch64-darwin
StaticArray(Tuple)
with mixed type broken for aarch64-darwin
This is indeed the mentioned LLVM bug, and unfortunately it's not fixed until version 13 (just tested with nightly LLVM builds). The issue is with returning big (as in occupying more than 1 64-bit register) composite types. The ABI code is not involved since this is all internal functions, and that code path is activated when generating external/C compatible functions. A reduced (not sure it's minimal) LLVM IR which triggers the bug follows (compile with source_filename = "main_module"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknown-linux-gnu"
%Foo = type { i64, i32 }
define [1 x %Foo] @__crystal_main(i32 %argc, i8** %argv) {
%1 = alloca [1 x %Foo]
%2 = call [1 x %Foo] @bar()
store [1 x %Foo] %2, [1 x %Foo]* %1
%3 = load [1 x %Foo], [1 x %Foo]* %1
ret [1 x %Foo] %3
}
define internal [1 x %Foo] @bar() {
%1 = alloca [1 x %Foo]
%2 = load [1 x %Foo], [1 x %Foo]* %1
ret [1 x %Foo] %2
} Swapping the Other than waiting for LLVM13, a possible workaround would be to use some sort of indirect value return. For that, the codegen for calls and funs must be changed, or maybe it's possible/easier to perform a code transformation to add an |
We can try applying the patch to the LLVM11 brew formula as we did with the LLVM11 issue. |
Cross-compiling the above code with
crystal build --cross-compile --target aarch64-darwin
results in an invalid memory access in LLVM. So probablyThe call to
pp!
is important for reproduction because it callsto_slice
on the static array. Callingto_slice
directly on the static array does not trigger this error.It would be interesting to see if it reproduces natively, i.e. without
--cross-compile
. Codegen is a bit different when cross-compiling. Can someone try building this example on aarch64-darwin?Discovered while adding
#sort
toStaticArray
(https://github.com/straight-shoota/crystal/runs/3155348344; also present in https://github.com/crystal-lang/crystal/pull/10889/checks?check_run_id=3150235316).It reproduces with LLVM 11 and LLVM 10 and with Crystal 1.1 and Crystal 1.0.
Stacktrace (with LLVM 10.0.0)
Stacktrace for original code (with LLVM 10.0.0)
Stacktrace for original code (with LLVM 11.1.0)
/cc @maxfierke @bcardiff
The text was updated successfully, but these errors were encountered: