-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extern "C" functions don't generate the same IR definitions as clang on x86, causing problems with cross-language LTO #102174
Comments
Rust seems to just mark all struct arguments as byval:
While clang will pass certain aggregates directly: https://github.com/llvm/llvm-project/blob/1aaba40dcbe8fdc93d825d1f4e22edaa3e9aa5b1/clang/lib/CodeGen/TargetInfo.cpp#L1875-L1883 Another instance of @eddyb's favorite bug, #65111. From your description, I don't really see how this used to work before the referenced LLVM change, as the ABIs are completely different -- I guess you just ended up interpreting the pointer address as the value contained in the struct and that happened to "work". |
Before the change, the inlining wasn't happening so we ended up with a normal call, and the rust callee is just fine too because it compiled down to the proper ABI for the function. What screws everything is that inlining now happens. |
@Gankra did you run the abi checker for some i686 targets? |
I believe IIRC x86 32-bit ABIs don't have any registers available for arguments "by default" (and the proliferation of e.g. Any ABI that doesn't use Though this discussion is further complicated by:
From a quick check, it seems (older) 32-bit ABIs are most susceptible to this inefficiency: rust/compiler/rustc_target/src/abi/call/arm.rs Lines 71 to 73 in fe217c2
rust/compiler/rustc_target/src/abi/call/sparc.rs Lines 24 to 26 in fe217c2
rust/compiler/rustc_target/src/abi/call/mips.rs Lines 24 to 26 in fe217c2
Even a 64-bit one (I assume because it's still an old ABI):
MIPS64 is also affected, but only for Given sufficient ABI information, LLVM itself should be able to "relax" This is one of my gripes with LLVM and ABI: it forces the frontend to think about ABI distinctions like:
... but only some of the time, not all of the time, and that leads to both multiple ways to represent certain behaviors (like the one discussed here, with I recently wrote something related to this (Ctrl+F "ABI mapping" in #100698 (comment)), and I still believe it would be one of the nicer solutions overall: decouple "stack slots and SSA scalars/vectors" call dataflow (friendlier to IPO in general - e.g. inlining, but not only) from exporting a function with specific register/stack assignments. |
Thanks, this is the bit I was missing! The ABIs are incompatible at the LLVM IR level, but become compatible post-legalization. That does limit the issue to the cross-language LTO case only.
LLVM will generally try to relax
Yes, LLVM's ABI handling is a big pile of poo ... your suggestion does sound reasonable on the surface, but I'm not volunteering to implement it :P Regardless of the specific IR representation, something that would help a lot is to extract Clang's ABI calculation into something reusable by other frontends (discussed a bit at https://discourse.llvm.org/t/rfc-targetinfo-library/64342, though I doubt something will come of that). While rustc couldn't use that directly (due to other codegen backends), at least it could be used to our computed ABI is correct. |
Heh, something like that is why My long-term hopes with (the rough idea is that once you reduce a type and its layout to the set of possible ABI encodings that apply, it's a "resource calculus" from there on, like the "stateful" part of these algorithms ~always looks like "number of remaining registers per bank" or "stack offset/misalignment", and all the decisions are like "do I fit", i.e. "are there available resources for me to consume") But for |
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-high regression-untriaged |
I have not, it's on my TODO list to figure out how platforms-that-aren't-host-but-can-run-on-host should be handled by abi-checker, since that's a lot easier than the full cross-compile case but also I'm not sure what problems come up if you build an x86_32 dylib and load it into an x64 process on the various platforms (esp when passing opaque pointers across the dylib boundary). edit: actually maybe this would be fine if I just make you rebuild the harness itself too... |
More broadly, I'm not sure abi-cafe would be able to catch this since the ultimate ABI is correct, it's just that llvm freaks out after cross-lang LTO. Maybe this is something abi-cafe should support checking? I'm not sure how hard it is to setup. llvm has had other bugs like this before too, right? istr something wonky about inlining functions compiled for different "modes" like simd or thumb? |
Cranelift doesn't allow assigning exact registers. Cranelift is pretty much exactly on the same level as what rustc_target exposes. You tell it the calling convention (for assigning registers and stack locations) and a list of input and output |
@Gankra
The errors I get are:
and
|
Ok so I should just figure out a way to optionally flip on LTO for |
Discussed during T-compiler meeting (notes), namely this part:
@rustbot label -i-compiler-nominated |
I looked a bit closer into what clang is actually doing here. I referenced the wrong code above, the relevant bit is actually this: https://github.com/llvm/llvm-project/blob/1aaba40dcbe8fdc93d825d1f4e22edaa3e9aa5b1/clang/lib/CodeGen/TargetInfo.cpp#L1891-L1900 In words, what clang is going is to unpack structs <= 128 bits where all parts are 32-bit or 64-bit primitives (pointer, integer, float, enum). Some samples: https://clang.godbolt.org/z/bMjYfxnjj This is being done on the premise that it does not change the final ABI, while making it easier to optimize the resulting IR. There's also this special gem hidden in there: https://github.com/llvm/llvm-project/blob/1aaba40dcbe8fdc93d825d1f4e22edaa3e9aa5b1/clang/lib/CodeGen/TargetInfo.cpp#L1487-L1492 That is, if it's C++ code it actually makes a difference whether you use |
And another extra peculiar case: If we combine this with the fastcall calling convention, we get this: https://clang.godbolt.org/z/MW797ehfj Now we have an additional inreg i32 undef argument on the call, before the i32 that actually passes the unpacked struct. |
Discussed during T-compiler P-high review.. There are a number of short-comings identified here. We should figure out how to best invest our effort to address them, even partially. I think the next step is to write an MCP proposing some kind of ABI validation (see #65111), perhaps via abi-cafe |
I thought that in C, structs are never passed in a register (they are not implicit transparent like our structs)? How come clang passes this as a plain Looking at the clang code
So... they get away with that because they are sure this I see no practical way for us to fix this alone. Clearly clang itself only considers itself to be beholden to the final ABI of the produced binary, and feels free to arbitrarily change the LLVM-IR-level ABI as long as the asm-level ABI stays the same. I doubt they make stability guarantees for the LLVM-IR-level ABI, so trying to match them seems hopeless. So IMO a systematic fix requires xlang LTO to be more careful, and in particular xlang inlining needs to be able to actually handle these kinds of ABI mismatches. |
Small structs are often passed in registers, but it depends on the calling convention. The x86_64 SystemV call conv for example passes many structs less than 2 registers in size in registers. While the wasm C abi (the official one, not the broken one wasm32-unknown-unknown uses) passes every struct which contains only a single scalar as field by-value. |
After llvm/llvm-project@6c8adc5, inlining in cross-language LTO happens in cases where it didn't happen before, including cases where things go very bad (more details in https://bugzilla.mozilla.org/show_bug.cgi?id=1789779#c7)
It seems to boil down to LLVM not liking that rust defines its
extern "C"
functions in significantly different ways than clang does for the C/C++ code that calls it. For example:The caller:
Rust defines the function as:
while the caller C code does this:
The equivalent C code:
defines the function as:
(Ironically, rustc transforms a non-
extern "C"
version of the function to the same declaration as clang's)Arguably, there's an underlying LLVM bug not being able to handle this case, which /could/ be considered fine, but I'm not sure it's supposed to be.
It's worth noting that rustc does not use a byval for e.g. x86_64-unknown-linux-gnu.
Cc: @nikic
The text was updated successfully, but these errors were encountered: