Implement & use metadata on arguments in LLVM #50156

hanna-kruppe · 2018-04-22T13:13:36Z

We use metadata attached to LLVM instructions to tell LLVM passes about the range of possible integers that a value can take on. This allows additional optimizations. For example, when loading from a &NonZeroU32, the LLVM IR looks like this:

  %2 = load i32, i32* %1, align 4, !range !33

where !33 refers to a module-level entry like this (usually located near the end of the module source code):

!33 = !{i32 1, i32 0}

This tells LLVM that it's UB if the load returns anything that is not in the range ">= 1, < 0" (with wrap-around, i.e. it includes all values except 0). This allows x.get() != 0 (where x: &NonZeroU32) to be optimized to true, for example.

Unfortunately LLVM does not have a way to attach this information to function arguments, which leads to missed optimizations (e.g. #49572, #49420 (comment)). Adding this capability has been discussed before on llvm-dev and received positively, but was never actually implemented. If we want to use this in Rust, we're likely going to have to implement (& upstream) it outselves.

This issue tracks:

Implementing metadata-on-arguments in LLVM (mentoring instructions)
Upstreaming / backporting that LLVM patch
Emitting range metadata on arguments from rustc

cc @nox @eddyb

The text was updated successfully, but these errors were encountered:

nox · 2018-04-22T13:15:17Z

Nice! Thanks for filing this and being willing to mentor its implementation.

hanna-kruppe · 2018-04-22T13:31:15Z

cc #50157 which is much easier but strongly related and complementary.

hanna-kruppe · 2018-04-22T15:20:30Z

Mentoring instructions for the LLVM changes:

Currently metadata can be attached to instructions, and attributes can be attached to both function arguments and parameters, as well as the return value. For metadata-on-arguments we'll draw inspiration from the APIs for both. The implementation can be simpler than both of those, though you may still want to look through it. For that, and in general for browsing LLVM code, I recommend https://code.woboq.org/llvm/llvm/ which offers go-to-definition, go-to-declaration, find-all-uses, etc.

We don't need metadata on return values (you can already attach metadata to the call instruction, and this indeed already works for range metadata today). I also don't see a need for metadata on the values passed to a call (e.g. there also isn't range metadata on stores). So we restrict outselves to the formal parameters of the function definition -- confusingly called Argument in the LLVM code base. I think we should add an optional collection of metadata to the Function object (one per parameter), and both Function and Argument would gain APIs for adding, querying, and removing metadata from individual parameters.

The APIs would be similar to what Instruction offers for metadata, though pared down -- we don't need anything related to debug metadata, for example. However, like with attributes, there should probably be APIs on both Function (taking an ArgNo integer to identify the argument) and on Argument (delegating to Function and pass in their own ArgNo). Take a look at how it's done for attributes).

Metadata on instructions is implemented in the superclass Instruction and it's fairly complicated to avoid wasting storage for the majority of instructions that don't have metadata. The actual metadata is stored out-of-line in the LLVMContext and Instruction only stores one bit indicating whether there's any metadata. We don't need that, we can just have an optional pointer to an array with one MDAttachmentMap per parameter, mirroring the Argument *Arguments; member. See the Instruction methods for examples of how to use the MDAttachmentMap.

Once that is done, passes need to be updated to query the new metadata when analyzing Arguments. I haven't looked at this part in detail yet, but going through the existing uses of LLVMContext::MD_range will almost certainly suffice. But anyone who wants to dive into this should ping me so I can help or take over.

nox · 2018-05-06T09:39:02Z

@rkruppe Should the Function class have a new MDAttachmentMap * field and handle memory allocation of the array itself, or should that go in a separate type?

hanna-kruppe · 2018-05-06T09:52:29Z

I think in analogy with Function::Arguments it's fine to open-code that in Function.

eddyb · 2018-05-06T10:01:18Z

Would this change mean that attributes like nonnull would get replaced with metadata?
Actually, don't we have to use !nonnull for pointers and !range for integers?
So nonnull would become !nonnull (how? this should be backwards-compatible).

hanna-kruppe · 2018-05-06T10:14:18Z

Yes, I believe this would obsolete some attributes on formal parameters. For argument values at specific call sites you'd still use attributes, but I don't think callsite annotations are useful for nonnull specifically. So yeah, nonnull probably becomes obsolete with this change.

hanna-kruppe · 2018-05-06T10:23:23Z

Reviewing the list of attributes in the Language Reference, metadata-on-parameters could also obsolete dereferencable(<n>) and dereferencable_or_null(<n>) if corresponding metadata is added.

Other optimization hints that could conceivably become metadata are noalias, nocapture and returned, but these are very function-parameter-specific so turning them into metadata that can be applied to other values may not make sense. There's !noalias metadata but it's not obvious to me whether the noalias attribute can be exactly expressed with that metadata.

I ran into an issue where I did a method call on a nil interface and it resulted in a HardFault. Luckily I quickly realized what was going on so I could fix it, but I think undefined behavior is definitely the wrong behavior in this case. This commit therefore changes such calls to cause a nil panic instead of introducing undefined behavior. This does have a code size impact. It's relatively minor, much lower than I expected. When comparing the before and after of the drivers smoke tests (probably the most representative sample available), I found that most did not change at all and those that did change, normally not more than 100 bytes (16 or 32 byte changes are typical). Right now the pattern is the following: switch typecode { case 1: call method 1 case 2: call method 2 default: nil panic } I also tried the following (in the hope that it would be easier to optimize), but it didn't really result in a code size reduction: switch typecode { case 1: call method 1 case 2: call method 2 case 0: nil panic default: unreachable } Some code got smaller, while other code (the majority) got bigger. Maybe this can be improved once range[1] is finally allowed[2] on function parameters, but it'll probably take a while before that is implemented. [1]: https://llvm.org/docs/LangRef.html#range-metadata [2]: rust-lang/rust#50156

Add range attribute to scalar function results and arguments as LLVM 19 adds the range attribute this starts to use it for better optimization. hade been interesting to see a perf run with the rust-lang#127513 closes rust-lang#50156 cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization. r? `@nikic`

Add range attribute to scalar function results and arguments as LLVM 19 adds the range attribute this starts to use it for better optimization. hade been interesting to see a perf run with the rust-lang#127513 closes rust-lang#50156 cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization. try-job: i686-gnu

Add range attribute to scalar function results and arguments as LLVM 19 adds the range attribute this starts to use it for better optimization. hade been interesting to see a perf run with the rust-lang#127513 closes rust-lang#50156 cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

Add range attribute to scalar function results and arguments as LLVM 19 adds the range attribute this starts to use it for better optimization. hade been interesting to see a perf run with the rust-lang/rust#127513 closes rust-lang/rust#50156 cc rust-lang/rust#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

hanna-kruppe mentioned this issue Apr 22, 2018

The NonZero types don't tell LLVM that they're non-zero on get #49572

Open

kennytm added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. WG-llvm Working group: LLVM backend code generation labels Apr 24, 2018

hanna-kruppe mentioned this issue Oct 12, 2018

Tell LLVM about NonZero integer types #54995

Closed

jonas-schievink added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 18, 2020

aykevl mentioned this issue May 27, 2020

Interface nil check tinygo-org/tinygo#1138

Merged

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

andjo403 mentioned this issue Jan 21, 2024

[IR] Add support for metadata attachments for Function Arguments llvm/llvm-project#78893

Closed

andjo403 mentioned this issue Jul 29, 2024

Add range attribute to scalar function results and arguments #128371

Merged

bors closed this as completed in e08b80c Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement & use metadata on arguments in LLVM #50156

Implement & use metadata on arguments in LLVM #50156

hanna-kruppe commented Apr 22, 2018 •

edited

Loading

nox commented Apr 22, 2018

hanna-kruppe commented Apr 22, 2018

hanna-kruppe commented Apr 22, 2018 •

edited

Loading

nox commented May 6, 2018 •

edited

Loading

hanna-kruppe commented May 6, 2018

eddyb commented May 6, 2018

hanna-kruppe commented May 6, 2018

hanna-kruppe commented May 6, 2018

Implement & use metadata on arguments in LLVM #50156

Implement & use metadata on arguments in LLVM #50156

Comments

hanna-kruppe commented Apr 22, 2018 • edited Loading

nox commented Apr 22, 2018

hanna-kruppe commented Apr 22, 2018

hanna-kruppe commented Apr 22, 2018 • edited Loading

nox commented May 6, 2018 • edited Loading

hanna-kruppe commented May 6, 2018

eddyb commented May 6, 2018

hanna-kruppe commented May 6, 2018

hanna-kruppe commented May 6, 2018

hanna-kruppe commented Apr 22, 2018 •

edited

Loading

hanna-kruppe commented Apr 22, 2018 •

edited

Loading

nox commented May 6, 2018 •

edited

Loading