-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable mutable noalias for LLVM >= 12 #82834
Conversation
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit f8452a5159f665c77df589e44b37fa17fa508df0 with merge 3352d8ec2c800d702ecbbdc68f5502978773c4f3... |
☀️ Try build successful - checks-actions |
Queued 3352d8ec2c800d702ecbbdc68f5502978773c4f3 with parent 51748a8, future comparison URL. |
Finished benchmarking try commit (3352d8ec2c800d702ecbbdc68f5502978773c4f3): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
When looking at incr-unchanged to exclude any slowdown in LLVM, there are many improvements of up to 1.4%, a few slowdowns of up to 0.6% and a regression of ctfe-stress of 1.8%. When looking at all benchmarks, there are big losses of up to 3.3%. For the style-servo-opt full run for example LLVM_module_codegen took 28.5% longer. |
I don't think we should expect this to be compile-time-neutral; LLVM does have to do more work. It'd be helpful to have runtime performance benchmarks to compare these to. Ultimately, we may end up trading compile-time for runtime here. |
I was looking at incr-unchanged for that. Those bail out before LLVM gets invoked. |
@bjorn3 I meant "runtime performance" as in "performance of the compiled code". I'd expect the primary benefit of noalias to be better code generation, not faster code generation. |
@joshtriplett I think that was the idea. If you enable noalias for the compiler, you expect the Rust part of it to get faster, so incr-unchanged can roughly tell you that. |
ce8fb3b
to
73006de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that when looking at the performance results you can open the -Zself-profile
outputs to see where the regressions end up being. For example: old vs new.
There are two areas of comptime regressions here: one is in LLVM internals (which makes sense as LLVM is doing strictly more work optimizing things now) and the other in generating the IR (which also makes sense as we're generating more IR)
This LGTM to me overall. r=me with or without @RalfJung's comment addressed.
It's probably worthwhile to check the perf results again with the full implementation. The previous perf run was with it unconditionally enabled, without the Unpin and version checks. @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit ee2eefeecbc5a8af4c5e81a05bffdad00912f3f8 with merge 989a6de73b924cf690879f45fab0a3889c267b43... |
☀️ Try build successful - checks-actions |
Queued 989a6de73b924cf690879f45fab0a3889c267b43 with parent eb95ace, future comparison URL. |
Yes, |
Ah, perfect. Thank you, @bjorn3 💛 |
Would tracking the produced binary size in perf.rlo make sense as a cheaper proxy measure instead doing full benchmarks? On the assumption that LLVM optimizing more results in things being optimized away and thus smaller code. |
relevant issue rust-lang/rustc-perf#145 |
I'm not 100% sure that that would always be the case. LLVM optimizes by default for runtime performance, and many times that would entail a larger binary size. Inlining and loop unrolling are very common examples of this. Monomorphization as well. That said, maybe there's something related to the |
correlation < 1 is expected, which is why wrote "proxy measure". I just hope it's still big enough to be useful. |
In light of rust-lang/rust#82834, we must ensure that the intrusive linked list pointers never get mutable-noalias optimizations (see also rust-lang/rust#63818). Adding a `PhantomPinned` to the `Links` struct ensures it will always be `!Unpin`, disabling mutable-noalias. Signed-off-by: Eliza Weisman <[email protected]>
In light of rust-lang/rust#82834, we must ensure that the intrusive linked list pointers never get mutable-noalias optimizations (see also rust-lang/rust#63818). Adding a `PhantomPinned` to the `Links` struct ensures it will always be `!Unpin`, disabling mutable-noalias. Signed-off-by: Eliza Weisman <[email protected]>
Add mutable-noalias to the release notes for 1.54 It was enabled in rust-lang#82834 and disabled in 1.53 by rust-lang#86036, but it was never disabled on (then) nightly, so it still landed in 1.54. This was mentioned on rust-lang#86696 but never made it into the release notes. r? `@XAMPPRocky` cc `@nikic`
We haven't seen any regressions upstream since it was last enabled again in March 2021: rust-lang/rust#82834. This results in a negligible increase in binary size of 24-56 kiB, depending on build configuration. Runtime perf only changed on x64 builds. 46 test cases got faster and 18 test cases got slower, with a couple of significant regressions in FIDL microbenchmarks. Overall the results look positive. Fixed: 76297 Change-Id: Id4a2b643e30e748e8d200f9d88c54ecc0ea02b2c Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/674307 Commit-Queue: Tyler Mandry <[email protected]> Reviewed-by: Dan Johnson <[email protected]>
Is there any plan to enable emitting LLVM |
I suspect that depends on deciding a final memory model for rust. |
I was under the impression that there are clear cases where we can guarantee no mutable aliasing around references? |
Is it plausible that you two aren't talking about the same thing? I'm not entirely sure of what context @archshift is talking about, but I think there are indeed some contexts in which we can guarantee that. Gui, do you think you could show us an example of what you mean? I get the feeling that the context you're talking about is more specific than what Bjorn is talking about when they mean that the memory model is needed to know the answer. |
An old closed PR is definitely not the right place for such a discussion though. :) Please take this to Zulip, IRLO, or a new issue. |
Enable mutable noalias by default on LLVM 12, as previously known miscompiles have been resolved. Now it's time to find the next one ;)
-Z mutable-noalias
option no longer has an explicit default and accepts-Z mutable-noalias=yes
and-Z mutable-noalias=no
to override the LLVM version based default behavior.noalias
is not emitted for types that are!Unpin
, as a heuristic for self-referential structures (see Enable noalias annotations #54878 and Resolve unsound interaction between noalias and self-referential data (incl. generators, async fn) #63818).