-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apple platforms: Disabled frame pointer elimination causes perf issues and is not in line with what clang does #86196
Comments
Indeed. Here's what clang does: For AArch64 it omits the frame pointer for leaf functions. Interestingly at https://github.com/llvm/llvm-project/blob/c3cc14f87f78f8172b74175bbd2557cfb9384900/clang/lib/Driver/ToolChains/Clang.cpp#L517 you'd think the same would apply ARM and Thumb on Darwin but with a closer reading it appears that's not the case. It looks like Rust doesn't currently support non-leaf frame pointer omission so that's going to need to be hooked up before this can fixed. Can you elaborate on the massive performance issues? Do you have a benchmark where you noticed this? |
@rustbot label A-codegen C-bug I-prioritize I-slow O-ios O-macos regression-from-stable-to-nightly T-compiler Not entirely sure "bug" and "regression" fit a deliberate change, but since it seems to have been based on a false assumption and have caused performance issues I'm applying them. If someone thinks they don't fit, feel free to undo. |
I have to admit that I was writing a non-inlined partial load function using Neon intrinsics which was heavily affected by this. So it might not be as bad in the general case. I will run some benchmarks. |
The time for an idem function call (returning the u64 passed in as an argument) on Apple M1 is:
The performance difference in the real world applications I have measured (ripgrep, rustc) is negligable. The size of generated binaries is somewhat surprising: The So all in all while I still think that using |
Assigning priority as discussed in the Zulip thread of the Prioritization Working Group. @rustbot label -I-prioritize +P-high |
…henkov Add support for leaf function frame pointer elimination This PR adds ability for the target specifications to specify frame pointer emission type that's not just “always” or “whatever cg decides”. In particular there's a new mode that allows omission of the frame pointer for leaf functions (those that don't call any other functions). We then set this new mode for Aarch64-based Apple targets. Fixes rust-lang#86196
PR #85706 causes
massiveperformance issues for not-inlined leaf functions since the frame pointer and the link register are now always saved and restored at the beginning and end of a function, even if this function e.g. just returns a constant. In contrast to the PR description this is not what clang does on Macos aarch64. The clang default on Macos is"frame-pointer"="non-leaf"
instead.E.g. clang compiles
to this LLVM IR (with
clang -c test.c -S -emit-llvm -O3
)which leads to this assembly:
Since the inclusion of the PR Rust Nightly sets
"frame-pointer"="all"
, which causes this assembly to be generated:@jrmuizel: FYI
The text was updated successfully, but these errors were encountered: