Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISC-V] Do not force frame pointers #69890

Merged
merged 1 commit into from
Jun 17, 2020

Conversation

lenary
Copy link
Contributor

@lenary lenary commented Mar 10, 2020

We have been seeing some very inefficient code that went away when using
-Cforce-frame-pointers=no. For instance core::ptr::drop_in_place at
-Oz was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In #61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in #61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that fp is always used for the frame pointer. fp is
an ABI name for x8 (aka s0), and if no frame pointer is required,
x8 may be used for other callee-saved values.


I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere.

There are more details on the code size savings seen in Tock here: tock/tock#1660

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 10, 2020
@jonas-schievink jonas-schievink added the WG-embedded Working group: Embedded systems label Mar 10, 2020
@jonas-schievink
Copy link
Contributor

Counterpoint: The embedded ARM targets have done the opposite in order to fix backtraces #69248

@jonas-schievink
Copy link
Contributor

Have you tried changing the uwtable option instead? According to #69231 (comment) that might be a cheaper option than keeping full frame pointers, while still keeping backtraces intact.

@jonas-schievink jonas-schievink added the O-riscv Target: RISC-V architecture label Mar 10, 2020
@lenary
Copy link
Contributor Author

lenary commented Mar 10, 2020

Oh, god, I see part of the problem. You can't use a -C option to override how std is compiled, so if anyone wants to force frame pointers, they have to be forced in the target configs. This is really infuriating, to be honest.

Looking at why uwtable works, it seems that maybe the ARM issue comes from here: https://github.com/llvm/llvm-project/blob/e4230a9f6c518209cf0d9fdac1764dadd525b513/llvm/lib/Target/ARM/ARMFrameLowering.cpp#L98 In the RISC-V backend, we do not override this hook, and so the default implementation (which returns false) is always used. grepping the RISC-V backend for uwtable does not seem to match any target-specific code.

Looking at how this all interacts:

  • GDB looks for call frame information (CFI) first, and then tries to reconstruct the backtrace using the frame pointer.
    • If a function is nounwind (and not uwtable), then no CFI will be emitted for the function.
    • panic != unwind adds nounwind to all functions.
    • Therefore right now, no RISC-V target gets CFI information.
  • LR(ARM)/fp(RISC-V) is a callee-saved register (CSR). This is used for the frame pointer, if a frame pointer is requested, otherwise it is a general-purpose CSR.
    • if a function is both noreturn and nounwind (and not uwtable), then LLVM allows backends to skip restoring CSRs. panic is a noreturn function, and nounwind depending on the conditions above.
      • ARM definitely skips these restores. Skipping these restores trashes the frame pointer.
      • RISC-V does not skip these restores (yet).
    • The only place to force frame pointers for std is in these target definitions. Using -C is too late.

I will investigate adding uwtable to ensure that CFI is added, which can be used instead of the frame pointer. I'm not sure this is the best fix, but I think it's maybe the best we can do at the moment.

@hanna-kruppe
Copy link
Contributor

Longer-term, Cargo should gain the ability to recompile the standard library (currently unstable as -Zbuild-std) allowing projects to fully opt in or out of whatever it takes to get backtraces working. But if "CFI everywhere" indeed has lower costs (binary size, code size, performance) than "frame pointers everywhere" then that seems like the better choice for that both today and in the future.

Do we have a -C flag controlling uwtabe emission the same way -C force-frame-pointers controls frame pointers? If targets are going to start defaulting to it for backtrace quality reasons, it would be good to have an opt-out (modulo sysroot crates, as with frame pointers). And even if targets won't need to do that in the future because Cargo can rebuild the standard library, it would be helpful if projects had the choice to enable "CFI everywhere" without creating a new target definition.

@fintelia
Copy link
Contributor

fintelia commented Mar 10, 2020

GDB also isn't the only use case we need backtraces for. Executables also want to be able to generate backtraces for themselves, so they can print from inside a panic/abort handler, for instance. It should ideally be possible to do this even when in a nostd environment.

With frame pointers this logic only takes a handful of lines of code. With uwtable, well, none of the relevant libraries had been ported to RISC-V when I checked last year. Though maybe someone could share documentation on how to parse the format? Perhaps it isn't as bad as I'm imagining.

@lenary
Copy link
Contributor Author

lenary commented Mar 11, 2020

What happens today if we use -Zbuild-std and -C force-frame-pointers=no? Presumably this rebuilds core and std without frame pointers. So in the event of this patch landing, projects that require frame pointers (which I believe are in the minority) could presumably use -Zbuild-std -C force-frame-pointers=yes, without requiring that the default std distribution contains frame pointers everywhere.

Do we have a -C flag controlling uwtable emission

Not sure, I can look to add it in a separate PR. This might be a better solution rather than adding requires_uwtable: true to all these definitions.

@jonas-schievink
Copy link
Contributor

So in the event of this patch landing, projects that require frame pointers (which I believe are in the minority) could presumably use -Zbuild-std -C force-frame-pointers=yes, without requiring that the default std distribution contains frame pointers everywhere.

What I would really like to prevent here is breaking obtaining a backtrace using GDB after an embedded app has panicked (this traverses noreturn frames, which caused the issues on ARM, and here the problems seem to be caused by nounwind frames). As long as that still works, I'm fine with this patch.

If it doesn't, then I don't think we should land this patch. Backtraces should work by default, even on embedded targets (or rather, especially there). For the cases where code size is an issue, passing -Cforce-frame-pointers=off should not be a dealbreaker. For the remaining cases where frame pointers in libcore are still too costly, producing release builds on nightly via -Zbuild-std should work.

@lenary
Copy link
Contributor Author

lenary commented Mar 12, 2020

Sorry, I've been conflating two things: ELF Binaries have up to two sources of (slightly different) call frame information:

  • Binaries that need to do stack unwinding (for exceptions) have an .eh_frame section (and maybe an .eh_frame_hdr which is a fast lookup table to find the right .eh_frame call frame info entry). I think this is what uwtable (or not nounwind) attributes in LLVM IR translate to. Usually, C binaries do not have an .eh_frame section (C++ binaries usually do).
  • Binaries that you want to debug have a .debug_frame section, which can be used if there is no frame pointer. This is controlled with debug=true/debug=false in your cargo config (or -g with gcc-like c compilers).

GDB will parse either of these to construct frame information, it is not fussy (the requirement here is you use the ELF file generated by the compiler/linker, for debugging, not the stripped binary that is loaded onto the device. You will also need to ensure that your linker script preserves at least one of these sections for the elf file).

I doubt we need to add the uwtable attribute to get a correct GBD backtrace if debug=true.
If a user has chosen debug=false, then I'm not sure that Rust should be on the hook for ensuring the GDB backtrace is correct.

@bjorn3
Copy link
Member

bjorn3 commented Mar 12, 2020

If a user has chosen debug=false, then I'm not sure that Rust should be on the hook for ensuring the GDB backtrace is correct.

Panic backtraces should be correct though.

@hanna-kruppe
Copy link
Contributor

I doubt we need to add the uwtable attribute to get a correct GBD backtrace if debug=true.
If a user has chosen debug=false, then I'm not sure that Rust should be on the hook for ensuring the GDB backtrace is correct.

Interesting. As with frame pointers, user's choice of debuginfo does not currently affect the parts of the sysroot that are shipped as pre-compiled machine code (which crucially includes the panic machinery that's at the start of every backtrace). Leaving aside the still-unstable possibility of -Zbuild-std, I believe this means that even allowing users to opt into working backtraces requires the standard library to be built with one of:

  • panic=unwind (not supported or desired on many embedded targets)
  • uwtable
  • some debug info
  • forcing frame pointers

I expect that debug info, even just line tables, has more space overhead than uwtable or frame pointers, so that seems like the least appealing option (if all we care about is backtraces; of course debug info enables more than that).


While writing this up, I realized we could also specifically tweak how the standard library is built for distribution, instead of changing the target defaults. This would effectively mean moving the burden for users from "if you want better code size, use these options" to "if you want backtrace to work in release, use these options". I don't have a strong opinion on which is the better default but it's an option.

@nikomatsakis
Copy link
Contributor

So I don't really have an opinion here. I'd like to delegate this final decision to the wg-embedded group perhaps =) not sure who's the right owner for the RISC-V backend, basically. Any suggestions for who I ought to reassign to?

@Amanieu
Copy link
Member

Amanieu commented Mar 13, 2020

LLVM's actual logic for determining whether callee-saved registers (which includes the return address register) should be saved is actually target-independent.

The condition for not preserving CSRs is basically that a function must be noreturn + nounwind + !uwtable. Notably, this means that just enabling debuginfo generation is not enough: you must use uwtable if you want to ensure that backtraces are possible.

I personally think that this should be added as a third option to -C panic, called abort-with-backtrace. All it does is enable uwtable to guarantee that backtraces can be reliably generated. The reason for making it separate from -C panic=abort is that this comes at a slight code size increase (due to the need to preserve CSRs even for noreturn functions) and a binary size increase from the .eh_frame section.

@Amanieu
Copy link
Member

Amanieu commented Mar 13, 2020

On targets which do not support unwinding (e.g. thumb* & riscv*), the standard library would then be built with -C panic=abort-with-backtrace to ensure that reconstructing backtraces is always possible on these targets.

@bjorn3
Copy link
Member

bjorn3 commented Mar 13, 2020

What about -Cpanic=abort-without-backtrace instead? That would make it less confusing why backtraces don't work and it means that existing uses where people do want backtraces don't have to be updated.

@Disasm
Copy link
Contributor

Disasm commented Mar 18, 2020

Just to note: I checked debugging panics and diverging functions on a HiFive1 board (rv32imac) and it works great both with and without -Cforce-frame-pointers=no. I hope this resolves @jonas-schievink's doubts.

@fintelia
Copy link
Contributor

@Disasm Could you specify a bit more about what you did? Were you running with #![no_std], and if so what did your panic handler look like? Did you see a backtrace or just a line number in the resulting output?

@nikomatsakis
Copy link
Contributor

If we are going to be adding a new panic mode, that would be a different set of decision makers. I'm a bit wary of that outcome, although I do see @Amanieu's point (but -- if I'm not mistaken -- isn't that already available to people by setting the "force frame pointer" option manually?).

@Disasm
Copy link
Contributor

Disasm commented Mar 19, 2020

@fintelia Yes, it was a no_std firmware for the riscv32imac-unknown-none-elf target with a few #[inline(never)] diverging functions calling each other and optionally calling panic!().
panic-halt was used as a panic handler. I didn't see a panic message for obvious reasons, I only checked the output of gdb after the bt command. I saw a good finite backtrace.

@Dylan-DPC-zz
Copy link

r? @hanna-kruppe

@joelpalmer joelpalmer added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 14, 2020
@lenary lenary force-pushed the lenary/riscv-frame-pointers branch from 6438cc7 to 654f414 Compare May 30, 2020 14:18
@lenary
Copy link
Contributor Author

lenary commented May 30, 2020

Hopefully this change now addresses @hanna-kruppe's suggestion.

@rust-highfive
Copy link
Collaborator

The job mingw-check of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
##[section]Starting: Linux mingw-check
##[section]Starting: Initialize job
Agent name: 'Azure Pipelines 28'
Agent machine name: 'fv-az578'
Current agent version: '2.169.1'
##[group]Operating System
16.04.6
LTS
LTS
##[endgroup]
##[group]Virtual Environment
Environment: ubuntu-16.04
Version: 20200517.1
Included Software: https://github.com/actions/virtual-environments/blob/ubuntu16/20200517.1/images/linux/Ubuntu1604-README.md
##[endgroup]
Agent running as: 'vsts'
Prepare build directory.
Set build variables.
Download all required tasks.
Download all required tasks.
Downloading task: Bash (3.163.2)
Checking job knob settings.
   Knob: AgentToolsDirectory = /opt/hostedtoolcache Source: ${AGENT_TOOLSDIRECTORY} 
   Knob: AgentPerflog = /home/vsts/perflog Source: ${VSTS_AGENT_PERFLOG} 
Start tracking orphan processes.
##[section]Finishing: Initialize job
##[section]Starting: Configure Job Name
==============================================================================
---
========================== Starting Command Output ===========================
[command]/bin/bash --noprofile --norc /home/vsts/work/_temp/82d383ba-96a5-4fb3-9222-676d86b41a67.sh

##[section]Finishing: Disable git automatic line ending conversion
##[section]Starting: Checkout rust-lang/rust@refs/pull/69890/merge to s
Task         : Get sources
Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version      : 1.0.0
Author       : Microsoft
---
##[command]git remote add origin https://github.com/rust-lang/rust
##[command]git config gc.auto 0
##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
##[command]git config --get-all http.proxy
##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/69890/merge:refs/remotes/pull/69890/merge
---
 ---> 3adb0605cc65
Step 6/7 : ENV RUN_CHECK_WITH_PARALLEL_QUERIES 1
 ---> Using cache
 ---> 28dbc326cb7f
Step 7/7 : ENV SCRIPT python3 ../x.py test src/tools/expand-yaml-anchors &&            python3 ../x.py check --target=i686-pc-windows-gnu --host=i686-pc-windows-gnu &&            python3 ../x.py build --stage 0 src/tools/build-manifest &&            python3 ../x.py test --stage 0 src/tools/compiletest &&            python3 ../x.py test src/tools/tidy &&            python3 ../x.py doc --stage 0 src/libstd &&            /scripts/validate-toolstate.sh
 ---> 537a01811900
Successfully built 537a01811900
Successfully tagged rust-ci:latest
Built container sha256:537a018119009dc218456238dec90b5530050db1e2a1e166550c218003f6159d
---
   Compiling bootstrap v0.0.0 (/checkout/src/bootstrap)
error[E0308]: mismatched types
   --> src/bootstrap/compile.rs:255:9
    |
255 |         cargo.rustflag("-Cforce-unwind-tables=yes")
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `()`, found `&mut builder::Cargo`
help: try adding a semicolon
    |
    |
255 |         cargo.rustflag("-Cforce-unwind-tables=yes");
help: try adding a return type
    |
    |
168 | pub fn std_cargo(builder: &Builder<'_>, target: Interned<String>, stage: u32, cargo: &mut Cargo) -> &mut builder::Cargo {

error: aborting due to previous error

For more information about this error, try `rustc --explain E0308`.
---
  local time: Sat May 30 14:24:05 UTC 2020
  network time: Sat, 30 May 2020 14:24:05 GMT
== end clock drift check ==

##[error]Bash exited with code '1'.
##[section]Finishing: Run build
##[section]Starting: Checkout rust-lang/rust@refs/pull/69890/merge to s
Task         : Get sources
Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version      : 1.0.0
Author       : Microsoft
Author       : Microsoft
Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
==============================================================================
Cleaning any cached credential from repository: rust-lang/rust (GitHub)
##[section]Finishing: Checkout rust-lang/rust@refs/pull/69890/merge to s
Cleaning up task key
Start cleaning up orphan processes.
Terminate orphan process: pid (4769) (python)
##[section]Finishing: Finalize Job

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @rust-lang/infra. (Feature Requests)

@hanna-kruppe
Copy link
Contributor

LGTM. r=me except that I'd like someone more familiar with rustbuild to take a second look at the rustbuild part, so r? @Mark-Simulacrum

We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In rust-lang#61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in rust-lang#61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

This commit does ensure that the standard library is built with unwind
tables, so that users do not need to rebuild the standard library in
order to get a backtrace that includes standard library calls (which is
the original reason for forcing frame pointers).
@lenary lenary force-pushed the lenary/riscv-frame-pointers branch from 654f414 to 3da3d15 Compare May 30, 2020 17:24
@lenary
Copy link
Contributor Author

lenary commented May 30, 2020

I have just confirmed in a build on my own machine (of riscv32imc-unknown-none-elf) that libcore-<hash>.rlib does contain .eh_frame information with this change.

Update: This has also removed the code generation issues seen in core::ptr::drop_in_place which is now 2 bytes not 16.

@JohnCSimon
Copy link
Member

ping from triage:
@lenary is this ready for review?

@lenary
Copy link
Contributor Author

lenary commented Jun 6, 2020

Yes, this is ready from my point of view.

@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 16, 2020
@Mark-Simulacrum
Copy link
Member

@bors r=hanna-kruppe,Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Jun 16, 2020

📌 Commit 3da3d15 has been approved by hanna-kruppe,Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 16, 2020
@lenary
Copy link
Contributor Author

lenary commented Jun 16, 2020

Thanks!

@bors
Copy link
Contributor

bors commented Jun 17, 2020

⌛ Testing commit 3da3d15 with merge 2935d29...

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Jun 17, 2020
… r=hanna-kruppe,Mark-Simulacrum

[RISC-V] Do not force frame pointers

We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In rust-lang#61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in rust-lang#61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

---

I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere.

There are more details on the code size savings seen in Tock here: tock/tock#1660
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Jun 17, 2020
… r=hanna-kruppe,Mark-Simulacrum

[RISC-V] Do not force frame pointers

We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In rust-lang#61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in rust-lang#61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

---

I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere.

There are more details on the code size savings seen in Tock here: tock/tock#1660
@bors
Copy link
Contributor

bors commented Jun 17, 2020

☀️ Test successful - checks-azure
Approved by: hanna-kruppe,Mark-Simulacrum
Pushing 2935d29 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 17, 2020
@bors bors merged commit 2935d29 into rust-lang:master Jun 17, 2020
@lenary lenary deleted the lenary/riscv-frame-pointers branch June 17, 2020 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. O-riscv Target: RISC-V architecture S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. WG-embedded Working group: Embedded systems
Projects
None yet
Development

Successfully merging this pull request may close these issues.