Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[codegen] Unnecessary panicking branches in resumption of infinite generator (stored in static variable) #66100

Closed
japaric opened this issue Nov 5, 2019 · 3 comments · Fixed by #69814
Assignees
Labels
A-async-await Area: Async & Await A-codegen Area: Code generation A-coroutines Area: Coroutines AsyncAwait-Triaged Async-await issues that have been triaged during a working group meeting. C-enhancement Category: An issue proposing an enhancement or a PR with one. F-coroutines `#![feature(coroutines)]` F-type_alias_impl_trait `#[feature(type_alias_impl_trait)]` I-heavy Issue: Problems and improvements with respect to binary size of generated code. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems

Comments

@japaric
Copy link
Member

japaric commented Nov 5, 2019

Tested with nightly-2019-11-04

Observed behavior

The following no_std program which stores a generator in a static variable and
then resumes it (no indirection / dynamic-dispatch involved):

#![feature(generator_trait)]
#![feature(generators)]
#![feature(never_type)]
#![feature(type_alias_impl_trait)]
#![no_main]
#![no_std]

use core::{mem::MaybeUninit, ops::Generator, pin::Pin};
use cortex_m_rt::{entry, exception};
use panic_halt as _;

type A = impl Generator<Yield = (), Return = !>;

fn foo() -> A {
    || loop {
        yield
    }
}

static mut X: MaybeUninit<A> = MaybeUninit::uninit();

#[entry]
fn main() -> ! {
    unsafe {
        X.as_mut_ptr().write(foo());
    }

    loop {}
}

#[exception]
fn SysTick() {
    unsafe {
        Pin::new_unchecked(&mut *X.as_mut_ptr()).resume();
    }
}

when fully optimized for the thumbv7m-none-eabi target (panic = abort)
produces the following machine code for the SysTick function:

00000400 <SysTick>:
 400:   f240 0000       movw    r0, #0
 404:   f2c2 0000       movt    r0, #8192       ; 0x2000
 408:   6801            ldr     r1, [r0, #0]
 40a:   e8df f001       tbb     [pc, r1]
 40e:   0502            .short  0x0502
 410:   020d            .short  0x020d
 412:   2103            movs    r1, #3
 414:   6001            str     r1, [r0, #0]
 416:   4770            bx      lr
 418:   f240 5030       movw    r0, #1328       ; 0x530
 41c:   2122            movs    r1, #34 ; 0x22
 41e:   f2c0 0000       movt    r0, #0
 422:   f000 f80c       bl      43e <core::panicking::panic> ; <- 1
 426:   defe            udf     #254    ; 0xfe
 428:   f240 5000       movw    r0, #1280       ; 0x500
 42c:   2121            movs    r1, #33 ; 0x21
 42e:   f2c0 0000       movt    r0, #0
 432:   f000 f804       bl      43e <core::panicking::panic> ; <- 2
 436:   defe            udf     #254    ; 0xfe

This function resumes the generator stored in the static variable X. The
generator is extremely simple and contains no explicit panicking branches yet
the machine code contains 2 panicking branches which are not necessary for
memory safety (more on this later).

If the program is slightly modified like this (semantics are not changed):

@@ -7,7 +7,7 @@

 use core::{mem::MaybeUninit, ops::Generator, pin::Pin};
 use cortex_m_rt::{entry, exception};
-use panic_halt as _;
+// use panic_halt as _; // "source-inlined" as the `panic_halt` function at the bottom

 type A = impl Generator<Yield = (), Return = !>;

@@ -34,3 +34,8 @@
         Pin::new_unchecked(&mut *X.as_mut_ptr()).resume();
     }
 }
+
+#[panic_handler]
+fn panic_halt(_: &core::panic::PanicInfo) -> ! {
+    loop {}
+}

Then SysTick is optimized to the following machine code, which is free of
panicking branches:

00000400 <SysTick>:
 400:   f240 0000       movw    r0, #0
 404:   2103            movs    r1, #3
 406:   f2c2 0000       movt    r0, #8192       ; 0x2000
 40a:   6001            str     r1, [r0, #0]
 40c:   4770            bx      lr

Expected behavior

This second machine code (the one with 5 instructions) is what one would expect
from the first version of the program. Changing the shape / complexity of unrelated code
should not introduce panicking branches.

Cause

The Generator.resume function decides which part of the generator body to
execute based on the value of a discriminant. As of nightly-2019-11-04 the code
generated for the resume function contains 2 "built-in" panicking branches,
which are present in all generators:

; Function Attrs: nofree norecurse nounwind
define void @SysTick() unnamed_addr #1 {
start:
  %_3.i = load i32, i32* bitcast (<{ [4 x i8] }>* @_ZN3app1X17h48a9105f802c5ed4E to i32*), align 4
  switch i32 %_3.i, label %bb6.i [
    i32 0, label %"_ZN3app3foo28_$u7b$$u7b$closure$u7d$$u7d$17hcc44d33a27bb7de8E.exit"
    i32 1, label %panic1.i ; <- 1
    i32 2, label %panic.i  ; <- 2
    i32 3, label %"_ZN3app3foo28_$u7b$$u7b$closure$u7d$$u7d$17hcc44d33a27bb7de8E.exit"
  ]
; ..
}

The first built-in panicking branch, which corresponds to discriminant 1, is
used to panic a generator that has been resume-d beyond completion -- a
generator that has reached a return statement has reached completion.

The second built-in branch, which corresponds to discriminant 2, is used to
panic a generator that has been resume-d after panicking -- I'm not sure when
this can occur in practice but looking at the compiler codegen unwinding is required to
reach this "poisoned" state.

However, neither of these situations can occur in the example. The discriminant
of the generator can never become 1 because it does not contain a return
statement in its body. Its discriminant can not become 2 either because the
target has been configured for panic = abort meaning that unwinding is not
supported at the codegen level (UnwindResume functions are never generated). This
can be confirmed by scanning the optimized LLVM IR produced by the compiler: the
discriminant is stored in static memory (.bss) and the only values written to
this memory are 0 and 3.

Possible fixes (?)

Naively, I would improve the MIR codegen by:

  • Not producing the discriminant == 1 branch if the Return type of the
    generator is ! (or, I guess, any other empty type like enum Void {}). This
    type indicates that the generator does not contain a return statement.

  • Not producing the discriminant == 2 branch if the target has set its panic
    option to abort. Alternatively, one could check if the unwind-related
    Resume function is generated and omit the branch if it's not generated.
    Relevant code:

// Poison the generator when it unwinds
for block in body.basic_blocks_mut() {
let source_info = block.terminator().source_info;
if let &TerminatorKind::Resume = &block.terminator().kind {
block.statements.push(
transform.set_discr(VariantIdx::new(POISONED), source_info));
}
}

Does these seem like sensible fixes?


STR

Steps to produce the above machine code. This is not a minified repro because a
certain complexity is required to get the inferior codegen -- trivial programs
get properly optimized.

$ rustup target add thumbv7m-none-eabi
$ rustup component add llvm-tools-preview
$ cargo new --bin app && cd app
$ # or edit Cargo.toml
$ cargo add cortex-m-rt --vers '=0.6.10'
$ cargo add panic-halt --vers '=0.2.0'
$ cat <<EOF >>Cargo.toml
[profile.release]
codegen-units = 1
lto = true
EOF
$ mkdir .cargo
$ cat <<EOF >.cargo/config
[target.thumbv7m-none-eabi]
rustflags = ["-C", "link-arg=-Tlink.x"]
[build]
target = "thumbv7m-none-eabi"
EOF
$ cat <<EOF >memory.x
MEMORY
{
  FLASH : ORIGIN = 0x00000000, LENGTH = 256K
  RAM : ORIGIN = 0x20000000, LENGTH = 64K
}
EOF
$ $EDITOR src/main.rs # enter code shown at the beginning of this issue
$ cargo build --release
$ # or use `arm-none-eabi-objdump` from the `binutils-arm-none-eabi` package (Debian)
$ $(find $(rustc --print sysroot) -name llvm-objdump) -Cd -triple=thumbv7m-none-eabi target/thumbv7m-none-eabi/release/app
@japaric japaric changed the title [codegen] Unnecessary panicking branches in resumption infinite generator (stored in static variable) [codegen] Unnecessary panicking branches in resumption of infinite generator (stored in static variable) Nov 5, 2019
@Mark-Simulacrum Mark-Simulacrum added A-async-await Area: Async & Await I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 5, 2019
@Centril Centril added F-coroutines `#![feature(coroutines)]` F-type_alias_impl_trait `#[feature(type_alias_impl_trait)]` requires-nightly This issue requires a nightly compiler in some way. labels Nov 5, 2019
@jonas-schievink jonas-schievink added the A-codegen Area: Code generation label Nov 5, 2019
@nikomatsakis nikomatsakis added the AsyncAwait-Triaged Async-await issues that have been triaged during a working group meeting. label Nov 12, 2019
@Zoxc
Copy link
Contributor

Zoxc commented Nov 26, 2019

@tmandry Do you want to fix this?

It should be pretty easy. Don't emit the return case if there's no Return terminator in MIR. Don't emit the unwind case if there's no Resume terminator in MIR.

@Zoxc
Copy link
Contributor

Zoxc commented Nov 26, 2019

I guess the Resume thing is not true since unwinds are implicit and other terminators can unwind too. I guess skipping it if tcx.sess.no_landing_pads() will suffice. This does mean that the generator transformation doesn't poison all the unwind paths currently, just Resume. so that has to be fixed too.

@Zoxc
Copy link
Contributor

Zoxc commented Nov 26, 2019

The generator transformation could be fixed by inserting a basic block with just Resume in it at the start of create_generator_resume_function and change cleanup/unwind edges of terminators with None to point to the new block.

@jonas-schievink jonas-schievink added the A-coroutines Area: Coroutines label Nov 26, 2019
@jonas-schievink jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. WG-embedded Working group: Embedded systems labels Dec 6, 2019
@jonas-schievink jonas-schievink added the I-heavy Issue: Problems and improvements with respect to binary size of generated code. label Feb 6, 2020
@jonas-schievink jonas-schievink removed the requires-nightly This issue requires a nightly compiler in some way. label Mar 7, 2020
@jonas-schievink jonas-schievink self-assigned this Mar 7, 2020
Centril added a commit to Centril/rust that referenced this issue Mar 18, 2020
Smaller and more correct generator codegen

This removes unnecessary panicking branches in the resume function when the generator can not return or unwind, respectively.

Closes rust-lang#66100

It also addresses the correctness concerns wrt poisoning on unwind. These are not currently a soundness issue because any operation *inside* a generator that could possibly unwind will result in a cleanup path for dropping it, ultimately reaching a `Resume` terminator, which we already handled correctly. Future MIR optimizations might optimize that out, though.

r? @Zoxc
Centril added a commit to Centril/rust that referenced this issue Mar 19, 2020
Smaller and more correct generator codegen

This removes unnecessary panicking branches in the resume function when the generator can not return or unwind, respectively.

Closes rust-lang#66100

It also addresses the correctness concerns wrt poisoning on unwind. These are not currently a soundness issue because any operation *inside* a generator that could possibly unwind will result in a cleanup path for dropping it, ultimately reaching a `Resume` terminator, which we already handled correctly. Future MIR optimizations might optimize that out, though.

r? @Zoxc
@bors bors closed this as completed in 5570a23 Mar 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-async-await Area: Async & Await A-codegen Area: Code generation A-coroutines Area: Coroutines AsyncAwait-Triaged Async-await issues that have been triaged during a working group meeting. C-enhancement Category: An issue proposing an enhancement or a PR with one. F-coroutines `#![feature(coroutines)]` F-type_alias_impl_trait `#[feature(type_alias_impl_trait)]` I-heavy Issue: Problems and improvements with respect to binary size of generated code. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems
Development

Successfully merging a pull request may close this issue.

6 participants