[codegen] Unnecessary panicking branches in resumption of infinite generator (stored in static variable) #66100

japaric · 2019-11-05T02:32:48Z

Tested with nightly-2019-11-04

Observed behavior

The following no_std program which stores a generator in a static variable and
then resumes it (no indirection / dynamic-dispatch involved):

#![feature(generator_trait)]
#![feature(generators)]
#![feature(never_type)]
#![feature(type_alias_impl_trait)]
#![no_main]
#![no_std]

use core::{mem::MaybeUninit, ops::Generator, pin::Pin};
use cortex_m_rt::{entry, exception};
use panic_halt as _;

type A = impl Generator<Yield = (), Return = !>;

fn foo() -> A {
    || loop {
        yield
    }
}

static mut X: MaybeUninit<A> = MaybeUninit::uninit();

#[entry]
fn main() -> ! {
    unsafe {
        X.as_mut_ptr().write(foo());
    }

    loop {}
}

#[exception]
fn SysTick() {
    unsafe {
        Pin::new_unchecked(&mut *X.as_mut_ptr()).resume();
    }
}

when fully optimized for the thumbv7m-none-eabi target (panic = abort)
produces the following machine code for the SysTick function:

00000400 <SysTick>:
 400:   f240 0000       movw    r0, #0
 404:   f2c2 0000       movt    r0, #8192       ; 0x2000
 408:   6801            ldr     r1, [r0, #0]
 40a:   e8df f001       tbb     [pc, r1]
 40e:   0502            .short  0x0502
 410:   020d            .short  0x020d
 412:   2103            movs    r1, #3
 414:   6001            str     r1, [r0, #0]
 416:   4770            bx      lr
 418:   f240 5030       movw    r0, #1328       ; 0x530
 41c:   2122            movs    r1, #34 ; 0x22
 41e:   f2c0 0000       movt    r0, #0
 422:   f000 f80c       bl      43e <core::panicking::panic> ; <- 1
 426:   defe            udf     #254    ; 0xfe
 428:   f240 5000       movw    r0, #1280       ; 0x500
 42c:   2121            movs    r1, #33 ; 0x21
 42e:   f2c0 0000       movt    r0, #0
 432:   f000 f804       bl      43e <core::panicking::panic> ; <- 2
 436:   defe            udf     #254    ; 0xfe

This function resumes the generator stored in the static variable X. The
generator is extremely simple and contains no explicit panicking branches yet
the machine code contains 2 panicking branches which are not necessary for
memory safety (more on this later).

If the program is slightly modified like this (semantics are not changed):

@@ -7,7 +7,7 @@

 use core::{mem::MaybeUninit, ops::Generator, pin::Pin};
 use cortex_m_rt::{entry, exception};
-use panic_halt as _;
+// use panic_halt as _; // "source-inlined" as the `panic_halt` function at the bottom

 type A = impl Generator<Yield = (), Return = !>;

@@ -34,3 +34,8 @@
         Pin::new_unchecked(&mut *X.as_mut_ptr()).resume();
     }
 }
+
+#[panic_handler]
+fn panic_halt(_: &core::panic::PanicInfo) -> ! {
+    loop {}
+}

Then SysTick is optimized to the following machine code, which is free of
panicking branches:

00000400 <SysTick>:
 400:   f240 0000       movw    r0, #0
 404:   2103            movs    r1, #3
 406:   f2c2 0000       movt    r0, #8192       ; 0x2000
 40a:   6001            str     r1, [r0, #0]
 40c:   4770            bx      lr

Expected behavior

This second machine code (the one with 5 instructions) is what one would expect
from the first version of the program. Changing the shape / complexity of unrelated code
should not introduce panicking branches.

Cause

The Generator.resume function decides which part of the generator body to
execute based on the value of a discriminant. As of nightly-2019-11-04 the code
generated for the resume function contains 2 "built-in" panicking branches,
which are present in all generators:

; Function Attrs: nofree norecurse nounwind
define void @SysTick() unnamed_addr #1 {
start:
  %_3.i = load i32, i32* bitcast (<{ [4 x i8] }>* @_ZN3app1X17h48a9105f802c5ed4E to i32*), align 4
  switch i32 %_3.i, label %bb6.i [
    i32 0, label %"_ZN3app3foo28_$u7b$$u7b$closure$u7d$$u7d$17hcc44d33a27bb7de8E.exit"
    i32 1, label %panic1.i ; <- 1
    i32 2, label %panic.i  ; <- 2
    i32 3, label %"_ZN3app3foo28_$u7b$$u7b$closure$u7d$$u7d$17hcc44d33a27bb7de8E.exit"
  ]
; ..
}

The first built-in panicking branch, which corresponds to discriminant 1, is
used to panic a generator that has been resume-d beyond completion -- a
generator that has reached a return statement has reached completion.

The second built-in branch, which corresponds to discriminant 2, is used to
panic a generator that has been resume-d after panicking -- I'm not sure when
this can occur in practice but looking at the compiler codegen unwinding is required to
reach this "poisoned" state.

However, neither of these situations can occur in the example. The discriminant
of the generator can never become 1 because it does not contain a return
statement in its body. Its discriminant can not become 2 either because the
target has been configured for panic = abort meaning that unwinding is not
supported at the codegen level (UnwindResume functions are never generated). This
can be confirmed by scanning the optimized LLVM IR produced by the compiler: the
discriminant is stored in static memory (.bss) and the only values written to
this memory are 0 and 3.

Possible fixes (?)

Naively, I would improve the MIR codegen by:

Not producing the discriminant == 1 branch if the Return type of the
generator is ! (or, I guess, any other empty type like enum Void {}). This
type indicates that the generator does not contain a return statement.
Not producing the discriminant == 2 branch if the target has set its panic
option to abort. Alternatively, one could check if the unwind-related
Resume function is generated and omit the branch if it's not generated.
Relevant code:

rust/src/librustc_mir/transform/generator.rs

Lines 1061 to 1068 in 881ebeb

    
           // Poison the generator when it unwinds 
        
           for block in body.basic_blocks_mut() { 
        
               let source_info = block.terminator().source_info; 
        
               if let &TerminatorKind::Resume = &block.terminator().kind { 
        
                   block.statements.push( 
        
                       transform.set_discr(VariantIdx::new(POISONED), source_info)); 
        
               } 
        
           }

Does these seem like sensible fixes?

STR

Steps to produce the above machine code. This is not a minified repro because a
certain complexity is required to get the inferior codegen -- trivial programs
get properly optimized.

$ rustup target add thumbv7m-none-eabi
$ rustup component add llvm-tools-preview
$ cargo new --bin app && cd app
$ # or edit Cargo.toml
$ cargo add cortex-m-rt --vers '=0.6.10'
$ cargo add panic-halt --vers '=0.2.0'
$ cat <<EOF >>Cargo.toml
[profile.release]
codegen-units = 1
lto = true
EOF
$ mkdir .cargo
$ cat <<EOF >.cargo/config
[target.thumbv7m-none-eabi]
rustflags = ["-C", "link-arg=-Tlink.x"]
[build]
target = "thumbv7m-none-eabi"
EOF
$ cat <<EOF >memory.x
MEMORY
{
  FLASH : ORIGIN = 0x00000000, LENGTH = 256K
  RAM : ORIGIN = 0x20000000, LENGTH = 64K
}
EOF
$ $EDITOR src/main.rs # enter code shown at the beginning of this issue
$ cargo build --release
$ # or use `arm-none-eabi-objdump` from the `binutils-arm-none-eabi` package (Debian)
$ $(find $(rustc --print sysroot) -name llvm-objdump) -Cd -triple=thumbv7m-none-eabi target/thumbv7m-none-eabi/release/app

The text was updated successfully, but these errors were encountered:

Zoxc · 2019-11-26T08:57:17Z

@tmandry Do you want to fix this?

It should be pretty easy. Don't emit the return case if there's no Return terminator in MIR. Don't emit the unwind case if there's no Resume terminator in MIR.

Zoxc · 2019-11-26T09:22:25Z

I guess the Resume thing is not true since unwinds are implicit and other terminators can unwind too. I guess skipping it if tcx.sess.no_landing_pads() will suffice. This does mean that the generator transformation doesn't poison all the unwind paths currently, just Resume. so that has to be fixed too.

Zoxc · 2019-11-26T09:30:35Z

The generator transformation could be fixed by inserting a basic block with just Resume in it at the start of create_generator_resume_function and change cleanup/unwind edges of terminators with None to point to the new block.

@Zoxc

Smaller and more correct generator codegen This removes unnecessary panicking branches in the resume function when the generator can not return or unwind, respectively. Closes rust-lang#66100 It also addresses the correctness concerns wrt poisoning on unwind. These are not currently a soundness issue because any operation *inside* a generator that could possibly unwind will result in a cleanup path for dropping it, ultimately reaching a `Resume` terminator, which we already handled correctly. Future MIR optimizations might optimize that out, though. r? @Zoxc

@Zoxc

Smaller and more correct generator codegen This removes unnecessary panicking branches in the resume function when the generator can not return or unwind, respectively. Closes rust-lang#66100 It also addresses the correctness concerns wrt poisoning on unwind. These are not currently a soundness issue because any operation *inside* a generator that could possibly unwind will result in a cleanup path for dropping it, ultimately reaching a `Resume` terminator, which we already handled correctly. Future MIR optimizations might optimize that out, though. r? @Zoxc

japaric changed the title ~~[codegen] Unnecessary panicking branches in resumption infinite generator (stored in static variable)~~ [codegen] Unnecessary panicking branches in resumption of infinite generator (stored in static variable) Nov 5, 2019

Mark-Simulacrum added A-async-await Area: Async & Await I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 5, 2019

Centril added F-coroutines `#![feature(coroutines)]` F-type_alias_impl_trait `#[feature(type_alias_impl_trait)]` requires-nightly This issue requires a nightly compiler in some way. labels Nov 5, 2019

jonas-schievink added the A-codegen Area: Code generation label Nov 5, 2019

nikomatsakis added the AsyncAwait-Triaged Async-await issues that have been triaged during a working group meeting. label Nov 12, 2019

jonas-schievink added the A-coroutines Area: Coroutines label Nov 26, 2019

jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. WG-embedded Working group: Embedded systems labels Dec 6, 2019

jonas-schievink added the I-heavy Issue: Problems and improvements with respect to binary size of generated code. label Feb 6, 2020

jonas-schievink removed the requires-nightly This issue requires a nightly compiler in some way. label Mar 7, 2020

jonas-schievink self-assigned this Mar 7, 2020

This was referenced Mar 8, 2020

Smaller and more correct generator codegen #69814

Merged

Tracking Issue for Coroutine Memory Usage #69826

Open

bors closed this as completed in 5570a23 Mar 19, 2020

oli-obk moved this to Done in type alias impl trait stabilization Sep 9, 2022

oli-obk added this to type alias impl trait stabilization Sep 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codegen] Unnecessary panicking branches in resumption of infinite generator (stored in static variable) #66100

[codegen] Unnecessary panicking branches in resumption of infinite generator (stored in static variable) #66100

japaric commented Nov 5, 2019

Zoxc commented Nov 26, 2019

Zoxc commented Nov 26, 2019

Zoxc commented Nov 26, 2019

[codegen] Unnecessary panicking branches in resumption of infinite generator (stored in static variable) #66100

[codegen] Unnecessary panicking branches in resumption of infinite generator (stored in static variable) #66100

Comments

japaric commented Nov 5, 2019

Observed behavior

Expected behavior

Cause

Possible fixes (?)

STR

Zoxc commented Nov 26, 2019

Zoxc commented Nov 26, 2019

Zoxc commented Nov 26, 2019