Unreachable code optimization failure when matching on Rust enum #77812

MSxDOS · 2020-10-11T05:05:58Z

In this example:

#[derive(Copy, Clone, Eq, PartialEq)]
pub enum Variant {
    Zero, 
    One,
    Two,
}

#[inline]
fn unreachable() {
    println!("Impossible");
}

extern {
    fn exf1();
    fn exf2();
}

#[no_mangle]
pub static mut GLOBAL: Variant = Variant::Zero;

pub unsafe fn test() {
    let g = GLOBAL;
    if g != Variant::Zero {
        match g {
            Variant::One => exf1(),
            Variant::Two => exf2(),
            Variant::Zero => unreachable(),
        }
    }
}

the unreachable() branch is not removed. Adding any #[repr(n)] but not #[repr(C)] to Variant or making reachable branches call only one of external functions allows the optimization.

There's no such issue with a similar C example using clang so it's likely something on Rust side.

Example links:
Rust https://rust.godbolt.org/z/9oh6sP
C https://godbolt.org/z/Tnbfx6

The text was updated successfully, but these errors were encountered:

scottmcm · 2020-10-11T07:52:35Z

This is weird; LLVM has plenty enough information to be able to do this from what's emitted.

The IR it currently produces: https://rust.godbolt.org/z/GWacEc

define void @_ZN7example4test17h9c70bf34953e7145E() unnamed_addr #0 !dbg !6 {
start:
  %_2.i = alloca %"std::fmt::Arguments", align 8
  %0 = load i8, i8* getelementptr inbounds (<{ [1 x i8] }>, <{ [1 x i8] }>* @GLOBAL, i64 0, i32 0, i64 0), align 1, !dbg !10, !range !11
  %_10.i.i.not = icmp eq i8 %0, 0, !dbg !12
  br i1 %_10.i.i.not, label %bb8, label %bb3, !dbg !22

bb3:                                              ; preds = %start
  %_6 = zext i8 %0 to i64, !dbg !23
  switch i64 %_6, label %bb5 [
    i64 0, label %bb4
    i64 1, label %bb6
    i64 2, label %bb7
  ], !dbg !23

That clearly has sufficient information to know that bb4 is unreachable, but it's not taking advantage of it.

Curiosity: why is rustc insisting on zexting to i64 for the discriminant comparison? That's something rust is doing; it's there even in -O0 https://rust.godbolt.org/z/a1nxzx

SkiFire13 · 2020-10-11T11:25:41Z

It gets even weirder if you put Variant::Zero as the first pattern in the match, notice in particular the two consecutive je https://rust.godbolt.org/z/3K933z

erikdesjardins · 2020-10-11T19:14:47Z

This looks like the same problem as #73031. LLVM can't see through the zext in some cases.

rustc extends to i64 because the default discriminant type is isize, even though the memory representation is usually smaller.

Although this can be fixed in LLVM, it may be worthwhile to try making the discr type the same as the memory repr, i.e. do this

rust/compiler/rustc_middle/src/ty/layout.rs

Lines 74 to 78 in 8cc82ee

    
               /// Finds the appropriate Integer type and signedness for the given 
        
               /// signed discriminant range and `#[repr]` attribute. 
        
               /// N.B.: `u128` values above `i128::MAX` will be treated as signed, but 
        
               /// that shouldn't affect anything, other than maybe debuginfo. 
        
               fn repr_discr<'tcx>(

earlier, so this fn

rust/compiler/rustc_middle/src/ty/mod.rs

Lines 2290 to 2292 in 8cc82ee

    
               pub fn discr_type(&self) -> attr::IntType { 
        
                   self.int.unwrap_or(attr::SignedInt(ast::IntTy::Isize)) 
        
               }

defaults to the smallest type that will fit instead of isize. (I think this would be backwards compatible, you can still as-cast to any integer type, I'm not sure if the underlying discriminant type is exposed anywhere.)

scottmcm · 2020-10-12T00:57:48Z

So is the fix here to codegen it using <T as DiscriminantKind>::Discriminant instead of i64?

erikdesjardins · 2020-10-12T01:13:47Z

<T as DiscriminantKind>::Discriminant is i64, which is the problem: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=3f5f5c4a34d07152653c173a7e3407e4

tgnottingham · 2020-10-12T20:50:07Z

FYI, #74215 has some discussion about the trade-offs of changing DiscriminantKind::Discriminant to be the smallest possible integer type.

I've also found that a not insignificant amount of hashing in the compiler is being done on discriminant values, so there is a potential compile time improvement in doing this too, though I don't think that should be a primary motivation for the change.

scottmcm · 2020-10-12T22:39:10Z

Hmm, there's nothing that would force that type and the HIR->MIR desugaring to use the same internal construct value, though, right? Like we could change the MIR Rvalue::Discriminant to return the same width as it's actually stored, and the discriminant_value intrinsic to [sz]ext to whatever DiscriminantKind wants?

That way LLVM would just see a switch on the loaded value, not an extended one. (So this might still happen if the if g != Variant::Zero { check were replaced with mem::Discriminant one, but that would be a different problem.)

nikic · 2020-12-12T19:15:26Z

The case is eliminated with LLVM 12 by IPSCCP, probably as a result of https://reviews.llvm.org/D84270: https://llvm.godbolt.org/z/nbfnzc

I would have expected CVP to handle this already before, but it doesn't due to https://github.com/llvm/llvm-project/blob/7beee561e23db56c7e24462a9870a3ba58a9b2b3/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp#L333-L336.

nikic · 2020-12-12T20:15:11Z

The CVP issue is addressed by llvm/llvm-project@afbb6d9. Assigning this to myself to re-check this after the LLVM 12 upgrade.

MSxDOS · 2021-03-05T13:15:27Z

Assigning this to myself to re-check this after the LLVM 12 upgrade.

AFAICT, the issue has been successfully obliterated by #81451 🎉

Add codegen tests for some issues closed by LLVM 12 Namely rust-lang#73031, rust-lang#75546, and rust-lang#77812

scottmcm added the A-codegen Area: Code generation label Oct 12, 2020

erikdesjardins mentioned this issue Dec 12, 2020

Unneeded call to panic!() #73031

Closed

nikic self-assigned this Dec 12, 2020

MSxDOS closed this as completed Mar 5, 2021

erikdesjardins mentioned this issue Mar 7, 2021

Add codegen tests for some issues closed by LLVM 12 #82874

Merged

m-ou-se added a commit to m-ou-se/rust that referenced this issue Mar 8, 2021

Rollup merge of rust-lang#82874 - erikdesjardins:cgtests, r=nagisa

a5035c9

Add codegen tests for some issues closed by LLVM 12 Namely rust-lang#73031, rust-lang#75546, and rust-lang#77812

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unreachable code optimization failure when matching on Rust enum #77812

Unreachable code optimization failure when matching on Rust enum #77812

MSxDOS commented Oct 11, 2020

scottmcm commented Oct 11, 2020

SkiFire13 commented Oct 11, 2020

erikdesjardins commented Oct 11, 2020 •

edited

Loading

scottmcm commented Oct 12, 2020

erikdesjardins commented Oct 12, 2020

tgnottingham commented Oct 12, 2020

scottmcm commented Oct 12, 2020

nikic commented Dec 12, 2020

nikic commented Dec 12, 2020

MSxDOS commented Mar 5, 2021

Unreachable code optimization failure when matching on Rust enum #77812

Unreachable code optimization failure when matching on Rust enum #77812

Comments

MSxDOS commented Oct 11, 2020

scottmcm commented Oct 11, 2020

SkiFire13 commented Oct 11, 2020

erikdesjardins commented Oct 11, 2020 • edited Loading

scottmcm commented Oct 12, 2020

erikdesjardins commented Oct 12, 2020

tgnottingham commented Oct 12, 2020

scottmcm commented Oct 12, 2020

nikic commented Dec 12, 2020

nikic commented Dec 12, 2020

MSxDOS commented Mar 5, 2021

erikdesjardins commented Oct 11, 2020 •

edited

Loading