Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unreachable code optimization failure when matching on Rust enum #77812

Closed
MSxDOS opened this issue Oct 11, 2020 · 10 comments
Closed

Unreachable code optimization failure when matching on Rust enum #77812

MSxDOS opened this issue Oct 11, 2020 · 10 comments
Assignees
Labels
A-codegen Area: Code generation A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@MSxDOS
Copy link

MSxDOS commented Oct 11, 2020

In this example:

#[derive(Copy, Clone, Eq, PartialEq)]
pub enum Variant {
    Zero, 
    One,
    Two,
}

#[inline]
fn unreachable() {
    println!("Impossible");
}

extern {
    fn exf1();
    fn exf2();
}

#[no_mangle]
pub static mut GLOBAL: Variant = Variant::Zero;

pub unsafe fn test() {
    let g = GLOBAL;
    if g != Variant::Zero {
        match g {
            Variant::One => exf1(),
            Variant::Two => exf2(),
            Variant::Zero => unreachable(),
        }
    }
}

the unreachable() branch is not removed. Adding any #[repr(n)] but not #[repr(C)] to Variant or making reachable branches call only one of external functions allows the optimization.

There's no such issue with a similar C example using clang so it's likely something on Rust side.

Example links:
Rust https://rust.godbolt.org/z/9oh6sP
C https://godbolt.org/z/Tnbfx6

@rustbot rustbot added A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 11, 2020
@scottmcm
Copy link
Member

This is weird; LLVM has plenty enough information to be able to do this from what's emitted.

The IR it currently produces: https://rust.godbolt.org/z/GWacEc

define void @_ZN7example4test17h9c70bf34953e7145E() unnamed_addr #0 !dbg !6 {
start:
  %_2.i = alloca %"std::fmt::Arguments", align 8
  %0 = load i8, i8* getelementptr inbounds (<{ [1 x i8] }>, <{ [1 x i8] }>* @GLOBAL, i64 0, i32 0, i64 0), align 1, !dbg !10, !range !11
  %_10.i.i.not = icmp eq i8 %0, 0, !dbg !12
  br i1 %_10.i.i.not, label %bb8, label %bb3, !dbg !22

bb3:                                              ; preds = %start
  %_6 = zext i8 %0 to i64, !dbg !23
  switch i64 %_6, label %bb5 [
    i64 0, label %bb4
    i64 1, label %bb6
    i64 2, label %bb7
  ], !dbg !23

That clearly has sufficient information to know that bb4 is unreachable, but it's not taking advantage of it.

Curiosity: why is rustc insisting on zexting to i64 for the discriminant comparison? That's something rust is doing; it's there even in -O0 https://rust.godbolt.org/z/a1nxzx

@SkiFire13
Copy link
Contributor

It gets even weirder if you put Variant::Zero as the first pattern in the match, notice in particular the two consecutive je https://rust.godbolt.org/z/3K933z

@erikdesjardins
Copy link
Contributor

erikdesjardins commented Oct 11, 2020

This looks like the same problem as #73031. LLVM can't see through the zext in some cases.

rustc extends to i64 because the default discriminant type is isize, even though the memory representation is usually smaller.

Although this can be fixed in LLVM, it may be worthwhile to try making the discr type the same as the memory repr, i.e. do this

/// Finds the appropriate Integer type and signedness for the given
/// signed discriminant range and `#[repr]` attribute.
/// N.B.: `u128` values above `i128::MAX` will be treated as signed, but
/// that shouldn't affect anything, other than maybe debuginfo.
fn repr_discr<'tcx>(
earlier, so this fn
pub fn discr_type(&self) -> attr::IntType {
self.int.unwrap_or(attr::SignedInt(ast::IntTy::Isize))
}
defaults to the smallest type that will fit instead of isize. (I think this would be backwards compatible, you can still as-cast to any integer type, I'm not sure if the underlying discriminant type is exposed anywhere.)

@scottmcm
Copy link
Member

So is the fix here to codegen it using <T as DiscriminantKind>::Discriminant instead of i64?

@scottmcm scottmcm added the A-codegen Area: Code generation label Oct 12, 2020
@erikdesjardins
Copy link
Contributor

<T as DiscriminantKind>::Discriminant is i64, which is the problem: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=3f5f5c4a34d07152653c173a7e3407e4

@tgnottingham
Copy link
Contributor

FYI, #74215 has some discussion about the trade-offs of changing DiscriminantKind::Discriminant to be the smallest possible integer type.

I've also found that a not insignificant amount of hashing in the compiler is being done on discriminant values, so there is a potential compile time improvement in doing this too, though I don't think that should be a primary motivation for the change.

@scottmcm
Copy link
Member

Hmm, there's nothing that would force that type and the HIR->MIR desugaring to use the same internal construct value, though, right? Like we could change the MIR Rvalue::Discriminant to return the same width as it's actually stored, and the discriminant_value intrinsic to [sz]ext to whatever DiscriminantKind wants?

That way LLVM would just see a switch on the loaded value, not an extended one. (So this might still happen if the if g != Variant::Zero { check were replaced with mem::Discriminant one, but that would be a different problem.)

@nikic
Copy link
Contributor

nikic commented Dec 12, 2020

The case is eliminated with LLVM 12 by IPSCCP, probably as a result of https://reviews.llvm.org/D84270: https://llvm.godbolt.org/z/nbfnzc

I would have expected CVP to handle this already before, but it doesn't due to https://github.com/llvm/llvm-project/blob/7beee561e23db56c7e24462a9870a3ba58a9b2b3/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp#L333-L336.

@nikic
Copy link
Contributor

nikic commented Dec 12, 2020

The CVP issue is addressed by llvm/llvm-project@afbb6d9. Assigning this to myself to re-check this after the LLVM 12 upgrade.

@MSxDOS
Copy link
Author

MSxDOS commented Mar 5, 2021

Assigning this to myself to re-check this after the LLVM 12 upgrade.

AFAICT, the issue has been successfully obliterated by #81451 🎉

@MSxDOS MSxDOS closed this as completed Mar 5, 2021
m-ou-se added a commit to m-ou-se/rust that referenced this issue Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants