Compile bools to i1, as per Clang #8106

bstrie · 2013-07-29T14:11:27Z

<strcat> aatch: well, can we just do what clang does? does it just say i1 is represented as i8?
<strcat> clang stores them as i8, passes them as i1 zeroext and treats them as i1 after a load
<strcat> lets just copy that
<strcat> that's what LLVM knows how to optimize

The text was updated successfully, but these errors were encountered:

bstrie · 2013-07-29T14:12:11Z

@thestinger has been mentioning this for a long time.

Nominating for Production Ready.

graydon · 2013-07-29T14:23:18Z

I thought we used to do just this, and stopped because "it wasn't what clang did".

thestinger · 2013-07-29T18:34:46Z

@graydon: the issue before was that we were also storing them as i1

brson · 2013-07-31T04:29:58Z

The IRC convo is missing context. Why do we want to do this?

thestinger · 2013-07-31T04:42:46Z

It's mostly because our IR is resulting in poor optimization right now, because LLVM is designed to consume clang's output. The way clang handles this is to truncate down to i1 after a load, and zext to i8 before a store. Since there are range assertions output for the loads, LLVM will optimize the truncation to a no-op, resulting in it branching directly on the value in every case.

We do actually output range assertions now, but they're not triggering the relevant knowledge in LLVM.

There are also currently issues like this, because we aren't truncating to i1:

strcat | rusti: let x = 5 as bool; (x, x == true)
    -- | Notice(rusti): (true, false)

I could try to put together an isolated example where using i1 leads to better results.

graydon · 2013-07-31T14:54:37Z

Oh I see. Carry on then!

catamorphism · 2013-09-12T17:27:33Z

Just a bug, de-nominating

eddyb · 2013-12-12T10:59:57Z

And I found an example where it really stops optimizations:

// optimized to ret 0
fn foo(i: uint) -> uint {if i % 5 == 1 {if i % 4 == 0 {i % 4}else{0}}else{0}}
// not optimized to ret 0
fn foo(i: uint) -> uint {if i % 5 == 1 && i % 4 == 0 {i % 4}else{0}}

// optimized in clang to ret 0, but I think it does short-circuiting differently,
// so it's equivalent to the optimized rustc one
unsigned long foo(unsigned long i) {return (i % 5 == 1 && i % 4 == 0) ? i % 4 : 0;}

I realized the same thing as @thestinger by looking at the output of --emit-llvm.
By manually editing that output to get rid of the two casts to i8, and optimizing it again, I've obtained an improved version.

Update: the missing link in the optimization process is %.demorgan = and i1 %3, %5 implying %5 = icmp eq i64 %4, 0, replacing the use of %.demorgan with %5 produces ret 0.
I've tested with the LLVM version currently bundled with rust, so I believe this last step is a still unsolved LLVM issue.

// this non-short-circuiting example behaves just like the unoptimized rust one,
// with my manual bool optimization applied
unsigned long foo(unsigned long i) {return (i % 5 == 1 & i % 4 == 0) ? i % 4 : 0;}

dotdash · 2014-03-25T22:31:10Z

I have a patch for this, but it's currently blocked on http://llvm.org/bugs/show_bug.cgi?id=19250 -- just to make sure that nobody wastes his time on this.

We currently compiled bools to i8 values, because there was a bug in LLVM that sometimes caused miscompilations when using i1 in, for example, structs. Using i8 means a lot of unnecessary zero-extend and truncate operations though, since we have to convert the value from and to i1 when using for example icmp or br instructions. Besides the unnecessary overhead caused by this, it also sometimes made LLVM miss some optimizations. Fixes rust-lang#8106.

To fix rust-lang#8106, we need an LLVM version that contains r211082 aka 0dee6756 which fixes a bug that blocks that issue. There have been some tiny API changes in LLVM, and cmpxchg changed its return type. The i1 part of the new return type is only interesting when using the new weak cmpxchg, which we don't do.

We currently compiled bools to i8 values, because there was a bug in LLVM that sometimes caused miscompilations when using i1 in, for example, structs. Using i8 means a lot of unnecessary zero-extend and truncate operations though, since we have to convert the value from and to i1 when using for example icmp or br instructions. Besides the unnecessary overhead caused by this, it also sometimes made LLVM miss some optimizations. Fixes rust-lang#8106.

Remind users of separate `rustc` lint groups changelog: none fixes rust-lang#8104

dotdash mentioned this issue Jun 17, 2014

Update to r211089 rust-lang/llvm#7

Closed

bors closed this as completed in 90a9f65 Jun 22, 2014

flip1995 pushed a commit to flip1995/rust that referenced this issue Dec 17, 2021

Auto merge of rust-lang#8106 - medwards:rustc-lint-groups, r=flip1995

5abecad

Remind users of separate `rustc` lint groups changelog: none fixes rust-lang#8104

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile bools to i1, as per Clang #8106

Compile bools to i1, as per Clang #8106

bstrie commented Jul 29, 2013

bstrie commented Jul 29, 2013

graydon commented Jul 29, 2013

thestinger commented Jul 29, 2013

brson commented Jul 31, 2013

thestinger commented Jul 31, 2013

graydon commented Jul 31, 2013

catamorphism commented Sep 12, 2013

eddyb commented Dec 12, 2013

dotdash commented Mar 25, 2014

Compile bools to i1, as per Clang #8106

Compile bools to i1, as per Clang #8106

Comments

bstrie commented Jul 29, 2013

bstrie commented Jul 29, 2013

graydon commented Jul 29, 2013

thestinger commented Jul 29, 2013

brson commented Jul 31, 2013

thestinger commented Jul 31, 2013

graydon commented Jul 31, 2013

catamorphism commented Sep 12, 2013

eddyb commented Dec 12, 2013

dotdash commented Mar 25, 2014