Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc_mir_transform: Add a local value numbering pass, off by default. #92051

Closed
wants to merge 1 commit into from

Conversation

pcwalton
Copy link
Contributor

This commit adds a local value numbering pass, which is intended to
eventually subsume constant and copy propagation. It's designed to detect
whether a value has been computed multiple times and replace such duplicate
computations with a reference to the originally-computed value. LLVM performs
this optimization, but it's limited because it doesn't have the ability to
reason about immutability of memory the way Rust MIR can. For example, this
pass optimizes the following code:

let s1: &S = ...;
let s2: &mut S = ...;
let mut sum = 0;
sum += s1.a + s1.b;
s2.a = 1;
sum += s1.a + s1.b;

into:

let s1: &S = ...;
let s2: &mut S = ...;
let mut sum = 0;
sum += s1.a + s1.b;
s2.a = 1;
sum += sum;

LLVM won't do this optimization because it can't prove that the assignment to a
field of s2 doesn't mutate the fields of s1.

Because this pass is likely to have bugs and may be slow, I've turned it off by
default for now. I didn't notice any show-stopping bugs in the test suite, but
out of an abundance of caution I'm keeping it off.

Closes #91688.

@rust-highfive
Copy link
Collaborator

r? @davidtwco

(rust-highfive has picked a reviewer for you, use r? to override)

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Dec 17, 2021
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 17, 2021
@rust-log-analyzer

This comment has been minimized.

@camelid camelid added the A-mir-opt Area: MIR optimizations label Dec 17, 2021
Cargo.lock Outdated
@@ -2451,9 +2451,9 @@ checksum = "dd20eec3dbe4376829cb7d80ae6ac45e0a766831dca50202ff2d40db46a8a024"

[[package]]
name = "os_info"
version = "3.0.7"
version = "3.0.8"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentionally being updated?

Comment on lines +206 to +209
let terminator = match block.terminator {
None => return,
Some(ref mut terminator) => terminator,
};
Copy link
Contributor

@ecstatic-morse ecstatic-morse Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the MIR is built, Terminator is guaranteed to be Some (unless you've moved from it elsewhere in this pass). You're free strongly encouraged to call the terminator and terminator_mut methods.

Comment on lines +37 to +38
// First, promote mutable locals to immutable ones. This helps our rudimentary alias
// analysis a lot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The precedent for indirect mutation is to traverse the MIR to looking for any locals that are mutably borrowed at any point. I'm not sure if you need to look at the mutability of a LocalDecl separately, but maybe doing it that way helps you avoid dealing with that failing test case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean, could you elaborate?

Copy link
Contributor

@ecstatic-morse ecstatic-morse Dec 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, see this comment for context. My understanding is that you're using the mutability field of each LocalDecl as a conservative approximation for whether it can be mutated indirectly (through a pointer). Is that correct? What you're doing is fine of course, but it would be slightly more precise to traverse the MIR looking for any local that is the base of an Rvalue::Ref(Mut) or Rvalue::AddrOf. I don't know how much the extra precision will matter.

Copy link
Contributor

@ecstatic-morse ecstatic-morse Dec 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, actually is what you're doing fine? Is the current version correct in the presence of locals with interior mutability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the code to handle UnsafeCell.

Copy link
Contributor

@ecstatic-morse ecstatic-morse Dec 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this concern is resolved by the fact that we don't have to care about function calls (duh).

One more question: could you explain what would happen in this (incredibly niche) case?

//! I am in libcore/src/cell.rs, so I have access to `UnsafeCell`'s field.

let x  = UnsafeCell { value: 1 }; // `x` is not marked as a mutable local (I think)?
let px = &x;
unsafe { *(px as *mut _).value  = 2 };
read(x.value);

I think you'll handle this correctly due to several conservative assumptions you've made. UnsafeCell is not Copy, for one, and also you bail out on field projections. Is this description accurate or is something more robust stopping these kinds of shenanigans?

Frankly, I suspect there are other MIR transformations that are unsound if direct accesses to the field of an UnsafeCell are allowed, but it doesn't hurt to ask I guess.

Comment on lines +703 to +708
ProjectionElem::Deref => match ty_kind {
ty::Ref(_, inner_ty, Mutability::Not) => {
ty_kind = inner_ty.kind();
}
_ => return false,
},
Copy link
Contributor

@tmiasko tmiasko Dec 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Place behind a shared reference might be mutable?

fn f(x: &mut u32) -> u32 {
    let a = &*x;
    let c = *a;
    *x = 2;
    c
}

fn main() {
    let mut x = 1;
    assert_eq!(f(&mut x), 1);
}
$ rustc +stage1 -O b.rs && ./b
$ rustc +stage1 -O -Znumber_values b.rs && ./b
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `2`,
 right: `1`', b.rs:10:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because of NLL I guess. I suppose we should check to see whether the live ranges of the immutable reference overlap the mutation site. If they don't we have to kill the immutable reference.

Copy link
Contributor Author

@pcwalton pcwalton Apr 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, new plan: contents of an immutable reference are considered immutable if and only if the immutable reference either came from a function parameter or has a use reachable from the current basic block. In your example, a has no uses reachable from the BB, so it would be considered potentially mutable. I think this is sound. It would probably be possible to intertwine this pass with NLL data structures to get precise here, but I'm very hesitant to do that because right now MIR optimizations don't interact with NLL borrow checking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, the reference must not only be immutable, it must be immutably borrowed across the entire basic block.

Here's a modified form of the example that fails to borrow check because it fails the "uses reachable from the statement" check: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=2dcd401a1b807854207c732e2153fc3a

And here's an example that shows that the "function parameter" check is sound because the borrow check rejects an attempt to violate it: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=2dcd401a1b807854207c732e2153fc3a

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidtwco
Copy link
Member

r? @ecstatic-morse

@camelid camelid added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 25, 2022
@nnethercote
Copy link
Contributor

What's the status of this? Looks like it's waiting for a review?

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Feb 4, 2022

I'm also excited about this PR. However, it's incorrect in at least one case.

Also, I think the way its detecting indirect mutation is misguided. Local value numbering is perfectly able to handle direct reassignment of variables, but AFAICT this PR refuses to optimize code involving variables that are declared with mut. I think the author is using that as a proxy to detect whether a variable can be mutated through a pointer/reference (I asked for clarification).

Looking for mut declarations is fine (although imprecise) for things without interior mutability. For now, it works even for things with interior mutability because the only statement (not terminator) that can write through a shared reference is a StatementKind::Assign to the field of an UnsafeCell, and that's handled explicitly. Everything else goes through a function call. This assumption no longer holds if this pass were extended to global value numbering (as the module docs suggest), since now you have to handle the effects of terminators (function calls, etc.). There's no mention of this anywhere.

I was going to suggest that the pass to look explicitly for mutable references to locals or shared references to locals whose type has interior mutability (!Freeze). However, I think this might be unsound in the global case since Freeze is shallow?

@ecstatic-morse ecstatic-morse added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 4, 2022
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 6, 2022
@Dylan-DPC
Copy link
Member

@pcwalton any updates on this?

@bors
Copy link
Contributor

bors commented Mar 18, 2022

☔ The latest upstream changes (presumably #88098) made this pull request unmergeable. Please resolve the merge conflicts.

@pcwalton
Copy link
Contributor Author

pcwalton commented Apr 9, 2022

I think the author is using that as a proxy to detect whether a variable can be mutated through a pointer/reference

Yes, that is the goal.

I was going to suggest that the pass to look explicitly for mutable references to locals or shared references to locals whose type has interior mutability (!Freeze).

I would prefer to keep this patch small and conservative for now. Adding more and more precision to it is something that can be done over time. The perfect shouldn't be the enemy of the good, especially when we don't even know what sound semantics of the perfect should be!

local_info.statement_index_of_last_kill = location.statement_index;
local_info.current_live_range_index += 1;
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: this is where we need to kill immutable references in some cases.

This commit adds a local [value numbering] pass, which is intended to
eventually subsume constant and copy propagation. It's designed to detect
whether a value has been computed multiple times and replace such duplicate
computations with a reference to the originally-computed value. LLVM performs
this optimization, but it's limited because it doesn't have the ability to
reason about immutability of memory the way Rust MIR can. For example, this
pass optimizes [the following code]:

    let s1: &S = ...;
    let s2: &mut S = ...;
    let mut sum = 0;
    sum += s1.a + s1.b;
    s2.a = 1;
    sum += s1.a + s1.b;

into:

    let s1: &S = ...;
    let s2: &mut S = ...;
    let mut sum = 0;
    sum += s1.a + s1.b;
    s2.a = 1;
    sum += sum;

LLVM won't do this optimization because it can't prove that the assignment to a
field of `s2` doesn't mutate the fields of `s1`.

Because this pass is likely to have bugs and may be slow, I've turned it off by
default for now. I didn't notice any show-stopping bugs in the test suite, but
out of an abundance of caution I'm keeping it off.

Closes rust-lang#91688.

[value numbering]: https://en.wikipedia.org/wiki/Value_numbering

[the following code]: https://rust.godbolt.org/z/7f6Gff7Ms
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-12 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
failures:

---- [ui] src/test/ui/destructuring-assignment/slice_destructure.rs stdout ----
normalized stderr:
warning: variable `a` is assigned to, but never used
  --> $DIR/slice_destructure.rs:4:12
   |
LL |   let (mut a, mut b);
   |
   = note: `#[warn(unused_variables)]` on by default
   = note: `#[warn(unused_variables)]` on by default
   = note: consider using `_a` instead

warning: variable `b` is assigned to, but never used
Some tests failed in compiletest suite=ui mode=ui host=x86_64-unknown-linux-gnu target=x86_64-unknown-linux-gnu
  --> $DIR/slice_destructure.rs:4:19
   |
LL |   let (mut a, mut b);
   |
   |
   = note: consider using `_b` instead
warning: value assigned to `a` is never read
  --> $DIR/slice_destructure.rs:5:4
   |
   |
LL |   [a, b] = [0, 1];
   |
   = note: `#[warn(unused_assignments)]` on by default
   = note: `#[warn(unused_assignments)]` on by default
   = help: maybe it is overwritten before being read?

warning: value assigned to `b` is never read
  --> $DIR/slice_destructure.rs:5:7
   |
LL |   [a, b] = [0, 1];
   |
   |
   = help: maybe it is overwritten before being read?
warning: variable does not need to be mutable
  --> $DIR/slice_destructure.rs:4:8
   |
   |
LL |   let (mut a, mut b);
   |        |
   |        help: remove this `mut`
   |
   = note: `#[warn(unused_mut)]` on by default
   = note: `#[warn(unused_mut)]` on by default

warning: variable does not need to be mutable
  --> $DIR/slice_destructure.rs:4:15
   |
LL |   let (mut a, mut b);
   |               |
   |               help: remove this `mut`

warning: variable does not need to be mutable
---
To only update this specific test, also pass `--test-args destructuring-assignment/slice_destructure.rs`

error: 1 errors occurred comparing output.
status: exit status: 0
command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/destructuring-assignment/slice_destructure.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "--json" "future-incompat" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/destructuring-assignment/slice_destructure/a" "-Crpath" "-O" "-Cdebuginfo=0" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/destructuring-assignment/slice_destructure/auxiliary"
stdout: none
--- stderr -------------------------------
warning: variable `a` is assigned to, but never used
   |
   |
LL |   let (mut a, mut b);
   |
   = note: `#[warn(unused_variables)]` on by default
   = note: `#[warn(unused_variables)]` on by default
   = note: consider using `_a` instead

warning: variable `b` is assigned to, but never used
   |
   |
LL |   let (mut a, mut b);
   |
   |
   = note: consider using `_b` instead
warning: value assigned to `a` is never read
  --> /checkout/src/test/ui/destructuring-assignment/slice_destructure.rs:5:4
   |
   |
LL |   [a, b] = [0, 1];
   |
   = note: `#[warn(unused_assignments)]` on by default
   = note: `#[warn(unused_assignments)]` on by default
   = help: maybe it is overwritten before being read?

warning: value assigned to `b` is never read
   |
   |
LL |   [a, b] = [0, 1];
   |
   |
   = help: maybe it is overwritten before being read?
warning: variable does not need to be mutable
  --> /checkout/src/test/ui/destructuring-assignment/slice_destructure.rs:4:8
   |
   |
LL |   let (mut a, mut b);
   |        |
   |        help: remove this `mut`
   |
   = note: `#[warn(unused_mut)]` on by default
   = note: `#[warn(unused_mut)]` on by default

warning: variable does not need to be mutable
  --> /checkout/src/test/ui/destructuring-assignment/slice_destructure.rs:4:15
   |
LL |   let (mut a, mut b);
   |               |
   |               help: remove this `mut`

warning: variable does not need to be mutable

@bors
Copy link
Contributor

bors commented May 28, 2022

☔ The latest upstream changes (presumably #97158) made this pull request unmergeable. Please resolve the merge conflicts.

@pcwalton pcwalton closed this Oct 18, 2022
@pcwalton
Copy link
Contributor Author

I'm closing this as it doesn't make sense to work on it until after updated const prop/dest prop land.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Local value numbering for MIR