Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc: SIMD types use pointers in Rust's ABI #47743

Merged
merged 1 commit into from
Jan 26, 2018

Conversation

alexcrichton
Copy link
Member

This commit changes the ABI of SIMD types in the "Rust" ABI to unconditionally
be passed via pointers instead of being passed as immediates. This should fix a
longstanding issue, #44367, where SIMD-using programs ended up showing very odd
behavior at runtime because the ABI between functions was mismatched.

As a bit of a recap, this is sort of an LLVM bug and sort of an LLVM feature
(today's behavior). LLVM will generate code for a function solely looking at the
function it's generating, including calls to other functions. Let's then say
you've got something that looks like:

define void @foo() { ; no target features enabled
  call void @bar(<i64 x 4> zeroinitializer)
  ret void
}

define void @bar(<i64 x 4>) #0 { ; enables the AVX feature
  ...
}

LLVM will codegen the call to bar without using AVX registers becauase foo
doesn't have access to these registers. Instead it's generated with emulation
that uses two 128-bit registers. The bar function, on the other hand, will
expect its argument in an AVX register (as it has AVX enabled). This means we've
got a codegen problem!

Comments on #44367 have some more contexutal information but the crux of the
issue is that if we want SIMD to work in general we'll need to ensure that
whenever a function calls another they ABI of the arguments being passed is in
agreement.

One possible solution to this would be to insert "shim functions" where whenever
a target_feature mismatch is detected the compiler inserts a shim function
where you pass arguments via memory to the shim and then the shim loads the
values and calls the target function (where the shim and the target have the
same target features enabled). This unfortunately is quite nontrivial to
implement in rustc today (especially when accounting for function pointers and
such).

This commit takes a different solution, always passing SIMD arguments through
memory instead of passing as immediates. This strategy solves the problem at the
LLVM layer because the ABI between two functions never uses SIMD registers. This
also shouldn't be a hit to performance because SIMD performance is thought to
often rely on inlining anyway, where a call instruction, even if using SIMD
registers, would be disastrous to performance regardless. LLVM should then be
more than capable of fixing all our memory usage to use registers instead after
enough inlining has been performed.

Note that there's a few caveats to this commit though:

  • The "platform intrinsic" ABI is omitted from "always pass via memory". This
    ABI is used to define intrinsics like simd_shuffle4 where LLVM and rustc
    need to have the arguments as an immediate.

  • Additionally this commit does not fix the extern ("C") ABI. This means
    that the bug in repr(simd) is unsound #44367 can still happen when using non-Rust-ABI functions. My
    hope is that before stabilization we can ban and/or warn about SIMD types in
    these functions (as AFAIK there's not much motivation to belong there anyway),
    but I'll leave that for a later commit and if this is merged I'll file a
    follow-up issue.

All in all this...

Closes #44367

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member Author

cc @BurntSushi, @gnzlbg, @eddyb

if arg.is_ignore() { return; }

match arg.layout.abi {
layout::Abi::Aggregate { .. } => {}
layout::Abi::Vector { .. } if abi != Abi::PlatformIntrinsic => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, intrinsics do end up using this code, I forgot about that.

if is_ret {
arg.make_indirect();
} else {
arg.make_indirect_byval();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this? If so we should also use it for other indirect arguments below. But I think that if you already have a memory location to pass in, byvalwill make an extra stack copy instead of reusing the original.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hm so this may or may not be correct, I sort of just assumed that it was naively!

I did find though that this is what clang does, it marks all SIMD arguments as byval along with a pointer indirection behind them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand, Clang often passes vectors as immediates? see https://godbolt.org/g/2cSZJD

Clang may be using byval in some circumstances where the ABI dictates it (I believe byval has some ABI implications that differ from just passing a pointer), but for the "Rust" ABI we have more leeway.

Copy link
Member

@eddyb eddyb Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

byval doesn't change correctness (as long as you're consistent), it's just a (platform) ABI choice, and we just don't do it for the Rust ABI. AFAIK it means "we have a pointer at the IR level but the data behind the pointer is passed on the stack" vs the default "this is a pointer like any other passed in a register (or the stack if we ran out of registers)" - but note that in the default case it's still just the pointer itself passed on the stack, not what it points to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkruppe oh interesting! I wonder if this is an integer/float difference? I've been using https://godbolt.org/g/t4qzjA as testing ground which shows using the byval attribute...

Maybe it's 128/256 ABI? Unsure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddyb ok cool, I'll switch to just make_indirect for now.

@kennytm kennytm added T-lang Relevant to the language team, which will review and decide on the PR/issue. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. A-SIMD Area: SIMD (Single Instruction Multiple Data) labels Jan 25, 2018
@eddyb
Copy link
Member

eddyb commented Jan 25, 2018

cc @rkruppe @nagisa

@eddyb
Copy link
Member

eddyb commented Jan 25, 2018

@bors r+

@bors
Copy link
Contributor

bors commented Jan 25, 2018

📌 Commit 9068e9c has been approved by eddyb

// across this boundary would cause corrupt data to show up.
//
// This problem is fixed by unconditionally passing SIMD
// arguments through memory between callers callees which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing and ?

@alexcrichton
Copy link
Member Author

@bors: r=eddyb

@bors
Copy link
Contributor

bors commented Jan 25, 2018

📌 Commit ab55f08 has been approved by eddyb

alexcrichton added a commit to alexcrichton/stdarch that referenced this pull request Jan 25, 2018
In rust-lang/rust#47743 the SIMD types in the Rust ABI are being switched to
getting passed through memory unconditionally rather than through LLVM's
immediate registers. This means that a bunch of our assert_instr invocations
started breaking as LLVM has more efficient methods of dealing with memory than
the instructions themselves.

This switches `assert_instr` to unconditionally use a shim that is an `extern`
function which should revert back to the previous behavior, using the simd types
as immediates and testing the same.
@nikomatsakis nikomatsakis assigned eddyb and unassigned nikomatsakis Jan 25, 2018
alexcrichton added a commit to rust-lang/stdarch that referenced this pull request Jan 25, 2018
In rust-lang/rust#47743 the SIMD types in the Rust ABI are being switched to
getting passed through memory unconditionally rather than through LLVM's
immediate registers. This means that a bunch of our assert_instr invocations
started breaking as LLVM has more efficient methods of dealing with memory than
the instructions themselves.

This switches `assert_instr` to unconditionally use a shim that is an `extern`
function which should revert back to the previous behavior, using the simd types
as immediates and testing the same.
@alexcrichton
Copy link
Member Author

@bors: r-

gotta update a codegen test

@kennytm kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 25, 2018
This commit changes the ABI of SIMD types in the "Rust" ABI to unconditionally
be passed via pointers instead of being passed as immediates. This should fix a
longstanding issue, rust-lang#44367, where SIMD-using programs ended up showing very odd
behavior at runtime because the ABI between functions was mismatched.

As a bit of a recap, this is sort of an LLVM bug and sort of an LLVM feature
(today's behavior). LLVM will generate code for a function solely looking at the
function it's generating, including calls to other functions. Let's then say
you've got something that looks like:

```llvm
define void @foo() { ; no target features enabled
  call void @bar(<i64 x 4> zeroinitializer)
  ret void
}

define void @bar(<i64 x 4>) #0 { ; enables the AVX feature
  ...
}
```

LLVM will codegen the call to `bar` *without* using AVX registers becauase `foo`
doesn't have access to these registers. Instead it's generated with emulation
that uses two 128-bit registers. The `bar` function, on the other hand, will
expect its argument in an AVX register (as it has AVX enabled). This means we've
got a codegen problem!

Comments on rust-lang#44367 have some more contexutal information but the crux of the
issue is that if we want SIMD to work in general we'll need to ensure that
whenever a function calls another they ABI of the arguments being passed is in
agreement.

One possible solution to this would be to insert "shim functions" where whenever
a `target_feature` mismatch is detected the compiler inserts a shim function
where you pass arguments via memory to the shim and then the shim loads the
values and calls the target function (where the shim and the target have the
same target features enabled). This unfortunately is quite nontrivial to
implement in rustc today (especially when accounting for function pointers and
such).

This commit takes a different solution, *always* passing SIMD arguments through
memory instead of passing as immediates. This strategy solves the problem at the
LLVM layer because the ABI between two functions never uses SIMD registers. This
also shouldn't be a hit to performance because SIMD performance is thought to
often rely on inlining anyway, where a `call` instruction, even if using SIMD
registers, would be disastrous to performance regardless. LLVM should then be
more than capable of fixing all our memory usage to use registers instead after
enough inlining has been performed.

Note that there's a few caveats to this commit though:

* The "platform intrinsic" ABI is omitted from "always pass via memory". This
  ABI is used to define intrinsics like `simd_shuffle4` where LLVM and rustc
  need to have the arguments as an immediate.

* Additionally this commit does *not* fix the `extern` ("C") ABI. This means
  that the bug in rust-lang#44367 can still happen when using non-Rust-ABI functions. My
  hope is that before stabilization we can ban and/or warn about SIMD types in
  these functions (as AFAIK there's not much motivation to belong there anyway),
  but I'll leave that for a later commit and if this is merged I'll file a
  follow-up issue.

All in all this...

Closes rust-lang#44367
@alexcrichton
Copy link
Member Author

@bors: r=eddyb

@bors
Copy link
Contributor

bors commented Jan 25, 2018

📌 Commit 502de01 has been approved by eddyb

@bors bors merged commit 502de01 into rust-lang:master Jan 26, 2018
@alexcrichton alexcrichton deleted the simd-always-mem branch January 26, 2018 21:55
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Oct 14, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
bors added a commit that referenced this pull request Oct 14, 2018
rustc: Fix (again) simd vectors by-val in ABI

The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes #50154
Closes #52636
Closes #54583
Closes #55059

[quite a lot]: #47743
[discussion]: #44367
[wasn't]: #50154
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Oct 16, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
kennytm added a commit to kennytm/rust that referenced this pull request Oct 18, 2018
rustc: Fix (again) simd vectors by-val in ABI

The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Oct 19, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
Manishearth added a commit to Manishearth/rust that referenced this pull request Oct 20, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-SIMD Area: SIMD (Single Instruction Multiple Data) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

repr(simd) is unsound
8 participants