-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc: Implement stack probes for x86 #42816
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
This intends to address #16012 for x86/x86_64, and it also shouldn't land until the upstream diff has landed (although this seems likely to do so, so I figured I may as well go ahead and get this reviewed). Note that this is using a fork of our fork temporarily until the patches are upstreamed, this will not land pointing at a different fork of Rust. |
70a3f8f
to
16098e7
Compare
☔ The latest upstream changes (presumably #42682) made this pull request unmergeable. Please resolve the merge conflicts. |
9727e7a
to
0c33a2e
Compare
// our goal here is to take %eax and add it to %rsp, but we're also going to | ||
// touch each page between %rsp+8 and %rsp+8-%rax | ||
// | ||
// The ABI here is that the stack frame size is located in `%eax`. Upon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%rax
(also on the line below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think technically it's %eax
because I'm seeing this before calls to probestack:
mov $0x2008,%eax
Although it's essentiall %rax
as that zeroes out the top bits anyway.
// | ||
// The ABI here is that the stack frame size is located in `%eax`. Upon | ||
// return we're not supposed to modify `%esp` or `%eax`. It's unclear | ||
// whether we can modify caller-saved registers, but to be on the safe side |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to SystemV x86_64 ABI, %rcx
is the 4th function argument, so you'd better save it, no doubt here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but I think the comment still applies. We could pick any caller-saved register for that scratch space (I think) but I'm not sure what LLVM guarantees here. Figured it'd be good to just stay conservative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could pick any caller-saved register for that scratch space (I think) but I'm not sure what LLVM guarantees here.
I don't think we're on the same page. The probestack function does not get invoked via the standard function call sequence; the prologue code inserted by LLVM consists of just a call
instruction. So anything that's not defined by the ABI as a scratch register has to be saved in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I might be wrong, let me double-check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sorry what I mean here is:
- At the start of a function, all caller-saved registers are scratch space
- Some unknown set of instructions happen
- Then there's
call __rust_probestack
From what I've seen the "unknown set of instructions" is the empty set, but I'm not sure if that's true 100% of the time with LLVM. If that set of instructions is always empty then all caller-saved registers should be scratch space here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? %rcx
is used for passing the 4th argument in SysV ABI. If it is not saved by the unknown set of instructions nor __rust_probestack
then how will the callee know the value of its 4th argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To illustrate. Let's say we translate this LLVM IR:
declare void @use([40096 x i8]*)
define i32 @test(i32 %a, i32 %b, i32 %c, i32 %d) "probe-stack"="__probestack" {
%array = alloca [40096 x i8], align 16
call void @use([40096 x i8]* %array)
ret i32 %d
}
This results in the following assembly:
test: # @test
pushq %rbx
movl $40096, %eax # imm = 0x9CA0
callq __probestack
subq %rax, %rsp
movl %ecx, %ebx
movq %rsp, %rdi
callq use
movl %ebx, %eax
addq $40096, %rsp # imm = 0x9CA0
popq %rbx
retq
As you can see %ecx
is not saved around the __probestack
call. You have to save it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is clearly just confusing, I'm going to delete it.
Ah also, I've updated our llvm fork now that patches are upstream and that's here, this should be ready to land. |
#[cfg(target_arch = "x86_64")] | ||
pub unsafe extern fn __rust_probestack() { | ||
// our goal here is to take %eax and add it to %rsp, but we're also going to | ||
// touch each page between %rsp+8 and %rsp+8-%rax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment isn't accurate anymore — you don't actually modify rsp in this function.
#[no_mangle] | ||
#[cfg(all(target_arch = "x86", windows))] | ||
pub unsafe extern fn __rust_probestack() { | ||
// This is similar to 32-bit unix but we're going to actually perform the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure you don't need this function: just omit the probe-stack
attribute on Windows and LLVM will automatically add a call to Microsoft's implementation. It's mandated by the Windows ABI, so LLVM has to do that.
@@ -16,6 +16,7 @@ pub fn target() -> TargetResult { | |||
base.cpu = "x86-64".to_string(); | |||
base.pre_link_args.get_mut(&LinkerFlavor::Gcc).unwrap().push("-m64".to_string()); | |||
base.max_atomic_width = Some(64); | |||
base.stack_probes = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I don't think you need this.
@@ -15,6 +15,7 @@ pub fn target() -> TargetResult { | |||
let mut base = super::windows_msvc_base::opts(); | |||
base.cpu = "x86-64".to_string(); | |||
base.max_atomic_width = Some(64); | |||
base.stack_probes = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I don't think you need this.
@@ -15,6 +15,7 @@ pub fn target() -> TargetResult { | |||
let mut base = super::windows_msvc_base::opts(); | |||
base.cpu = "pentium4".to_string(); | |||
base.max_atomic_width = Some(64); | |||
base.stack_probes = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I don't think you need this.
@@ -21,6 +21,7 @@ pub fn target() -> TargetResult { | |||
// space available to x86 Windows binaries on x86_64. | |||
base.pre_link_args | |||
.get_mut(&LinkerFlavor::Gcc).unwrap().push("-Wl,--large-address-aware".to_string()); | |||
base.stack_probes = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I don't think you need this.
//! guard page. If a function did not have a stack probe then there's a risk of | ||
//! having a stack frame *larger* than the guard page, so a function call could | ||
//! skip over the guard page entirely and then later hit maybe the heap, | ||
//! possibly leading to a security vulnerability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth maybe doing
//! possibly leading to security vulnerabilities such as [The Stack Clash], for example.
//!
//! [The Stack Clash]: https://blog.qualys.com/securitylabs/2017/06/19/the-stack-clash
// The ABI here is that the stack frame size is located in `%eax`. Upon | ||
// return we're not supposed to modify `%esp` or `%eax`. | ||
asm!(" | ||
push %rcx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you can use r11 here and avoid this spill, since r11 is a designated scratch register.
//! thread has a guard page then a stack overflow is guaranteed to hit that | ||
//! guard page. If a function did not have a stack probe then there's a risk of | ||
//! having a stack frame *larger* than the guard page, so a function call could | ||
//! skip over the guard page entirely and then later hit maybe the heap, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just this issue. The Stack Clash vulnerability is not necessarily easy to trigger, especially on amd64, but if you have secondary threads, their stacks is just anonymous memory that is very likely to be close to heap in the first place. Even worse if the threads have small stacks.
//! Note that `#[naked]` is typically used here for the stack probe because the | ||
//! ABI corresponds to no actual ABI. Additionally it means that on Windows we | ||
//! don't have to maintain assembly for MSVC and MinGW, we can just have one | ||
//! blob of assembly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Windows remark is no longer relevant.
cmp $$0x1000,%rax | ||
ja 2b | ||
|
||
// Finish up the last remainings stack space requested, getting the last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uber-nit: "remainings" -> "remaining"
☔ The latest upstream changes (presumably #42828) made this pull request unmergeable. Please resolve the merge conflicts. |
Sorry for interrupting, I would like to submit a patch for MSP430, but it sits on top of the changes from this pull request. Do I need to wait until it's merged? |
@bors: r=nikomatsakis |
📌 Commit 51b54ca has been approved by |
🔒 Merge conflict |
☔ The latest upstream changes (presumably #42899) made this pull request unmergeable. Please resolve the merge conflicts. |
Will be required for rust-lang/rust#42816
Add `__rust_probestack` intrinsic Will be required for rust-lang/rust#42816
Will be required for rust-lang/rust#42816
Add `__rust_probestack` intrinsic Will be required for rust-lang/rust#42816
This commit implements stack probes on x86/x86_64 using the freshly landed support upstream in LLVM. The purpose of stack probes here are to guarantee a segfault on stack overflow rather than having a chance of running over the guard page already present on all threads by accident. At this time there's no support for any other architecture because LLVM itself does not have support for other architectures.
@bors: r=nikomatsakis |
📌 Commit 5dbd97d has been approved by |
rustc: Implement stack probes for x86 This commit implements stack probes on x86/x86_64 using the freshly landed support upstream in LLVM. The purpose of stack probes here are to guarantee a segfault on stack overflow rather than having a chance of running over the guard page already present on all threads by accident. At this time there's no support for any other architecture because LLVM itself does not have support for other architectures.
☀️ Test successful - status-appveyor, status-travis |
Retry downloading llvm commit tarball As promised on #42816 (comment) r? @alexcrichton
This commit implements stack probes on x86/x86_64 using the freshly landed
support upstream in LLVM. The purpose of stack probes here are to guarantee a
segfault on stack overflow rather than having a chance of running over the guard
page already present on all threads by accident.
At this time there's no support for any other architecture because LLVM itself
does not have support for other architectures.