-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reusing allocation addresses should imply a happens-before relationship #3450
Comments
Hm, I've never heard that the allocator would be required to establish happens-before relationships when doing address reuse. The compiler already can't move things that access this memory down below the free or up across the malloc, so it may be sufficient for the allocator to use Would it help to make these |
No, because those could still load an old value.
If there is no happens-before, then writes from before the |
I am not sure I fully understand what happens. So we're reading an outdated |
Yes, exactly.
No, because that read would still race with the write of the count in the other thread (or at least miri would think so). |
Is it possible to reproduce this without involving ReentrantLock? I thought something like this would do it but there doesn't seem to be a race here. use std::thread;
// Just a type with a rare size to make it more likely that we will
// reuse the right allocation.
type T = [u8; 7];
fn thread() {
for _ in 0..100 {
let b = Box::new(T::default());
drop(b);
thread::yield_now();
}
}
fn main() {
let j1 = thread::spawn(thread);
let j2 = thread::spawn(thread);
j1.join().unwrap();
j2.join().unwrap();
} |
Ah I guess it's not enough to just reuse the memory, as the data race detector will consider the old and new allocation to have nothing to do with each other. It's about using (abusing?) the address as a signal for synchronization on another shared location. I am honestly not convinced that your reasoning is sound here, it sounds extremely sketchy to me. |
Cc @rust-lang/opsem any opinions on this? I feel like if we're changing Miri to accommodate this we at least need to also document this as a requirement for our allocator traits. I don't know enough about allocators to judge whether there's a a risk of a real-world allocator violating this property. |
In this case it's a thread-local, that wouldn't even go through a regular allocator right? I have no idea how the memory backing thread-locals is managed, some sort of page table magic as part of thread creation/destruction or so? Where would the happens-before come from there? |
musl libc either puts TLS on the (mmapped) stack or uses malloc to allocate it. |
The kernel would have a happens before between the munmap and mmap, right? It will only reuse a mmap region after it is certain all threads have flushed the TLB entries for the old mapping (which requires synchronization) Same with a memory allocator which won't reuse a memory location until after it has synchronized with the free call. |
Yes, this is invariant over all correct memory allocators. In terms of hardware, if the |
Separately, doesn't address reuse mean that ReentrantLock ownership is transferred to a different thread when a thread exits without releasing the lock? That doesn't seem right either. |
It's the same thing: as long as there is a happens-before relationship between the deallocation and allocation, that is totally sound. |
Sure, it might be sound but is it safe? It seems wrong that you can have multiple threads locking the same lock without unlocking it first (EDIT: i.e., it is inconsistent with guarantees documented in |
|
@joboet is there a way to extract this into a self-contained testcase that does not rely on |
Yes! This one works for any seed and will only succeed if the happens-before relationship is formed: #![feature(strict_provenance)]
#![feature(sync_unsafe_cell)]
use std::cell::SyncUnsafeCell;
use std::sync::atomic::{AtomicUsize, Ordering::Relaxed};
use std::thread;
static ADDR: AtomicUsize = AtomicUsize::new(0);
static VAL: SyncUnsafeCell<i32> = SyncUnsafeCell::new(0);
fn addr() -> usize {
let alloc = Box::new(42);
<*const i32>::addr(&*alloc)
}
fn thread1() {
unsafe {
VAL.get().write(42);
}
let alloc = addr();
ADDR.store(alloc, Relaxed);
}
fn thread2() -> bool {
// We try to get an allocation at the same address. If we fail too often,
// try again with a different allocation to ensure it is still in miris
// reuse cache.
for _ in 0..16 {
let alloc = addr();
let addr = ADDR.load(Relaxed);
if alloc == addr {
// If the new allocation is at the same address as the old one,
// there must be a happens-before relationship between them.
// Therefore, we can read VAL without racing and must observe
// the write above.
let val = unsafe { VAL.get().read() };
assert_eq!(val, 42);
return true;
}
}
false
}
fn main() {
let mut success = false;
while !success {
let t1 = thread::spawn(thread1);
let t2 = thread::spawn(thread2);
t1.join().unwrap();
success = t2.join().unwrap();
// Reset everything.
ADDR.store(0, Relaxed);
unsafe { VAL.get().write(0); }
}
} |
Since C11, per cppreference:
synchronizes-with is the relation from a release store to an acquire load; the This doesn't mean that However, this also ties into provenance and allocation planes — this reasoning relies on it being impossible for two disjoint allocated objects that are simultaneously live to have the same address. It's not necessarily guaranteed that this is true (see discussions around resource exhaustion and optimizing out allocations). |
Thanks! That's useful. So yeah Miri should implement that then. But also IMO our allocator traits should document this.
Ah indeed this entire approach that ReentrantLock takes is kind of incompatible with LLVM optimizations that reduce memory usage. That's a separate problem, but still a challenging one. Would people rather be able to optimize away |
I believe we were already considering something like this in the model, some kind of "plane 0" used for external allocations and such, and which we force the usage of once the user starts doing difficult-to-analyze things with pointers. (We called it "exposed" but this meaning is overloaded, I think... I don't think it's necessarily the same as the strict-provenance "expose" but there are some relationships between them.) My guess is that ReentrantLock is either already doing something that would imply the use of this allocation plane, or could be modified to do so once we have enough details hammered out to suggest actual code changes. |
FWIW my interpretation was that it is exactly the same "exposed". But anyway, let's continue in rust-lang/unsafe-code-guidelines#328. |
when an address gets reused, establish a happens-before link in the data race model Fixes #3450
Address reuse improvements and fixes - when an address gets reused, establish a happens-before link in the data race model - do not reuse stack addresses, and make the reuse rate configurable Fixes #3450
Address reuse improvements and fixes - when an address gets reused, establish a happens-before link in the data race model - do not reuse stack addresses, and make the reuse rate configurable Fixes rust-lang/miri#3450
The following code fails under miri:
Commit:
e38a828bb8c1faa95be23d1fa902913d218de848
Command:
./miri run -Zmiri-seed=706 -Zmiri-track-weak-memory-loads -Zmiri-preemption-rate=0 race.rs
Result:
ReentrantLock
uses the address of a TLS variable to check whether the current thread already owns the lock. As-Zmiri-track-weak-memory-loads
shows, the second thread reads an outdatedowner
value and as with this specific seed the TLS allocation happens to be on the same address, it thinks it already owns the lock; thus triggering UB inside miri.However, I do not believe this to be an issue with
ReentrantLock
. In any allocator, reallocating the same address requires observing that the address has been deallocated. Thus, there must be a happens-before relationship between deallocation and allocation. miri currently does not form such a relationship, leading to the issue above. With happens-before, the outdated read cannot happen because the second thread will observe that the owner has been reset and released after allocating the TLS variable.The text was updated successfully, but these errors were encountered: