Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacked Borrows vs inner pointers into "stable-deref" types #194

Closed
Aaron1011 opened this issue Aug 12, 2019 · 21 comments
Closed

Stacked Borrows vs inner pointers into "stable-deref" types #194

Aaron1011 opened this issue Aug 12, 2019 · 21 comments
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) C-open-question Category: An open question that we should revisit

Comments

@Aaron1011
Copy link
Member

Aaron1011 commented Aug 12, 2019

When running Miri on owning-ref, I ran into an error with this (minimized) snippet (playground):

fn main() {
    let val: Box<u8> = Box::new(25);
    let ptr: *const u8 = &*val;
    
    let _moved_val = val;
    let _my_val: u8 = unsafe { *ptr };
}

which errors with:

error[E0080]: Miri evaluation error: trying to reborrow for SharedReadOnly, but parent tag <1359> does not have an appropriate item in the borrow stack
   --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/alloc.rs:219:28
    |
219 |     let size = size_of_val(&*ptr);
    |                            ^^^^^ Miri evaluation error: trying to reborrow for SharedReadOnly, but parent tag <1359> does not have an appropriate item in the borrow stack
    |

In this snippet, we're creating a raw pointer into a Box, moving the Box, then dereferencing the raw pointer. The owning-ref crate (and more generally, anything relying on stable_deref_trait) relies on this working. In fact, this is the entire reason for the existence of stable_deref_trait.

The error message is a little confusing. However, I believe that the read from ptr causes the Unique item on the stack to be disabled (since the Unique is above the SharedReadOnly that grants access).

Does it make sense to consider this kind of behavior UB, given that stable_deref_trait and owning-ref (and probably other crates as well) rely on it?

@Aaron1011
Copy link
Member Author

cc @RalfJung

@RalfJung RalfJung added the A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) label Aug 13, 2019
@RalfJung
Copy link
Member

On first glance this looks like a duplicate of #148. Can you confirm?

Does it make sense to consider this kind of behavior UB, given that stable_deref_trait and owning-ref (and probably other crates as well) rely on it?

Finding trade-offs like this is exactly why we have Stacked Borrows implemented in Miri. If this is really a duplicate of the above, it will be very interesting to figure out which optimizations it will cost us to support code like this. Maybe we end up needing some kind of annotation for self-referential structs.

@hanna-kruppe
Copy link

I don't see anything self-referential here. The code is just taking a pointer to the contents of the Box and expects this pointer to remain usable as the Box itself is moved around.

@comex
Copy link

comex commented Aug 13, 2019

So the Box<T> contains a Unique<T> pointer, asserting that all accesses to the underlying object are done through that pointer or something derived from it, and the example breaks that assumption.

If I modify the example to involve mutation, this can actually cause miscompilation:

Playground link

use std::cell::Cell;

fn helper(val: Box<Cell<u8>>, ptr: *const Cell<u8>) -> u8 {
    val.set(10);
    unsafe { (*ptr).set(20); }
    val.get()
}

fn main() {
    let val: Box<Cell<u8>> = Box::new(Cell::new(25));
    let ptr: *const Cell<u8> = &*val;
    let res = helper(val, ptr);
    assert_eq!(res, 20);
}

At the time of writing, the assertion passes in debug mode but fails in release mode.

I believe the modified example still corresponds to what owning-ref does, but I'm not sure as I haven't used that crate myself.

What's happening is that the val parameter to helper is being marked noalias. I'm using Cell to ensure I'm doing "as much as I can" to assert non-uniqueness, but it doesn't actually make a difference: Cell disables noalias for shared references, but not for unique references.

Surprisingly, noalias is applied even without -Z mutable-noalias=yes; looking at the compiler code, mutable-noalias only affects actual &mut references, even though Box is also a mutable pointer. This is problematic, since I don't see why Box would be immune to the LLVM bugs that made us default to mutable-noalias=no.

In this example, though, the misoptimization happens because we've violated the noalias invariant, not because of an LLVM bug.

Edit: Interestingly, my example does not trigger a miri error.

@comex
Copy link

comex commented Aug 13, 2019

I looked into it, and sure enough, you can write an equivalent program without unsafe code using the APIs provided by owning_ref: playground link.

It should be possible for owning_ref to fix this by not storing the Box directly in the OwningRef struct, but instead storing a raw pointer that represents it, using into_raw/from_raw as appropriate. However, doing that for all supported types would require redesigning its StableDeref trait.

@CAD97
Copy link

CAD97 commented Aug 13, 2019

I definitely think it's the same underlying reasoning that would make this and self-referential (i.e. #148) sound.

It boils down to:

  • I get handed &impl StableDeref<Target=T>
  • I make a *const T and store it
  • Later, I observe &impl StableDeref<Target=T> again
  • Thus, I can upgrade my *const T to &T
    • Effectively, re-borrowing from the new reference

The problem is that this is a completely implicit reborrowing, effectively from just having the &impl StableDeref<Target=T> in scope.

This is the exact behavior that StableDeref purports to provide:

More specifically, implementors must ensure that the result of calling deref() is valid for the lifetime of the object, not just the lifetime of the borrow, and that the deref is valid even if the object is moved. [...] Basically, it must be valid to convert the result of deref() to a pointer, and later dereference that pointer, as long as the original object is still live, even if it has been moved or &self methods have been called on it.

Pin and self-referential structures through Pin use a mostly equivalent guarantee to allow observation of Pin<P> to "re-validate" stored references.

The result is that Pin<impl Deref<Target=T>>/impl StableDeref<Target=T> aren't fully noalias, because they guarantee that it's valid to use raw pointers derived from their Deref implementation beyond the lifetime provided by said Deref. I think the only way that these types that naively "should" be noalias can be noalias is if the owning Pin/impl StableDeref is consulted to "upgrade" these "swindled" references to valid ones again, allowing "re-deriving" them and changing their parent such that they are children of the valid noalias pointer.

@comex
Copy link

comex commented Aug 14, 2019

Another update: It's also possible to violate the noalias condition with just async/await, similar to the example in #148. But for arcane reasons, it seems almost impossible to actually trigger miscompilation, even with -Z mutable-noalias=yes.

This is all you need to violate the condition:

pub async fn test() -> i32 {
    let a: Cell<i32> = Cell::new(0);
    let mut b: &Cell<i32> = &a;
    SomeFuture.await;
    b.set(100);
    a.get()
}

It's desugared into something like

struct Test {
    state: ?,
    a: Cell<i32>,
    b: *const Cell<i32>,
}
impl Future for Test {
    fn poll(self: Pin<&mut Self>, ...) {
        loop {
            match self.state {
                State1 => {
                    self.b = &self.a;
                    ...poll on SomeFuture...
                },
                State2 => {
                    (*self.b).set(100);
                    Poll::Ready(self.a.get())
                },
            }
        }
    }
}

Here, (*self.b).set(100) writes to self.a, but if we enter poll with self.state == State2, the value of self.b is not directly or indirectly 'based on' self. Thus the write violates the noalias condition for self.

On the other hand, if we enter poll in the initial state and the poll is ready on the first try, then we haven't violated the condition, because the value of self.b is 'based on' self (because we wrote it as part of the same function call). Thus, LLVM can only assume no-aliasing in State2 if it knows that the poll won't be ready on the first try. Because of quirks of the await implementation (use of thread-local storage), I wasn't able to come up with a proof-of-concept that allowed LLVM to prove this, even using extremely contrived Future implementations. There's probably some way to do it, but the issue is unlikely to affect actual async code.

I was sort of hoping to drop a bombshell that might affect async/await stabilization, but no such luck. ;)

For a long-term solution, we need to avoid marking self as noalias in this case. Perhaps, as @CAD97 suggested, Pin<&mut T> should just never be noalias...

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

@comex I think it is worth mentioning that in your example here helper has as a pre-condition that the pointer is valid to dereference and that it does not alias the Box<T> (a Box<T> owns T and using it is like using a &mut T) . The helper function should therefore be unsafe, and the calling code violates its pre-conditions, so the program has undefined behavior, and gets mis-optimized.

@RalfJung
Copy link
Member

I don't see anything self-referential here. The code is just taking a pointer to the contents of the Box and expects this pointer to remain usable as the Box itself is moved around.

Oh, I forgot what owning-ref does. The pointer into the box lives next to the box, not inside it. I was also confused by "the box is moved", because the box has to stay in place for the pointer to remain valid. But the Box pointer is moved -- this is the kind of "move" that is allowed even when something is pinned.

However, in terms of Stacked Borrows, it doesn't matter where the pointer is stored, whether inside the box or outside of it. What matters is that an inner pointer gets created, then an outer pointer gets used (for a move of a box or a reborrow of a mutable reference), and then the old inner pointer gets used again. So the pattern is the same.

@RalfJung RalfJung changed the title Stacked Borrows: Using raw pointer into moved box Stacked Borrows vs inner pointers into "stable-deref" types Aug 14, 2019
@RalfJung RalfJung added the C-open-question Category: An open question that we should revisit label Aug 14, 2019
@RalfJung
Copy link
Member

@comex

So the Box contains a Unique pointer, asserting that all accesses to the underlying object are done through that pointer or something derived from it, and the example breaks that assumption.

Note that right now, Stacked Borrows special-cases Box but not Unique.

Edit: Interestingly, my example does not trigger a miri error.

Nice example! Also the one for async/await.

Interesting indeed. Probably this is because all raw pointers are equal for Miri (at least currently; I hope to change this eventually but without &raw that's not possible), and there is a raw pointer being created inside val.set(10).

@CAD97

The result is that Pin<impl Deref<Target=T>>/impl StableDeref<Target=T> aren't fully noalias, because they guarantee that it's valid to use raw pointers derived from their Deref implementation beyond the lifetime provided by said Deref.

Exactly.

Maybe such things "just" need to be specially marked to the compiler.

@gnzlbg

The helper function should therefore be unsafe, and the calling code violates its pre-conditions, so the program has undefined behavior, and gets mis-optimized.

The compiler doesn't know about the preconditions, so violating the preconditions never explains UB (from an abstract machine perspective).

The reason it is UB is that aliasing constraints are violated, and as @comex demonstrated owning-ref can do the same.


So should we close this now in favor of #148? I think we established that they are definitely closely related.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

The compiler doesn't know about the preconditions, so violating the preconditions never explains UB (from an abstract machine perspective).

Of course it does? All Box function arguments are noalias because the compiler does know about this precise pre-condition (EDIT: that no other argument to the function can alias them, e.g., example (godbolt)).

@RalfJung
Copy link
Member

I think it is just ontologically wrong to say "this is UB because the precondition is violated", the IMO more correct view is "this is a precondition because otherwise there is UB" (and the UB is caused by the aliasing model that's part of the abstract machine).

Following your argument "the calling code violates its pre-conditions, so the program has undefined behavior", calling Vec::set_len the wrong way causes UB. It does not. That's all I am saying. I think this is just sloppy terminology on your side but I wanted to avoid people reading this interpreting it the wrong way.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

I claimed that:

  • helper has a pre-condition that its callers must upheld,
  • violating this particular precondition of helper invokes undefined behavior,
  • helper doesn't check its pre-condition and therefore the function should be unsafe,
  • programs that invoke undefined behavior can be mis-optimized.

I'm not sure how you end up arguing that there are pre-conditions that can be violated without invoking undefined behavior (e.g. Vec::set_len). I don't disagree with that, and I don't think I claimed otherwise.

I did not make any claims about the source of the undefined behavior, I certainly did not intend to make the tautological claim that this is UB because helper has a precondition due to this being UB.

When I read the thread, I got the impression from @comex comment that this might be a mis-optimization. I just wanted to point out that this is intended behavior.

@danielhenrymantilla
Copy link
Contributor

danielhenrymantilla commented Aug 14, 2019

In the case of OwningRef<Box<T>>, this could be solved if the implementation decided to replace the input Box<T> given at construction by a NonNull<UnsafeCell<T>> (the NonNull to opt out of uniqueness; the UnsafeCell to be able to temporarily upgrade back into it).

More generally, for a pointer P : Deref, we will need to also have IntoRawNonNull + FromRawNonNull to be able to ensure opting out of uniqueness.

I don't know how this would generalise toPin, though (at the end of the day, we miss the PinMut abstraction).

@RalfJung
Copy link
Member

RalfJung commented Aug 14, 2019

@gnzlbg

The helper function should therefore be unsafe, and the calling code violates its pre-conditions, so the program has undefined behavior, and gets mis-optimized.

This phrasing heavily implied a general entailment of the form "precondition violated implies UB". Specifically, the "so" in your sentence indicates a causal connection. If you didn't mean to say that, take this as feedback for trying a clearer wording next time.
Maybe I am nitpicking too much here, but I think it is important that we use clear language. The next time you write "A, so B" I need to know if this indicates a causal connection or not.

For @comex' example, it also doesn't really matter whether that function is marked unsafe or not as we were only considering its operational behavior.

When I read the thread, I got the impression from @comex comment that this might be a mis-optimization. I just wanted to point out that this is intended behavior.

Ah, the term "miscompilation" is the culprit here. Do we have a good word for "code that has UB that the compiler actually exploits", in contrast to "code that has UB but still works as intended by the programmer"? We often say "miscompilation" but that sounds like a compiler bug, which it isn't if the code really does have UB.

As far as I can tell, there was never any disagreement that that code is wrong according to our current aliasing rules, and so the compiler is in its right to "miscompile". @comex was not trying to argue the UB away. The point of that comment was that this code does the exact same thing as what you can do with owning-ref using only safe code. So either owning-ref is wrong or our aliasing rules are wrong.

Cc @Kimundi (owning-ref author); I also reported this upstream.

at the end of the day, we miss the PinMut abstraction

We might... :/

@bluss
Copy link
Member

bluss commented Apr 18, 2020

FWIW, we've attempted to address this in ndarray, where it is using a Vec for allocation/deallocation (partly historical purposes, now for zero-copy transfer of data between Array and Vec).

The description of the problem is this:

The owned array has used a Vec that it owns, and kept a separate "head pointer" ptr on the side; the head pointer points to the first element in the array, somewhere inside the vector's data.

and solution this:

We replace the Vec with a deconstructed vector; it's equivalent but uses NonNull for its pointer, so it's an aliasable owner.

It feels like a non-issue, because I can't find anywhere that we "interleave" read/writes through the Vec with reads/writes through our own "head pointer". The only places where we write to the Vec (for example in Clone::clone_from), afterwards we immediately refresh our head pointer, without using the old value. But it seems like the right thing to do, to avoid using a Vec with its Unique<T>.

@RalfJung
Copy link
Member

It feels like a non-issue, because I can't find anywhere that we "interleave" read/writes through the Vec with reads/writes through our own "head pointer".

To clarify, you are saying this code should be fine and Stacked Borrows should not complain, for the reasons given?

@bluss
Copy link
Member

bluss commented Apr 18, 2020

If this code is the ndarray code in question: Yes, but I don't know the model that well, so it's just a hunch.

@danielhenrymantilla
Copy link
Contributor

danielhenrymantilla commented Apr 18, 2020

I still feel that if people / libraries (i.e., I am not talking about the compiler-generated unsugaring for generators and the Pin interaction) want to "alias" a Box's pointee, then they should downgrade the Box to a different type (which could be in std, that's a different topic) which would not include the Unique property: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=039a3ae5d241eeee37b0f8bc40446f6a

That is, a Box that only assumes Unique-ness when DerefMut (which in my example requires an explicit method call, as a "lint") and obviously when dropped.

Otherwise, its pointee can be aliased, and depending on the usage, either by a pointer that assumes immutability, so as long as the Box itself never mutates the pointee (should be enough for ::owning_ref), or one that could even support the Box mutating the pointee, so as long as it does not assume immutability (and is not used for mutation itself).


  • To go back to Pin, maybe Pin<Box< _ >> could be such a pointer? (my wavy-hands intuition: if "the memory location itself owns the pointee", then it makes no sense for the Box to assume Unique-ness anymore).

weiznich added a commit to weiznich/diesel that referenced this issue Nov 30, 2021
Any rust container like `Box<T>`, `Vec<T>` or `String<T>` internally
contains a `Unique<T>` pointer, which communicates to the compiler that
this container is the owner of that memory location and all access goes
through that pointer. See
rust-lang/unsafe-code-guidelines#194 for
details. Passing out a pointer to the underlying buffer to sqlite could
cause UB according to this definition, at least if someone else accesses
the buffer through the originial pointer. To prevent that we temporarily
leak the Buffer and manage the pointer by ourself.

Additionally this change introduces a way to construct the
`BoundStatement` as early as possible as part of the
`BoundStatement::bind` function, so that all cleanup code can be
concetracted in the corresponding `Drop` impl
weiznich added a commit to weiznich/diesel that referenced this issue Dec 15, 2021
Any rust container like `Box<T>`, `Vec<T>` or `String<T>` internally
contains a `Unique<T>` pointer, which communicates to the compiler that
this container is the owner of that memory location and all access goes
through that pointer. See
rust-lang/unsafe-code-guidelines#194 for
details. Passing out a pointer to the underlying buffer to sqlite could
cause UB according to this definition, at least if someone else accesses
the buffer through the originial pointer. To prevent that we temporarily
leak the Buffer and manage the pointer by ourself.

Additionally this change introduces a way to construct the
`BoundStatement` as early as possible as part of the
`BoundStatement::bind` function, so that all cleanup code can be
concetracted in the corresponding `Drop` impl
weiznich added a commit to weiznich/diesel that referenced this issue Dec 16, 2021
Any rust container like `Box<T>`, `Vec<T>` or `String<T>` internally
contains a `Unique<T>` pointer, which communicates to the compiler that
this container is the owner of that memory location and all access goes
through that pointer. See
rust-lang/unsafe-code-guidelines#194 for
details. Passing out a pointer to the underlying buffer to sqlite could
cause UB according to this definition, at least if someone else accesses
the buffer through the originial pointer. To prevent that we temporarily
leak the Buffer and manage the pointer by ourself.

Additionally this change introduces a way to construct the
`BoundStatement` as early as possible as part of the
`BoundStatement::bind` function, so that all cleanup code can be
concetracted in the corresponding `Drop` impl
@JakobDegen
Copy link
Contributor

We have separate issues to discuss the design of MaybeDangling and to discuss the open questions around Box aliasing requirements. Closing.

@RalfJung
Copy link
Member

RalfJung commented Aug 8, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) C-open-question Category: An open question that we should revisit
Projects
None yet
Development

No branches or pull requests

9 participants