Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The ref to *const implicit conversion has become a footgun with e.g. Rc::from_raw #112337

Open
glandium opened this issue Jun 6, 2023 · 17 comments
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. T-opsem Relevant to the opsem team

Comments

@glandium
Copy link
Contributor

glandium commented Jun 6, 2023

The following code does nothing in practice (no strong reference added to the Rc) with 1.70 or earlier versions compiled against LLVM 16.

use std::rc::Rc;
use std::mem;

pub struct Foo;

pub unsafe extern "C" fn addref(f: &Foo) {
    let raw = Rc::from_raw(f);
    mem::forget(Rc::clone(&raw));
    mem::forget(raw);
}

The corresponding (release) IR, per the playground:

define void @_ZN10playground6addref17h4ebcfc6cc7bb1438E(ptr noalias noundef nonnull readonly align 1 %f) unnamed_addr #0 personality ptr @rust_eh_personality {
start:
  %0 = getelementptr inbounds i8, ptr %f, i64 -16
  %self.val.i = load i64, ptr %0, align 8, !noundef !2
  %_3.i = icmp ne i64 %self.val.i, 0
  tail call void @llvm.assume(i1 %_3.i)
  %_10.i = icmp eq i64 %self.val.i, -1
  br i1 %_10.i, label %bb4.i, label %bb6

bb4.i:                                            ; preds = %start
  tail call void @llvm.trap()
  unreachable

bb6:                                              ; preds = %start
  ret void
}

Changing the ref to a *const "fixes" the problem.

use std::rc::Rc;
use std::mem;

pub struct Foo;

pub unsafe extern "C" fn addref(f: *const Foo) {
    let raw = Rc::from_raw(f);
    mem::forget(Rc::clone(&raw));
    mem::forget(raw);
}

IR looks like this:

define void @_ZN10playground6addref17h3157527175b7319bE(ptr noundef %f) unnamed_addr #0 personality ptr @rust_eh_personality {
start:
  %0 = getelementptr inbounds i8, ptr %f, i64 -16
  %self.val.i = load i64, ptr %0, align 8, !noundef !2
  %_3.i = icmp ne i64 %self.val.i, 0
  tail call void @llvm.assume(i1 %_3.i)
  %strong1.i = add i64 %self.val.i, 1
  store i64 %strong1.i, ptr %0, align 8
  %_10.i = icmp eq i64 %strong1.i, 0
  br i1 %_10.i, label %bb4.i, label %bb6

bb4.i:                                            ; preds = %start
  tail call void @llvm.trap()
  unreachable

bb6:                                              ; preds = %start
  ret void
}

Here, the Rc is updated with a new strong reference.

This is another footgun that comes from llvm/llvm-project@01859da, along with #111229.

@glandium glandium added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Jun 6, 2023
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 6, 2023
@glandium
Copy link
Contributor Author

glandium commented Jun 6, 2023

Cc: @pcwalton @nikic

@glandium glandium changed the title The ref to *const implicit conversion has become a footgun The ref to *const implicit conversion has become a footgun with e.g. Rc::from_raw Jun 6, 2023
@emilio
Copy link
Contributor

emilio commented Jun 6, 2023

I think at least the addref case ought to work. It is a sensible pattern to have an object that's known-refcounted and allow passing &T rather than &Rc<T> around IMO...

It's less clear if the equivalent "release" code should work:

pub unsafe extern "C" fn release(f: &Foo) {
    let _ = Rc::from_raw(f);
}

since you could argue that f could be dead at the end of the function...

@Noratrieb Noratrieb added C-discussion Category: Discussion or questions that doesn't represent real issues. T-opsem Relevant to the opsem team and removed C-bug Category: This is a bug. labels Jun 6, 2023
@Noratrieb
Copy link
Member

Even just the addref case is writing through a &T where T: Freeze, I don't see how that should ever be supposed to work. In general, working with references around unsafe code is a bad idea. I think this is a great example of why something like Rc::from_raw should really have taken a *mut T instead.

I don't know how this should best be addressed. Even #111229 doesn't help here, because Rc::from_raws signature implies that it won't write to anything behind the pointer.
Maybe some kind of more general "reference coerced to owning pointer" lint, where "owning pointer" would have to be special cased for methods like Rc::from_raw.

@Noratrieb Noratrieb removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. regression-untriaged Untriaged performance or correctness regression. labels Jun 6, 2023
@glandium
Copy link
Contributor Author

glandium commented Jun 6, 2023

It's not necessarily about owning, though, more about interior mutability.

@emilio
Copy link
Contributor

emilio commented Jun 6, 2023

@Nilstrieb Rc::from_raw doesn't write to the pointer itself, right? It's the Drop impl of Rc, or clone-ing the RC what actually performs the write.

@Noratrieb
Copy link
Member

Noratrieb commented Jun 6, 2023

yes, but having an Rc without write access is a bug. It's also breaking the safety invariant of Rc::from_raw, which says that the pointer must be valid for the Rc.

This function is unsafe because improper use may lead to memory unsafety, even if the returned Rc is never accessed.

So even if it's not language level UB, it's still breaking a safety invariant.

@emilio
Copy link
Contributor

emilio commented Jun 6, 2023

yes, but having an Rc without write access is a bug.

How so? That's the whole point of Rc.

It's also breaking the safety invariant of Rc::from_raw, which says that the pointer must be valid for the Rc.

The pointer might be valid for the Rc, right? That depends on the caller.

So even if it's not language level UB, it's still breaking a safety invariant.

I don't see how the code in the OP breaks that invariant.

@emilio
Copy link
Contributor

emilio commented Jun 6, 2023

Rc::increment_strong_count and Rc::decrement_strong_count take the (const) pointer and increment the reference count.

@Noratrieb
Copy link
Member

Noratrieb commented Jun 6, 2023

The pointer doesn't have write access to the field, that's expected. But it also does not have write access to the reference counter, which is the problem: See Miri on https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=bcc0acaa89f36841786b0cc10f626968
A basic safety invariant of Rc is that you can clone it (not that the safety invariant is particularly important, because this code has UB because it calls clone).

@emilio
Copy link
Contributor

emilio commented Jun 6, 2023

Sure, I get it, I just don't think it's reasonable behavior.

As in, it seems silly (to me) that this code is fine:

    let rc = Rc::new(0);
    let r = Rc::into_raw(rc);
    unsafe {
        Rc::increment_strong_count(r);
        Rc::decrement_strong_count(r);
    }

But this is UB in a way that it can cause security issues:

    let rc = Rc::new(0);
    let r = Rc::into_raw(rc);
    unsafe {
        Rc::increment_strong_count(&*r);
        Rc::decrement_strong_count(&*r);
    }

In particular, it seems reasonable to have types that are effectively "always-refcounted" in your program, in a way that calling Rc::from_raw on a reference to that type is always fine.

Or at the very least there should be a warning / lint of sorts. Because it seems like an extremely easy mistake to make.

@joboet
Copy link
Member

joboet commented Jun 6, 2023

It's completely reasonable behaviour. Creating a reference &T restricts the provenance of the pointer to T. This allows some nice optimizations:

Optimizing the Rc allocation in create_rc away is valid without knowing the insides of takes_reference (LLVM doesn't currently do so):

fn takes_reference(t: &i32) {
    // Some operation
}

fn create_rc() {
    let rc = Rc::new(0);
    takes_reference(&*rc);
}

This optimization would become invalid if takes_reference were allowed to rely on the value being stored inside an Rc and used Rc::clone.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jun 7, 2023
…tions. r=glandium

Even tho rust-lang/rust#112337 is IMO a rust
bug, it's easy to work around in our code and it doesn't really affect
expressiveness.

Differential Revision: https://phabricator.services.mozilla.com/D180065
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jun 8, 2023
…tions. r=glandium, a=dmeehan

Even tho rust-lang/rust#112337 is IMO a rust
bug, it's easy to work around in our code and it doesn't really affect
expressiveness.

Differential Revision: https://phabricator.services.mozilla.com/D180065
ErichDonGubler pushed a commit to erichdongubler-mozilla/firefox that referenced this issue Jun 8, 2023
…tions. r=glandium

Even tho rust-lang/rust#112337 is IMO a rust
bug, it's easy to work around in our code and it doesn't really affect
expressiveness.

Differential Revision: https://phabricator.services.mozilla.com/D180065
surapunoyousei pushed a commit to Floorp-Projects/Floorp that referenced this issue Jun 16, 2023
…tions. r=glandium, a=dmeehan

Even tho rust-lang/rust#112337 is IMO a rust
bug, it's easy to work around in our code and it doesn't really affect
expressiveness.

Differential Revision: https://phabricator.services.mozilla.com/D180065
@emilio
Copy link
Contributor

emilio commented Jul 14, 2023

It's completely reasonable behaviour. Creating a reference &T restricts the provenance of the pointer to T.

There are reasonable use-cases for casting references and transmute to a container. Seems reasonable to have stuff like C-style inheritance:

#[repr(C)]
struct Derived {
    base: Base,
    something: RefCell<...>,
}

Casting from &Base to &Derived seems like a reasonable thing to do, and something you sometimes must do if you're interfacing with C / C++. But not exclusively. E.g., Servo's DOM works like that.

@Noratrieb
Copy link
Member

That's an interesting variation of rust-lang/unsafe-code-guidelines#256, you should leave a comment there.

@RalfJung
Copy link
Member

For any such tricks, you'll have to use raw pointers throughout. Making &T (where T: Freeze) read-only is pretty fundamental.

Depending on how rust-lang/unsafe-code-guidelines#236 is resolved, something like &(UnsafeCell<()>, Base) might be sufficient to preserve the information that "there is some interior mutable stuff elsewhere", but this is the best option I can imagine here.

@glandium
Copy link
Contributor Author

If PhantomData<UnsafeCell<()>> worked, something like &(PhantomData<Derived>, Base) would work, and would be a no-brainer to plaster everywhere than to figure out whether Derived needs the hack or not (thinking about cases where that code is proc-macro'ed or generated some other way).

@RalfJung
Copy link
Member

Currently it looks like UnsafeCell inside PhantomData does not have any effect. So this would have to be changed to permit that pattern.

@RalfJung
Copy link
Member

RalfJung commented Sep 6, 2023

Currently it looks like UnsafeCell inside PhantomData does not have any effect. So this would have to be changed to permit that pattern.

I tried that in #114559 and it's not looking good.

Loirooriol pushed a commit to Loirooriol/servo that referenced this issue Nov 22, 2023
Even tho rust-lang/rust#112337 is IMO a rust
bug, it's easy to work around in our code and it doesn't really affect
expressiveness.

Differential Revision: https://phabricator.services.mozilla.com/D180065
Loirooriol pushed a commit to Loirooriol/servo that referenced this issue Nov 23, 2023
Even tho rust-lang/rust#112337 is IMO a rust
bug, it's easy to work around in our code and it doesn't really affect
expressiveness.

Differential Revision: https://phabricator.services.mozilla.com/D180065
Loirooriol pushed a commit to Loirooriol/servo that referenced this issue Nov 23, 2023
Even tho rust-lang/rust#112337 is IMO a rust
bug, it's easy to work around in our code and it doesn't really affect
expressiveness.

Differential Revision: https://phabricator.services.mozilla.com/D180065
mrobinson pushed a commit to servo/servo that referenced this issue Nov 24, 2023
Even tho rust-lang/rust#112337 is IMO a rust
bug, it's easy to work around in our code and it doesn't really affect
expressiveness.

Differential Revision: https://phabricator.services.mozilla.com/D180065
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. T-opsem Relevant to the opsem team
Projects
None yet
Development

No branches or pull requests

6 participants