-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC for a Rust Memory Model #1578
Conversation
with thanks to Amanieu, huonw, durka42, aatch, acrichto, nmatsakis, and anyone else I might have missed.
This opens up the possibility of performing TBAA on types which have implementation-defined bytes since these types are only allowed to be read through a pointer of the correct type. I don't really see this as a big issue since this is only possible from unsafe code and the struct layout isn't guaranteed to match any other type. |
It's not that they're only allowed to be read through a pointer of a correct type. It's that if they are then used, it's undefined behavior (and therefore, TBAA might be a thing). Although, I'm thinking about changing it to implementation defined offsets instead of implementation defined bytes... I'm not sure. |
Enthusiasm is commendable, but I'd prefer this particular work to be done by professionals. |
I agree with @petrochenkov, there are several shortfallings in this model. |
@ticki Anything less, e.g. "obvious" useful guarantees on struct layout or valid values of primitive types, can be specified on case by case basis. |
Exactly. It should be formalized (to avoid these ambigiuities). It is very vague for a memory model, as is. |
@ticki @petrochenkov Honestly, this has been simplified and such a lot from how it would appear originally, in order to fit in the RFC format. Being a "professional" also has very little to do with the work here; I've been thinking about this memory model for months, and I'm likely one of the experts on Rust's. People send people to me to explain it. And, I get it; you would be more comfortable if the memory model was written by a professional. But I'm not a professional, and I wrote (part of) a memory model (the important part, the pointer aliasing part). The other things are just a basis for pointer aliasing, which is what really needs to be defined. |
@usban Don't get me wrong. I don't doubt that you know a lot about this subject, and have put a lot of effort into it. What I am saying is just that this wasn't really what was inteded with a memory model. What was inteded is a formalized model, instead. |
@ticki well, I am working on formalizing the model now... I just wanted to talk about it first, in public. It's important that we are all able to give feedback on it. |
The fact is, any EDIT: Clarification |
This RFC reads to me as if there are two different operations that are both called In fact, if I see it correctly, replacing |
I am not gonna weigh in on this RFC very much, but one comment:
When it comes to guarantees that you're making, nothing can often be better. If we were to adopt a poor model, it could inhibit Rust for the rest of its life. You really want something like this to be absolutely foolproof, or as close to it as possible. Adopting something that's not ready because "something is better than nothing" is not a good idea here. |
For the moment, then, can we at least have some documentation that pointer aliasing of any form is Undefined Behavior? Because that's what it really is without any rules on its use. |
@archshift This is, in fact, true currently. |
@steveklabnik One thing I found really compelling was something that @dikaiosune said in IRC: "is it just me, or is waiting for 5 or more years for any guarantees about unsafe pointers kinda untenable for rust adoption?" Currently, this: let mut x = 0;
let ptr1 = &mut x as *mut i32;
let ptr2 = &mut x as *mut i32;
*ptr1 = 1;
*ptr2 = 2;
*ptr1 is completely undefined. And most unsafe code assumes it is defined. |
Just to be clear here, I am not right now working on a memory model for Rust. Nobody should feel blocked by me here. For my Rust formalization I decided to shortcut this discussion, and instead define too many behaviors. This means that at least I can prove the desired libraries to be safe in my model (hopefully), but they may still turn out to rely on behavior that Rust wishes not to define. |
@ubsan Even that is unclear, as according to the Rust Doc:
|
@archshift Exactly. It's completely undefined. |
Since @ubsan pinged me here, I'd like to share some of my thoughts on the level of formalism or academic chops present in the RFC. One of the things that most impresses me about and attracts me to open source in general (and to a greater degree, Rust specifically) is the power that "expert amateurs" can have when they ignore credentials and focus on a substantive problem, emphasizing collaboration and results over social status and signalling. In many cases when one lacks the time, specific knowledge, or some other resource to substantively critique a work, the easiest signal to process can be from a contributor's credentials and prior achievements, or from the format of the presentation. But I think that's a bad trap for an inclusive community like Rust's to fall into. I have always been impressed by the pragmatism in the other discussions I've read here, and I think it's a distraction to focus (explicitly or implicitly) on ubsan's qualifications or the current level of formalism of the RFC, especially when its author has approached it in this way:
I don't see any reason to reject this out of hand. The product of this work will be very important for Rust's community and its future, and I think it deserves concrete feedback and a substantive appraisal. |
I have to disagree with @steveklabnik on this one. In this case something is better than nothing, as long as we make sure that the guarantees are not going to get broken in the future. Personally, I would prefer an access based model over the one presented here. I am using an access based model in my formal verification of Redox. I have outlined my idea below: (unfortunately, github doesn't support Latex, so I have rendered it as images) and formalized through: |
I think we're on the same page. It's not so much that it has to be absolutely, fully, 100% perfect at first blush, just that we're not painted into a corner. |
@steveklabnik, so what you are saying is that you think it should rather make one promise less than one promise more? |
@ticki I'm not familiar with that idiom. But I don't mean "something is worse than nothing" in the sense that it's an all-or-nothing enterprise. But I wouldn't want to rush to get something down just to have some kind of guarantee, only to find out that we're not actually happy with it later. Anyway, I said I wasn't going to say much, and now I've made three posts, so I'll stop 😉 |
let mut x = 0;
let ptr1 = &mut x as *mut i32;
let ptr2 = &mut x as *mut i32;
*ptr1 = 1;
*ptr2 = 2;
*ptr1 This will actually work on every model that we plan to use. It is undefined in the exact same sense as everything else. What is more dubious, is for example #[derive(Debug)] struct Foo<T>(T);
fn main() {
let five = 5;
let five_ref = &five;
let mut x = Some(Foo(five_ref));
let x_addr = &mut x as *mut Option<_>;
match x {
x_copy => {
if let Some(ref inner) = x_copy {
unsafe { *x_addr = None; }
println!("{:?}", unsafe { (*(inner as *const Foo<_>)).0 });
}
}
}
} This generates code that crashes on current rustc, through I don't think it should. In any case, I don't like the derived pointer model because it is fundamentally based on "confidentality", while Rust basically tries to guarantee "integrity". |
Additionally, LLVM likes attributes on loads, so lvalue-to-rvalue-conversions require the rvalue to be valid. |
So have drop glue call OTOH, you ought to be able to do partial drops, which would make struct Foo(Box<u32>, Box<u32>);
fn main() {
let x = Box::new(Foo(Box::new(0), Box::new(1)));
drop(x.1);
// drop glue:
let d = DerefMove::deref_move(&mut x); // unsafe! can access `x.1`!
drop(d.0);
mem::forget(d.0);
Drop::drop(&mut x);
} |
@ubsan and I had talked a bit about My "4 types of unique borrowed pointer" proposal is really less about pointers and more about making "initializedness" first-class---part of the type system instead some extra static analysis tacked onto borrow checking. Once you have that, the generalized pointer types are fairly inevitable. |
@arielb1 well, in that case, it's not unsafe :) Notice how you're let inner = x.deref_move();
drop(inner.1);
// drop glue
drop(inner.0);
Drop::drop(&mut x); @Ericson2314 see above ^ |
So there's gotta be some sort of guarantee about only being called once, I'm thinking... something like that. |
I have just opened #1643, which proposes that rather than adopting one RFC, we adopt a "strike-team-based" approach to work these rules out in a more systematic way. |
@ubsan |
But why? DerefMove seems fine if we just... don't call DerefMove multiple times. I think it's actually a really nice guarantee that DerefMove doesn't get called multiple times, just like Drop. |
This allows for things like dropping in DerefMove, since it's only called once. If, for example, you have a cache, and you don't need it once you're moved out of, it would be nice to get rid of it once you're no longer useful; and, I don't see the point of requiring DerefPure. It just makes it more annoying to implement DerefMove for users. |
Most of the discussion here is well beyond the scope currently covered by my formalization, but this one part is actually already "in scope" and I have definitions which I think make sense. So I will try to share my formal thoughts on this. Funny enough, the outcome is exactly the opposite if what you suggest. Let me try to explain the relevant parts of my model of types: The type So, if we consider Let's consider |
@RalfJung But the topic was |
@RalfJung I believe there's a distinction to be made here that About ZSTs, I believe you're talking about memory reads, not syntactical dereferences, which do not touch memory at all to "read" a ZST. |
@RalfJung still reading the rest of your post, but btw I totally agree that |
Ah OK, so even besides @eddyb's concerns I consider a load from Also I'd like to point out that @RalfJung's model is another good reason why the conflation of ()-isomorphic and Void-isomorphic types as both having size 0 is a dangerous thing to do. |
@Ericson2314 To be specific, lvalue->rvalue conversion for a |
Oh, I see. As far as I can tell, Rust attaches absolutely no guarantees to raw pointers, so their dereferencability depends entirely on the meaning you assign to them. If your invariants ensure that the given EDIT: I should add that, for this reasoning, I think using
Oh, the good ol' lvalue-rvalue confusion. I am entirely sidestepping that issue in my formalization by not having lvalues, which is why I payed no attention to it. ;-) Anyway, when I wrote about dereferencing pointers above, I was talking about actual load operations that happen on the machine.
I am not sure that is the case. ;-) Actually, I did not even assume any particular size for
I would say that lvalue -> rvalue conversion for |
@RalfJung The issue is raw pointers, where |
@ubsan Right, as long as you don't dereference it. Whether |
Quoting my comment from #1216 with respect to using
|
I am not sure about |
Agreed. I'd argue that |
@RalfJung But having |
Here are some of my own thoughts (not an expert, but am working on a garbage collector):
|
Nominating for discussion at lang-team meeting - the unsafe guidelines team exists now, so we should consider moving this PR either somewhere else or just the team tag. |
I agree with @nrc that, given that we have accepted #1643, we should close this RFC, since its contents are subsumed by the discussion taking place at the unsafe guidelines repo. I reviewing the thread and have tried to extract out the most notable questions that were debated. It would make sense to move many of these to issues on the rust-memory-model repo, I think:
@rust-lang/lang members, please check off your name to signal agreement. Leave a comment with concerns or objections. Others, please leave comments. Thanks!
|
I'm just going to close it now, since I owns it :P |
with thanks to Amanieu, huonw, durka42, aatch, acrichto, nmatsakis, and anyone
else I might have missed.
Fixes, or at least starts the process of fixing, #1447