Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust needs a memory model #1447

Open
pnkfelix opened this issue Jan 6, 2016 · 135 comments
Open

Rust needs a memory model #1447

pnkfelix opened this issue Jan 6, 2016 · 135 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@pnkfelix
Copy link
Member

pnkfelix commented Jan 6, 2016

As noted by @arielb1, we need an organized place for resolving questions about the Rust memory model.

While I could have made this a post on internals.rust-lang.org, I think this topic 1. needs visibility, and 2. needs to be able to be easily cross-referenced by the other issues and discussions that are hosted on github.

Eventually such discussion will hopefully culminate in an actual RFC document with a definition of the Rust memory model.

So, this issue is meant to be a collecting place for either such discussion, (or linking other discussions that perhaps should funnel into here).

@pnkfelix
Copy link
Member Author

pnkfelix commented Jan 6, 2016

cc rust-lang/rust#30424

@pnkfelix
Copy link
Member Author

pnkfelix commented Jan 6, 2016

cc @arielb1 @nikomatsakis @RalfJung

@pnkfelix
Copy link
Member Author

pnkfelix commented Jan 6, 2016

cc rust-lang/rust#9883

@pnkfelix
Copy link
Member Author

pnkfelix commented Jan 6, 2016

@eddyb
Copy link
Member

eddyb commented Jan 6, 2016

cc @ubsan

@RalfJung
Copy link
Member

While writing rust-lang/rust#30424 (comment), some more general thoughts about the memory model came to my mind, so I'm now dropping them in this thread. (Note that in this context, the core purpose of the memory model is to define which programs have UB and which don't. At least that's my reading of it. There is probably little discussion about what the actual behavior of the program should be, if it has defined behavior.)

I am pretty worried about how sensitive the UB-related discussion we are having in rust-lang/rust#30424 is to the concrete lifetimes that the compiler infers. Considering that it has a lot of freedom in this inference (and it was able to surprise me with its results), the fact that whether a program has UB or not can depend on the assignment of lifetimes is... troublesome, to say the least. I don't think this should be the case. But I'm also not sure if it can be avoided. Depending on the goal that the memory model wants to achieve, it may be impossible to avoid.

Right now, the only goal I know of is "the noalias annotations should be correct", and since this restricts the entire story to function arguments which are the place where lifetimes are most explicit, we may be good (in the sense that, there may be a way to define things such that UB doesn't depend on the lifetimes anymore). But you may have optimizations in mind that need stronger guarantees.

It would be great if there was a write-up or something of the kind of assertions that you want to be "made true by introducing UB", and the program transformations that rely on these assertions (and that hence form the motivation). Such a document would form the design constraints a memory model has to satisfy. There is probably enough room for discussion already on these design constraints that it's a bad idea to start serious work on a detailed memory model before we know how it is supposed to behave ;-) . I also assume some of you guys have lots of intuition about this, and it would be tremendously helpful if that intuition could be documented.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

Right now, the only goal I know of is "the noalias annotations should be correct", and since this restricts the entire story to function arguments

noalias annotations only being used for function arguments is an open bug.

@RalfJung
Copy link
Member

See, that's what I meant ;-) . The way the semantics of noalias are defined in LLVM, I have no idea what, formally, noalias could mean anywhere else. (Except for return values, where they say it means the function is like malloc - which means this is no fit for functions returning &mut.)

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

It means that memory operations don't alias overlap.

unsafe fn f(x: *mut u8, y: *mut u8) -> u8 {
    let x = &mut *x;
    let y = &mut *y;
    *x = 1;
    *y = 2;
    *x
}

always returns 1.

@RalfJung
Copy link
Member

That would be one of the examples for the aforementioned document :D . The idea here seems to be, two mutable references are disjoint.

What about this code?

unsafe fn f(x: *mut u8, y: *mut u8) -> u8 {
    let x = &mut *x;
    *x = 1;
    *y = 2;
    *x
}

Now the "overlapping" write happens through a raw pointer, not through a mutable reference.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

It returns 1. Consider

unsafe fn f(x: *mut u8, y: *mut u8) -> u8 {
    unsafe fn g(x: &mut u8, y: *mut u8) -> u8 {
        *x = 1;
        *y = 2;
        *x
    }
    let x = &mut *x;
    g(x, y)
}

which returns 1 unconditionally.

The behavior must not be changed by this transformation.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

The behavior must not be changed by this transformation.

Because otherwise safe code can become unsafe by either manually inlining a function or moving part of a function to another function.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

This is also the reason that noalias information should be used outside of function arguments. It is not uncommon to see the advice to simply cast *mut to &mut to make access easier. This currently works because noalias information is not used for &mut created inside a function, but once you refactor your code and pass the &mut to another function, the code breaks because suddenly noalias information is applied.

@Gankra Gankra added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jan 13, 2016
@arielb1
Copy link
Contributor

arielb1 commented Jan 13, 2016

@RalfJung

The main surprise for me in lifetime inference is not reborrows, but rather mutable pointers not being correctly disposed of when they are coerced to raw pointers.

@strega-nil
Copy link

@arielb1 It's not that they should be thrown away upon coercion. It's that as soon as any aliasing pointer is written to, excluding reborrowing, any references that point to that same memory are invalid, and can't be written to or read from. In the following example:

let src: &mut [u8];
let dst: &mut [u8];
std::ptr::copy_nonoverlapping(src.as_mut_ptr(), dst.as_mut_ptr(), std::cmp::max(src.len(), dst.len()));

src is still valid. It's not super useful to make it into a *mut u8, yes, but it still should be valid.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

Consider

unsafe fn f(x: *mut u8, y: *mut u8) -> u8 {
    unsafe fn g(x: &mut u8, y: *mut u8) -> u8 {
        *x = 1;
        *y = 2;
        *x
    }
    let x = &mut *x;
    g(x, y)
}

which returns 1 unconditionally.

Isn't this incorrect? This is what happens right now, but I think it should not be happening. As @ubsan's example, shows, raw pointers are always allowed to alias mutable references. The most elementary rust code relies on this. Unless the compiler can prove that the raw pointer y is not based on x, the code above should not be allowed to return 1 unconditionally. But since raw pointers are currently used everywhere (Box, Vec, etc.), this practically means that noalias becomes almost useless.

Consider the following example:

fn write(buf: &mut [u8]) -> u8  {
    buf[0] = 1;
    memcpy(buf.as_mut_ptr(), [2].as_ptr(), 1);
    buf[1]
}

This code must be correct and return 2 (since, after inlining, such code can be found everywhere and since the pointer in the memcpy is obviously based on the slice) But if we perform the following transformation:

fn write(buf: &mut [u8])  -> u8 {
    fn f(p: *mut u8, buf: &mut u8) -> u8 {
        *buf = 1;
        memcpy(p, [2].as_ptr(), 1);
        *buf
    }
    f(buf.as_mut_ptr(), &mut buf[0])
}

the code returns 1.

The correctness of code should not be changed by inlining. I see no way to fix this but to ignore noalias information completely if any raw pointers are mutably accessed inside a function.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

#1410 would of course solve this by allowing raw pointers in containers to be marked noalias.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

There is of course the argument that the code is not quite equivalent since the write through the reference happens after the raw pointer was created. But since the consensus seems to be that raw pointers are just integers (casting between integers and pointers is even safe) I don't think the argument that a raw pointer somehow borrows the reference it was created from (for whatever short lifetime) is valid.

@arielb1
Copy link
Contributor

arielb1 commented Jan 13, 2016

@mahkoh

Your version of f is supposed to be UB.

On the other hand, write would naïvely be UB, but it would be perfectly well-behaved if as_mut_ptr consumed the local buf instead of reborrowing it - and we are trying to find out if we can find some way of ergonomically and reliably consuming &mut references (you can consume them using a {buf}, but that is ugly and probably not done).

@arielb1
Copy link
Contributor

arielb1 commented Jan 13, 2016

And if as_mut_ptr really consumed buf, write would be obviously borrowck-invalid unless you do some reborrow control.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

@arielb1 I don't like this idea. I think "if between the last write through this reference and the next read through this reference a raw pointer was used mutably, then the "cached value" becomes invalid" is a much easier approach. Although it relies on some kind of pointer annotations inside of containers to be useful.

@arielb1
Copy link
Contributor

arielb1 commented Jan 13, 2016

@mahkoh

The problem is that every function you call can have a stealth raw pointer alias to your mutable reference.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

@arielb1 A function call implies that the "cache" is invalidated. (Unless the function call is inlined and the compiler can then prove that the cache is still valid.) I don't think this is a huge problem.

@arielb1
Copy link
Contributor

arielb1 commented Jan 13, 2016

@mahkoh

@arielb1 A function call implies that the "cache" is invalidated. (Unless the function call is inlined and the compiler can then prove that the cache is still valid.) I don't think this is a huge problem.

That's not how LLVM noalias works, and would frustrate many optimizations.

The primary advantage of the "active mutable references freeze" idea is that it is fully modular, and that therefore it

  • allows full optimizations in safe code
  • does not have complications with respect to external functions, threads, etc.
  • only requires unsafe code to care about the correct disposal of mutable references

The main problem is that disposing of mutable references is quite subtle and annoying, and we don't have a good ergonomic semantics for it.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 13, 2016

That's not how LLVM noalias works

I'm aware but that seems to be a problem of the implementation only.

and would frustrate many optimizations.

I don't buy that this is an optimization problem. In the other issue someone wrote that removing noalias completely only slowed the code (rustc) down by a few percent. It would be interesting to see where the performance is lost and how my more optimistic version would affect it.

@arielb1
Copy link
Contributor

arielb1 commented Jan 13, 2016

I guess that our main disagreement is that I mostly care about safe code (in which the optimization we discuss is 100% safe) while you mostly care about unsafe code (in which this optimization is scary).

I am not sure I would give up a few percents of performance for some feature I don't even use.

@strega-nil
Copy link

@steveklabnik The issue is that it can't really be made better until we define what is and isn't correct. However, the entirety of what is undefined behavior is completely incorrect, as well as pointer aliasing.

@strega-nil
Copy link

@RalfJung then defining it programatically does seem like the correct solution.

However, I still think that function calls should not be anywhere near what makes a rust program defined or not. I still believe that we can make a really nice definition based upon ptr::read, ptr::write, and ptr::offset.

@comex
Copy link

comex commented Feb 5, 2016

Maybe it shouldn't, but this discussion is starting to give me an anxious feeling. Things like too-short lifetimes being inferred for &muts which are casted to *mut (possibly within an unseen function body), instant aliasing death which can easily be accidentally triggered when dealing with raw pointers (&(*raw_ptr).field as *const _ - oops, a reference temporary existed for an instant, if someone else had it mutably borrowed then you lose), or any more lenient alternative that would surely be more complex and therefore both harder to remember and harder to ensure the optimizer actually obeys...

They're starting to feel like footguns on the level of C's strict aliasing and overflow rules. Well, not quite so bad - learning obscure details of the C standard is a hobby of mine and I still get things wrong in my code - but tricky enough that a lot of otherwise competently written code will get them wrong in practice. (And then people like djb and Linus accuse compiler writers of breaking their code, while compiler folks smugly quote the rules and blame the code, and they both have valid points.) In this case, only unsafe code, but then a large fraction of unsafe code exists for the sake of dealing with and converting raw pointers, so its relative paucity in most Rust codebases is no statistical refuge. Indeed, this should be expected: even competent programmers make mistakes from time to time, and only detect them when a test, automated or otherwise, finds the program actually misbehaving. Most of the time, aliasing violations will never actually cause misbehavior - until some surrounding code changes and they do. It's almost like the situation with memory corruption bugs in C programs, where sometimes the corruption is small enough that the program usually works anyway, making the bug highly nondeterministic and hard to reproduce; but that nondeterminism is mainly at runtime, whereas here it's typically entirely at compile time.

I've always liked the fact that even while Rust's abstraction creates obstacles to low level programming, when it comes to playing ball with the optimizer it's better, in some sense closer to the metal than C - pointers are just integers as they are in assembly, there's no hidden "state" about what type a particular bit of memory has. It's freeing, a feeling I've seen echoed elsewhere, a sort of extension of Rust's usual stronger guarantees to even unsafe code. This doesn't change that per se, but I should've realized things would be trickier than I thought.

Not that I have an alternative - as far as I can tell, there really is no way to do common sense, useful optimizations on references without creating large pitfalls. Even if someone comes up with some memory model formalization that could hypothetically make things safer, to some extent Rust is constrained by its need to generate efficient code using existing compiler backends designed for C, both LLVM today and, ideally, GCC and anything else someone might want to write a Rust implementation targeting someday.

But I do have a suggestion: solve the invisibility problem. Add an "aliasing sanitizer" mode to rustc that generates code to reliably detect all violations at runtime, along the lines of ASan. Instead of references just being lowered to pointers, rustc would generate calls to stubs whenever they're created, destroyed, and reborrowed, which would look the pointers up in a global map and ensure proper borrowing (presumably causing significant overhead). Not sure how hard this would be to implement, but I don't see any big theoretical obstacles.

@mahkoh
Copy link
Contributor

mahkoh commented Feb 5, 2016

And then people like djb [...] accuse compiler writers of breaking their code

Got a citation?

@RalfJung
Copy link
Member

RalfJung commented Feb 5, 2016

And then people like djb [...] accuse compiler writers of breaking
their code

Got a citation?

This was a seminar at ETAPS 2015, which he announced at
http://blog.cr.yp.to/20150314-optimizing.html. I would not be
surprised if he also used other occasions to talk about this ;-)

@mahkoh
Copy link
Contributor

mahkoh commented Feb 5, 2016

Unless I missed something, he doesn't blame anyone for breaking his code.

@strega-nil
Copy link

@comex I don't understand what could be undefined about the bit of code you showed there.

I don't think the rules are very difficult to parse out. There's nothing like C or C++'s aliasing stuff. The definition I've been working off of is basically just:

If you have a reference R, and a pointer (raw or ref) P which is not derived from R
AND P is "aliased" to R (for some definition of "aliased")
AND you write to P
AND you read or write from/to R

you get undefined behavior.

@RalfJung
Copy link
Member

RalfJung commented Feb 5, 2016

However, I still think that function calls should not be anywhere near what makes a rust program defined or not. I still believe that we can make a really nice definition based upon ptr::read, ptr::write, and ptr::offset.

That would be nice :)
However, I don't think this will actually work. Not if we want any of these aliasing-based optimizations. If you only look at th raw memory events, and you see two consecutive writes ptr::write(0x1234, ...); ptr::write(0x1234, ...), then you can't tell whether these two have been issued through the same variable or not. If it's all integers only, you cannot distinguish raw pointer accesses and accesses through references.

If you have a reference R, and a pointer (raw or ref) P which is not derived from R
AND P is "aliased" to R (for some definition of "aliased")
AND you write to P
AND you read or write from/to R

This requires tracking how addresses got derived. So, they cannot be just integers, they have to remember some history of how they have been computed.
If you just see 0x126 is derived from 0x124, write to 0x126, write to 0x126, maybe both of these writes happened on the same variable that was derived from some original pointer, or maybe the second write is through a separate, raw pointer.

Worse though, it requires knowing when you "have" a reference. You need to track lifetimes to know where there are references, because only those references that have active lifetimes count. As the discussion in rust-lang/rust#30424 showed, we have a very hard time defining what exactly makes up an active reference once there are raw pointers involved.

@comex
Copy link

comex commented Feb 5, 2016

@mahkoh His proposal for boringcc is pretty accusatory (he's also written similar things on Twitter IIRC). For completeness, here's a colorful take from Linus Torvalds who I also cited.

@strega-nil
Copy link

@RalfJung that's actually what we do now, with mutable reborrows. This just gives a nice definition for them that works with both references and pointers. There are some extra things I didn't specify; for example, a read or write of the original reference makes all derived pointers no longer derived, and a move is equivalent to a derivation. But this definition just works with current mutable reborrow rules.

@strega-nil
Copy link

for an example:

let original: &mut i32= &mut 0;
{
    let derived: &mut i32 = &mut *original;
    *derived = 1;
}
println!("{}", *original); // prints "1"

similarly, with raw pointers

let original: &mut i32 = &mut 0;
let derived: *mut i32 = original;
unsafe { *derived = 1; }
println!("{}", *original); // prints "1"
// you shouldn't access derived after this

@Amanieu
Copy link
Member

Amanieu commented Feb 18, 2016

I think a good source of inspiration for defining Rust's aliasing rules is the definition of restrict in the C11 standard (page 141, section 6.7.3.1). This formally defines when operations involving restrict-qualified pointers are undefined behavior.

@strega-nil
Copy link

@Amanieu it looks fairly similar, although rule 11 would have to be changed. In C, the following is invalid (for some reason):

int* restrict p = &po;
int* restrict q = &qo;
p = q;

while the following is valid in Rust:

let p: &mut i32 = &mut po;
let q: &mut i32 = &mut qo;
p = q;

@RalfJung
Copy link
Member

This formally defines when operations involving restrict-qualified pointers are undefined behavior.

While this definition in some sense is "formal"; it is unfortunately not formal in a mathematical sense. They certainly try hard to make a mathematical definition, but this is still embedded into a huge pile of plain English. This usually means that the interaction between different sections of the standard is very poorly defined. For example, this applies ton the interaction of undefined behavior and weak memory effects (this is discussed, for example, in https://www.cl.cam.ac.uk/~jp622/popl16-thinair/).

For example, compare the description of the weak atomic behaviors in the standard, with https://www.cl.cam.ac.uk/~pes20/cpp/popl085ap-sewell.pdf. The former is (mathematically speaking) informal, the latter is formal.
Or, compare the description of which memory accesses are legal when through which pointers (which is spread across the entire standard) with http://robbertkrebbers.nl/thesis.html. The latter is formal, the former is not.
I am not aware of any precise description of C11 precise or LLVM noalias. This makes it impossible to perform any proofs like "this piece of code, which uses restrict, will always do the right thing when compiled by a standards-compliant compiler".

I think it would be a pity if Rust ended up in a similar situation. A plain English, semi-formal definition is certainly important. It should however be developed in tandem with a mathematically precise, formal model, since only the latter allows proofs that verify that certain desired properties actually hold. Experience shows that if the mathematical model is only developed later, it ends up uncovering serious flaws and/or ambiguities in the original definitions, which could have been avoided, had formal methods been part of the development process from the start.

@aidanhs
Copy link
Member

aidanhs commented Feb 29, 2016

Is there any ETA or roadmap for this RFC? Even if it's not possible to pin down all the edge cases right now, surely it's possible to do better than the current state of documentation.

Currently the official docs have the following to say on the subject of aliasing, references and pointers:

  1. https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html, specifically:
    • "The UnsafeCell type is the only legal way to obtain aliasable data that is considered mutable. In general, transmuting an &T type into an &mut T is considered undefined behavior."
  2. https://doc.rust-lang.org/reference.html#behavior-considered-undefined, specifically three points (which are duplicated in the rust book):
    • "Breaking the pointer aliasing rules with raw pointers (a subset of the rules used by C)"
    • "&mut T and &T follow LLVM’s scoped noalias model, except if the &T contains an UnsafeCell. Unsafe code must not violate these aliasing guarantees."
    • "Mutating non-mutable data (that is, data reached through a shared reference or data owned by a let binding), unless that data is contained within an UnsafeCell."
  3. https://doc.rust-lang.org/nomicon/transmutes.html, specifically:
    • "Transmuting an & to &mut is UB"

Reading 1, it sounds like there's an unspoken implication on the end of the second sentence ("...is considered undefined behavior, except with UnsafeCell"), which makes me think that something the following is fine:

let u = UnsafeCell::new(5);
let a = unsafe { &mut *u.get() };
let b = unsafe { &mut *u.get() };

Except, from comments on this issue, I think I've picked up that you may never have aliasing &muts!

Reading 2 is...not helpful.

  1. The subtlety that an &mut containing an UnsafeCell is noalias (and basically breaks UnsafeCell completely?) is easily missed and should be called out more clearly (old issue: UnsafeCell semantics rust#16022).
  2. "unless that data is contained within an UnsafeCell" from the second point is pretty unclear. For example, I thought that I could create an array of UnsafeCell<T>, then freely pass around aliased &mut references into this array. It wasn't until I spent a while digging into what UnsafeCell actually means to the compiler that I realised that this is a contradiction of the previous point, so tough luck (this complaint and the previous compress into: "&mut must never alias with anything, ever, even if UnsafeCell is involved").
  3. "data is contained within" could mean "the type of the data is UnsafeCell<T>" or "the owner of the data somewhere else in the program is an UnsafeCell" (e.g. the 'UnsafeCell array' idea), and only disambiguates to the first by considering the other points or reading the rustc code.
  4. Deferring to the LLVM documentation for detail on 'aliasing' and 'noalias' doesn't make life easy. I am sympathetic to "If you have an understanding of aliasing, the rules are clear." (Mutable Pointer Aliasing Rules are Unclear for Unsafe Code rust#19733 (comment)), but getting this right is pretty important for extensive communication with C/C++ code. I like the note "(a subset of the rules used by C)" - I wouldn't dare to presume I know the rules for pointers in C, especially after reading http://www.mail-archive.com/[email protected]/msg01647.html!
    • failed understanding example: wouldn't calling f(&x, &x) violate noalias?

Regarding 3, I reference the four comment chain starting at #1447 (comment). I can't really say much here, but this doesn't inspire faith in the official docs.

Sorry for the lengthy comment and basic understanding of the topic, but this is a source of continuing confusion for me (and others I know) - I'm not looking for answers to questions, I just want to point out that uncertainty in this area is causing issues (at least to me!).

@arielb1
Copy link
Contributor

arielb1 commented Feb 29, 2016

@aidanhs

The main issue that annoys us with this thread is the "disposing of mutable references" issue (AFAICT) - we may just give up and say that mutable references should always be considered disposed-of or something. The rest of the model is pretty much settled.

The rules around UnsafeCell are pretty - the effect of UnsafeCell is that it is an exception to "data contained within" for memory that is frozen by an immutable reference (but not for memory that is owned by a mutable reference).

The "data contained within" rule is just transitive reachability (it can reach through references, and theoretically through containers if they use a yet-to-be-invented special pointer), but not counting objects of type UnsafeCell that are accessed immutably.

That's it, unless you came up with some special case I was not aware of.

I thought that I could create an array of UnsafeCell<T>, then freely pass around aliased &mut references into this array.

That would make &mut pointless, won't it (unless you mean &mut UnsafeCell<T> element pointers - but in that case, why not just use &UnsafeCell<T>)?

failed understanding example: wouldn't calling f(&x, &x) violate noalias?

Aliasing only talks about memory accesses. Passing immutable pointers should not cause aliasing violations unless f writes to (non-UnsafeCell parts) of its arguments.

@arielb1
Copy link
Contributor

arielb1 commented Feb 29, 2016

In any case, the issue Linus ranted about does not affect new versions of gcc (at least 5.3.1).

@aidanhs
Copy link
Member

aidanhs commented May 14, 2016

@arielb1 thanks for your comment. I've done some reading around LLVM aliasing and dug into rustc and now realise that the majority of my questions would be answered by better docs (which I'll take a look at) rather than this issue, which is only tangentially related.

One question: "Aliasing only talks about memory accesses." One page in the LLVM docs also said this and it wasn't until I read another page that talked about dependence that it became clear - my intuition was that pointer reads would fall under 'memory accesses'. Is it common knowledge that 'memory accesses' implies writing? (I was thinking about proposing a change to the first page)

@arielb1
Copy link
Contributor

arielb1 commented May 15, 2016

One question: "Aliasing only talks about memory accesses." One page in the LLVM docs also said this and it wasn't until I read another page that talked about dependence that it became clear - my intuition was that pointer reads would fall under 'memory accesses'. Is it common knowledge that 'memory accesses' implies writing? (I was thinking about proposing a change to the first page)

"aliasing only cares about memory accesses" does not imply "aliasing does not care whether an access is a read/write". It means that an aliasing violation consists of a bad set of accesses (which must contain a write). If your code does not actually access an address (e.g. it only does casts on pointers) it can't be a part of an aliasing violation involving that address.

@nikomatsakis
Copy link
Contributor

I wrote a blog post discussing my thoughts on a high-level approach to the memory model:

http://smallcultfollowing.com/babysteps/blog/2016/05/27/the-tootsie-pop-model-for-unsafe-code/

The heart of the proposal is the intution that:

  • when you enter the unsafe boundary, you can rely that the Rust type system invariants hold;
  • when you exit the unsafe boundary, you must ensure that the Rust type system invariants are restored;
  • in the interim, you can break a lot of rules (though not all the rules).

Discuss thread: http://internals.rust-lang.org/t/tootsie-pop-model-for-unsafe-code/3522

@ticki
Copy link
Contributor

ticki commented Oct 2, 2016

http://ticki.github.io/blog/a-hoare-logic-for-rust/

@arielb1
Copy link
Contributor

arielb1 commented Oct 3, 2016

@ticki

Not a memory model. The Hoare logic follows from the denotational semantics, which already assume you have a memory model.

@ticki
Copy link
Contributor

ticki commented Oct 4, 2016

@arielb1 I never said it was one. It is a minor step towards one. It formalizes the MIR and below.

@gnzlbg
Copy link
Contributor

gnzlbg commented Feb 12, 2017

People from the future, most of the discussion around this is happening here: https://github.com/nikomatsakis/rust-memory-model

@Centril
Copy link
Contributor

Centril commented Oct 7, 2018

And I believe now it is happening mostly at https://github.com/rust-rfcs/unsafe-code-guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests