-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: Go Memory Model clarifications #50590
Comments
It's not clear to me how this differs from the current Go memory model, which says "Reads and writes of values larger than a single machine word behave as multiple machine-word-sized operations in an unspecified order." |
Ah, that would be a difference. But it's not clearly related to a specific type. What matters there is how a loop like I believe that the current memory model does not require the compiler to generate code that reloads |
Ian, As far as I can tell both a goroutine closure with "for running {}" and a goroutine function with "for *running {}" work correctly. If you are saying that this is expected reliable behavior, then I think that is wonderful and I humbly submit it needs to be clarified in the memory model. Perhaps it seems clear to you because you already know the answer. I completely understand your perspective in pointing out the quote about multi-word writes. The quote does explicitly say that multi-word writes are not atomic, but it does not actually say that single-word writes are safe. If the behavior I've suggested is already expected, then I propose adding another sentence something like this:
In defense of clarifying this, I will say that before posting this proposal I read the "Happens Before" section over and over it did not seem clear. I posted on Go Forum, and no one was confident that unsynchronized single-writer multi-reader was safe. Funny thing: One person said they would not feel comfortable with it, said maybe I would lose writes, and suggested I read the memory model. Another suggested it is working for me because I'm on x86. With all that said, I really think this is a super important guarantee, and it should be clear to Go users. |
You generally need some sort of synchronization to guarantee that accesses see up-to-date contents of memory. It is true that currently we don't optimize unsynchronized loops completely away. But strange things will happen in such loops, see for example #15635. They are definitely not recommended.
I think we're ok with guaranteeing this no-out-of-thin-air property.
All the trickiness is hidden under the word "done". When is an assignment done? It can't be when the instruction issues, or retires, because in fact there are processors which will still give you the old value on another processor for a while. I think the best you could say is "eventually"? But that's not much of a guarantee. And as we can see in #15635, we don't actually obey that today. Writes can be delayed indefinitely. |
Ok, I see that possibly implied in this:
I read that as: "you should use synchronization because your program won't work correctly if you don't know the order things are happening." which is completely different from "without synchronization, there is no guarantee a given goroutine will ever observe changes written by another concurrent goroutine." I do however see that down at the bottom in the examples:
No one has said this, but reading between the lines, it sounds like the "no corruption" guarantee of concurrent word-sized reads and writes does extend to multi-writer. I don't know of any scenario where that is good programming, but bad code aside, how about this:
Is that accurate? I mean, it sounds scary a hell and maybe it should if that is the guarantee. |
The second sentence is correct. The first sentence is correct in spirit, but I think uses terms that aren't really defined well: what does "behave individually" or "do not conflict" mean? |
I'm just making this up. I tried to take the existing language here:
And use it to write similar language for single-word-sized operations. Ultimately, I'm hoping we can eventually land on some language that is appropriate for a non-Go-contributor to clearly understand the guarantees in the Memory Model and how to work with it, so your feedback is perfect. I agree. How about this as another crack at it (the first sentence is copied from the memory model):
The following current sentence can be read like programming advice (i.e. if you don't synchronize you won't see the right data), but really this is where the issue of "not observed" should be clarified: Existing:
Proposed:
In order to define this behavior, I really have to know what specifically is required in order to establish "synchronization events?" For instance, can both routines just call atomic.Load() on a shared variable at the very top of the routine? That would establish one load before the other. I'm guessing it is something like both goroutines must a) use a shared variable that is typed in sync, or b) use atomic on a shared variable or c) use a shared channel. This is not hypothetical for me. I need to know the answer because I have lock-free code that needs the observed guarantee, but I'm also happy to document this for others. With all that said, so far, it sounds like the only "not observed" problems have to do with compiler optimization, and without those specific optimizations, we would not have the caveat about "not observing". From my perspective, this begs the question of which is more important. Is it better to define the existing behavior around this caveat, or is it better to not optimize away writes to shared variables? This really circles all the way back to Ian's initial comment of "aren't we already doing this" and pointing out some compiler differences. I'm honestly cool with anything... I just need to know the rules... but if it seems more obvious and straightforward when writes are guaranteed to be observed, then maybe writes should be guaranteed to be observed. |
The general issue of atomics and the memory model is #5045. The general intent is #5045 (comment). |
OK, thanks, that helps a lot. I'd like to narrow this proposal down to adding the following two guarantees to the Memory Model. Not necessarily these exact words, but these guarantees.
Based on the discussion, I think we already have the first item. I think we should include this because it is a very important feature, and we need to guarantee it rather than leave it ambiguous. It defines a basic and critical operation that is otherwise unknown. The second would necessitate a change in Go (probably only in a few compiler optimizations). I don't think we need to define what "after the write" really means given existing text of the Memory Model, but in principle, it means Go will not optimize away writes to variables that can be observed in other goroutines. I think we need this second item for two reasons: If we don't do this, the alternative is ugly: 1) I believe we must make explicitly clear the lack of guarantee on shared writes because it is unusual and is a big problem if you don't know about it, and 2) we must then define the minimum requirement for "synchronizing events" needed to guarantee writes to shared variables... both of which are ugly. On the other hand, if we do this, it makes things clear and expected: Programs work as they appear they should even if that means including general software quality problems associated with unsynchronized access. Any problems are in the hands of the programmer as opposed to being introduced by the compiler. |
The memory model already says
I don't understand how your suggestion "All goroutines observing a shared variable will observe values written to the variable by other goroutines after the write" differs from that. |
The memory model has this in the examples:
That would change with this guarantee. More importantly, I'm with you 100%. It seems super weird that this is not a fundamental guarantee already. |
If there are no synchronization events between two threads, then the notion of "after" is not well defined. You suggested earlier that we don't have to define "after the write." If "after the write" means something other than "a write, then a happens-before relationship, then a read", then I think we very much do have to define it. In a highly parallel world there is no such thing as "after" without some sort of synchronization. |
I agree that without synchronization, the exact "when" is unknown, but this guarantee is really about the difference between "after" and "never". It is saying "never" is not an option. The point is that sometimes a program can have valid communication between threads just knowing: this shared variable's newly assigned value can be observed elsewhere after this assignment whenever that happens". This is the basis of lock-free programming. The referenced example in the memory model where main never finished is a great example of lock-free programming. It would be normal to expect that done does in fact get observed by the main eventually ("after") but the example also points out the risk of lock-free programming regarding the order of operations in the assignment of variable a. While it is easy to debate the pit-falls of lock-free, and lock-free is definitely NOT the Go way, Go has never seemed to intrinsically preclude programming outside the Go way, and philosophically, in some ways, the sync and atomic packages are evidence that the pure way of "share by communicating" is not the only way. I continue to believe the downside of this "possibly never" anti-guarantee as stated in a prior post, far outweigh the benefit of just making it simple and straightforward. Ultimately, I don't think this guarantee would even need to be stated because of the other guarantees you pointed out. We just need to remove the "maybe never" anti-guarantee. |
In a concurrent language like Go, "never" is always an option. If I start a million goroutines some of them may never run. Also, if two different goroutines both write to a variable with no synchronization, what does it mean to say that other goroutines will eventually observe both writes? If I understand you correctly (which I may not) I think you are trying to say two things: all stores to non-local variables must actually be performed, and all reads of non-local variables must actually be performed. This would be roughly equivalent to saying that in C all non-local variables are volatile. It would mean that a loop like var Count int
func Loop(n int) {
for i := 0; i < n; i++ {
Count++
}
} can't be optimized into |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I could quibble about whether that is really what lock-free programming, but more importantly I think this goal is somewhat misguided. Go tries hard to make it easier to write correct concurrent code. One of the way it does so it by stressing that you should always communicate through communication mechanisms, not through simply setting a variable in one place and reading it in another place without synchronization. If I could easily make such programs break, I would. Because I believe that programs should use explicit synchronization, and should not rely on the fact that on a shared memory system a store to memory will eventually be observable if nothing else happens. |
Your message about the optimized loop is very close to what I'm saying, but I don't think the optimization is the problem, nor should the compiler be required to execute every read and write of non-local variables. I'll use your example to try to make clear the distinction in what I'm saying. The optimization is good because the effect of the unsynchronized writes in the loop exactly the same whether it is optimized or not. Even if that loop did actually execute every write, but did so with all the resources, no other goroutine would ever see any intermediate writes. They would see the final result. Here is what I think is not OK: We are saying that the final result (Count+=n) may never be observed. Saying that is not the same as optimizing away individual writes. If you tell me it may never be observed then as a programmer I really need to understand what is going on so I can avoid it. My position has been that we need to either a) clearly and precisely document what "synchronizing events" are required to ensure a write will not never be observed, or b) Don't say it may never be observed or c) of course, as you said, we could just declare all communication by sharing in Go to be broken. Here is my concrete and real situation (leaving aside the debate on why I do this): Multiple goroutines are looking at the same []byte data, and one goroutine changes a byte to communicate with other goroutines. I was told that the write may never be observed by other goroutines. After the "Whaaat??!", then reading the Memory Model over and over, I said, "Dang, it really does say that." I think maybe the Memory Model could mean all writes (including writes to shared variables) are cached in the goroutine's local memory, then either never flushed if there is no synchronizing event, or delayed for a very very long time until the next synchronizing event occurs (whatever that is). Of course, I'm sure you can imaging that after lots of hardcore testing, I can't get this to break. As far as I can tell, it does work as expected, but the Memory Model is explicitly telling me I can't trust it which is bad and is raising weird possibilities like the one above about caching. That brings me back to the idea of the two guarantees which when applied to my concrete situation mean:
Is that making the disconnect more clear? |
I agree, and the existing Go memory model picks option a): it documents the exact set of synchronizing events that are required to ensure that a read will observe a write. (Modulo #5045, which we all agree should be addressed.) |
This is definitely true given the current memory model for reads/writes to a You previously asked about a similar, but somewhat different property:
I interpreted this as if there are two writes to
I'm not sure if this guarantee is enough for you, but it is the case that the compiler can't throw away writes unless it can prove that a goroutine will never again execute a synchronizing operation. If a synchronization event is possible somewhere in the goroutine's future, all writes must be preserved in case that synchronization makes all those writes visible to another goroutine. I.e., the compiler can only throw away writes in compile-time provable infinite loops (or at known goroutine exits). |
Keith, Thank you so much for being clear on this. This is super helpful. I'm really sorry for asking so many questions. I'm on the outside (as a Go programmer... not Go contributor) not knowing what is really happening under the covers. You clearly answered my question about the []byte writes. I think you have also already answered a generalization of that, but I'd like to be sure I understood you. I assumed incorrectly from the context of some of the conversations, that multi-writer was also safe. Dialing that back to single-writer, may I please confirm that, given a machine word of 16 bits or larger, if v =0xaaaa then a single writer sets v = 0xbbbb, then concurrent readers will never see anything except either 0xaaaa or 0xbbbb? On the second item, your last paragraph makes sense. Thank you again for taking the time to write it. My question may come down to when the compiler can throw away writes to unsynchronized share variables "at know goroutine exists." Specifically, the memory model says the following code may never finish. Is that really true? And if so, what is going on? Like... is the write to done being thrown away, or is it the racey loop, or some other specific optimization?
|
Correct.
It is really true that the memory model says that program can run forever. This program will currently run forever:
Because the compiler does remove the write from inside the infinite loop. (The compiler can prove that the memory state with |
OK, that is very clear. Thank you. I understand that. I'm totally comfortable with someone just closing this out as a bad proposal for any number of reasons, but just to be unambiguous I will leave the proposal as follows:
I'm not suggesting any new guarantee for the second item. I just think those two sentences create a huge anti-guarantee without explaining it. It can be absolutely true while also being so broad that it just creates problems because is not clear. Sort of like the difference between saying, "Writes to variables may not happen", and saying "We may optimize away writes that are unobservable." Again, thanks to everyone for their patience. |
Would it be also possible to add a guarantee that it is safe (not a data race) for multiple writers to concurrently write the same value to the same location, even if these values may be more than one machine word, provided that there are no concurrent readers? |
@DemiMarie If I understand what you are saying, then, no, it would not be possible. It would be a significant performance loss if writers were required to ensure that a reader will always see a multi-word value as entirely written by either one writer or another. It would in effect require a lock around all reads and writes of multi-word values. |
@ianlancetaylor that is not what I meant 😄. The specific exemptions I am thinking of are as follows:
My reasoning is that in both cases, all possible orderings of the writes are safe, including the writes being interleaved. This is because if two writers are writing the same value, it does not matter who wins, and writing the same value as the one already in a location is a no-op. |
That would dramatically complicate the race detector for a case that is accurately described as performing duplicated or useless work. To take advantage of such a guarantee, you would need to know enough about the program that you could just not do that. |
I believe there are some algorithms where avoiding the duplicate work would require costly synchronization that would hamper overall system performance. I have never implemented such an algorithm, and would be fine with the race detector producing false positives for such an algorithm. |
Hi @DemiMarie, FWIW, I think avoiding false positives is an explicit goal of the race detector: |
@thepudds good point |
This proposal has been added to the active column of the proposals project |
Regarding #50590 (comment), we are not going to make those changes. They may seem innocent enough, but the implications are the compiler removing very important optimizations all for the sake of racy programs. The better way is to use atomics (or channels or mutexes or ...) when you want to actually share something between goroutines. Or, as the saying goes, "Don't communicate by sharing memory, share memory by communicating." We already restrict the compiler quite a lot compared to C/C++, and I think justifiably so, to limit the possible damage done by races. But races are races. What you are proposing is to make them not races anymore, which would require much less efficient code in a variety of settings, slowing down the extant body of entirely-race-free Go code for no benefit to that code. It may be that at some point in the future we might want to add weak atomics - although I suspect not, the jury is still out on that. But if we did, we would do it by having special types, not by redefining the behavior of basic types and slowing down all programs that use them. |
@rsc It looks like you are referring back to my comment. I'm assuming you are referring to the first item. That is not intended to have any change to the compiler at all. It should only be a clarification to the memory model. The specific details are just above that post and in a few posts above that. The scope of the clarification is here:
|
We are going to add various clarifications as part of other efforts. I don't think we need to keep this issue open. But yes, there are no out-of-thin-air values, nor any shearing on machine-word-sized values. The text in go.dev/ref/mem that starts with "A read r of a variable v is allowed to observe a write w to v if both of the following hold:" enumerates the only conditions when a read can observe a write (has to be an earlier write). |
@rsc I can only submit my personal experience. For the two items listed, I could not find any documentation of the behavior, and after asking in the Go Forum I got confusing and actually wrong feedback as other people did not understand these issues either. |
Based on the discussion above, this proposal seems like a likely decline. |
No change in consensus, so declined. |
Russ Cox has written well on the need to include the concept of atomics in the Go Memory Model: here. This proposal assumes an understanding of atomics and understanding the statement of the Go Memory Model (as opposed to understanding the memory model's implementation).
My proposal is to define a certain set of native data types (maybe as few as bool, byte, and int) that are intrinsically and always safe for single-writer, multi-reader applications.
Specifically, if initially variable v = v1 and several routines are concurrently reading variable v while a single routine concurrently assigns v = v2, then it is guaranteed that no reader will see any value other than either v1 or v2 and that after the assignment is done, all routines will only see the value v2.
Importantly, this proposal makes no guarantee about serializing multiple concurrent writers and as such it does not represent a complete implementation of atomic.Load* and atomic.Store* While the implementation may ultimately require a true atomic Store, that should be an implementation detail and a possible expansion of the definition of the type. I believe the benefit of this proposal is entirely realized just by defining the behavior of a single-writer for certain types.
This does not change the expression of the language, and in fact guarantees that some simple, straightforward code will work as expected. For instance, it allows a variable running=true to be used by many routines containing a "for running { doStuff() }" block such that they can all be canceled by a single routine setting running=false.
Additionally, while this does not change the Go Memory Model Advice at all, this does enable some very powerful and simple (no atomic package) high-performance lock-free programming where the Advice is impractical.
By defining these types in the Memory Model document, it would also explicitly acknowledge that other types are not safe for single-writer multi-reader code.
While I suspect the cost of implementing byte, bool, and int are low and would cover a great number of use cases, someone would have to consider the cost of other primitive data types across platforms.
Specifically for each primitive type under consideration, what is the run time cost of implementation? We should probably weigh the cost for each platform against the popularity of the platform. For instance, (and I'm not saying this is true), but if bytes are already atomic on all platforms except s/390 we should consider the benefit of the implementation against any narrow run time cost on s/390.
The text was updated successfully, but these errors were encountered: