proposal: sync/atomic: add more relaxed atomics #35639
Labels
FeatureRequest
Issues asking for a new feature that does not need a proposal.
FrozenDueToAge
Performance
Proposal
Milestone
Go's atomics are sequential consistency, which requires memory fences or instructions with implicit memory fences even on strongly consistent architectures like x86/amd64.
It would be nice to have some more relaxed atomics so lock-free algorithms can be implemented efficiently in Go - because implementing them inefficiently defeats the purpose. Here's an example of a gopher discovering that porting C++ lock-free algorithms to Go results in poor performance (an order of magnitude slower, according to them.) This is especially true with non-x86 architectures. I'm a bit of a lock-free junkie, my first Go program was a lock-free hashtable. So personally for me this has been a long-standing pet-peeve with the language and I'm volunteering my help if we can agree on a direction forward.
This is a fairly uncommon need, but when you do need to get more performance there's no good alternatives. You have to write your whole function, not just the load/store in assembly, for each architecture you want to optimize. It should be noted that the Go runtime itself requires and implements acquire loads and release stores, and uses them judiciously where more performance is required. This also means it could be implemented fairly easily by just exporting those internal atomics from the sync/atomic package, with appropriate names and signatures.
There was a an old feature request from fellow lock-free junkie @dvyukov asking for more relaxed atomics, I remember reading the thread, but could not find it in the issue tracker. The Go maintainer balked at the idea of adding all of the atomics from c++, but was more open to the idea of a limited set that covers the majority of use cases. Nothing concrete came out of the discussion though.
I think having Acquire versions of the loads and Release versions of the stores in sync/atomic covers 95% of the use cases. I point to the Go runtime as an example to validate that claim, and also underscore why it's useful. It would double the amount of load/store functions in sync/atomic, which is reasonable in my opinion.
So to sum up, this is an advanced feature that is sometimes necessary with no good workarounds. It is already implemented and used by the Go runtime, so it can be implemented easily. Can this be my Christmas present to the Go community, and by extension, myself?
The text was updated successfully, but these errors were encountered: