Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document some places where crossbeam tests the limits of the C++ memory model -- or crosses them #234

Merged
merged 4 commits into from
Nov 20, 2018

Conversation

RalfJung
Copy link
Contributor

Thanks to @jeehoonkang for some early feedback.

@ghost
Copy link

ghost commented Nov 20, 2018

Thank you!

Now that I think of it, there should probably go an atomic::compiler_fence(Ordering::SeqCst) right after the compare_and_swap operation. Do you agree and if so, can you add it?

@RalfJung
Copy link
Contributor Author

TBH I have no idea, I don't nearly know enough about this code to answer your question.^^

But my reading of the comment (above the stuff I added) is that this fence is deliberately not added because in x86 assembly, it is not needed, but the compiler is not smart enough to optimize this correctly. Of course that doesn't mean the compiler won't do something stupid here, but unfortunately SC accesses (mixed with other accesses to the same location) are amongst the least understood parts of the C++ concurrency model...

Let's see what @jeehoonkang has to say.

@jeehoonkang
Copy link
Contributor

jeehoonkang commented Nov 20, 2018

Yeah, I think the motivation here is avoiding MFENCE in the generated assembly while achieving strong enough synchornization with LOCK CMPXCHG. I believe the code should really be written in inline assemby, rather than some C++ code that will be compiled to LOCK CMPXCHG, but unfortunately we cannot do that in the stable Rust.

To summarize my stance, there are two hacks in play:

  • We're using x86 LOCK CMPXCHG instead of C++ fence(SeqCst) for avoiding the expensive MFENCE. It's correctness is not yet theoretically investigated, but it seems working.
  • In fact, we're using C++ update instead of x86 LOCK CMPXCHG because inline assembly is currently Nightly only. To be snobbish, we should check whether the generated assembly indeed has LOCK CMPXCHG.

@RalfJung I just realized this point. Sorry to have you work twice, but if you agree with my explanation, would you please document it as a comment?

@RalfJung
Copy link
Contributor Author

The first part isn't a hack though, is it? It is quite clear from the x86/TSO model that a CAS has the effect of a fence.

So the only hack here is to pretend we are working in assembly while we are not -- to rely on how CAS is codegen'd, and to hope that either this is correct in the C++ model as well or else LLVM does not notice what we are doing here.

@ghost
Copy link

ghost commented Nov 20, 2018

Exactly - we'd ideally use inline assembly here but can't on stable Rust. The CAS will compile into the desired lock cmpxchg instruction, and it is indeed an ugly hack. :)

I think it'd be a good idea to put a compiler fence after the CAS so that it becomes a real fence from the processor's standpoint and from the compiler's standpoint.

@RalfJung
Copy link
Contributor Author

I extended the comment to mention inline assembly.

Do we have compiler fences on stable Rust...?

@ghost
Copy link

ghost commented Nov 20, 2018

@RalfJung
Copy link
Contributor Author

RalfJung commented Nov 20, 2018

I added a compiler fence after the CAS.

@ghost
Copy link

ghost commented Nov 20, 2018

@RalfJung I like how precise and cautious your comments are :)

bors r+

bors bot added a commit that referenced this pull request Nov 20, 2018
234: Document some places where crossbeam tests the limits of the C++ memory model -- or crosses them r=stjepang a=RalfJung

Thanks to @jeehoonkang for some early feedback.

Co-authored-by: Ralf Jung <[email protected]>
@bors
Copy link
Contributor

bors bot commented Nov 20, 2018

Build succeeded

@bors bors bot merged commit d94e5ee into crossbeam-rs:master Nov 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants