Document some places where crossbeam tests the limits of the C++ memory model -- or crosses them #234

RalfJung · 2018-11-20T09:05:22Z

Thanks to @jeehoonkang for some early feedback.

ghost · 2018-11-20T09:40:34Z

Thank you!

Now that I think of it, there should probably go an atomic::compiler_fence(Ordering::SeqCst) right after the compare_and_swap operation. Do you agree and if so, can you add it?

RalfJung · 2018-11-20T09:53:43Z

TBH I have no idea, I don't nearly know enough about this code to answer your question.^^

But my reading of the comment (above the stuff I added) is that this fence is deliberately not added because in x86 assembly, it is not needed, but the compiler is not smart enough to optimize this correctly. Of course that doesn't mean the compiler won't do something stupid here, but unfortunately SC accesses (mixed with other accesses to the same location) are amongst the least understood parts of the C++ concurrency model...

Let's see what @jeehoonkang has to say.

jeehoonkang · 2018-11-20T10:11:08Z

Yeah, I think the motivation here is avoiding MFENCE in the generated assembly while achieving strong enough synchornization with LOCK CMPXCHG. I believe the code should really be written in inline assemby, rather than some C++ code that will be compiled to LOCK CMPXCHG, but unfortunately we cannot do that in the stable Rust.

To summarize my stance, there are two hacks in play:

We're using x86 LOCK CMPXCHG instead of C++ fence(SeqCst) for avoiding the expensive MFENCE. It's correctness is not yet theoretically investigated, but it seems working.
In fact, we're using C++ update instead of x86 LOCK CMPXCHG because inline assembly is currently Nightly only. To be snobbish, we should check whether the generated assembly indeed has LOCK CMPXCHG.

@RalfJung I just realized this point. Sorry to have you work twice, but if you agree with my explanation, would you please document it as a comment?

RalfJung · 2018-11-20T10:14:29Z

The first part isn't a hack though, is it? It is quite clear from the x86/TSO model that a CAS has the effect of a fence.

So the only hack here is to pretend we are working in assembly while we are not -- to rely on how CAS is codegen'd, and to hope that either this is correct in the C++ model as well or else LLVM does not notice what we are doing here.

ghost · 2018-11-20T10:59:23Z

Exactly - we'd ideally use inline assembly here but can't on stable Rust. The CAS will compile into the desired lock cmpxchg instruction, and it is indeed an ugly hack. :)

I think it'd be a good idea to put a compiler fence after the CAS so that it becomes a real fence from the processor's standpoint and from the compiler's standpoint.

RalfJung · 2018-11-20T11:25:06Z

I extended the comment to mention inline assembly.

Do we have compiler fences on stable Rust...?

ghost · 2018-11-20T11:45:41Z

We do, since 1.21: https://doc.rust-lang.org/std/sync/atomic/fn.compiler_fence.html

RalfJung · 2018-11-20T16:59:09Z

I added a compiler fence after the CAS.

ghost · 2018-11-20T17:14:36Z

@RalfJung I like how precise and cautious your comments are :)

bors r+

@jeehoonkang

234: Document some places where crossbeam tests the limits of the C++ memory model -- or crosses them r=stjepang a=RalfJung Thanks to @jeehoonkang for some early feedback. Co-authored-by: Ralf Jung <[email protected]>

bors · 2018-11-20T17:44:34Z

Build succeeded

continuous-integration/travis-ci/push

RalfJung added 2 commits November 20, 2018 08:32

document technical UB

4cbbb7f

point out that using a CAS for a fence is somewhat dubious

fdc168f

mention inline assembly as possible alternative

c291a1b

add a compiler fence

d94e5ee

bors bot merged commit d94e5ee into crossbeam-rs:master Nov 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document some places where crossbeam tests the limits of the C++ memory model -- or crosses them #234

Document some places where crossbeam tests the limits of the C++ memory model -- or crosses them #234

RalfJung commented Nov 20, 2018

ghost commented Nov 20, 2018

RalfJung commented Nov 20, 2018

jeehoonkang commented Nov 20, 2018 •

edited

Loading

RalfJung commented Nov 20, 2018

ghost commented Nov 20, 2018

RalfJung commented Nov 20, 2018

ghost commented Nov 20, 2018

RalfJung commented Nov 20, 2018 •

edited

Loading

ghost commented Nov 20, 2018

bors bot commented Nov 20, 2018

Document some places where crossbeam tests the limits of the C++ memory model -- or crosses them #234

Document some places where crossbeam tests the limits of the C++ memory model -- or crosses them #234

Conversation

RalfJung commented Nov 20, 2018

ghost commented Nov 20, 2018

RalfJung commented Nov 20, 2018

jeehoonkang commented Nov 20, 2018 • edited Loading

RalfJung commented Nov 20, 2018

ghost commented Nov 20, 2018

RalfJung commented Nov 20, 2018

ghost commented Nov 20, 2018

RalfJung commented Nov 20, 2018 • edited Loading

ghost commented Nov 20, 2018

bors bot commented Nov 20, 2018

Build succeeded

jeehoonkang commented Nov 20, 2018 •

edited

Loading

RalfJung commented Nov 20, 2018 •

edited

Loading