Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler_fence may emit machine code #62256

Open
Disasm opened this issue Jun 30, 2019 · 14 comments
Open

compiler_fence may emit machine code #62256

Disasm opened this issue Jun 30, 2019 · 14 comments
Labels
A-atomic Area: Atomics, barriers, and sync primitives A-codegen Area: Code generation A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-heavy Issue: Problems and improvements with respect to binary size of generated code. O-riscv Target: RISC-V architecture T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Disasm
Copy link
Contributor

Disasm commented Jun 30, 2019

As discovered there compiler_fence produces atomic_fence(ordering, SingleThread) construction, which in turn can produce a non-empty code sequence. In fact, LLVM backends for AVR, PowerPC, RISC-V and Spark do not treat SingleThread fence as something special. At the same time Rust docs tell that "compiler_fence does not emit any machine code".
Seems like Rust misuses this "SingleThread means CompilerBarrier" semantics, but I could be wrong.

@jonas-schievink jonas-schievink added A-codegen Area: Code generation C-bug Category: This is a bug. I-heavy Issue: Problems and improvements with respect to binary size of generated code. O-riscv Target: RISC-V architecture T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 30, 2019
@nagisa
Copy link
Member

nagisa commented Jul 1, 2019

Seems like Rust misuses this "SingleThread means CompilerBarrier" semantics

Yes, it does.

@jonas-schievink jonas-schievink added the A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools label Jul 1, 2019
@RalfJung
Copy link
Member

So what is the right way in LLVM to emit a compiler barrier that definitely emits no machine code?

@nagisa
Copy link
Member

nagisa commented Nov 27, 2020

clang++ emits the same LLVM instruction (fence sync_scope("thread")) for std::atomic_signal_fence(std::memory_order_seq_cst); and has the same issue on at least the RISC-V target, despite C++ documentation claiming the same no-instructions guarantee.

This is most likely a bug with the LLVM backends in question, rather than Rust’s lowering to LLVM. However llvm_asm!("" ::: "memory") (or whatever equivalent with the new asm! syntax is) is a plausible workaround to this bug.

@nagisa nagisa added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Nov 27, 2020
@lenary
Copy link
Contributor

lenary commented Nov 30, 2020

I'm working on this from the LLVM side. I'll try to take a generic approach that supports both RISC-V and other targets, but if that doesn't pan out, we'll be looking at a RISC-V only patch in LLVM 12.

@lenary
Copy link
Contributor

lenary commented Dec 4, 2020

Aside: I'm not sure either Rust or C++ can provide or enforce the guarantee that no instructions should be emitted. On OoO cores with relaxed memory models it is almost certain that the compiler will need to insert an instruction to tell the core to ensure all stores/loads are completed before proceeding, even if you don't need to synchronize outside of the currently executing core.

While I think it is probably valid for RISC-V to not emit any instruction, the central semantic guarantee on the memory accesses is preserved by emitting the kind of fence instruction corresponding to the ordering requested, and hence I don't think the current RISC-V implementation is fundamentally incorrect, even if it is inefficient.

@nagisa
Copy link
Member

nagisa commented Dec 4, 2020

compiler_fence's only purpose is to prevent reordering of reads and writes by the compiler itself. OoO execution at hardware level is entirely out of scope as far as this barrier is concerned.

But yes, if we were to ignore the documentation guarantee, it is not strictly incorrect from functional perspective for it to emit a more strict barrier.

@lenary
Copy link
Contributor

lenary commented Dec 4, 2020

It's not clear to me that the C++ version does exclude OoO execution, even though the rust definition does, but I'm happy to follow that GCC leaves out the fence for RISC-V when compiling atomic_signal_fence. I hope to have a patch ready today, which will introduce a cross-target way of doing this, rather than the ad-hoc single-target ways of implementing compiler barriers that have been implemented to date.

@thomcc
Copy link
Member

thomcc commented Feb 18, 2021

Note that this line: https://github.com/llvm/llvm-project/blob/eb75f250feb6822d57be95e8535e28724cde6e9d/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp#L3998 (found by @Lymia) means it might lower into a call to __sync_synchronize() (on platforms without atomics, I think), which is annoying for no_std code.

I also would consider it to be "strictly incorrect from functional perspective", since there's no guarantee that function is linked in.

@workingjubilee
Copy link
Member

workingjubilee commented Jul 19, 2022

@workingjubilee
Copy link
Member

It is my assertion that we had no such contract from the codegen backend thus the contract we were forging with users was always erroneous, and that we should simply drop the promise of offering no machine code emission, even if we manage to persuade our backends to comply, as frustrating as that might be. It should be explained as the most minimal barrier possible, which should be zero machine instructions for all practical cases, not a guaranteed zero-machine-instructions barrier.

@thomcc
Copy link
Member

thomcc commented Jul 22, 2022

I think it's still wrong to emit a call to a function we don't provide via compiler-builtins on that target.

Admittedly looking at compiler-builtins now it's unclear why it's not provided, although ideally we wouldn't emit the call either way.

@workingjubilee
Copy link
Member

Oh, I would absolutely agree that we shouldn't be linking in code on no_std platforms that we can't provide. I just think we should think of emitting a call to it, if we provided it, as... disappointing, a missed opt, not a violation of our compiler contract. Strengthening a barrier should remain a valid interpretation of a barrier if the codegen backend is ever in doubt.

@RalfJung
Copy link
Member

RalfJung commented Sep 9, 2024

Seems like Rust misuses this "SingleThread means CompilerBarrier" semantics, but I could be wrong.

This is not a misuse, this is exactly the intended semantics -- compiler_fence is our name for C++'s atomic_signal_fence, it is explicitly and only intended to synchronize with operations that are "on the same logical thread" (i.e. in a signal handler). Arguably our name is not great, but the latest nightly docs at least should clarify that.

That should AFAIK never need to emit any instructions, so indeed I would say this is an LLVM bug.
Cc @nikic

@hanna-kruppe
Copy link
Contributor

As one of the inbound links to this issue points out, it also affects PowerPC: https://godbolt.org/z/qdK8bT68a

It's possible that several LLVM targets have been overly conservative about this and none of them need to emit any codefor fence sync_scope("thread"). Or maybe that fence implies something stronger than needed for Rust's compiler_fence and C++'s atomic_signal_fence and that's why it emits code. But it's still only "suboptimal codegen" and the promise about not generating machine code should still be removed or rephrased (it still exists in the current nightly docs) for reasons discussed above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-atomic Area: Atomics, barriers, and sync primitives A-codegen Area: Code generation A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-heavy Issue: Problems and improvements with respect to binary size of generated code. O-riscv Target: RISC-V architecture T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants