-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How should we expose atomic load/store on targets that don't support full atomics #99668
Comments
Just wanted to add my $0.02, option two, "Remove support for atomic load/store on targets that don't support full atomics", would be a breaking change to (at least) one Tier 2 target: It would break a non-trivial number of embedded projects (there are lots of Cortex-M0+ cores out there, including the very popular RP2040), and might not be picked up by a crater run, as crate may not exercise specific targets during the run. |
Also just as a note, we might want to check in that It's likely their load/stores will also be affected by any change here. |
AFAIK |
Currently the core::ptr docs say:
One downside of requiring users on these platforms to use volatile ops instead is that (in Rust) volatile is a property of an access (so a static UnsafeCell/static mut could still be accessed with a non-volatile operation) while the Atomic types only have atomic operations (via AtomicU8, AtomicBool, etc), and I don't think there's any appetite for There is the atomic-polyfill crate which is quite popular in embedded Rust and provides CAS operations on thumbv6 etc while directly forwarding |
That is unsound on multi-core systems. (See tokio-rs/bytes#461 (comment) for an approach to handle such cases soundly.) |
I thought it used the critical-section crate, which requires users to provide a critical section implementation for their system - so on a multi-core system you need to provide something with the appropriate guarantees, and it won't compile if no implementation is provided. It doesn't simply disable all interrupts on the CM0 core to obtain the CAS locks. Edit: The currently-released critical-section 0.2.7 does by default provide a "disable all interrupts" critical section for thumbv* platforms, which is unsound on multi-core, though users on multi-core systems can still provide their own implementation which is sound. The next release of critical-section removes all the built-in implementations entirely, and it's expected most platform support crates will provide an implementation instead, which should be sound for their respective platform. In any event I don't think it gets in the way of having atomic-polyfill provide replacement Atomic* types that can do load/store on thumbv6 if need be. |
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-critical +T-compiler |
Note that rust-embedded/critical-section@a3cdfd8 has not been released yet. The releases available on crates.io do not work this way. And even if it were released, atomic-polyfill is still unsound on multi-core systems because it mixes native atomic load/store and critical-section. |
FWIW, my portable-atomic crate implements the equivalent of the atomic types that was be removed in #99595 (by using inline assembly). It also provides equivalents on other platforms where atomic load/store is not provided (riscv without A-extension, msp430, avr). |
That seems in conflict with us telling people for years now that volatile operations are not atomic. And they aren't, so this strategy risks subtle miscompilations. |
If we implement this in the compiler (i.e., the first way that Amanieu mentioned), I think it is safe to use volatile load/store -- If there are tests to ensure that these operations are lowered properly by LLVM, we can guarantee compiler that abusing UB of using non-atomic volatile load/store as atomic load/store will never be shipped. |
Revert "Mark atomics as unsupported on thumbv6m" This is a breaking change for the `thumbv6m` target. See rust-lang#99668 for discussion on how we can proceed forward from here. This reverts commit 7514610. cc `@nikic`
If we do this after LLVM, yes. |
I think the ideal solution would be to add a new target feature to LLVM, which makes the promise that no atomic CAS will be used, and thus allow lowering atomic load/store without libcalls (I've commented to that effect here: https://reviews.llvm.org/D120026#3674333). That would restore the status quo. We should probably also document the special assumptions made by this target (or any other with "only load/store atomics"), which should boil down to:
|
Dropping regression label and downgrading priority as the change is reverted now. We still need a solution for the LLVM upgrade of course. Quoting Eli's response from https://reviews.llvm.org/D120026#3674604:
I think option one is what we want, and it seems like a pretty elegant solution. With a custom target spec one could even set I am kinda curious about the last question though -- what do people actually use pure load/store atomics for? |
I believe load/store atomics are mainly used for synchronization with an interrupt handler. Somewhat like |
I think almost all use cases are indeed synchronisation with interrupt contexts, but then again that's pretty much the only concurrency present in many bare-metal thumbv6-m systems. Usually multi-core Cortex-M0 systems (like RP2040) have to use some hardware-specific primitive for cross-core synchronisation, which is outside the scope of atomics. Many uses are probably in application firmware which doesn't tend to be as easy to find on GitHub, but there are a couple of examples in widely used libraries:
I expect another significant use case is that atomics provide a safe interior-mutable type for a static, so if you want to coordinate a bool flag or an int or something with an interrupt, you need to use a static, and an Atomic* type lets you easily load/store it without using unsafe or a mutex or anything else. Since all the word-sized operations are atomic anyway you could just do this with a regular static I'll see if I can find any other actual application examples too. |
If you're looking for examples: on the GBA (ARMv4T / ARM7TDMI) I do exactly what you're describing with UnsafeCell so that the interrupt handler can communicate with the main program (gba_cell.rs). Currently this is technically unsound to mix with non-volatile accesses (though in practice there's never been an issue). |
load/store appears to still not work with ARMv4T, though it should See #101300 |
T-compiler reviewed this as part of 2022 Q3 P-high review What is current situation here? It seems like there are still unresolved questions about what we want to tell people to do; but also, I see that @nikic added target changes on the LLVM side for ARM and RISC-V: https://reviews.llvm.org/D130480 and https://reviews.llvm.org/D130621 |
We already use Once we update to LLVM 16, we can consider using I think at this point in time, the only thing that can be done here is adding upstream LLVM support for |
#101300 is still open. The TLDR of that issue is that, at least on some targets, the fake atomic access has extremely bad performance, to the point that it might drive people to use inline assembly instead. |
@nikic I have no bandwidth at this time to add LLVM support for
This is still true. |
powerpc (32-bit) is also affected. |
Discussed at 2023 Q1 P-high review My current understanding is that there is no one objecting to adding support to +forced-atomics for the targets that need it, but no one is taking up the task of actually doing so for all such targets. I'm not clear on who would be best to take ownership of that, but maybe that doesn't matter yet; the first order of business is to collectively agree that is our strategy going forward. @Amanieu @nikic am I right in inferring that you're both aligned on the plan of leveraging |
We have I believe the only real concern is that it makes Rust's atomics ABI-incompatible with C's atomics (unless the C code also uses |
I've been thinking about this; does anyone have a concrete example of where calling C code using libatomic from Rust code using Rust atomics is unsound/memory-unsafe (single core and multi core case )? Or is it more a "precaution that interop will subtly break"? I can't visualize an example after thinking about it, because I already can't think of a way to share Rust atomics and use as if they were C |
FWIW, I'm planning to find some spare time to address some atomic weirdness on MSP430 in LLVM. |
Set max_atomic_width for AVR to 16 This is currently set to 0 https://github.com/rust-lang/rust/blob/90f0b24ad3e7fc0dc0e419c9da30d74629cd5736/compiler/rustc_target/src/spec/avr_gnu_base.rs#L26-L27 However, LLVM supports {8,16}-bit atomic load/store on AVR (support for RMW is still quite incomplete and only partially supported). https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/llvm/test/CodeGen/AVR/atomics/load8.ll#L5-L13 https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/llvm/test/CodeGen/AVR/atomics/load16.ll#L3-L12 https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/llvm/test/CodeGen/AVR/atomics/store.ll#L3-L22 cc rust-lang#99668 r? `@Amanieu`
Visited during the compiler team's P-high review. Based on the most recent discussion points, we believe this can relabeled P-medium as |
Some embedded targets (
thumbv6m
,riscv32i
) don't support general atomic operations (compare_exchange
,fetch_add
, etc) but do support atomicload
andstore
operations. We previously supported this onthumbv6m
(but notriscv32i
, see #98333) bycfg
ing out most of the methods onAtomic*
except forload
andstore
. However recent changes to LLVM (#99595) cause it to emit calls to libatomic forload
andstore
(it already does this for the other atomic operations).It seems that LLVM's support for lowering atomic loads and stores on ARM was an accident that is being reverted. However the reason for this revert is that the directly lowered load/store cannot interoperate correctly with the other atomic operations which are lowered to libcalls. This concern doesn't apply to Rust since we don't expose emulated (read: not lock free) atomics, so there is no interoperability concern (except maybe for FFI with C that uses atomics?).
I see 2 ways we can move forward:
The text was updated successfully, but these errors were encountered: