How should we expose atomic load/store on targets that don't support full atomics #99668

Amanieu · 2022-07-24T12:07:27Z

Some embedded targets (thumbv6m, riscv32i) don't support general atomic operations (compare_exchange, fetch_add, etc) but do support atomic load and store operations. We previously supported this on thumbv6m (but not riscv32i, see #98333) by cfging out most of the methods on Atomic* except for load and store. However recent changes to LLVM (#99595) cause it to emit calls to libatomic for load and store (it already does this for the other atomic operations).

It seems that LLVM's support for lowering atomic loads and stores on ARM was an accident that is being reverted. However the reason for this revert is that the directly lowered load/store cannot interoperate correctly with the other atomic operations which are lowered to libcalls. This concern doesn't apply to Rust since we don't expose emulated (read: not lock free) atomics, so there is no interoperability concern (except maybe for FFI with C that uses atomics?).

I see 2 ways we can move forward:

Continue supporting atomic load/store on such targets by lowering them in rustc to a volatile load/store. This will avoid breakage in the embeded Rust ecosystem.
Remove support for atomic load/store on targets that don't support full atomics. Users are expected to switch to using volatile load/store instead.

The text was updated successfully, but these errors were encountered:

jamesmunns · 2022-07-24T12:15:16Z

Just wanted to add my $0.02, option two, "Remove support for atomic load/store on targets that don't support full atomics", would be a breaking change to (at least) one Tier 2 target: thumbv6m-none-eabi.

It would break a non-trivial number of embedded projects (there are lots of Cortex-M0+ cores out there, including the very popular RP2040), and might not be picked up by a crater run, as crate may not exercise specific targets during the run.

jamesmunns · 2022-07-24T12:16:58Z

Also just as a note, we might want to check in that armv5te still builds with these relevant changes. It has "native" load/stores, but no CAS operations. These are provided by an OS-level polyfill for the linux target.

It's likely their load/stores will also be affected by any change here.

Amanieu · 2022-07-24T12:19:45Z

AFAIK armv5te should not be affected by this change, but we should double check with the upgrade to LLVM 15.

adamgreig · 2022-07-24T12:20:03Z

Currently the core::ptr docs say:

All accesses performed by functions in this module are non-atomic in the sense of atomic operations used to synchronize between threads. This means it is undefined behavior to perform two concurrent accesses to the same location from different threads unless both accesses only read from memory. Notice that this explicitly includes read_volatile and write_volatile: Volatile accesses cannot be used for inter-thread synchronization.

One downside of requiring users on these platforms to use volatile ops instead is that (in Rust) volatile is a property of an access (so a static UnsafeCell/static mut could still be accessed with a non-volatile operation) while the Atomic types only have atomic operations (via AtomicU8, AtomicBool, etc), and I don't think there's any appetite for VolatileU32 or similar.

There is the atomic-polyfill crate which is quite popular in embedded Rust and provides CAS operations on thumbv6 etc while directly forwarding core::sync::atomic on other platforms. If option 2 was chosen, probably this crate could help smooth out a transition.

taiki-e · 2022-07-24T12:25:02Z

the atomic-polyfill crate

That is unsound on multi-core systems. (See tokio-rs/bytes#461 (comment) for an approach to handle such cases soundly.)

adamgreig · 2022-07-24T12:27:26Z

That is unsound on multi-core systems.

I thought it used the critical-section crate, which requires users to provide a critical section implementation for their system - so on a multi-core system you need to provide something with the appropriate guarantees, and it won't compile if no implementation is provided. It doesn't simply disable all interrupts on the CM0 core to obtain the CAS locks.

Edit: The currently-released critical-section 0.2.7 does by default provide a "disable all interrupts" critical section for thumbv* platforms, which is unsound on multi-core, though users on multi-core systems can still provide their own implementation which is sound. The next release of critical-section removes all the built-in implementations entirely, and it's expected most platform support crates will provide an implementation instead, which should be sound for their respective platform. In any event I don't think it gets in the way of having atomic-polyfill provide replacement Atomic* types that can do load/store on thumbv6 if need be.

apiraino · 2022-07-24T12:36:07Z

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-critical +T-compiler

taiki-e · 2022-07-24T12:41:14Z

requires users to provide a critical section implementation for their system - so on a multi-core system you need to provide something with the appropriate guarantees, and it won't compile if no implementation is provided

Note that rust-embedded/critical-section@a3cdfd8 has not been released yet. The releases available on crates.io do not work this way.

And even if it were released, atomic-polyfill is still unsound on multi-core systems because it mixes native atomic load/store and critical-section.

taiki-e · 2022-07-24T12:58:59Z

FWIW, my portable-atomic crate implements the equivalent of the atomic types that was be removed in #99595 (by using inline assembly). It also provides equivalents on other platforms where atomic load/store is not provided (riscv without A-extension, msp430, avr).

RalfJung · 2022-07-24T13:31:31Z

Remove support for atomic load/store on targets that don't support full atomics. Users are expected to switch to using volatile load/store instead.

That seems in conflict with us telling people for years now that volatile operations are not atomic. And they aren't, so this strategy risks subtle miscompilations.

taiki-e · 2022-07-24T13:37:26Z

using volatile load/store

If we implement this in the compiler (i.e., the first way that Amanieu mentioned), I think it is safe to use volatile load/store -- If there are tests to ensure that these operations are lowered properly by LLVM, we can guarantee compiler that abusing UB of using non-atomic volatile load/store as atomic load/store will never be shipped.

Revert "Mark atomics as unsupported on thumbv6m" This is a breaking change for the `thumbv6m` target. See rust-lang#99668 for discussion on how we can proceed forward from here. This reverts commit 7514610. cc `@nikic`

RalfJung · 2022-07-24T13:50:19Z

If we implement this in the compiler (i.e., the first way that Amanieu mentioned), I think it is safe to use volatile load/store

If we do this after LLVM, yes.
But if we tell LLVM that these are (implicitly non-atomic) volatile accesses, it might optimize accordingly. Checking how LLVM lowers them is not enough, it does not take into account the effect of optimizations. So at that point inline assembly would be the safer choice.

nikic · 2022-07-24T14:00:23Z

I think the ideal solution would be to add a new target feature to LLVM, which makes the promise that no atomic CAS will be used, and thus allow lowering atomic load/store without libcalls (I've commented to that effect here: https://reviews.llvm.org/D120026#3674333). That would restore the status quo.

We should probably also document the special assumptions made by this target (or any other with "only load/store atomics"), which should boil down to:

Linking against non-Rust code using atomics is not supported (and may be silently unsound).
Atomic CAS can only be safely emulated on single-core systems.
To support atomic CAS on multi-core systems, a custom target spec that disables all atomics should be used and all atomic operations should go through a polyfill (with a shared lock) instead.

nikic · 2022-07-24T21:55:29Z

Dropping regression label and downgrading priority as the change is reverted now. We still need a solution for the LLVM upgrade of course.

Quoting Eli's response from https://reviews.llvm.org/D120026#3674604:

I see two possible approaches to restore the functionality you want.

One, you could add a target feature that says "I have 32-bit (or 64-bit) atomics". That would override the normal rule for setMaxAtomicSizeInBitsSupported. The target environment would be expected to provide whatever _sync* functions are necessary. If you don't actually use any atomic operations that require CAS, you won't see any calls, so everything works even if the functions don't actually exist. (Or it's possible to write an operating system that implements lock-free CAS on Cortex-M0, the same way the Linux kernel implements atomics on old ARM cores.)

Two, you could relax the constraint that atomic load/store are required to be atomic with respect to cmpxchg. We can add a value to syncscope to represent this. This would allow mixing code that uses load-store atomics with code that requires "real" atomics.

That said, I'm not sure how you use load-store atomics in practice. I mean, I guess you can use Dekker's alogorithm, or some kinds of lock-free queues, but that seems a bit exotic. Or do you use some sort of lock implemented by disabling interrupts?

I think option one is what we want, and it seems like a pretty elegant solution. With a custom target spec one could even set atomic_cas: true, as long as one provides the necessary __sync libcalls.

I am kinda curious about the last question though -- what do people actually use pure load/store atomics for?

Amanieu · 2022-07-24T23:34:25Z

I believe load/store atomics are mainly used for synchronization with an interrupt handler. Somewhat like volatile sig_atomic_t in C, hence my suggestion to lower it to volatile load/store.

adamgreig · 2022-07-25T00:32:59Z

I think almost all use cases are indeed synchronisation with interrupt contexts, but then again that's pretty much the only concurrency present in many bare-metal thumbv6-m systems. Usually multi-core Cortex-M0 systems (like RP2040) have to use some hardware-specific primitive for cross-core synchronisation, which is outside the scope of atomics.

Many uses are probably in application firmware which doesn't tend to be as easy to find on GitHub, but there are a couple of examples in widely used libraries:

heapless' SPSC queue is a lock-free queue using just atomic load/store (no CAS): https://github.com/japaric/heapless/blob/master/src/spsc.rs
bbqueue has a separate implementation for cortex-m0 that uses a disable-all-interrupts critical section (notably only sound on single-core) alongside atomic load/store operations to emulate CAS

I expect another significant use case is that atomics provide a safe interior-mutable type for a static, so if you want to coordinate a bool flag or an int or something with an interrupt, you need to use a static, and an Atomic* type lets you easily load/store it without using unsafe or a mutex or anything else. Since all the word-sized operations are atomic anyway you could just do this with a regular static u32, but that would need a static mut or an UnsafeCell etc. This is often in a context where it's not incorrect to only use load/store, for example a timer interrupt incrementing a timestamp which many other contexts read from (only one writer, many readers).

I'll see if I can find any other actual application examples too.

Lokathor · 2022-07-25T00:52:51Z

If you're looking for examples: on the GBA (ARMv4T / ARM7TDMI) I do exactly what you're describing with UnsafeCell so that the interrupt handler can communicate with the main program (gba_cell.rs). Currently this is technically unsound to mix with non-volatile accesses (though in practice there's never been an issue).

Lokathor · 2022-09-03T20:38:52Z

load/store appears to still not work with ARMv4T, though it should

See #101300

pnkfelix · 2022-10-28T14:39:29Z

T-compiler reviewed this as part of 2022 Q3 P-high review

What is current situation here? It seems like there are still unresolved questions about what we want to tell people to do;

but also, I see that @nikic added target changes on the LLVM side for ARM and RISC-V: https://reviews.llvm.org/D130480 and https://reviews.llvm.org/D130621
do we need to be leveraging the +atomic-32 and +forced-atomics features in some fashion within the rustc interface to LLVM in order to make progress here?

nikic · 2022-10-28T14:51:59Z

We already use +atomic-32 for relevant ARM/Thumb targets.

Once we update to LLVM 16, we can consider using +forced-atomics on RISC-V (it would allow landing the change from #98333).

I think at this point in time, the only thing that can be done here is adding upstream LLVM support for +forced-atomics to more targets, like MSP430 and AVR mentioned above.

Lokathor · 2022-10-28T15:02:22Z

#101300 is still open. The TLDR of that issue is that, at least on some targets, the fake atomic access has extremely bad performance, to the point that it might drive people to use inline assembly instead.

cr1901 · 2022-10-28T20:50:37Z

@nikic I have no bandwidth at this time to add LLVM support for +forced-atomics or lowering atomics for MSP430, but I'm not against the idea and neither is the maintainer last I checked.

If this is only simple thing that just drops some bits and lowers atomic stuff to normal loads / stores, then probably it could be fine

Right now, MSP430 doesn't lower atomic operations correctly in LLVM (long story)

This is still true.

pkubaj · 2022-11-09T11:18:00Z

powerpc (32-bit) is also affected.

pnkfelix · 2023-04-14T14:52:32Z

Discussed at 2023 Q1 P-high review

My current understanding is that there is no one objecting to adding support to +forced-atomics for the targets that need it, but no one is taking up the task of actually doing so for all such targets.

I'm not clear on who would be best to take ownership of that, but maybe that doesn't matter yet; the first order of business is to collectively agree that is our strategy going forward.

@Amanieu @nikic am I right in inferring that you're both aligned on the plan of leveraging +forced-atomics where we need it, and the main work items here are adding support for it upstream with LLVM? (And I guess maybe there are some efficiency issues with the current support, as flagged e.g. in #101300 ?)

nikic · 2023-04-14T15:11:20Z

We have +forced-atomics support for RISCV in LLVM 16, so we could give adding that to our target specs (and enabling atomic load/store) a try.

I believe the only real concern is that it makes Rust's atomics ABI-incompatible with C's atomics (unless the C code also uses +forced-atomics), but I think people value having atomics in Rust at all much more than being able to work with atomics across an ABI boundary.

cr1901 · 2023-04-14T16:21:25Z

I believe the only real concern is that it makes Rust's atomics ABI-incompatible with C's atomics (unless the C code also uses +forced-atomics)

I've been thinking about this; does anyone have a concrete example of where calling C code using libatomic from Rust code using Rust atomics is unsound/memory-unsafe (single core and multi core case )? Or is it more a "precaution that interop will subtly break"?

I can't visualize an example after thinking about it, because I already can't think of a way to share Rust atomics and use as if they were C _Atomic_s; Rust uses repr(C) on a newtype, and C _Atomics are not guaranteed to be the same size as the underlying type (do gcc and clang accommodate this?).

asl · 2023-04-23T22:45:40Z

We have +forced-atomics support for RISCV in LLVM 16, so we could give adding that to our target specs (and enabling atomic load/store) a try.

FWIW, I'm planning to find some spare time to address some atomic weirdness on MSP430 in LLVM.

Set max_atomic_width for AVR to 16 This is currently set to 0 https://github.com/rust-lang/rust/blob/90f0b24ad3e7fc0dc0e419c9da30d74629cd5736/compiler/rustc_target/src/spec/avr_gnu_base.rs#L26-L27 However, LLVM supports {8,16}-bit atomic load/store on AVR (support for RMW is still quite incomplete and only partially supported). https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/llvm/test/CodeGen/AVR/atomics/load8.ll#L5-L13 https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/llvm/test/CodeGen/AVR/atomics/load16.ll#L3-L12 https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/llvm/test/CodeGen/AVR/atomics/store.ll#L3-L22 cc rust-lang#99668 r? `@Amanieu`

wesleywiser · 2023-11-03T14:49:51Z

Visited during the compiler team's P-high review. Based on the most recent discussion points, we believe this can relabeled P-medium as +forced-atomic support has landed in LLVM for all relevant Tier 1 or Tier 2 targets and interoperability of passing pointers to atomic values between C and Rust in this context does not seem to be commonly used.

Amanieu mentioned this issue Jul 24, 2022

Re-enable atomic loads and stores for all RISC-V targets #98333

Merged

Amanieu added the regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. label Jul 24, 2022

rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jul 24, 2022

This was referenced Jul 24, 2022

Mark atomics as unsupported on thumbv6m #99595

Merged

Revert "Mark atomics as unsupported on thumbv6m" #99670

Merged

rustbot added P-critical Critical priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 24, 2022

Mark-Simulacrum added this to the 1.64.0 milestone Jul 24, 2022

nikic added P-high High priority and removed P-critical Critical priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. labels Jul 24, 2022

taiki-e mentioned this issue Sep 17, 2022

Can atomic-polyfill and portable-atomic soundly coexist in the same dependency graph? taiki-e/portable-atomic#39

Closed

hudson-ayers mentioned this issue Oct 5, 2022

linkage error when compiling esp32-c3-devkitM-1 kernel tock/tock#3266

Closed

TheButlah mentioned this issue Nov 20, 2022

Use atomic_polyfill on targets without atomics knurling-rs/defmt#707

Closed

phil-opp mentioned this issue Feb 22, 2023

Can't use the crate for the target riscv32i-unknown-none-elf rust-osdev/linked-list-allocator#76

Open

This was referenced Feb 26, 2023

Embassy HAL implementation for Espressif chips embassy-rs/embassy#745

Closed

Use atomics using atomic-polyfill embassy-rs/embassy#1240

Closed

Use atomic polyfill for targets missing atomic support probe-rs/rtt-target#21

Closed

Dirbaio mentioned this issue Mar 14, 2023

Use atomic polyfill for core::sync::atomic. embassy-rs/embassy#1278

Closed

This was referenced Apr 14, 2023

Executor: Replace unnecessary atomics in runqueue embassy-rs/embassy#1361

Merged

fix: embassy-executor - use atomic_polyfill for riscv & xtensa embassy-rs/embassy#1381

Closed

Dirbaio mentioned this issue Apr 24, 2023

Support architectures without atomics with polyfill rust-embedded-community/usb-device#115

Merged

taiki-e mentioned this issue Aug 5, 2023

Set max_atomic_width for AVR to 16 #114495

Merged

taiki-e mentioned this issue Sep 24, 2023

document when atomic loads are guaranteed read-only #115577

Merged

wesleywiser added P-medium Medium priority and removed P-high High priority labels Nov 3, 2023

michaelkamprath mentioned this issue Jan 2, 2024

Unable to compile in no_std project fereidani/rclite#7

Open

ProfFan mentioned this issue Aug 30, 2024

Atomic operations in PSRAM (esp. with 3rd party libraries) esp-rs/esp-hal#2027

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How should we expose atomic load/store on targets that don't support full atomics #99668

How should we expose atomic load/store on targets that don't support full atomics #99668

Amanieu commented Jul 24, 2022

jamesmunns commented Jul 24, 2022

jamesmunns commented Jul 24, 2022

Amanieu commented Jul 24, 2022

adamgreig commented Jul 24, 2022

taiki-e commented Jul 24, 2022

adamgreig commented Jul 24, 2022 •

edited

Loading

apiraino commented Jul 24, 2022

taiki-e commented Jul 24, 2022

taiki-e commented Jul 24, 2022 •

edited

Loading

RalfJung commented Jul 24, 2022

taiki-e commented Jul 24, 2022

RalfJung commented Jul 24, 2022 •

edited

Loading

nikic commented Jul 24, 2022

nikic commented Jul 24, 2022

Amanieu commented Jul 24, 2022

adamgreig commented Jul 25, 2022

Lokathor commented Jul 25, 2022

Lokathor commented Sep 3, 2022 •

edited

Loading

pnkfelix commented Oct 28, 2022

nikic commented Oct 28, 2022

Lokathor commented Oct 28, 2022 •

edited

Loading

cr1901 commented Oct 28, 2022

pkubaj commented Nov 9, 2022

pnkfelix commented Apr 14, 2023

nikic commented Apr 14, 2023

cr1901 commented Apr 14, 2023 •

edited

Loading

asl commented Apr 23, 2023 •

edited

Loading

wesleywiser commented Nov 3, 2023

How should we expose atomic load/store on targets that don't support full atomics #99668

How should we expose atomic load/store on targets that don't support full atomics #99668

Comments

Amanieu commented Jul 24, 2022

jamesmunns commented Jul 24, 2022

jamesmunns commented Jul 24, 2022

Amanieu commented Jul 24, 2022

adamgreig commented Jul 24, 2022

taiki-e commented Jul 24, 2022

adamgreig commented Jul 24, 2022 • edited Loading

apiraino commented Jul 24, 2022

taiki-e commented Jul 24, 2022

taiki-e commented Jul 24, 2022 • edited Loading

RalfJung commented Jul 24, 2022

taiki-e commented Jul 24, 2022

RalfJung commented Jul 24, 2022 • edited Loading

nikic commented Jul 24, 2022

nikic commented Jul 24, 2022

Amanieu commented Jul 24, 2022

adamgreig commented Jul 25, 2022

Lokathor commented Jul 25, 2022

Lokathor commented Sep 3, 2022 • edited Loading

pnkfelix commented Oct 28, 2022

nikic commented Oct 28, 2022

Lokathor commented Oct 28, 2022 • edited Loading

cr1901 commented Oct 28, 2022

pkubaj commented Nov 9, 2022

pnkfelix commented Apr 14, 2023

nikic commented Apr 14, 2023

cr1901 commented Apr 14, 2023 • edited Loading

asl commented Apr 23, 2023 • edited Loading

wesleywiser commented Nov 3, 2023

adamgreig commented Jul 24, 2022 •

edited

Loading

taiki-e commented Jul 24, 2022 •

edited

Loading

RalfJung commented Jul 24, 2022 •

edited

Loading

Lokathor commented Sep 3, 2022 •

edited

Loading

Lokathor commented Oct 28, 2022 •

edited

Loading

cr1901 commented Apr 14, 2023 •

edited

Loading

asl commented Apr 23, 2023 •

edited

Loading