- Relevant components:
svd2rust
- Feature Name: deferenceable_volatile
- Start Date: 2019-10-23
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)
The current implementation of the peripheral API generated by svd2rust
is
affected by bug rust-lang/rust#55005. The (hypothetical) observable effect of
this bug on the svd2rust
API is that LLVM may insert spurious reads to MMIO
registers in optimized builds. These spurious reads in turn may cause some bits
of the register to be cleared (e.g. interrupt flags) or flipped, changing the
intended behavior of the program.
The way to prevent this rustc/LLVM bug is to never create a reference (&-
or &mut-
) to a MMIO register, or in general to memory that is meant to be
accessed only through volatile operations. This document explores alternative
implementations of the svd2rust
API to avoid this bug.
To visualize the problem consider an hypothetical device with 2 similar USART
peripherals. Each USART peripheral has two registers: a 32-bit control register
(CR
) and a 16-bit data register (DR
); these registers are laid out next to
each other in memory; the CR
registers appears first in memory (has the lower
address).
In the current implementation these peripherals are represented using the following types:
pub struct USART1 { _not_sync: PhantomData<*mut ()> }
pub struct USART2 { _not_sync: PhantomData<*mut ()> }
impl Deref for USART1 { // repeat for USART2
type Target = usart::RegisterBlock;
fn deref(&self) -> &usart::RegisterBlock {
unsafe { &*(0x4000_0000 as *const usart::RegisterBlock) }
}
}
// repeat for USART2 with address `0x4001_0000`
mod usart {
#[repr(C)]
pub struct RegisterBlock {
pub CR: CR, // newtype over `VolatileCell<u32>`
pub DR: DR, // newtype over `VolatileCell<u16>`
}
}
The VolatileCell<T>
is a newtype over UnsafeCell<T>
with the following API:
struct VolatileCell<T> { inner: UnsafeCell<T> }
impl<T> VolatileCell<T> {
unsafe fn read(&self) -> T { // it also has a `write` method
ptr::read_volatile(self.inner.get())
}
}
The UnsafeCell::get(&self)
method (implicitly) creates a shared reference to
an UnsafeCell<T>
and thus runs into rust-lang/rust#55005. Because
VolatileCell::{read,write}
call UnsafeCell::get
they are also affected by
the bug.
Code like &usart1.cr
(where usart1: USART1
) may also trigger the bug because
it creates a shared reference to CR
, which is a newtype over
UnsafeCell<u32>
.
The main problem with the current implementation is that RegisterBlock
is a
sized type that represents a register block in MMIO memory. Therefore all
instances of &any::RegisterBlock
are references (pointers) into MMIO memory
and will run into the bug. The logical solution is to replace the concept of
RegisterBlock
with something else.
This section explores two alternative implementations.
One approach is to turn RegisterBlock
into a zero sized type (ZST):
Registers
(*). This Registers
structure will contain more ZSTs that
represent the registers of a peripheral. Each of these ZST registers will
implement today's API by directly calling ptr::{read,write}_volatile_
on a
raw pointer -- this way references to MMIO memory are never created. The code
below illustrates the idea:
(*) we don't really have to change the name of the type in the actual implementation but using different names makes the change easier to explain.
struct USART1 { _not_sync: PhantomData<*const ()> }
pub mod usart {
pub struct Registers {
pub CR: CR,
pub DR: DR,
}
pub struct CR { _not_send_or_sync: PhantomData<*const ()> }
impl CR {
// simplified so it doesn't match the current svd2rust API
pub fn read(&self) -> u32 {
unsafe { ptr::read_volatile(0x4000_0000 as *const u32) }
}
}
impl DR {
pub fn read(&self) -> u16 {
unsafe { ptr::read_volatile(0x4000_0004 as *const u16) }
// ^ ^^
}
}
}
In today's API one can write a generic function that works with peripherals
USART1 and USART2 using the following trait bound: U: Deref<Target = usart::RegisterBlock>
. Alternatively, one can use a concrete function that
takes an argument of type &usart::RegisterBlock
; this function works with both
&USART1
and &USART2
because they implement the Deref
trait.
fn generic<U>(usart: U) where U: Deref<Target = usart::RegisterBlock> {
usart.dr.read();
}
fn concrete(usart: &usart::RegisterBlock) {
usart.dr.read();
}
fn pass_usart1(usart: USART1) {
concrete(&usart); // (A)
generic(usart);
}
fn pass_usart2(usart: USART2) {
// ^
concrete(&usart); // (B)
generic(usart);
}
The ability to write the concrete
function is lost when moving to the ZST
approach. The reason is that under this approach &usart::Registers
is also a
ZST; in contrast, in the current implementation &usart::RegisterBlock
is a
pointer into MMIO memory so in line (A) we are passing the pointer 0x4000_0000
to concrete
and in line (B) we are passing the pointer 0x4001_0000
.
Under the ZST approach it's possible to write something like the generic
function but the implementation needs to become more complex:
trait Peripheral {
const BASE_ADDRESS: usize;
}
pub type USART1 = usart::Registers<usart::_1>;
pub type USART2 = usart::Registers<usart::_2>;
mod usart {
struct _1;
struct _2;
// repeat for `_2` but with address = `0x4000_1000`
impl Peripheral for _1 {
const BASE_ADDRESS: usize = 0x4000_0000;
}
pub struct Registers<P>
where
P: crate::Peripheral,
{
pub CR: CR<P>,
pub DR: DR<P>,
}
pub struct CR<P>
where
P: Peripheral,
{
_p: PhantomData<P>,_
_not_send: PhantomData<*mut ()>,
}
// repeat for `DR<P>` but with `OFFSET=0x4` and `u16` instead of `u32`
impl<P> CR<P>
where
P: Peripheral,
{
const OFFSET: usize = 0x0;
pub fn read(&self) -> u32 {
unsafe {
ptr::read_volatile((P::BASE_ADDRESS + Self::OFFSET) as *const u32)
}
}
}
impl<P> Registers<P> {
pub(crate) unsafe fn new() -> Self {
Registers {
CR: CR { _p: PhantomData, _not_send: PhantomData_},
// ..
}
}
}
}
pub struct Peripherals {
pub USART1: USART1,
pub USART2: USART2,
}
impl Peripherals {
pub fn take() -> Option<Self> {
// omitted: runtime check
unsafe {
Some(Peripherals {
USART1: usart::Registers::new(),
USART2: usart::Registers::new(),
})
}
}
}
Then generic code will look like this:
fn generic_zst<P>(usart: USART<P>) where P: Peripheral {
usart.dr.read();
}
Furthermore, this ZST implementation with generics changes the signatures of
registers types: CR<_1>
is the control register of USART1 and CR<_2>
is the
control register of USART2. These two types are ZST so the exact address of the
register must be stored in the type system. In today's implementation, the &CR
value is a pointer, that points to the particular register (0x4000_0000
or
`0x4000_1000**).
In summary, the breaking changes specific to this approach are:
-
You can't write a
concrete
function that works with all instances of a peripheral. -
Generic functions that work with all instances of a peripheral need to be modified (
generic
->generic_zst
). -
The types of registers changes from a concrete type (
today::CR
) to a generic type (zst::CR<P>
). Note that registers of peripherals that have only a single instance do not need to be turned into generic types.
Another approach is to not use ZSTs but instead store the address of each
register "on the stack" using the VolAddress
type. The code below
illustrates the idea:
pub struct USART1(usart::Registers);
pub struct USART2(usart::Registers);
impl Deref for USART1 {
type Target = usart::Registers;
fn deref(&self) -> &Self::Target {
&self.0
}
}
mod usart {
pub struct Registers {
// simplified; these should be newtypes
pub CR: VolAddress<u32>,
pub DR: VolAddress<u16>,
}
impl Registers {
pub(crate) unsafe fn new(base_address: usize) -> Self {
Registers {
CR: VolAddress::new(base_address + 0x0),
DR: VolAddress::new(base_address + 0x4),
}
}
}
}
pub struct Peripherals {
pub USART1: USART1,
pub USART2: USART2,
}
impl Peripherals {
pub fn take() -> Option<Self> {
// omitted: runtime check
unsafe {
Some(Peripherals {
USART1: USART1(usart::Registers::new(0x4000_0000)),
USART2: USART2(usart::Registers::new(0x4000_1000)),
})
}
}
}
This approach has no specific breaking changes. The concrete
and generic
that one can write today will continue to work under this alternative
implementation. The register type does not change in any visible way when
switching to the VolAddress
approach.
The VolAddress
approach is not zero cost in comparison to today's
implementation and to the ZST approach. This can be observed when a peripheral
type (or an abstraction that contains a peripheral) is stored in a static
variable. This pattern is commonly used in RTFM applications (peripherals are
used as late resources) but also appears in cortex-m-rt
-only applications
that use the Mutex<RefCell<Option<_>>>
pattern.
The following program demonstrates the non-zero-cost-ness.
// a late resource
// (or this could be `static _: _Mutex<RefCell<Option<usart::Registers>>>`)
static mut USART1: MaybeUninit<usart::Registers> = MaybeUninit::uninit();
#[entry]
fn main() -> ! {
// simulate the initialization of a late resource
unsafe {
USART1
.as_mut_ptr()
.write(usart::Registers::new(0x4000_0000));
}
loop {}
}
#[exception]
fn SVCall() {
unsafe {
(*USART1.as_ptr()).DR.read();
}
}
#[exception]
fn SysTick() {
unsafe {
(*USART1.as_ptr()).CR.read();
}
}
Disassembly (arm-none-eabi-objdump
):
00000400 <SVCall>:
400: f240 0000 movw r0, #0
404: f2c2 0000 movt r0, #8192 ; 0x2000
408: 6840 ldr r0, [r0, #4]
40a: 8800 ldrh r0, [r0, #0]
40c: 4770 bx lr
0000040e <SysTick>:
40e: f240 0000 movw r0, #0
412: f2c2 0000 movt r0, #8192 ; 0x2000
416: 6800 ldr r0, [r0, #0]
418: 6800 ldr r0, [r0, #0]
41a: 4770 bx lr
Binary size (arm-none-eabi-size
):
app :
section size addr
(..)
.bss 0x8 0x20000000
(..)
static mut USART1: MaybeUninit<USART1> = MaybeUninit::uninit();
// ^^^^^^
#[entry]
fn main() -> ! {
unsafe {
USART1.as_mut_ptr().write(USART1::steal());
// ^^^^^^^^^^^^^^^
}
loop {}
}
// SVCall and SysTick don't change
Disassembly (arm-none-eabi-objdump
):
00000400 <SVCall>:
400: 2004 movs r0, #4
402: f2c4 0000 movt r0, #16384 ; 0x4000
406: 8800 ldrh r0, [r0, #0]
408: 4770 bx lr
0000040a <SysTick>:
40a: f04f 4080 mov.w r0, #1073741824 ; 0x40000000
40e: 6800 ldr r0, [r0, #0]
410: 4770 bx lr
Binary size (arm-none-eabi-size
):
app :
section size addr
(..)
.bss 0x0 0x20000000
(..)
The ZST approach produces the same machine code as the current implementation.
The difference boils down to voladdress::USART1
not being a ZST. This causes
the register pointers (VolAddress
values) to be stored in static memory; this
can use hundred of bytes of SRAM for peripherals that have many registers (each
register uses 4-byte of memory on a 32-bit device). Apart from the memory usage,
each register has indirection: the register address is first loaded from static
memory.
It is not possible to implement the RegisterBlock::ptr
API under any of the
alternative implementations. This API creates a raw pointer into the register
block, and thus into MMIO memory. This action by itself does not trigger the bug
(rust-lang/rust#55005) but in both alternative implementations the Registers
struct that replaces RegisterBlock
no longer represents MMIO memory so *mut Registers*
is not equivalent to *mut RegisterBlock
. In the ZST approach,
&Registers
is also a ZST; and in the VolAddress
approach, &Registers
is a
pointer into some stack-allocated value.
An alternative is to keep the svd2rust
API as it is and hope for
rust-lang/rust#55005 to be fixed upstream (e.g. by removing the deferenceable
attribute from &UnsafeCell<T>
when rustc
lowers it to LLVM-IR).
Three options have been explored in order to address bug rust-lang/rust#55005.
-
Keep the current implementation and hope the bug is fixed soon.
-
Re-implement the
svd2rust
API using the ZST approach. This approach is as zero cost as the current implementation but causes 4 breaking changes in the public API. -
Re-implement the
svd2rust
API using theVolAddress
approach. This approach only causes one breaking change but it's not zero cost: it increases memory usage and reduces performance in a common use case.
If it is decided to use an alternative implementation it is also advisable to
deprecate the vcell
and volatile-register
crates and point current direct
users of those crates to the alternatives presented in this document.