Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix undefined behavior identified by Miri #280

Merged
merged 4 commits into from
May 2, 2022

Conversation

jgallagher
Copy link
Contributor

Hi! We ran into an exception triggered by new undefined behavior checks inserted into the nightly compiler (https://github.com/rust-lang/rust/pull/92686/files#diff-54110dcedc5a4d976321aa5d2a6767ac0744a3ef1363b75ffc62faf81cf14c30R230-L229). Running heapless's test suite under Miri didn't flag anything at first, but it did once we added MIRIFLAGS="-Zmiri-tag-raw-pointers". All three of the fixes in this PR were identified via

MIRIFLAGS="-Zmiri-tag-raw-pointers -Zmiri-ignore-leaks" cargo +nightly miri test -- --skip pool::

and the fixes came from copying the implementations from the equivalent methods in std. Note that I skipped the pool:: tests; there is at least one miri failure in them, but it wasn't immediately obvious how to fix it so I skipped it for now. It's probably worth adding the flag above to the CI miri run, but I didn't do that either (since it would immediately cause failures given I didn't fix the problem in pool).

The specific output for pool is

test pool::singleton::tests::sanity ... error: Undefined Behavior: trying to reborrow <untagged> for SharedReadWrite permission at alloc36[0x1], but that tag does not exist in the borrow stack for this location
   --> /home/john/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/non_null.rs:380:18
    |
380 |         unsafe { &*self.as_ptr() }
    |                  ^^^^^^^^^^^^^^^
    |                  |
    |                  trying to reborrow <untagged> for SharedReadWrite permission at alloc36[0x1], but that tag does not exist in the borrow stack for this location
    |                  this error occurs as part of a reborrow at alloc36[0x1..0x9]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information

    = note: inside `std::ptr::NonNull::<pool::stack::Node<u8>>::as_ref` at /home/john/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/non_null.rs:380:18
note: inside `pool::stack::Stack::<u8>::push` at src/pool/cas.rs:43:17
   --> src/pool/cas.rs:43:17
    |
43  | /                 new_head
44  | |                     .as_raw()
45  | |                     .as_ref()
    | |_____________________________^
note: inside `pool::Pool::<u8>::grow` at src/pool/mod.rs:390:25
   --> src/pool/mod.rs:390:25
    |
390 |                         self.stack.push(p);
    |                         ^^^^^^^^^^^^^^^^^^
note: inside `<pool::singleton::tests::sanity::A as pool::singleton::Pool>::grow` at src/pool/singleton.rs:78:9
   --> src/pool/singleton.rs:78:9
    |
78  |         Self::ptr().grow(memory)
    |         ^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `pool::singleton::tests::sanity` at src/pool/singleton.rs:362:9
   --> src/pool/singleton.rs:362:9
    |
362 |         A::grow(unsafe { &mut MEMORY });
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside closure at src/pool/singleton.rs:353:5
   --> src/pool/singleton.rs:353:5
    |
352 |       #[test]
    |       ------- in this procedural macro expansion
353 | /     fn sanity() {
354 | |         const SZ: usize = 2 * mem::size_of::<Node<u8>>() - 1;
355 | |         static mut MEMORY: [u8; SZ] = [0; SZ];
356 | |
...   |
373 | |         assert_eq!(*A::alloc().unwrap().init(1), 1);
374 | |     }
    | |_____^
    = note: this error originates in the attribute macro `test` (in Nightly builds, run with -Z macro-backtrace for more info)

Copy link
Member

@japaric japaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR. changes look good to me.

the sanitizer tests are failing. I think it may be because rustc updated the dwarf it outputs. there's this in the output

/usr/bin/addr2line: DWARF error: invalid or unhandled FORM value: 0x25

I cannot repro locally but will investigate in the CI environment

@japaric
Copy link
Member

japaric commented Apr 29, 2022

the sanitizer tests are failing.

PR #283 fixes CI. this will need to wait until that one lands.

@korken89
Copy link
Contributor

korken89 commented May 1, 2022

#283 is merged now, @jgallagher could you rebase on master?

Copy link
Member

@japaric japaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks again

@japaric
Copy link
Member

japaric commented May 2, 2022

bors r+

@japaric
Copy link
Member

japaric commented May 2, 2022

bors merge

@japaric
Copy link
Member

japaric commented May 2, 2022

bors r+

@japaric
Copy link
Member

japaric commented May 2, 2022

bors ping

1 similar comment
@japaric
Copy link
Member

japaric commented May 2, 2022

bors ping

@bors
Copy link
Contributor

bors bot commented May 2, 2022

pong

@japaric
Copy link
Member

japaric commented May 2, 2022

bors merge

@bors
Copy link
Contributor

bors bot commented May 2, 2022

Build succeeded:

@bors bors bot merged commit 6877eed into rust-embedded:master May 2, 2022
@jgallagher jgallagher deleted the miri-tagged-raw-pointers branch May 2, 2022 13:43
@jgallagher
Copy link
Contributor Author

Thanks!

@jgallagher
Copy link
Contributor Author

jgallagher commented May 2, 2022

I read a bit more about Miri and the -Zmiri-tag-raw-pointers flag, and it's possible the remaining instance of undefined behavior it identifies is a false positive. The error message begins with

trying to reborrow <untagged> for SharedReadWrite permission at alloc36[0x1], but that tag does not exist in the borrow stack for this location

and the docs for -Zmiri-tag-raw-pointers state

You can recognize false positives by occurring in the message -- this indicates a pointer that was cast from an integer, so Miri was unable to track this pointer.

I tried enabling #![feature(strict_provenance)] and updating the int -> pointer casts to use the new APIs gated behind that feature, but got stuck in https://github.com/japaric/heapless/blob/6877eedfd4ff0c19f66b504cfe523e8d3f097884/src/pool/cas.rs#L95-L109 where we cast a pointer from a usize stored in a spin::Once. It's beyond my current level of understanding how (or even if) Miri can understand that.

@saethlin
Copy link
Contributor

👋

Since the rules in Rust aren't set, the right way to interpret Miri's errors are the interpreter reporting "This is UB with respect to the model you have requested with whatever combination of MIRIFLAGS you passed". This is why you tend to get more reports of UB with -Zmiri-tag-raw-pointers, you're asking for a stricter (but IMO more sensible) model.

I'll admit I don't fully understand this anchoring system, but I've poked around it enough to mostly grasp what's going on from Miri's perspective. First off, it's entirely possible to remove the ptr-int-ptr roundtrips from this code. As is very often the case, the key is to not store a usize, and to store some kind of pointer instead. In this case you need a Send + Sync pointer wrapper, annoying but just 4 lines of code.

The reason you need this is that pointers in Rust (and probably in C) have provenance, extra shadow state that identifies the allocation they are for. In C and Rust, integers do not have provenance. So a cast from a pointer to an integer loses data and thus cannot necessarily round-trip. Pointers definitely do not round-trip through integers in Stacked Borrows with raw pointer tagging. That's the error you're seeing above.

But in the existing Stacked Borrows with Untagged which is the default behavior in Miri, a cast from an integer to a pointer will check to see if there is an allocation at the address represented by the value of the usize, then assign that allocation's provenance to the new pointer: https://github.com/rust-lang/miri/blob/c4dd3f4ef98f8527aa652e8154ff044ca3e88455/src/intptrcast.rs#L139-L148

That is the crux of why this code works without raw pointer tagging and as far as I can tell, cannot work with raw pointer tagging: A single anchor needs to be able to hand out multiple provenances. If the anchor is adjusted to contain a pointer, it will always hand out pointers with the provenance of that pointer, and you will get errors of a different sort, about attempting to dereference a pointer way outside its allocation. The anchor will be initialized once for example in the test called grow then it will be re-used with a different allocation in grow_exact, and the pointers you get back will have the provenance of the MEMORY in the grow test when you're trying to use them with the MEMORY in the grow_exact test.

saethlin added a commit to saethlin/heapless that referenced this pull request Jun 18, 2022
The tweaks in rust-embedded#280 missed one instance of UB. The get_unchecked_mut
inside VacantEntry::Insert can be out of bounds of the initialized
region of the backing Vec. When that happens, the call is UB. This is
detected both by the standard library's debug assertions which can be
enabled with -Zbuild-std and with Miri but only with
-Zmiri-tag-raw-pointers.

This also adds inherent as_ptr and as_mut_ptr methods to Vec which
shadow those provided by the Deref to a slice. Without this shadowing,
this change doesn't actually fix the problem identified by the debug
assertions or Miri, it just hides it from the debug assertions. The core
problem is that references narrow provenance, so if we want to access
outside of the initialized region of a Vec we need to get a pointer to
the array without passing through a reference to the initialized region
first. The pointers from these shadowing methods can be used to access
anywhere in the allocation, whereas vec.as_slice().as_ptr() would be UB
to use for access into the uninitialized region.
saethlin added a commit to saethlin/heapless that referenced this pull request Jun 18, 2022
The tweaks in rust-embedded#280 missed one instance of UB. The get_unchecked_mut
inside VacantEntry::Insert can be out of bounds of the initialized
region of the backing Vec. When that happens, the call is UB. This is
detected both by the standard library's debug assertions which can be
enabled with -Zbuild-std and with Miri but only with
-Zmiri-tag-raw-pointers.

This also adds inherent as_ptr and as_mut_ptr methods to Vec which
shadow those provided by the Deref to a slice. Without this shadowing,
this change doesn't actually fix the problem identified by the debug
assertions or Miri, it just hides it from the debug assertions. The core
problem is that references narrow provenance, so if we want to access
outside of the initialized region of a Vec we need to get a pointer to
the array without passing through a reference to the initialized region
first. The pointers from these shadowing methods can be used to access
anywhere in the allocation, whereas vec.as_slice().as_ptr() would be UB
to use for access into the uninitialized region.
saethlin added a commit to saethlin/heapless that referenced this pull request Jun 18, 2022
The fixes in rust-embedded#280 missed one instance of UB. The get_unchecked_mut
inside VacantEntry::Insert can be out of bounds of the initialized
region of the backing Vec. When that happens, the call is UB. This is
detected both by the standard library's debug assertions which can be
enabled with -Zbuild-std and with Miri but only with
-Zmiri-tag-raw-pointers.

This also adds inherent as_ptr and as_mut_ptr methods to Vec which
shadow those provided by the Deref to a slice. Without this shadowing,
this change doesn't actually fix the problem identified by the debug
assertions or Miri, it just hides it from the debug assertions. The core
problem is that references narrow provenance, so if we want to access
outside of the initialized region of a Vec we need to get a pointer to
the array without passing through a reference to the initialized region
first. The pointers from these shadowing methods can be used to access
anywhere in the allocation, whereas vec.as_slice().as_ptr() would be UB
to use for access into the uninitialized region.
saethlin added a commit to saethlin/heapless that referenced this pull request Jun 18, 2022
The fixes in rust-embedded#280 missed one instance of UB. The get_unchecked_mut
inside VacantEntry::Insert can be out of bounds of the initialized
region of the backing Vec. When that happens, the call is UB. This is
detected both by the standard library's debug assertions which can be
enabled with -Zbuild-std and with Miri but only with
-Zmiri-tag-raw-pointers.

This also adds inherent as_ptr and as_mut_ptr methods to Vec which
shadow those provided by the Deref to a slice. Without this shadowing,
the change from get_unchecked_mut to as_mut_ptr.add wouldn't actually
fix the problem identified by the debug assertions or Miri, it just
hides it from the debug assertions. The core problem is that references
narrow provenance, so if we want to access outside of the initialized
region of a Vec we need to get a pointer to the array without passing
through a reference to the initialized region first. The pointers from
these shadowing methods can be used to access anywhere in the allocation,
whereas vec.as_slice().as_ptr() would be UB to use for access into the
uninitialized region.
bors bot added a commit that referenced this pull request Jun 20, 2022
300: Fix OOB get_unchecked, shadow Vec::as_ptr methods r=japaric a=saethlin

The fixes in #280 missed one instance of UB. The get_unchecked_mut
inside VacantEntry::Insert can be out of bounds of the initialized
region of the backing Vec. When that happens, the call is UB. This is
detected both by the standard library's debug assertions which can be
enabled with -Zbuild-std and with Miri but only with
-Zmiri-tag-raw-pointers.

This also adds inherent as_ptr and as_mut_ptr methods to Vec which
shadow those provided by the Deref to a slice. Without this shadowing,
the change from get_unchecked_mut to as_mut_ptr.add wouldn't actually
fix the problem identified by the debug assertions or Miri, it just
hides it from the debug assertions. The core problem is that references
narrow provenance, so if we want to access outside of the initialized
region of a Vec we need to get a pointer to the array without passing
through a reference to the initialized region first. The pointers from
these shadowing methods can be used to access anywhere in the allocation,
whereas vec.as_slice().as_ptr() would be UB to use for access into the
uninitialized region.

Co-authored-by: Ben Kimock <[email protected]>
@RalfJung
Copy link

RalfJung commented Jul 12, 2022

I read a bit more about Miri and the -Zmiri-tag-raw-pointers flag, and it's possible the remaining instance of undefined behavior it identifies is a false positive.

Yeah, using tag-raw-pointers and getting an "untagged" error basically always indicates a false positive. That's why tag-raw-pointers was not enabled by default. Miri has changed a lot the last month to eliminate those false positives. The entire notion of "untagged" pointers disappeared. Can you try again with the latest version of Miri?

Of course ideally we'd get the code to work with -Zmiri-strict-provenance, but that seems pretty hard. Sadly the code is rather sparsely commented and I have no clue what all this anchor business is about and how it makes any sense. Is there a design document somewhere describing the algorithm at a higher level?

@CAD97
Copy link

CAD97 commented Jul 12, 2022

(👋 you got mentioned in the Zulip thus the attention)

It should in theory be possible to fix anchored pointers to work under strict provenance, though it'll be a bit of work. The short version is that cas::Ptr<T> and cas::Atomic<T> need to be pointer types and contain the provenance, so that when converting from the anchor-relative pointer to the real pointer you can with_addr to use the provenance carried on the anchor-relative pointer. cas::Ptr<T> can be made to work by doing this, I think.

However, cas::Atomic<T> may make it impossible currently to be strict pointer compatible on 32 bit x86. (You'd think the anchor-relative pointers are the hard part, but...) This is because you're using a 64bit atomic CAS, and it needs to preserve provenance. There's currently no way to do this; you'd need AtomicPtr64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants