Skip to content
This repository has been archived by the owner on Feb 7, 2021. It is now read-only.

correctly handle uninitialized bytes #10

Closed
wants to merge 3 commits into from

Conversation

RustyYato
Copy link
Contributor

fixes #8

@RustyYato
Copy link
Contributor Author

I have made 1 change: convert

struct Stowaway {
    storage: *mut T,
}

to

struct Stowaway {
    storage: MaybeUninit<*mut T>,
}

in order to correctly handle uninitialized bytes inside of T. The rest of the changes are just fallout from this change. In the fallout we have 2 breaking changes:

  • Stowaway::into_raw must be marked unsafe because we can't guarantee that it contains initialized bytes
  • stow must be marked unsafe for the same reason

We could provide a checked version that only returns a valid pointer if we T is SizeClass::Zero or SizeClass::Boxed and panics otherwise, but that can be done in a later PR.

@RustyYato
Copy link
Contributor Author

RustyYato commented Apr 8, 2020

One more note: while #3 seems to be fixed prior to this PR, I don't think it is sound. A perfectly valid implementation of ptr::write would be

fn write<T>(ptr: *mut T, value: T) {
    ptr.cast::<u8>().copy_from_nonoverlapping(&value as *const T as *const u8, std::mem::size_of::<T>());
    forget(value);
}

Since write doesn't guarantee to not copy padding bytes, you can't assume that, and padding bytes may be copied over in the future. This will also be fixed in this PR, but I left the tests related to #3 for now.

(testing with this implementation of ptr::write shows the unsoundness, as miri fails on the tests added due to #3)

@RustyYato
Copy link
Contributor Author

RustyYato commented Apr 8, 2020

After playing around with the tests from #3 some more, this type fails MIRI when you use stow or Stowaway::into_raw

#[repr(C)]
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
struct GapsTest {
    a: u8,
    _align: [usize; 0],
}

This is because (as I said above), Rust does not guarantee that paddings bytes won't be copied. And given that this type is pointer aligned, it uses a word-aligned store to move this type. This also copies all padding bytes (even though you try and zero them).

reorganize tests
remove faulty test from Lucretiel#3
@RustyYato
Copy link
Contributor Author

Because of this new finding, I removed the faulty tests from #3, and manually copied padding bytes when running MIRI (using the alternative implementation of ptr::write I described above), so that any new tests will have to abide by the rules.

@RustyYato
Copy link
Contributor Author

I left the MaybeUninit::zeroed() in case the type is smaller than usize, then the upper bytes will be zero, this should be fine as we can see exactly how large a type is and handle it accordingly. Padding bytes are different, because we have no way to control where they are and how they are handled.

@RustyYato
Copy link
Contributor Author

On a separate note: You should have a PhantomData<T> in Stowaway to tell Rust that you own a T. Also, instead of MaybeUninit<*mut T>, you can use MaybeUninit<*const T> to get a more lenient variance, (which is also the same variance as Box). *mut T is unnecessarily restrictive.

I can do both in a separate PR.

@RustyYato RustyYato changed the title correctly hande uninitialized bytes correctly handle uninitialized bytes Apr 8, 2020
@Lucretiel
Copy link
Owner

Thank you for all the work you've put into this; I'm going to look more into the precise semantics of ptr::read / ptr::write in the presence of uninit padding bytes to ensure that these changes make sense. I'm not sure I agreed with the MaybeUninit use case, since it's not clear to me that using a Stowaway<MaybeUninit> doesn't violate the unsafe contract of MaybeUninit::uninit. However, the example of a struct with padding bytes is much more compelling, so I'm going to check and see what the read/write functions promise about them.

In the meantime, a design question: if we don't allow uninitialized bytes when using into_raw, what's the point anymore of allowing them to be stored in a Stowaway in the first place? There's not really a reason for a Stowaway to exist unless someone plans on calling into_raw on it, so would it make more sense to move the unsafe from into_raw to new (requiring people to only ever create Stowaways out of types without padding bytes in the first place), and get rid of the inner MaybeUninit on storage?

@RustyYato
Copy link
Contributor Author

RustyYato commented Apr 8, 2020

In the meantime, a design question: if we don't allow uninitialized bytes when using into_raw, what's the point anymore of allowing them to be stored in a Stowaway in the first place?

You can use Stowaway as a small box implementation (similar to small-vec), regardless of padding bytes. Making new unsafe would not allow this usecase.

since it's not clear to me that using a Stowaway doesn't violate the unsafe contract of MaybeUninit::uninit

MaybeUninit::uninit is a safe function, so there is no unsafe contract to uphold there. However, MaybeUninit::assume_init is unsafe. This is where things went wrong before, the usage of MaybeUninit::assume_init was incorrect, due to the precense of uninitialized bytes.

So all together, my Stowaway<MaybeUninit<usize>> test demonstrated an unsound new function, because of the incorrect call to assume_init inside of new.

@Lucretiel
Copy link
Owner

And to be clear, this happens because ptr::write is allowed to overwrite initialized bytes with uninitialized bytes? That is, when ptr::write copies bytes into a fully initialized buffer, it preserves their uninitialized-ness (as opposed to overwriting existing initialized bytes with "initialized but unknown" bytes)

@RustyYato
Copy link
Contributor Author

RustyYato commented Apr 8, 2020

When you call ptr::write it is allowed to overwrite the next size_of::<T>() following the pointee with the given value. It could:

  • overwrite with initialized bytes
  • overwrite with uninitialized bytes
  • not overwrite (only the case with padding bytes, but being a padding byte doesn't mean that it will be omitted as seen in the over-aligned byte case)

But if it does overwrite the a byte, and that byte was initialized, then the new byte will be the same as the corrosponding byte in the value

@RustyYato
Copy link
Contributor Author

RustyYato commented Apr 8, 2020

I think an example would be helpful,

// assuming 32-bit for brevity
// assume (u8, u16) is laid out like [init, padding, init, init]

// *ptr starts as all zero
// *ptr == [0, 0, 0, 0]

write(ptr, (0_u8, 0xEFAB_u16));

// *ptr could be in one of the following
// [0, uninit, 0xEF, 0xAB] could be uninit because of padding
// [0, 0, 0xEF, 0xAB] or padding could have been omitted

@Lucretiel
Copy link
Owner

A perfectly valid implementation of ptr::write would be: ...

Doesn't this implementation read uninitialized bytes? Doesn't doing that immediately trigger undefined behavior?

@RustyYato
Copy link
Contributor Author

RustyYato commented Apr 9, 2020

It's fine to memcopy uninitialized bytes, because memcpy is special (and is working with raw pointers). Note the saftey section for copy_nonoverlapping says nothing about uninitialized bytes, only the validity of the pointer.

In the example in the docs, the manually implemented Vec::append using copy_nonoverlapping, if the type T is MaybeUninit<_> and it was UB to copy uninit bytes, that example would be unsound.

@RustyYato
Copy link
Contributor Author

@Lucretiel just pinging to check on progress, can this be merged or do you need more time/information to review this?

@Lucretiel
Copy link
Owner

Lucretiel commented Apr 10, 2020

No updates; I'm mostly trying to find some more authoritative information on the current rules regarding padding bytes, and researching if there's ways to detect / force Rust to not copy over padding bytes, even if it means adding some kind of derive or an unsafe trait to types you wish to stow.

I admit I'm biased here; the changes as proposed would permanently break cooked-waker, because cooked-waker promises to be able to safely turn any struct into a waker, backed by stowaway. I'm therefore considering something like this:

// Implement this trait for any type which is known
// to never contain uninitialized memory or padding
// bytes. Implemented for stdlib types.
//
// Possibly derivable , conditional on
// (sizeof(T) == sum(sizeof(fields)) && fields: Stowable
unsafe trait Stowable {};

fn new<T: Stowable>(value: T) -> Stowaway<T>

Your original bug report (#8) specifically called out Stowaway<MaybeUninit> as being unsound. Did you have a particular use case for this pattern that you'd like Stowaway to support, or was that simply a demonstration of how to create UB?

@RustyYato
Copy link
Contributor Author

RustyYato commented Apr 10, 2020

That was just a demonstration, however, I don't think that it was unreasonable. For example, you could put a once-cell inside a stowaway, once-cell with parking_lot locks is small enough to be inline, and contains a MaybeUninit.

Moreover relying on the behavior of write wrt padding bytes is always unsound. One notable example I'veseen is in crossbeam's AtomicCell, see

crossbeam-rs/crossbeam#388

crossbeam-rs/crossbeam#315

@RustyYato
Copy link
Contributor Author

I admit I'm biased here; the changes as proposed would permanently break cooked-waker, because cooked-waker promises to be able to safely turn any struct into a waker, backed by stowaway

Yes, that is unfortunate. Your Stowable trait should be fine, but this will break cooked-wakers.

@Lucretiel
Copy link
Owner

Have to go to bed, will review in the morning. Thanks for the discussions from crossbeam, that's very helpful.

@RustyYato
Copy link
Contributor Author

Oh, I just realized something! If you implement Stowable for Box<T> and Arc, for all T, you don't loose any of the power of Stowaway in cooked-waker. For those who care about safety, they can allocate either a Box or a Arc and be on their way, for those who care about performance they can opt into the unsafe Stowable trait and ensure that they don't have any uninit bytes!

This way you can still support all types, although it will be suboptimal, or you can opt-in to the optimal way with a small amount of easy to check unsafe. This means cooked-waker is still valuable.

@Lucretiel
Copy link
Owner

Yup, that was exactly my point 😀. You wrote it out before I could even get to it this morning. Would you mind if I asked you to file a separate pull request with just the failing tests while I review this solution vs an unsafe trait?

@Lucretiel
Copy link
Owner

Feel free to add the variant, UB-detecting write, as well.

@RustyYato
Copy link
Contributor Author

Ok, I'll do that!

@RustyYato
Copy link
Contributor Author

because of #11, I'll close this. Thanks for being so responsive to this issue @Lucretiel.

@RustyYato RustyYato closed this Apr 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

small uninitialized types are unsound
2 participants