Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe Transmute RFC #5

Merged
merged 61 commits into from
Aug 31, 2020
Merged

Safe Transmute RFC #5

merged 61 commits into from
Aug 31, 2020

Conversation

jswrenn
Copy link
Member

@jswrenn jswrenn commented Aug 3, 2020

Rendered

For a (quicker) summary of the proposed API surface, see this rustdoc.

@jswrenn

This comment has been minimized.

rfcs/0000-safe-transmute.md Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
typo fixes from @jonas-schievink

Co-authored-by: Jonas Schievink <[email protected]>
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
#### Stability & Transmutation
A `Src` type is *stably* transmutable into a `Dst` type *only if* `<Src as PromiseTransmutableInto>::Archetype` is transmutable, stability notwithstanding, into `<Dst as PromiseTransmutableFrom>::Archetype`; formally:
```rust
unsafe impl<Src, Dst> TransmuteFrom<Src> for Dst
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... you mention TransmuteFrom here which you've hinted at above but never formally introduce, and you never explicitly state that this will be implemented by the compiler on behalf of the user automatically.


The type `<Src as PromiseTransmutableInto>::Archetype` exemplifies the furthest extreme of non-breaking changes that could be made to the layout of `Src` that could affect its use as a source type in transmutations. Conversely, `<Dst as PromiseTransmutableFrom>::Archetype` exemplifies the furthest extreme of non-breaking changes that could be made to the layout of `Dst` that could affect its use as a destination type in transmutations. If a transmutation between these extremities is valid, then so is `Src: TransmuteInto<Dst>`.

#### Common Use-Case: Promising a Fixed Layout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mention this before but I would really love a short-hand for types that implement PromiseTransmutableFrom<Archetype=Self> and PromiseTransmutableInto<Archetype=Self> - something like FixedLayout.

This would also allow us to introduce FixedLayout first if we wanted to let that use case bake before we allowed flexibly declaring stability guarantees.

Copy link
Member Author

@jswrenn jswrenn Aug 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rylev I'm unsure how useful this shorthand would be.

For end-users, I anticipate that PromiseTransmutableFrom and PromiseTransmutableInto will virtually never appear outside of the context of implementing those traits. In writing the examples for this RFC, I haven't yet encountered an instance where I needed to use either of those traits in a where bound. So, I don't think this shorthand would be useful outside of the context of implementing the traits.

And, within the context of implementing those traits, nearly all users will be using the shorthand #[derive(PromiseTransmutableFrom, PromiseTransmutableInto)]. So here, #[derive(FixedLayout)] would be a shorthand for a shorthand.

Is there a use-case you have in mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short-hand for short-hand is not necessarily a bad thing 😄 . I believe a vast majority of the time that users will want #[derive(PromiseTransmutableFrom, PromiseTransmutableInto)]. Given the long names and the frequency of its use #[derive(FixedLayout)] strikes me as an ergonomic choice. But I realize that it's not super elegant to introduce a new trait just as a short-hand for deriving.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's a better trait name for sure. People will understand it more when looking at docs and source.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rylev Defining a trait is rather tricky (since those derives are a little more nuanced than just Archetype = Self to account for field stability declarations), but it's very technically feasible to provide a derive(PromiseTransmutableFromAndInto) macro that would be equivalent to derive(PromiseTransmutableFrom, PromiseTransmutableInto)without a PromiseTransmutableFromAndInto trait.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining a trait is rather tricky (since those derives are a little more nuanced than just Archetype = Self to account for field stability declarations)

To clarify a little more about why having a FixedLayout shorthand/trait is tricky, it's because the concept of "fixed layout" is a little tricky.


First, there's the name: PromiseTransmutableFrom isn't really making a promise about layout, it's making a promise about future transmutability. Transmutability involves factors beyond layout, like constructability. If we wanted to create a derive shorthand for derive(PromiseTransmutableFrom, PromiseTransmutableInto), the name PromiseFixedLayout (or PromiseStableLayout, etc...) misrepresents what's actually being promised.


Second, there's the semantics: what does promising a fixed layout mean? You're probably mean to promise that you won't make changes to the type's transmutability, but what about others? What if your type's fields come from other crates? What if your type's fields are generic?

The behavior of #[derive(PromiseTransmutableFrom, PromiseTransmutableInto)] reflects this inherent complexity: it doesn't mean your type has totally-assured-transmutability, it means your type has as-assured-as-possible-transmutability. Concretely: When you write #[derive(PromiseTransmutable{From,Into})], your type inherits the instability of its fields. If its fields have strong stability promises, so too will your type; if its fields have weak stability promises, so too will your type.


To conclude, I don't think there's a good mapping of a mem-markers-style FixedLayout abstraction onto PromiseTransmutableFrom and PromiseTransmutableInto. While it might seem simpler, at first glance, to just present one trait (FixedLayout) instead of two (PromiseTransmutableFrom and PromiseTransmutableInto), the concept of a FixedLayout isn't well-defined, and the complexities of PromiseTransmutable{From,Into} reflect the complexity inherent to the problem of transmutation stability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm arguing here is that there is a use case for "POD" that will be quite common. By "POD" in this case I mean types where the in memory-representation of that type is intrinsic to its functionality and changing that would be a breaking change. This is common in the parsing scenario for instance. Having some sort of short hand for saying this type's layout/alignment/etc cannot change would be a simple "happy path" for many use cases. I'm not arguing this as a replacement to PromiseTransmutable{From,Into} but as a compliment. This doesn't have to be a derive either. I can imagine also #[repr(C, fixed)] for instance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per a conversation with @rylev, ed172fe proposes a #[derive(PromiseTransmutable)] ergonomic extension. I'm fairly convinced that adding such a shorthand is the right thing to do and I refer to it in the Guide-Level Explanation, but it's formally defined as an Extension because it's technically non-central to the RFC and poses a few unusual design quirks.

rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved

The question is, then: *how can the author of a type reason about transmutations they did not write, from-or-to types they did not write?* We address this problem by introducing two traits which both allow an author to opt-in to stability guarantees for their types, and allow third-parties to reason at compile-time about what guarantees are provided for such types.

#### `PromiseTransmutableFrom` and `PromiseTransmutableInto`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about these names? Open to bike shedding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fond of these names because Promise reflects that the user implements these traits to make a promise, and Transmutable{From,Into} corresponds exactly to what they're promising. I'm quite open to bikeshedding, though!

@rylev
Copy link
Member

rylev commented Aug 4, 2020

First, thank you @jswrenn. This is an amazing document, and I think you're really on to something here.

In general, I'm a bit concerned that the flexibility brings a lot of the complexity forward when the user might often need to think about the complexity. That's what I suggested the #[derive(FixedLayout)] because I imagine this is the use-case for a large proportion of users (any change to the type would be considered a breaking change). I don't have any additional suggestions to help with this beyond good documentation that starts with the presumed common case and then expands to the more advanced use case where the flexibility is needed.

Beyond this, I'm wondering if validity currently captures two distinct ideas and whether it's worth it to treat these as separate. As far as I can tell validity captures both the idea of in memory values being valid representations of a type (e.g., 3u8 is not a valid representation of bool) as well as whether the value is a valid instance of the type given the current context. For instance, given a pointer wrapper type struct MyPointer(*const T), a usize is always valid to transmute to that type. In other words, all possible values of a usize could be a valid value of MyPointer, but a given usize might not be valid given a specific context (namely that a valid T doesn't reside at the memory address that the usize represents). Is it worth separating these ideas? Is it useful to say that usize -> MyPointer is always a valid transmute, but might not be a correct one?

rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
@jswrenn
Copy link
Member Author

jswrenn commented Aug 4, 2020

As far as I can tell validity captures both the idea of in memory values being valid representations of a type (e.g., 3u8 is not a valid representation of bool) as well as whether the value is a valid instance of the type given the current context.

@rylev I think the distinction you're referring to is captured by the distinction the RFC draws between soundness and safety. usize -> MyPointer is a sound transmutation, because usize is a bit-valid instance of MyPointer. However, the resulting MyPointer value is not necessarily safe to use. The *const T field of MyPointer is private, and we should assume it's private because MyPointer imposes invariants on it that make it safe to dereference.

I actually like this MyPointer example quite a bit, because raw pointers are virtually always subjected to validity invariants by the types they appear in. I might change my example of implicit constructibility to instead use a type containing a raw pointer!

Copy link
Member Author

@jswrenn jswrenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
Copy link
Member Author

@jswrenn jswrenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
rfcs/0000-safe-transmute.md Outdated Show resolved Hide resolved
@JulianKnodt
Copy link

Just curious, why is the versioning baked into the type system? Is this not solved by Cargo and version locking? I think this was one of the largest complexities I found while reading the proposal

@jswrenn
Copy link
Member Author

jswrenn commented Aug 28, 2020

Just curious, why is the versioning baked into the type system? Is this not solved by Cargo and version locking? I think this was one of the largest complexities I found while reading the proposal

@JulianKnodt This is a great question!

Why do we need stability?

We need a stability system for at least two reasons:

  1. The usual rules SemVer stability dictates that if a trait is implemented in version m.a.b, it'll continue to be implemented for all versions m.x.y, where x ≥ a and y ≥ b. TransmuteFrom<Src, NeglectStability> is the exception to this rule. I think the drawbacks of introducing exceptional behavior are small compared to the drawbacks of not solving safer transmutation, but it would be irresponsible to do nothing to mitigate this stability hazard. The compromise of this RFC is that TransmuteFrom should be stable-by-default: Dst: TransmuteFrom<Src> follows the usual SemVer rules; Dst: TransmuteFrom<Src, NeglectStability> does not.
  2. I believe we'll need to use the simplified definition of constructability (or it will be a very long time before safer transmutation is realized), but that definition has a soundness hole. We have three options:
    • Pretend it doesn't exist. I don't view this as a real option.
    • Give up on completely safe transmutation; only offer unsafe transmutation. This option fails to remove any unsafe blocks from end-users code.
    • Allow safe transmutation only when the type authors have promised they're not creating a situation where this soundness hole arises. This is the option the RFC recommends, and stability declaration provides a natural mechanism for making this promise.

Why is stability so complicated?

I go into a ton of depth exploring the implications and design rationale behind stability, but I think it might be one of the simpler parts of the RFC. Whereas ensuring soundness and safety requires non-trivial compiler support, stability doesn't—it's just two normal traits and an impl:

pub trait PromiseTransmutableFrom
{
    type Archetype
        : TransmuteInto<Self, NeglectStability>
        + PromiseTransmutableFrom;
}

pub trait PromiseTransmutableInto
{
    type Archetype
        : TransmuteFrom<Self, NeglectStability>
        + PromiseTransmutableInto;
}

unsafe impl<Src, Dst> TransmuteFrom<Src, ()> for Dst
where
    Src: PromiseTransmutableInto,
    Dst: PromiseTransmutableFrom,

    <Dst as PromiseTransmutableFrom>::Archetype:
        TransmuteFrom<
            <Src as PromiseTransmutableInto>::Archetype,
            NeglectStability>
{}

That's the entire stability system!

In short: I think the depth I go into when documenting stability might make it seem more complicated than it is. I'm increasingly thinking that removing the in-depth explanations of stability from the RFC might be a good idea.


UPDATE: I've incorporated this answer into the RFC. I've also removed the 'Dissecting Stability' and 'Uncommon Use-Case: Weak Stability Guarantees' sections. If anyone has objections, please let me know.

@jswrenn
Copy link
Member Author

jswrenn commented Aug 31, 2020

The RFC is ready to formally submit! I'm going to merge this PR. I'll file a PR against rust-lang/rfcs either tonight or tomorrow morning.

@jswrenn jswrenn merged commit 038f193 into rust-lang:master Aug 31, 2020
jswrenn added a commit to jswrenn/rfcs that referenced this pull request Aug 31, 2020
@jswrenn
Copy link
Member Author

jswrenn commented Sep 1, 2020

RFC submitted!

// | ^^^^^^^^^^^^^^ the trait `TransmuteFrom<foo::Foo, _>` is not implemented for `u32`
// |
// = note: required because of the requirements on the impl of `TransmuteInto<u32, _>` for `foo::Foo`
// = note: byte 8 of the source type may be uninitialized; byte 8 of the destination type cannot be uninitialized.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this say "bit 8" instead of "byte 8" ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.