-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is the safety invariant, if any, for unions? #352
Comments
This isn't quite correct. Unless This isn't an unspecified safety invariant. It is the baseline safety invariant, of "only use safe code." This is a vaguely operational rule rather than an axiomatic representational one, but it is the baseline for every Rust type.
It wouldn't create new UB; any existing codepath would remain UB-free. What it does do is make code which was potentially sound previously now unsound (able to produce UB from safe inputs). I'm obviously in favor of (1) +
Making it safe to construct an uninitialized means that But more than anything else, I think a pub |
Your sentence is cut off here. |
Whoops, that's what I get for writing these comments out-of-order. Finished my thought there. |
To clarify, by "trivial safety invariant", I mean the invariant of |
This seems reasonable; I changed my comment to use "baseline safety requirement" to refer to the nontrivial requirement put onto all Rust code to not do unsound |
Thanks. With this clarification I definitely wrote my comment up wrong. I will edit it to capture the distinction I was trying to make. |
Now to actually respond...
I had to think about why this concept didn't quite sit right to me, and I've figured it out. This leads to me agreeing with CAD97 (and by extension RalfJung), but I'll take the time to explain why mostly for the benefit of others who thought similarly, and for the benefit of documentation writers trying to prevent it. I think it's largely because of the following intuitive heuristic:
Trying to fit unions into this framework goes horribly. It's easy for people, starting from this intuition, end up on a logical journey. (For illustrative purposes only):
That's almost worse 😂. More chance of the problem going undetected for longer.
There is no default construction option for a union at present. That was a spitball idea that we discussed briefly, but Ralf is opposed and I am not proposing it. Someone else can spend their time on that if they really want.
A false peace? Who is the war between? 😛 But yes, it really is. Such a type is almost completely useless. Which is probably another thing that affects heuristic reasoning: if I have access to the field, surely I'm meant to do something with it? |
[digs through the old code bin] I dunno, I've got this thing here that's been sitting in a lib for ages and the API seems pretty clear to me: #[allow(non_camel_case_types)]
#[repr(C, align(16))]
#[rustfmt::skip]
union ConstUnionHack128bit {
f32a4: [f32; 4],
f64a2: [f64; 2],
i8a16: [i8; 16],
i16a8: [i16; 8],
i32a4: [i32; 4],
i64a2: [i64; 2],
u8a16: [u8; 16],
u16a8: [u16; 8],
u32a4: [u32; 4],
u64a2: [u64; 2],
f32x4: f32x4,
f64x2: f64x2,
i8x16: i8x16,
i16x8: i16x8,
i32x4: i32x4,
i64x2: i64x2,
u8x16: u8x16,
u16x8: u16x8,
u32x4: u32x4,
u64x2: u64x2,
u128: u128,
} The union and fields happen to not be |
True, but this also seems like a perfect example of a (assuming it is as self-evident as we think it is, and we're both imagining similar usage for it) |
This is a somewhat odd discussion because generally speaking, safety invariants are defined by the author of the respective type. That's, like, their entire purpose. On the UCG / lang-team side, we should not be in the business of defining other people's safety invariants. That said, there is somewhat of a "default safety invariant" that justifies all the operations which are provided "by default" for such a type. For a struct, that is simply the safety invariant of all its fields. Crate authors can strengthen and/or weaken this as they see fit. However, as you noted, you have to be careful that the public API surface is still justified by the safety invariant -- so if you have a We never had to spell any of this out because it is somewhat obvious -- if you don't use For unions, the only default safe operations we provide are
That's not a lot, and hence the "default safety invariant" can indeed be trivial, i.e., However the general sentiment seems to be that we cannot do that? I think at this point we are entering the area of libs API guidance -- we are basically defining, if the crate author fails to document the safety invariant of their I could get behind a way to add an attribute that explicitly declares a safety invariant and also interacts with safe transmute things. I don't think anything like that should happen by default though. Basically, adding So, I think I am in full agreement with @CAD97. :) Option 1 in the original post, plus maybe an attribute for option 3iii, if it indeed has sufficient motivation. |
Now it's my turn to point out that your thought got cut off :) as well as it looks like markdown may have eaten some context in that section as well |
I have tried to document everything I believe we have consensus on. I've left some things open that I possibly could have closed, but because this PR is very big, I would like to focus on getting it in as quickly as possible and worrying about whatever's left aftwards. I strongly encourage others to submit follow up PRs to close out the other open issues. Closes rust-lang#156. Closes rust-lang#298. Closes rust-lang#352.
I have tried to document everything I believe we have consensus on. I've left some things open that I possibly could have closed, but because this PR is very big, I would like to focus on getting it in as quickly as possible and worrying about whatever's left aftwards. I strongly encourage others to submit follow up PRs to close out the other open issues. Closes rust-lang#156. Closes rust-lang#298. Closes rust-lang#352.
Per #73, discussion is converging on unions having no validity invariant, i.e., any bit-pattern is valid for a union type, including completely uninitialized memory. In safe Rust, however, not all valid bit-patterns can necessarily be created, because unions are checked for initialization:
Because union values cannot be consumed in safe Rust, however, we find ourselves needing to decide which, if either, of the following two functions are unsound:
At least one must be unsound, because another crate outside the trust boundary of either of them can call
producer::get_i(consumer::get_i())
which is clearly UB.To summarize the Zulip background, @RalfJung expressed the opinion that, by default, unions have an unspecified safety invariant and therefore, in the absence of clear documentation from the
definer
crate on a safety invariant, both theproducer
andconsumer
crate are unsound. Theproducer
should not create a value which cannot be created with safe Rust, and the `consumer crate cannot assume that the value has any properties.@Lokathor brought up that the safe transmute project is also interested in safe union field access in situations where all fields can be safely transmuted to one another. That is to say,
unsafe
would not be required to use a type likeunion { i: i32, f: f32 }
. Since this is trivially true of one-union fields, this would mean thatget_i
would be not only sound but actually safe.This requires a safety invariant, as clearly the uninitialized union would cause safe field access to be unsound. In reply, @RalfJung suggested that for unions he would simply have suggested the trivial invariant, i.e., all values are safe, including uninitialized ones, but that this safe transmutation would suggest an additional invariant. Note that if all unions have a trivial safety invariant, then the
producer
crate above would be sound.So what actually is the safety invariant? Some options:
producer
anddefiner
are unsound.producer
is unsound, butdefiner
is sound (and possibly later safe with compiler support), because safe code cannot produce the uninitialized value.unsafe
code, as it would require the coder to carefully think about whether the union has safe transmutation. It would also create a risk that changes to the union definition would silently create new UB (e.g. by adding a new field to a struct member of the union, which is normally not a breaking change).#[safe_transmute_union]
, which could allow an opt-in mutual transmutation safety invariant. Thus either 1 or 2 would apply by default, but the attribute would change the invariant.I am extremely partial to this last option because it makes it clear when a union does or does not support this type of invariant.
And I do not like the idea of unspecified safety invariants, so I would go with option 2 for the default.(edit: see below)I'm opening a separate issue about the offset of the field (which is why
#[repr(C)]
is required in the example).The text was updated successfully, but these errors were encountered: