Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detail layout of repr(C) unions #160

Merged
merged 20 commits into from
Aug 15, 2019
Merged
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 76 additions & 10 deletions reference/src/layout/unions.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,6 @@ is already entirely determined by their types, and since we intend to allow
creating references to fields (`&u.f1`), unions do not have any wiggle-room
there.

### C-compatible layout ("repr C")

For unions tagged `#[repr(C)]`, the compiler will apply the C layout scheme. Per
sections [6.5.8.5] and [6.7.2.1.16] of the C11 specification, this means that
the offset of every field is 0. Unsafe code can cast a pointer to the union to
a field type to obtain a pointer to any field, and vice versa.

[6.5.8.5]: http://port70.net/~nsz/c/c11/n1570.html#6.5.8p5
[6.7.2.1.16]: http://port70.net/~nsz/c/c11/n1570.html#6.7.2.1p16

### Default layout ("repr rust")

**The default layout of unions is not specified.** As of this writing, we want
Expand All @@ -38,3 +28,79 @@ contents are.
Even if the offsets happen to be all 0, there might still be differences in the
function call ABI. If you need to pass unions by-value across an FFI boundary,
you have to use `#[repr(C)]`.

### Layout of "repr C" unions
gnzlbg marked this conversation as resolved.
Show resolved Hide resolved

The layout of `repr(C)` unions follows the C layout scheme. Per sections
[6.5.8.5] and [6.7.2.1.16] of the C11 specification, this means that the offset
of every field is 0. Unsafe code can cast a pointer to the union to a field type
to obtain a pointer to any field, and vice versa.

[6.5.8.5]: http://port70.net/~nsz/c/c11/n1570.html#6.5.8p5
[6.7.2.1.16]: http://port70.net/~nsz/c/c11/n1570.html#6.7.2.1p16

#### Padding
RalfJung marked this conversation as resolved.
Show resolved Hide resolved

Since all fields are at offset 0, `repr(C)` unions do not have padding before
their fields. They can, however, have trailing padding, to make sure the size is
a multiple of the alignment:

```rust
# use std::mem::{size_of, align_of};
#[repr(C, align(2))]
union U { x: u8 }
# fn main() {
// The repr(align) attribute raises the alignment requirement of U to 2
assert_eq!(align_of::<U>(), 2);
// This introduces trailing padding, raising the union size to 2
assert_eq!(size_of::<U>(), 2);
# }
```

> **Note**: Fields are overlapped instead of laid out sequentially, so
> unlike structs there is no "between the fields" that could be filled
> with padding.

#### Zero-sized fields

If a `#[repr(C)]` union contains a field of zero-size, that field does not
occupy space in the union. For example:
gnzlbg marked this conversation as resolved.
Show resolved Hide resolved

```rust
# use std::mem::{size_of, align_of};
#[repr(C)]
union U {
x: u8,
y: (),
}
# fn main() {
assert_eq!(size_of::<U>(), 1);
# }
```

The field does, however, participate in the layout computation of the union, and
can raise its alignment requirement, which in turn can introduce trailing
padding. For example:

```rust
# use std::mem::{size_of, align_of};
#[repr(C)]
union U {
x: u8,
y: [u16; 0],
}
# fn main() {
// The zero-sized type [u16; 0] raises the alignment requirement to 2
assert_eq!(align_of::<U>(), 2);
// This introduces trailing padding, raising the union size to 2
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
assert_eq!(size_of::<U>(), 2);
# }
```

This handling of zero-sized types is equivalent to the handling of zero-sized
types in struct fields, and matches the behavior of GCC and Clang for unions in
C when zero-sized types are allowed via their language extensions.
gnzlbg marked this conversation as resolved.
Show resolved Hide resolved

**C++ compatibility hazard**: C++ does, in general, give a size of 1 to empty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe say "struct with no field"? "empty" in Rust means something else than in C++.

Copy link

@comex comex Aug 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a link somewhere to the section about zero-sized types in structs-and-tuples.md? That section goes into a lot more detail, e.g.

  • explicitly stating that empty structs are illegal in C
  • mentioning [[no_unique_address]]
  • differentiating between structs with no fields and structs with fields of zero size

There's no need to repeat all that here, but a link would be helpful.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Perhaps also mention that Rust unions can't have zero fields.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text now uses "struct with no fields" and refers and links to the structs chapter for details.

Copy link
Member

@RalfJung RalfJung Aug 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of your commits seem to have been lost? Like, "Link to struct chapter" just is not in this PR any more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might not have been intended. I merged the "suggestion changes" and that had conflicts so I had to force push :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force-push shouldn't be needed after merging...? Weird.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I screw up. I need to learn how to "pull" changes to my branch locally and rebase them without spurious merge commits when my local branch has other changes :/

structs. If an empty struct in C++ is used as an union field, a "naive"
translation of that code into Rust will not produce a compatible result.