Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layout of homogeneous structs #36

Open
nikomatsakis opened this issue Oct 11, 2018 · 37 comments
Open

Layout of homogeneous structs #36

nikomatsakis opened this issue Oct 11, 2018 · 37 comments
Labels
A-layout Topic: Related to data structure layout (`#[repr]`) S-not-opsem Despite being in this repo, this is not primarily a T-opsem question T-lang

Comments

@nikomatsakis
Copy link
Contributor

From #31: If you have homogeneous structs, where all the N fields are of a single type T, can we guarantee a mapping to the memory layout of [T; N]? How do we map between the field names and the indices? What about zero-sized types?

A specific proposal:

  • If all fields have the same type T (ignoring zero-sized types), then the layout will be the same as [T; N] where N is the number of fields
  • If the struct is defined with named fields, the mapping from fields to their indices is undefined (so foo.bar maps an undefined index)
  • If the struct is defined as a tuple struct (or a tuple), then the indices are derived from the definition (so foo.0 maps to [0])

This is basically because it's convenient but also because it's about the only sensible thing to do (unless you imagine we might want to start inserting padding between random fields or something, which connects to #35).

Note that for tuple structs (and tuples), the ordering of (public) fields is also significant for semver and other purposes -- changing the ordering will affect your clients. The same is not true for named fields and so this proposal attempts to avoid making it true.

@nikomatsakis nikomatsakis added A-layout Topic: Related to data structure layout (`#[repr]`) active discussion topic labels Oct 11, 2018
@the8472
Copy link
Member

the8472 commented Oct 11, 2018

(unless you imagine we might want to start inserting padding between random fields or something

Would there ever be a reason to automatically insert tail padding for homogenous structs?

@hanna-kruppe
Copy link

The most plausible reason for not guarantee this would be to raise the alignment of the aggregate to allow copying the type with fewer or more efficient instructions (e.g., align a struct of four u8s to 32 bit and use aligned 32 bit load and store instructions for it).

@Gankra
Copy link

Gankra commented Oct 11, 2018

I can also imagine two hypothetical but very dubious scenarios to add to the pile:

  • We explicitly want to break code that's relying on the order without an annotation because someone might reorder the fields syntactically without knowing it's important (in some sense, using repr(C) as a "I promise this type will have some kind of stable ABI" annotation)

  • With profile-guided optimization (PGO), it is determined that (say) Foo.7 being first in the type is better due to cache effects.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 12, 2018

The most plausible reason for not guarantee this would be to raise the alignment of the aggregate to allow copying the type with fewer or more efficient instructions

I think we should be able to do this for arrays too, and also for homogenous parts of aggregates like struct A { a: u8, x: f32, y: f32, z: f32, w: f32, b: u8 } where it might be worth it to reorder {xyzw} to the front not only to reduce A's size, but also to be able to perform SIMD operations on these fields, which might require raising the alignment of A from 4 to, e.g., 16. I don't think any implementation will do these optimizations in the near future (maybe never), but if leaving the door open for these does not causes many headaches it might still be worth doing nevertheless. We can always provide more guarantees later.

@hanna-kruppe
Copy link

I think we should be able to do this for arrays too

That would break ABI compatibility with C arrays.

Also, one reason why one might not want to raise the alignment of any aggregate is that it tends to cause extra padding, either internally (in your example struct A, increasing alignment to 16 bytes also increases the size of the struct from 20 bytes to 32 bytes) or externally, i.e., requiring extra padding in an aggregate that contains a field whose alignment requirement was raised.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 12, 2018

That would break ABI compatibility with C arrays.

Yes, repr(Rust) arrays would then differ from repr(C) arrays - I don't know how big of a deal this would be.

Also, one reason why one might not want to raise the alignment of any aggregate is that it tends to cause extra padding, either internally (in your example struct A, increasing alignment to 16 bytes also increases the size of the struct from 20 bytes to 32 bytes) or externally, i.e., requiring extra padding in an aggregate that contains a field whose alignment requirement was raised.

Yes, that's a very good reason not to do that. FWIW I wasn't suggesting to do this unconditionally, but rather that we don't have to forbid an implementation from doing this if it deems it worthwhile right now.

@Gankra
Copy link

Gankra commented Oct 12, 2018

You can't put a repr on a builtin type, so there's no such thing as a "repr(c) array". Also we've long guaranteed their layout, and that ship has sailed.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 12, 2018

An alternative would be to asymmetrically require that tuples are layout compatible with arrays, but to not require the layout of tuples and arrays to be identical. That is, arrays would not be layout compatible with tuples.

This would allow the compiler to increase the alignment of homogeneous structs and tuples, and also allow &(T, T, T, T) as &[T; 4]. It would not allow &[T; 4] as &(T, T, T, T) because the alignment of [T; 4] might be smaller than the alignment of (T, T, T, T). An union like U { a: [T; 4], t: (T, T, T, T) } could, however, be used to re-interpret the bytes of an array as a tuple because the union will properly align the array.

AFAICT it would be backwards compatible to, in the future, make the stronger guarantee that the layout of homogeneous tuples and arrays is identical, which would allow &[T; 4] as &(T, T, T, T).

@RalfJung
Copy link
Member

These ordering issues make me think that for homogeneous non-tuple-structs, while we could guarantee that the layout matches that of an array, we should not guarantee how the fields are mapping to array indices. Getting that from the definition order seems strange to me.

I do not have a strong opinion on this, though.

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Oct 18, 2018

@RalfJung

These ordering issues make me think that for homogeneous non-tuple-structs, while we could guarantee that the layout matches that of an array, we should not guarantee how the fields are mapping to array indices. Getting that from the definition order seems strange to me.

This was, I believe, what I meant when I wrote "If the struct is defined with named fields, the mapping from fields to their indices is undefined (so foo.bar maps an undefined index)".

This also seems to address @gankro's first concern:

We explicitly want to break code that's relying on the order without an annotation because someone might reorder the fields syntactically without knowing it's important (in some sense, using repr(C) as a "I promise this type will have some kind of stable ABI" annotation)


With respect to the PGO or SIMD scenarios, if we adopted this proposal, the idea would probably be that one out to explicitly opt in to "non-standard" layouts of this kind (e.g., #[repr(opt)] or something). I think this is probably reasonable -- basically #[repr(Rust)] gives you a compact layout by default that is optimized to do reasonably well on any program with any usage pattern. If you have specific needs (e.g., simd, avoiding false sharing) then those would be something you declare.

@hanna-kruppe
Copy link

hanna-kruppe commented Oct 18, 2018

I think this is probably reasonable -- basically #[repr(Rust)] gives you a compact layout by default that is optimized to do reasonably well on any program with any usage pattern. If you have specific needs (e.g., simd, avoiding false sharing) then those would be something you declare.

This has been my position for the longest time, but recently I'm less sure since the scenario outlined in #36 (comment) doesn't really need SIMD or any hypothetical whole-program/profile-guided optimizations to be profitable. It would be nice to be able to raise e.g. the alignment of struct Rgba { r: u8, g: u8, b: u8, a: u8 } to four bytes because doing so would rarely increase data structure sizes (if the struct is private we can even sometimes prove this locally) and on many targets it could make copies of Rgba use 4x fewer instructions.

This is relatively niche compared to reordering to remove internal packing, and considering the nice guarantees it would conflict with, I am not arguing we should reserve that freedom, but I do regret that more than I regret PGO-driven or SIMD-targeted layout changes.

@briansmith
Copy link

basically #[repr(Rust)] gives you a compact layout by default that is optimized to do reasonably well on any program with any usage pattern. If you have specific needs (e.g., simd, avoiding false sharing) then those would be something you declare.

I would expect #[repr(Rust)] to be able to optimize however it wants (not just for compactness), based on whatever information it has, including all whole-program analysis. In particular, I would expect it to eventually be able to notice which types might be best optimized for SIMD or avoiding false sharing.

If all fields have the same type T

What if all fields have different types, A, B, C, D, all of which are #[repr(transparent)] wrappers around the same T? There should be no performance penalty to using #[repr(transparent)] wrappers.

What if all fields have similar types, e.g. i8 and u8, or i32 and u32, or T& and T* and some transparent wrapper around T*? I think it would be too surprising to change T* to a T& or change the signness of a field and have that change affect the layout in the way this issue suggests.

I think there's a lot of advantages to treating homogeneous and heterogeneous structs identically. For one, there's no need to define "homogeneous."

@hanna-kruppe
Copy link

@briansmith Fair points, but if we decide to treat homogenuous aggregates the same as other aggregates then I'd suggest teasing out the salient properties (e.g. size+align) and give guarantees about the layout of all repr(Rust) aggregates that in particular imply that things like (u8, i8), (&T, *const T), etc. are laid out like arrays. Special casing homogenuous aggregates was, after all, originally conceived as a compromise solution between that and giving no guarantees at all.

@briansmith
Copy link

Another thing to consider: Let's say you have a heterogeneous type with fields of types A, B, and A. Then you change the type of the second field to A, making the struct homogeneous. Should this really reduce the kinds of optimizations the compiler can perform, especially if A and B are the same size and alignment?

@the8472
Copy link
Member

the8472 commented Nov 24, 2018

@briansmith for the heterogeneous type the compiler is free to reorder and add padding. For the homogeneous type it would at most be allowed to raise alignment up to the size itself.

@RalfJung RalfJung added the C-open-question Category: An open question that we should revisit label Nov 29, 2018
@gnzlbg gnzlbg mentioned this issue Feb 21, 2019
@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 21, 2019

While full layout compatibility might be useful (size, alignment, call ABI, niches, and whatnot), I think maybe restrict this to either being able to transmute between two types (e.g. [T; 2] to (T, T)), which only requires the size to match, or even further, to be able to memcpy the smaller type into the larger type (e.g. iff (T, T) is larger than [T; 2], you can memcpy [T; 2] to the beginning of (T, T), and that is guaranteed to give you the array field i at tuple index i).

This would suffice for many use cases (transmute, repr(C) union field access), while leaving out many degrees of freedom open such as type alignment, trailing padding, etc.

gnzlbg added a commit to gnzlbg/unsafe-code-guidelines that referenced this issue Dec 5, 2019
@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 5, 2019

PR #220 shows the diff of what that last proposal would look like.

I would personally prefer to just guarantee that:

All homogeneous aggregates with the same length and of the same repr have the same size and field offsets.

I'm not sure if this would imply that the first field of a struct/tuple/tuple_struct maps to the array index 0, the second field to index 1, etc. but we probably would make things clearer by spelling that out as well.

This would allow raising the alignment of aggregates up to their size, but it wouldn't allow PGO optimizations like re-ordering fields.

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 5, 2019

Summary of all drawbacks mentioned against providing layout guarantees.

Guaranteeing the layout of homogeneous aggregates:

  1. @rkruppe: prevents raising the alignment to allow copying with fewer instructions(here and here).

  2. @Gankra prevents reordering fields for better cache utilization (e.g. using PGO information) (here).

  3. @Gankra prevents randomizing field order to break code that relies on a particular order: (here

  4. @briansmith prevents optimizing types for SIMD (here)

  5. @briansmith prevents optimizing to avoid false sharing (here)

  6. @briansmith is surprising if, when a field type changes, the aggregate changes between homogeneous and heterogeneous, and different optimizations apply. (here)

I think those are all drawbacks mentioned.

The value and drawbacks of the following guarantees:

  • All homogeneous aggregates of the same repr and len have the same size
    • Drawbacks that apply: 4, 5, 6.
    • Value: type punning / transmute between aggregates works: [T; 2] -> (T, T) -> [T; 2]
  • All homogeneous aggregates of the same repr and len have the same size and field offsets:
    • Drawbacks that apply: (4,5,6), 2, 3
    • Value: on top of type punning, [T; 2][i] and (T, T).i access the same element.

From the drawbacks that apply,

  • \4. SIMD optimizations: a SIMD optimization was suggested also here but it has severe drawbacks mentioned here. No other SIMD optimization has been mentioned.

  • \5. Optimizing to avoid false sharing: if Rust is able to detect at compile-time or run-time that there is false sharing somewhere, I think it would be better to report it to the user, so that the user fixes its code. This might not be trivial, e.g., due to generics, but false sharing is bad enough that it should never happen, and it happening should not depend on optimizations, PGO information, etc.

  • \6. Surprising change in optimizations between heterogeneous and homogeneous aggregates: how surprising this is will depend on the impact of the different optimizations that get applied.

  • \2. Re-order fields for better cache utilization based on PGO: A report that includes the worst offenders could be enough to allow the user to apply it manually for those parts of the code for which it is worth it, and that will already be hard due to generics. But this optimization could be applied to every type in the program, and it feels unrealistic for users to apply it manually, so this might be a real loss. C and C++ do not allow this optimization, so I don't know how bad losing it would be, and whether opting into it via a #[repr(pgo)] or similar flag would be possible or worth it.

  • \3. Randomizing field order to break broken code that accidentally relies on a particular order: if we guarantee the order, then that code isn't broken, so I don't know why would it be worth it to break it.

At least from the drawbacks and value mentioned, to me this boils down to whether sacrificing being able to re-order fields for better cache utilization is worth allowing users to type-pun between homogeneous aggregates and switching between name-based and index-based field access.

@RalfJung
Copy link
Member

RalfJung commented Dec 5, 2019

All homogeneous aggregates of the same repr and len have the same size and field offsets:

To be clear, with this proposal you mean that the fields of a (non-tuple) struct are guaranteed to be laid out in source order, and for tuples and tuple structs in index order? Drawbacks 2 and 3 only apply with such a guarantee, as far as I can tell.

@the8472
Copy link
Member

the8472 commented Dec 5, 2019

For structs and tuple structs it would make the order of the named members part of the API, i.e. reordering would become a breaking change where it wasn't before.

@RalfJung
Copy link
Member

RalfJung commented Dec 5, 2019

For structs and tuple structs it would make the order of the named members part of the API, i.e. reordering would become a breaking change where it wasn't before.

No that is not correct. API stability does not imply stability for clients that choose to transmute.

@the8472
Copy link
Member

the8472 commented Dec 5, 2019

Does that also apply when the fields are public? If someone reads the UCGs and then sees a homogeneous struct then they could conclude "aha, this is the same as an array, I can transmute this safely!"

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 5, 2019

@RalfJung

To be clear, with this proposal you mean that the fields of a (non-tuple) struct are guaranteed to be laid out in source order, and for tuples and tuple structs in index order?

For homogeneous structs yes, that what I think that implies. That is, this would be guaranteed to work and never panic:

struct Foo { x: i32, y: i32 }
let foo = Foo { x: 42, y: 43 };
let foo: [i32; 2] = transmute(foo); 
assert_eq!(foo[0], 42);
assert_eq!(foo[1], 43);

@the8472

If all fields of the struct are public, then users can rely on the layout of the struct, and changing it is a breaking change. In particular, adding a private field to a struct whose fields are all public is a breaking change (and breaks, e.g., constructors, patterns, etc.).

The current way to protect a struct against users relying on its layout today is to add a private field to the struct. If the struct has a private field, users cannot rely on you not adding more private fields within the struct, changing field offsets, its size, alignment, etc.

@the8472
Copy link
Member

the8472 commented Dec 5, 2019

If all fields of the struct are public, then users can rely on the layout of the struct,

repr(rust) structs have unspecified layout, thus people in fact could not rely on field order even if fields were public. That would change with this proposal.

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 5, 2019

repr(rust) structs have unspecified layout,

Not completely, some parts of the layout of repr(Rust) structs are not specified, but

  • if you have a struct with a public NonZeroU8 field, you can rely on the struct having a niche,
  • if you have a struct with a public field, you can rely on the struct alignment requirement being at least as high as that of the public field, etc.
  • if you have a struct with all public fields, you can rely on the struct size being at least as high as the sum of the sizes of its fields, etc.

So there are things about the struct layout that you can already rely on, and things that you can't (EDIT: in the UCGs reference, the degrees of freedom that the compiler has are specified).

Today users cannot rely on the field order of homogeneous aggregates, with the exception of arrays. The whole point of guaranteeing the field order for all homogeneous aggregates, is to allow users to rely on it. So yes, if we were to guarantee that, then it becomes a guaranteed thing, and users could rely on it.

@the8472
Copy link
Member

the8472 commented Dec 5, 2019

So there are things about the struct layout that you can already rely on

Yes, but those are promises made by the compiler, not by struct's crate owner. Adding struct ordering as guarantee means that now the owning crate of the struct has to uphold an additional guarantee (field order) that it did not have to do before.

Adding niche optimization didn't impose additional constraints on maintainers. Ordering does. It is perhaps minor, but I think it is something that everyone needs to be informed of because reordering struct members could now break other people's code where that was impossible before.

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 6, 2019

Yes, but those are promises made by the compiler, not by struct's crate owner.

If the "crate owner" makes all the fields of a struct public, it is promising that it won't add any private or public fields to that struct, and doing either is a breaking change. This is not a promise the compiler makes, but a promise the crate owner makes.

Adding niche optimization didn't impose additional constraints on maintainers.

WIthout niche optimizations, whether NonNull is implemented internally using NonZero, or just using *mut T, is not observable by users. WIth niche optimizations, even though the fields involved are private, these changes are observable, and maintainers should be aware of them, even though users can't rely on them unless maintainers say so (due to them being restricted to private fields - if the fields were public, changing niches requires changing types, and that's already a breaking change).

Adding struct ordering as guarantee means that now the owning crate of the struct has to uphold an additional guarantee (field order) that it did not have to do before.

Without this guarantee, changing the field order of an homogeneous struct whose fields are all public is not a breaking change. With this guarantee, it is. If a homogeneous struct does not want to guarantee this, it can add a private 1-ZST field to the struct. Maintainers would need to be made aware of this, if we were to make this guarantee.

@the8472
Copy link
Member

the8472 commented Dec 6, 2019

it can add a private 1-ZST field to the struct.

That works for new structs, But for existing structs maintainers would have no choice but to accept the new requirement imposed on them.

Maintainers would need to be made aware of this, if we were to make this guarantee.

Indeed, that's what I am after.

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 6, 2019

But for existing structs maintainers would have no choice but to accept the new requirement imposed on them.

Yes, maintainers of homogeneous aggregates whose fields are all public won't be able to reorder these fields in the struct definition after such a change lands without doing a breaking semver bump.

Updating rust-semverver to automatically warn them of this when it happens shouldn't be hard.

@RalfJung
Copy link
Member

RalfJung commented Dec 9, 2019

I don't think that any guarantee the compiler makes automatically becomes part of semver for crate authors. I think that should be a separate discussion. But TBH I care very little about the outcome of that discussion (as long as there is some clear consensus), so I will stay out of this. ;)

But, IMO, such discussion is off-topic for UCG issues that are about compiler guarantees.

@kornelski
Copy link

Can we get some progress on this? I'd love to have homogeneous tuples defined.

@Lokathor
Copy link
Contributor

Lokathor commented Jul 5, 2020

To be clear: nothing defined anywhere in the UCG is actually officially defined and you can't rely on the info you find in the UCG docs until it's actually accepted via RFC.

@nikomatsakis
Copy link
Contributor Author

I don't think that any guarantee the compiler makes automatically becomes part of semver for crate authors.

I agree with @RalfJung on this, but it's true that we should say one way or the other what we consider to be a breaking change. In other words, it seems quite reasonable to me for the rule to be that:

  • The compiler will not reorder public fields
  • But a crate author may still choose to do so in a minor release

and, therefore, one should not assume that one can transmute or whatever else unless there is an affirmative comment promising not to reorder fields.

This seems fairly consistent to me, in that we generally make our "semver rules", which are more focused on "API" than on "ABI". But as @RalfJung said, I could be persuaded either way I imagine. I think though that we should definitely state explicitly the result, and we should try to be consistent overall.

@Lokathor

This comment has been minimized.

@the8472

This comment has been minimized.

@RalfJung
Copy link
Member

If you want to continue discussing semver-related library API guarantees, could you please open a new issue for that discussion? This thread here is about what the compiler guarantees.

@the8472
Copy link
Member

the8472 commented Jul 28, 2020

Ok, see #242

@JakobDegen JakobDegen added S-not-opsem Despite being in this repo, this is not primarily a T-opsem question T-lang and removed C-open-question Category: An open question that we should revisit labels Jun 6, 2023
tyilo added a commit to tyilo/std-traits that referenced this issue Oct 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-layout Topic: Related to data structure layout (`#[repr]`) S-not-opsem Despite being in this repo, this is not primarily a T-opsem question T-lang
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants