-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Layout of homogeneous structs #36
Comments
Would there ever be a reason to automatically insert tail padding for homogenous structs? |
The most plausible reason for not guarantee this would be to raise the alignment of the aggregate to allow copying the type with fewer or more efficient instructions (e.g., align a struct of four |
I can also imagine two hypothetical but very dubious scenarios to add to the pile:
|
I think we should be able to do this for arrays too, and also for homogenous parts of aggregates like |
That would break ABI compatibility with C arrays. Also, one reason why one might not want to raise the alignment of any aggregate is that it tends to cause extra padding, either internally (in your example |
Yes,
Yes, that's a very good reason not to do that. FWIW I wasn't suggesting to do this unconditionally, but rather that we don't have to forbid an implementation from doing this if it deems it worthwhile right now. |
You can't put a repr on a builtin type, so there's no such thing as a "repr(c) array". Also we've long guaranteed their layout, and that ship has sailed. |
An alternative would be to asymmetrically require that tuples are layout compatible with arrays, but to not require the layout of tuples and arrays to be identical. That is, arrays would not be layout compatible with tuples. This would allow the compiler to increase the alignment of homogeneous structs and tuples, and also allow AFAICT it would be backwards compatible to, in the future, make the stronger guarantee that the layout of homogeneous tuples and arrays is identical, which would allow |
These ordering issues make me think that for homogeneous non-tuple-structs, while we could guarantee that the layout matches that of an array, we should not guarantee how the fields are mapping to array indices. Getting that from the definition order seems strange to me. I do not have a strong opinion on this, though. |
This was, I believe, what I meant when I wrote "If the struct is defined with named fields, the mapping from fields to their indices is undefined (so This also seems to address @gankro's first concern:
With respect to the PGO or SIMD scenarios, if we adopted this proposal, the idea would probably be that one out to explicitly opt in to "non-standard" layouts of this kind (e.g., |
This has been my position for the longest time, but recently I'm less sure since the scenario outlined in #36 (comment) doesn't really need SIMD or any hypothetical whole-program/profile-guided optimizations to be profitable. It would be nice to be able to raise e.g. the alignment of This is relatively niche compared to reordering to remove internal packing, and considering the nice guarantees it would conflict with, I am not arguing we should reserve that freedom, but I do regret that more than I regret PGO-driven or SIMD-targeted layout changes. |
I would expect
What if all fields have different types, What if all fields have similar types, e.g. I think there's a lot of advantages to treating homogeneous and heterogeneous structs identically. For one, there's no need to define "homogeneous." |
@briansmith Fair points, but if we decide to treat homogenuous aggregates the same as other aggregates then I'd suggest teasing out the salient properties (e.g. size+align) and give guarantees about the layout of all repr(Rust) aggregates that in particular imply that things like |
Another thing to consider: Let's say you have a heterogeneous type with fields of types A, B, and A. Then you change the type of the second field to |
@briansmith for the heterogeneous type the compiler is free to reorder and add padding. For the homogeneous type it would at most be allowed to raise alignment up to the size itself. |
While full layout compatibility might be useful (size, alignment, call ABI, niches, and whatnot), I think maybe restrict this to either being able to This would suffice for many use cases (transmute, repr(C) union field access), while leaving out many degrees of freedom open such as type alignment, trailing padding, etc. |
PR #220 shows the diff of what that last proposal would look like. I would personally prefer to just guarantee that:
I'm not sure if this would imply that the first field of a struct/tuple/tuple_struct maps to the array index 0, the second field to index 1, etc. but we probably would make things clearer by spelling that out as well. This would allow raising the alignment of aggregates up to their size, but it wouldn't allow PGO optimizations like re-ordering fields. |
Summary of all drawbacks mentioned against providing layout guarantees. Guaranteeing the layout of homogeneous aggregates:
I think those are all drawbacks mentioned. The value and drawbacks of the following guarantees:
From the drawbacks that apply,
At least from the drawbacks and value mentioned, to me this boils down to whether sacrificing being able to re-order fields for better cache utilization is worth allowing users to type-pun between homogeneous aggregates and switching between name-based and index-based field access. |
To be clear, with this proposal you mean that the fields of a (non-tuple) struct are guaranteed to be laid out in source order, and for tuples and tuple structs in index order? Drawbacks 2 and 3 only apply with such a guarantee, as far as I can tell. |
For structs and tuple structs it would make the order of the named members part of the API, i.e. reordering would become a breaking change where it wasn't before. |
No that is not correct. API stability does not imply stability for clients that choose to |
Does that also apply when the fields are public? If someone reads the UCGs and then sees a homogeneous struct then they could conclude "aha, this is the same as an array, I can transmute this safely!" |
For homogeneous structs yes, that what I think that implies. That is, this would be guaranteed to work and never panic: struct Foo { x: i32, y: i32 }
let foo = Foo { x: 42, y: 43 };
let foo: [i32; 2] = transmute(foo);
assert_eq!(foo[0], 42);
assert_eq!(foo[1], 43); If all fields of the struct are public, then users can rely on the layout of the struct, and changing it is a breaking change. In particular, adding a private field to a struct whose fields are all public is a breaking change (and breaks, e.g., constructors, patterns, etc.). The current way to protect a struct against users relying on its layout today is to add a private field to the struct. If the struct has a private field, users cannot rely on you not adding more private fields within the struct, changing field offsets, its size, alignment, etc. |
|
Not completely, some parts of the layout of
So there are things about the struct layout that you can already rely on, and things that you can't (EDIT: in the UCGs reference, the degrees of freedom that the compiler has are specified). Today users cannot rely on the field order of homogeneous aggregates, with the exception of arrays. The whole point of guaranteeing the field order for all homogeneous aggregates, is to allow users to rely on it. So yes, if we were to guarantee that, then it becomes a guaranteed thing, and users could rely on it. |
Yes, but those are promises made by the compiler, not by struct's crate owner. Adding struct ordering as guarantee means that now the owning crate of the struct has to uphold an additional guarantee (field order) that it did not have to do before. Adding niche optimization didn't impose additional constraints on maintainers. Ordering does. It is perhaps minor, but I think it is something that everyone needs to be informed of because reordering struct members could now break other people's code where that was impossible before. |
If the "crate owner" makes all the fields of a struct public, it is promising that it won't add any private or public fields to that struct, and doing either is a breaking change. This is not a promise the compiler makes, but a promise the crate owner makes.
WIthout niche optimizations, whether
Without this guarantee, changing the field order of an homogeneous struct whose fields are all public is not a breaking change. With this guarantee, it is. If a homogeneous struct does not want to guarantee this, it can add a private 1-ZST field to the struct. Maintainers would need to be made aware of this, if we were to make this guarantee. |
That works for new structs, But for existing structs maintainers would have no choice but to accept the new requirement imposed on them.
Indeed, that's what I am after. |
Yes, maintainers of homogeneous aggregates whose fields are all public won't be able to reorder these fields in the struct definition after such a change lands without doing a breaking semver bump. Updating rust-semverver to automatically warn them of this when it happens shouldn't be hard. |
I don't think that any guarantee the compiler makes automatically becomes part of semver for crate authors. I think that should be a separate discussion. But TBH I care very little about the outcome of that discussion (as long as there is some clear consensus), so I will stay out of this. ;) But, IMO, such discussion is off-topic for UCG issues that are about compiler guarantees. |
Can we get some progress on this? I'd love to have homogeneous tuples defined. |
To be clear: nothing defined anywhere in the UCG is actually officially defined and you can't rely on the info you find in the UCG docs until it's actually accepted via RFC. |
I agree with @RalfJung on this, but it's true that we should say one way or the other what we consider to be a breaking change. In other words, it seems quite reasonable to me for the rule to be that:
and, therefore, one should not assume that one can transmute or whatever else unless there is an affirmative comment promising not to reorder fields. This seems fairly consistent to me, in that we generally make our "semver rules", which are more focused on "API" than on "ABI". But as @RalfJung said, I could be persuaded either way I imagine. I think though that we should definitely state explicitly the result, and we should try to be consistent overall. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
If you want to continue discussing semver-related library API guarantees, could you please open a new issue for that discussion? This thread here is about what the compiler guarantees. |
Ok, see #242 |
From #31: If you have homogeneous structs, where all the
N
fields are of a single typeT
, can we guarantee a mapping to the memory layout of[T; N]
? How do we map between the field names and the indices? What about zero-sized types?A specific proposal:
T
(ignoring zero-sized types), then the layout will be the same as[T; N]
whereN
is the number of fieldsfoo.bar
maps an undefined index)foo.0
maps to[0]
)This is basically because it's convenient but also because it's about the only sensible thing to do (unless you imagine we might want to start inserting padding between random fields or something, which connects to #35).
Note that for tuple structs (and tuples), the ordering of (public) fields is also significant for semver and other purposes -- changing the ordering will affect your clients. The same is not true for named fields and so this proposal attempts to avoid making it true.
The text was updated successfully, but these errors were encountered: