Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VarTuple and VarTupleULE #5511

Merged
merged 11 commits into from
Sep 9, 2024
Merged

Add VarTuple and VarTupleULE #5511

merged 11 commits into from
Sep 9, 2024

Conversation

sffc
Copy link
Member

@sffc sffc commented Sep 6, 2024

This abstraction makes things much easier. It is easier to debug than make_varule, and it composes nicely. I really want to use this for the plurals and patterns things.

I had to use transmute_copy because the compiler doesn't know that the generic is a fat pointer. But, we know it is, because it is a VarULE type. I don't think there is anything unsound about this PR.

@sffc sffc requested a review from robertbastian September 6, 2024 02:28
@sffc sffc requested a review from Manishearth as a code owner September 6, 2024 02:28
Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the unsafe code, @Manishearth pls

/// See the module for examples.
#[derive(Debug)]
#[allow(clippy::exhaustive_structs)] // well-defined type
pub struct VarTupleULE<A: AsULE, V: ?Sized> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be repr(packed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sure. repr(C, packed), yes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It lets me apply repr(C), but repr(packed) and repr(C, packed) say:

error[E0277]: the size for values of type `V` cannot be known at compilation time
  --> utils/zerovec/src/ule/vartuple.rs:74:19
   |
72 | pub struct VarTupleULE<A: AsULE, V: ?Sized> {
   |                                  - this type parameter needs to be `Sized`
73 |     pub sized: A::ULE,
74 |     pub variable: V,
   |                   ^ doesn't have a size known at compile-time
   |
   = note: the last field of a packed struct may only have a dynamically sized type if it does not need drop to be run
   = help: change the field's type to have a statically known size

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think repr(C) is fine.

https://doc.rust-lang.org/reference/type-layout.html#the-c-representation

Our safety requirement is that we are align(1) with no padding.

The Rust Reference says:

The alignment of the struct is the alignment of the most-aligned field in it.

Since the two generics are bound to ULE and VarULE which are defined to be align 1, then the repr(C) containing only those fields is defined to be align 1. (Note: I put those bounds on the impl, but I can move them to the struct to satisfy this)

The size and offset of fields is determined by the following algorithm.

Start with a current offset of 0 bytes.

For each field in declaration order in the struct, first determine the size and alignment of the field. If the current offset is not a multiple of the field’s alignment, then add padding bytes to the current offset until it is a multiple of the field’s alignment. The offset for the field is what the current offset is now. Then increase the current offset by the size of the field.

Finally, the size of the struct is the current offset rounded up to the nearest multiple of the struct’s alignment.

Since all field lengths are a multiple of 1, then repr(C) will never add padding bytes.

@sffc
Copy link
Member Author

sffc commented Sep 6, 2024

Another way for me to do this without transmute_copy would be for the variable-length field to be [u8] and then there's an associated function that returns the &V. I think this is slightly less efficient for nested VarTuples since the metadata has to keep changing: with the current design, the metadata is the metadata of the actual variable-length field, even if deeply nested.

@sffc sffc requested a review from Manishearth September 6, 2024 18:34
Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks about right, but we should be more detailed with the safety comments

// For now we rely on all DST metadata being a usize to extract it via a fake slice pointer
// Rust doesn't know that `&V` is a fat pointer so we have to use transmute_copy
debug_assert_eq!(size_of::<*const V>(), size_of::<*const [u8]>());
let variable_fat_ptr: *const [u8] = core::mem::transmute_copy(&variable_ptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should have a safety comment dedicated to it that starts with Safety:, that describes why the length and the transmute are valid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also when calling transmute_copy we should redundantly specify the types in a turbofish

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

let composed_fat_ptr: *const [u8] = core::slice::from_raw_parts(bytes.as_ptr(), metadata);
debug_assert_eq!(size_of::<*const Self>(), size_of::<*const [u8]>());
let composed_ref: *const Self = core::mem::transmute_copy(&composed_fat_ptr);
&*(composed_ref)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should also have comments explaining what's going on with all the transmutes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sffc sffc requested a review from Manishearth September 7, 2024 00:15
@sffc
Copy link
Member Author

sffc commented Sep 8, 2024

@Manishearth Ready to merge?

@Manishearth Manishearth merged commit 7751731 into unicode-org:main Sep 9, 2024
28 checks passed
@sffc sffc deleted the vartuple branch September 9, 2024 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants