Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extern types v2 #3396

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Extern types v2 #3396

wants to merge 7 commits into from

Conversation

Skepfyr
Copy link

@Skepfyr Skepfyr commented Feb 26, 2023

Define types external to Rust and introduce syntax to declare them.
This additionally introduces the MetaSized trait to allow these types to be interacted with in a generic context.
This supersedes #1861, and is related to the open RFCs #2984 and #3319.

Rendered

Comment on lines +49 to +51
extern {
type Foo;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the extern type Foo shorthand still allowed under this proposal or do all extern types have to be inside an extern block?

Similarly, how do extern blocks interact with ABI tags? For example, if I have an extern "C" block, is a type declaration inside it invalid, or just a warning? I would say that adding an ABI to these should be a deny-by-default lint that states that the tag is unused, with the potential for them having some meaning in the future.

Copy link
Member

@thomcc thomcc Feb 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that extern { type Blah; } will currently be reformatted by rustfmt to extern "C" { type Blah; }. So this question should probably have t-style weigh in.

I believe we've guaranteed the default is "C" anyway, so it's not clear to me that this makes sense.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question gets at one of the niggles I have with this proposal. The ABI doesn't really mean anything as we're using extern to mean something slightly different to what it normally means. I think this is a point in favour of repr(unsized) but extern makes so much sense in that it heavily suggests what this feature is useful for.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it comes to bikeshedding, what about something like the following?

#[repr(opaque)]
struct Foo;

I personally don't like the use of type and I think opaque captures the meaning better than either extern or unsized. I also think it just makes more sense as a repr, and it would go nicely with specifying alignment (eg. repr(opaque, align(16))) if we allow that, which I think we should (though that could be added later if necessary).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your reasoning for preferring opaque over unsized? I'm thinking of the common header use case where you may want:

#[repr(C, unsized)]
struct Header {
    field1: u8,
    field2: usize,
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are described as opaque types throughout the RFC, which I think describes them well. The thing that makes them unique isn't that you don't know their size, but that you don't know (nor care about) their composition. This is in contrast to types like [T], str, and dyn Trait, which are unsized, but not opaque.

Regarding the example you gave, if I didn't already know what it meant, I would find it very confusing. I wouldn't understand what unsized meant here, especially considering that the struct would appear to be composed entirely of sized types. It would be a lot less confusing, though, if the struct had no fields.

To create a type with a non-opaque header followed by opaque data, I would imagine creating two separate types: one for the opaque data itself, and a separate type for the opaque data with a header (though that would seem to require repr(align) on the opaque type). Using the word opaque in this context (where the opaque type has no fields) would clearly indicate not only that we don't know the size, but that we don't know (nor care about) the composition of the type. Here's how I would imagine doing that:

#[repr(opaque, align(4)]
struct Opaque;

#[repr(C)]
struct OpaqueWithHeader {
    field1: u8,
    field2: usize,
    opaque: Opaque,
}

That said, I'm not opposed to either extern type T nor the repr(unsized) syntax. I just wanted to express my view that repr(opaque) might be a better fit. However, regardless of the syntax used, I would discourage the idea of allowing fields within the opaque type itself, as that would seem quite confusing to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue I see with the repr attribute is that it looks too much like it's a unit struct declaration, if we're using a struct declaration I wonder if we could have native syntax for saying there are unknown fields (maybe with the repr too)

#[repr(opaque)]
struct Opaque {
  ..
}

Though, this has been proposed as syntax for #[non_exhaustive] previously. I think it fits opaqueness better since the fields are unknown even locally, whereas #[non_exhaustive] only affects metadata and the visible struct declaration is what it is known to be locally.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one place where extern "C" { type X; } and extern "Rust" { type X; } could differ: the former obviously should be FFI-safe, but the latter doesn't necessarily have to be. If and only if it's not, then it's presumably semver-compatible to replace it with a standard struct in the future.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I see "opaque (type)", I think of RPIT (return-position impl trait) types, and soon TAIT, RPITIT, etc.

This would provide a mechanism for allowing users to implement types with entirely custom metadata and therefore custom implementations of `size_of_val` and `align_of_val`.
This would likely mean that users could implement types that acted like extern types without using the `extern type` syntax.
This should not be an issue as `extern type` communicates the intent of these types well, and guarantees FFI-safety.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other valuable extension would be to make MaybeUninit accept ?Sized + MetaSized types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MaybeUninit currently takes a sized T, why would this enable relaxing that bound? T: ?Sized currently means the same as I'm proposing T: ?Sized + MetaSized will mean, so I don't see how this helps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I misread the proposal and assumed that ?Sized would opt out of MetaSized too.

Without a notion of MetaSized, MaybeUninit has to be sized, since there's no good way to determine the layout of an unsized value. Adding MetaSized means we can now do this with just metadata, making it valid again.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be clear the proposal is that ?Sized will opt out of MetaSized after the edition change. So far I've assumed that, because all types in current Rust are MetaSized, that introducing this won't help with issues like this. It's possible that's not true and this is an exception.

How would you create a MaybeUninit<T: ?Sized + MetaSized>, where would you get the metadata from? Can it be uninit? Aren't those the same issues you'd face if you opened an RFC to make MaybeUninit take ?Sized today?

Comment on lines +89 to +90
* statically - the size/alignment is known by the Rust compiler at compile time. This is the current `Sized` trait.
Most types in Rust are statically sized and aligned, like `u32`, `String`, `&[u8]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a way, this is the same as knowing from metadata -- the metadata is just ().

(I think that this way of wording is still clearer than omitting it, but figured it's worth pointing out somehow.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I possibly should clarify that all statically sized things are metadata sized and all metadata sized things are dynamically sized (and similarly for alignment). Note there is still a meaningful distinction between these things: you can put as many statically sized&aligned things in a struct but only one metadata aligned thing in.

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Feb 26, 2023
@Aloso
Copy link

Aloso commented Mar 4, 2023

Should MetaSized be a supertrait of Sized?

Yes, I think so, unless there are plans to add types that are Sized but not MetaSized, which makes no sense IMO.

Personally, I think it would be better to leave the meaning of ?Sized unchanged. MetaSized could be a trait that is automatically implied for trait bounds, so these are semantically identical:

fn foo<T>();
fn foo<T: Sized + MetaSized>();

as are those:

fn foo<T: ?Sized>();
fn foo<T: ?Sized + MetaSized>();

In other words, MetaSized would be opt-out rather than opt-in, like Sized is today. And if MetaSized is a super-trait of Sized, then you can opt-out of both traits with ?MetaSized, which is convenient.

Note that this is not a breaking change and requires no edition migration.

@scottmcm
Copy link
Member

scottmcm commented Mar 8, 2023

then you can opt-out of both traits with ?MetaSized

FWIW, lang has historically said "no more ?Trait bounds" like this. That's why there's no ?Move, no ?Unpin, no ?AlignSized , etc.

So I like the edition switch approach here: the conceptual meaning of ?Sized is still essentially the same, and code that just wants to deal in references can continue to use T: ?Sized, with only code that needs to look at the dynamic size of something needing to add the extra bound. (You can follow old docs that say ?Sized, and if you do something that needs the runtime size it'll be an obvious error message saying to add the + MetaSized.)

Comment on lines +49 to +51
extern {
type Foo;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I see "opaque (type)", I think of RPIT (return-position impl trait) types, and soon TAIT, RPITIT, etc.

In the 2024 edition and later, `T: ?Sized` no longer implies any knowledge of the size and alignment so opaque types can be used in generic contexts.
If you require your generic type to have a computable size and alignment you can use the bound `T: ?Sized + MetaSized`, which will enable you to store the type in a struct.

The automated tooling for migrating from the 2021 edition to the 2024 edition will replace `?Sized` bounds with `?Sized + MetaSized` bounds.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tooling should only do the replacement if Metasized is required, or a lot of declarations will get a lot noisier.

I feel it would be useful to have an idea on how much of an impact this will have on the ecosystem -- how common will it be to need a Metasized bound.

Doing the review of the standard library mentioned later would be a start.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tooling should only do the replacement if Metasized is required, or a lot of declarations will get a lot noisier.

How do you propose deciding which case is which? Note that (usually) relaxing the bound is non-breaking but adding it in later is breaking so a human decision is often required for what promises the API wants to make.

This requires `MetaSized` because the size must be known at run time.

```rust
pub struct Box<T: ?Sized + MetaSized, A: Allocator = Global>(Unique<T>, A);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without implied bounds, this will require the bound be mentioned everywhere Box (or Arc etc) are mentioned and can take a non-Sized parameter, including in traits.

pub trait Trait {
    fn foo(self: Box<Self>) where Self: MetaSized;
    fn bar(self: Arc<Self>) where Self: MetaSized;
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting but probably bad idea: MetaSized as a default bound for all trait, unless they opt out.

It's worth noting that since receivers are special, we could add special rules to make a Box<Self> receiver imply Self: MetaSized in traits. On the other hand, we'd probably prefer to make receivers less special.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implied bounds could solve this if they are ever implemented…

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really sad. It feels like a blocker that can't be easily skipped. How reasonable would it be to make receivers special until implied bounds is implemented? Or is the only option here to go and implement implied bounds?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a thought, doesn't implied bounds for self types arguably already exist given that you can write this:

trait Trait {
    fn foo(self);
}

You don't have to specify where Self: Sized on that. Granted, you can't implement it for unsized types but that feels like the correct behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, technically:

trait Trait {
    fn foo(self);
}

doesn't actually imply where Self: Sized, just that you need unstable features to implement it without where Self: Sized. for a good example, see FnOnce which can be used as Box<dyn FnOnce(...)> to call fn call_once(self, ...) with Self = dyn FnOnce(...)

@clarfonthey
Copy link
Contributor

Since the general consensus on the syntax is... very mixed, it might be worth focusing on MetaSized and it's semantics first, and maybe delegating the exact way to define extern types to another RFC?

Having a stable MetaSized would allow libraries to determine a strategy of handling extern types while we figure out the syntax, and the semantics seem like the largest barrier to getting that right now.

I just don't want to see more progress on refining Sized semantics for DSTs get canned because of syntax bikeshedding.

@Skepfyr
Copy link
Author

Skepfyr commented Mar 12, 2023

@Aloso I've added a note to the alternatioves section about ?MetaSized, but as Scott says the lang team have been opposed to these so this RFC is intentionally written to avoid adding it.

@Skepfyr
Copy link
Author

Skepfyr commented Mar 12, 2023

@clarfonthey

Since the general consensus on the syntax is... very mixed, it might be worth focusing on MetaSized and it's semantics first, and maybe delegating the exact way to define extern types to another RFC?

That's an option, although I'm tempted to leave everything as is for now and keep that option in my back pocket for if the extern types syntax is the only thing blocking this. Adding MetaSized without any types that can be !MetaSized would be hard to justify.

Comment on lines 226 to 228
- Should users be able to slap a `#[repr(align(n))]` attribute onto opaque types to give them an alignment?
This would allow us to represent `CStr` properly but would necessitate splitting `MetaSized` and `MetaAligned` as they would not be "metadata sized" in general.
(We may be able to get away with the [Aligned trait](https://github.com/rust-lang/rfcs/pull/3319))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've long been told that "extern type is needed to fix CStr and CString", so if this whole thing happens but doesn't definitely fix those types as part of it then things it would be extremely disappointing.

bors added a commit to rust-lang/libc that referenced this pull request May 26, 2023
Skip round-trip tests for structs with FAMs

For example:

```
strcut inotify_event {
  int      wd;       /* Watch descriptor */
  uint32_t mask;     /* Mask describing event */
  uint32_t cookie;   /* Unique cookie associating related
                        events (for rename(2)) */
  uint32_t len;      /* Size of name field */
  char     name[];   /* Optional null-terminated name */
};
```

the `name` field at the end of this struct is a Flexible Array Member (FAM - https://en.wikipedia.org/wiki/Flexible_array_member) and represents an inline array of _some_ length (in this case the `len` field gives the length, but FAMs in general are unsized).  These forms cause problems in two ways:

1. There's no rust-equivalent for FAM-containing structs (see rust-lang/rfcs#3396 for details) - the current approach is just to omit the `name` field in the Rust representation.
2. These structs cannot be passed by value (basically the compiler cannot know how many elements of the `name` field should be copied) and different compilers do different things when asked to do so.

This PR focusses on the second of these problems since it was causing test failures on my machine.  If you run the libc-test suite with GCC as your C compiler you get a compiler note saying the following:

```
/out/main.c: In function ‘__test_roundtrip_inotify_event’:
/out/main.c:21011:13: note: the ABI of passing struct with a flexible array member has changed in GCC 4.4
```

and the test suite passes.  OTOH if you build with clang as your compiler you get no compile time warnings/errors but the test suite fails:

```
size of struct inotify_event is 16 in C and 185207048 in Rust
error: test failed, to rerun pass `--test main`

Caused by:
  process didn't exit successfully: `/deps/main-e32ea4d2acb868af` (signal: 11, SIGSEGV: invalid memory reference)
```

(note that 185207048 is `0x08090A0B` (which is what the round-tripped value is initialized with by the test suite) so clearly the Rust and C code are disagreeing about the calling convention/register layout when passing such a struct).

Given that there doesn't seem to be a specification for passing these objects by value, this PR simply omits them from the roundtrip tests and the test suite now passes on GCC _and_ clang without warnings.
## Extern types
[extern-types]: #extern-types

Extern types are defined as above, they are thin DSTs, that is their metadata is `()`. They cannot ever exist except behind a pointer, and so attempts to dereference them fail at compile time similar to trait objects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume &*x is allowed with such types. This means that dereferencing (using the * operator) is actually fine, it's only the place-to-value coercion which is forbidden. That matches trait objects (and slices).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. I hadn't considered that distinction, I'll try to clarify that later.

This would allow us to represent `CStr` properly but would necessitate splitting `MetaSized` and `MetaAligned` as it is only "dynamically sized" but "statically aligned".
(We may be able to get away with the [Aligned trait](https://github.com/rust-lang/rfcs/pull/3319))
- Should the `extern type` syntax exist, or should there just be a `repr(unsized)`?
This would allow headers with opaque tails (which are very common in C code) but is a more significant departure from the original RFC, and looks more like custom DSTs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need repr(unsized) for this? Couldn't you just allow the last field of a type to be ?MetaSized? (Perhaps requiring the struct to be repr(C)?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine we have a setup like the following:

#[repr(C)]
struct Foo {
    a: u32,
    b: u8,
    foo: String,
}
#[repr(C)]
struct Bar {
    a: u32,
    b: u8,
    bar: u8,
}
extern type OpaqueTail;
#[repr(C)]
struct HeaderWithTail {
    a: u32,
    b: u8,
    tail: OpaqueTail,
}
#[repr(C, unsize)]
struct HeaderUnsized {
    a: u32,
    b: u8,
}

What alignment does OpaqueTail have? &header.tail doesn't really have a well defined meaning as it's either pointing at bar directly (assuming an alignment of one) or some random padding bytes before foo starts. However, you can freely convert between a &Foo (or &Bar) and a &HeaderUnsize without worrying about that.
This alignment issue is why this RFC doesn't propose allowing "dynamically aligned" or "unknown aligned" (?MetaSized) types as fields of structs (except as the only non-ZST field of a #[repr(transparent)] struct).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I suppose you could forbid referencing the field entirely…

@scottmcm
Copy link
Member

scottmcm commented Aug 9, 2023

it might be worth focusing on MetaSized and it's semantics first, and maybe delegating the exact way to define extern types to another RFC?

One potential middle ground would be to add a stable Unknown extern type in core, and for now have the syntax be "make a transparent wrapper on that".

I don't know that I'd call that a 90% solution, but it's maybe a 60% solution.

(I'd rather not punt syntax out if possible, but if it's a matter of landing an edition change in time, I'd rather the bounds change without the syntax than nothing.)

@tgross35
Copy link
Contributor

tgross35 commented Aug 10, 2023

What is the reasoning that optional traits aren't allowed in supertraits? I feel like a builtin trait JustSizedEnough: ?Sized + MetaSized {} may be useful to reduce the clunkiness of ?Sized + MetaSized. This combined bound is expected to be used more often than just ?Sized outside of FFI, right?

Also, is this understanding in this table correct? It might be good to include some kind of table or diagram in the RFC, I've found it kind of hard to wrap my head around just the text.

bound bound-like thing implied by the bound bound-like thing note
Sized MetaSized implemented for everything by default
MetaSized nothing (?Sized) unsized types like str or [T] (?Sized + MetaSized is needed to allow these types)
!Sized !MetaSized something that is !Sized can never be MetaSized

@CAD97
Copy link

CAD97 commented Aug 10, 2023

This combined bound is expected to be used more often than just ?Sized outside of FFI, right?

An owning pointer use will generally want ?Sized + MetaSized, and a non-owning pointer use will generally want ?Sized + ?MetaSized. The size of a non-owned object is only really needed if you're going to replace it, which happens fairly rarely in generic code, and when it does, usually you are also manipulating values by-ownership.

Interestingly, with implied bounds, fn<T: ?Sized>(Box<T>) would be able to still imply T: MetaSized. I don't know if the fundamentalness of MetaSized are enough to overwhelm the known pitfalls with implied bounds, though.

What is the reasoning that optional traits aren't allowed in supertraits?

Because ?Sized isn't adding a bound, it's opting out of one. When you write fn<T>, there's an invisible implied bound of T: Sized, unless you say that T: ?Sized.

Traits don't have this default implied bound of Sized, so ?Sized doesn't mean anything as a supertrait bound. Writing fn<T: JustSizedEnough> is the same as writing fn<T: Sized + JustSizedEnough>, and writing fn<T: ?Sized + Sized> is the same as writing fn<T: Sized>.

I didn't actually review the RFC for this, but if it doesn't yet, it should say that traits have a default MetaSized bound in editionBefore and no longer do in editionAfter (matching the behavior of ?Sized). It also needs to specify how this interacts with dyn; is Self: MetaSized dyn-safe, is Self: ?MetaSized, and when does dyn Trait: MetaSized hold?

!Sized

!Sized isn't a bound. !Sized is a colloquial way of referring to types where T: Sized doesn't hold, so colloquially we would say that str is !Sized + MetaSized. T: ?Sized thus says to permit both Sized and !Sized types.

@Skepfyr
Copy link
Author

Skepfyr commented Aug 10, 2023

@tgross35 I agree with what CAD97's said, I'm confused by your table and don't understand what you're trying to convey. Sized implies MetaSized (in fact I'm now convinced that it should be a supertrait of Sized) so all things that are Sized are MetaSized, but some things are not Sized but are MetaSized (eg trait objects), and finally extern types are neither.

@CAD97

I don't know if the fundamentalness of MetaSized are enough to overwhelm the known pitfalls with implied bounds, though.

I'm beginning to worry that this RFC will require implied bounds to be acceptable, can you link to the known pitfalls with implied bounds?

I didn't actually review the RFC for this, but if it doesn't yet, it should say that traits have a default MetaSized bound in editionBefore and no longer do in editionAfter (matching the behavior of ?Sized). It also needs to specify how this interacts with dyn; is Self: MetaSized dyn-safe, is Self: ?MetaSized, and when does dyn Trait: MetaSized hold?

I hadn't considered this actually, I'll update the RFC soon. dyn Trait is always MetaSized (in fact the vtable directly holds the size and alignment so the implementation is nice and obvious) so I think there isn't much interaction between this and object-safety.

@Skepfyr
Copy link
Author

Skepfyr commented Aug 10, 2023

For anyone following along there was a lang team design meeting about this yesterday, the minutes are available here: Extern types V2 - HackMD. There's also some Zulip discussion here: #t-lang > Design meeting 2023-08-09 - rust-lang - Zulip

That meeting came out with some suggestions for investigations that would help move this RFC along, the current set of things I think need doing are:

  • Clarify dereferencing is allowed but place-to-value coercion is not.
  • Remove uses of "opaque types" to reduce confusion with impl Trait
  • Make MetaSized a supertrait of Sized
  • Clarify implicit MetaSized bound on traits that drops on edition change.
  • Investigate how many ?Sized bounds would now need + MetaSized
  • Investigate whether the MetaSized bound can be removed on Box (and similar) so that it no-one needs to type it when just mentioning boxed things (the drop impl currently needs it to deallocate).

@bjorn3
Copy link
Member

bjorn3 commented Aug 10, 2023

Sized is an auto trait, but auto traits can't have trait bounds I believe.

@Skepfyr
Copy link
Author

Skepfyr commented Aug 10, 2023

Unless I'm missing something that could entirely prevent this kind of thing from working?

fn size_of_val<T: ?Sized + MetaSized>(val: &T) -> usize { ... }
fn foo<T>(val: &T) -> usize {
    size_of_val(val)
}

@tgross35
Copy link
Contributor

tgross35 commented Aug 10, 2023

@tgross35 I agree with what CAD97's said, I'm confused by your table and don't understand what you're trying to convey. Sized implies MetaSized (in fact I'm now convinced that it should be a supertrait of Sized) so all things that are Sized are MetaSized, but some things are not Sized but are MetaSized (eg trait objects), and finally extern types are neither.

I didn't put it together very well, but what I was going for was:

  1. Anything with a Sized bound (doesn't specify ?Sized) will always be MetaSized - this is the default case
  2. Something that is MetaSized may or may not be Sized, which is why you must specify ?Sized + MetaSized
  3. Something that is strictly not Sized can never be MetaSized

Based on your comment it sounds like item 1 is implied (by MetaSized being a supertrait of Sized), item 2 is true, and I'm now realizing that item 3 is just contradictory to item 2 🙃.

What is the reasoning that optional traits aren't allowed in supertraits?
Because ?Sized isn't adding a bound, it's opting out of one. When you write fn, there's an invisible implied bound of T: Sized, unless you say that T: ?Sized.

Traits don't have this default implied bound of Sized, so ?Sized doesn't mean anything as a supertrait bound. Writing fn<T: JustSizedEnough> is the same as writing fn<T: Sized + JustSizedEnough>, and writing fn<T: ?Sized + Sized> is the same as writing fn<T: Sized>.

Thanks for the explanation. I know the discussion of trait aliases came up before, it just seems unfortunate that two traits ?Sized + MetaSized bound is now needed in many places where ?Sized works. But, I suppose that is what this step is meant to determine:

  • Investigate how many ?Sized bounds would now need + MetaSized

@Skepfyr
Copy link
Author

Skepfyr commented Sep 20, 2023

(This is mostly a brain dump so I don't forget it)

boats' post Changing the rules of Rust has made me realise that my plan of relaxing the meaning of ?Sized has backwards compatibility hazards that may prove problematic. Specifically any traits containing associated types can't obviously have the bounds on those associated types be weakened. This would be a massive problem for, e.g., the Deref trait, however I think we might be able to work around it by treating all pre-2024 code as implicitly having bounds like Deref::Target: MetaSized everywhere that they place a bound using a trait with an associated type.

Worse however is traits that contain functions that are generic over their argument because implementations of those traits could rely on T: ?Sized being MetaSized. However, with this doozy of a command:

jq '. as $root | .index[] | select(.crate_id == 0) | select(.inner | has("trait")) | (.inner.trait.items[] |= $root.index[.]) | select(.inner.trait.items[].inner.function.generics | .. | .modifier? == "maybe") | .name'

run over the standard library rustdoc json output, I believe the only stdlib trait this affects is RangeBounds because of its contains method. This is probably fine, but may present an issue still because it could force breaking changes to a bunch of ecosystem traits, for example serde's Serializer trait`, if they wanted to support these types.



# Drawbacks
[drawbacks]: #drawbacks
Copy link
Member

@RalfJung RalfJung Sep 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another drawback is that this RFC rejects a type that is used widely inside rustc:

extern "C" {
    type OpaqueListContents;
}

pub struct List<T> {
    len: usize,
    data: [T; 0],
    opaque: OpaqueListContents,
}

This must definitely be mentioned in the RFC. Ideally there is some kind of proposal for what to do with this type...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple suggestion: allow structs with extern tails, but don't allow computing the offset of (or taking a reference of) the extern tail. This is sufficient for the rustc impl, which uses the data field rather than the opaque field.

A less simple suggestion is a way to specify a known alignment to use for an extern type; this could either be via repr(align = N) or unsafe impl marker::Aligned { const ALIGN: usize = N; }.


The lack of the `Sized` and `MetaSized` traits on these structs prevent you from calling `ptr::read`, `mem::size_of_val`, etc, which are not meaningful for opaque types.

In the 2021 edition and earlier, these types cannot be used in generic contexts as `T: Sized` and `T: ?Sized` both imply that `T` has a computable size and alignment.
Copy link
Member

@RalfJung RalfJung Sep 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a new-edition trait has an associated type with ?Sized bound? If now old-edition code is generic over that trait, Trait::T would be a not-meta-sized type! IOW, old-edition code might actually have not-meta-sized types in a generic context.

Similarly, what if a new-edition trait has a generic function with T: ?Sized bound. If now old-edition code implements this with a T: ?Sized bound, this should raise an error since in the old edition, T: ?Sized actually means T: ?Sized+MetaSized and this the implementation is less generic than the trait requires. This means having generic functions with ?Sized bounds in a new-edition trait makes the trait impossible to implement in old editions.

(Both of these examples came out of boats's recent exploration into adding a Leak trait.)

@CAD97
Copy link

CAD97 commented Sep 28, 2023

I got curious and ended up writing a partial implementation of ?MetaSized1 for the current edition (i.e. ?Sized still implies MetaSized, but ignoring anything beyond the trait itself). My goal was to change as few library unbounds to ?MetaSized as are required2. I think my major takeaways got brought up thanks to boats' post, but as a short list:

  • Current edition code should be able to write ?MetaSized. But I do generally agree that next edition code should have ?Sized unbound MetaSized as well and write ?Sized + MetaSized for current edition ?Sized.
  • A decent amount of code assumes only one ?Trait unbound may be present.
  • traits need a default Self: MetaSized bound/supertrait.3 We elaborate supertrait bounds, so relaxing this implied bound on a trait should be semver-incompatible.{^supertrait-break]
  • MetaSized is an object safe supertrait, so now most object safe traits fire warn(multiple_supertrait_upcastable) since the MetaSized supertrait is considered upcastable.4
  • When you write a generic list for<T: Deref>, this needs to be elaborated to essentially for<T: Deref<Target=$0>, $0: ?Sized + MetaSized>, where the ?Sized unbound is determined by ?Sized's presence on the associated item definition, but the MetaSized (un)bound is determined by ?MetaSized's presence in this bounds context, and this elaboration needs to be recursive.5

Footnotes

  1. I build the +stage1 compiler with (incomplete) ?MetaSized support, but then error during x check --stage 1 trying to derive Copy for ParamEnv, as for some reason it's determined that the std::marker::Copy impl for CopyTaggedPtr<&'tcx list::List<Clause<'tcx>>, ParamTag, true> requires that list::OpaqueListContents: std::marker::MetaSized (an extern type). I think it's actually the requirement P = &List<_>, P: Pointer, Pointer: Deref, since I still have Deref::Target: ?Sized; the odd error timing (the bound is structural on CopyTaggedPtr) is likely due to fast-accept paths for &_: Deref.

  2. I only relaxed ?Sized to ?MetaSized for *const T, *mut T, &T, &mut T, NonNull<T>, PtrRepr<T>, PhantomData<T>, Unsize<U> (both U and Self), Pointee (Self), and the feature(ptr_metadata) fn.

  3. I sort of do this -- adding a predicate alongside the Self: Trait predicate -- but with a huge hack where Unsize and Pointee are specifically opted out instead of looking for a ?MetaSized bound.

  4. I just hacked around this problem by ignoring MetaSized in the lint. The better solution is to exclude MetaSized from dyn upcast eligibility... which actually the implied bound currently isn't permitted as a trait upcast by my impl, though it might still be put into the vtable; I didn't check.

  5. I was unable to figure out a way to do this bound elaboration yet, so my incomplete impl currently just doesn't. This is necessary because Deref::Target needs to be relaxed to ?MetaSized for basically anything to work.

@RalfJung
Copy link
Member

RalfJung commented Sep 28, 2023

When you write a generic list for<T: Deref>, this needs to be elaborated to essentially for<T: Deref<Target=$0>, $0: ?Sized + MetaSized>, where the ?Sized unbound is determined by ?Sized's presence on the associated item definition, but the MetaSized (un)bound is determined by ?MetaSized's presence in this bounds context, and this elaboration needs to be recursive.5

I cannot parse your syntax here and have no idea what you are talking about (not even parts of it), could you explain in more detail?

@CAD97
Copy link

CAD97 commented Sep 29, 2023

It's easiest to explain by example:

// Assuming some trait associated type, e.g.
trait Deref {
    type Target: ?MetaSized;
    // ...
}

// when declaring the function
fn f<P: Deref>()
// this needs to be treated as having
where
    <P as Deref>::Target: MetaSized,
{
    // because the code might have previously relied on that bound, e.g.
    _ = size_of_val::<P::Target>;
}

// This application of implicit bound needs to be applied recursively;
// consider e.g.
trait Strategy {
    type Pointer: Deref
    where
        <Self::Pointer as Deref>::Target: ?MetaSized,
        // NB: bound would be implied here without unbound
    ;
}

// Then with a function defined as
fn g<S: Strategy>() {
    // we need both S::Pointer: MetaSized
    _ = size_of_val::<S::Pointer>;
    // and S::Pointer::Target: MetaSized
    _ = size_of_val::<<S::Pointer as Deref>::Target>;
}

It is technically an option to say that relaxing an associated type to unbound ?MetaSized is a major breaking change, but this is essentially untenable because Deref is critical to basic functionality which should continue to work for extern types. If we need to apply this implicit projected bound at preprojection bound declaration for existing traits anyway, it makes more sense to apply it to all trait projections than only some.1

This actually makes me realize one more edge case to consider: trait DerefExt: Deref needs to have an implied bound of <Self as Deref>::Target: MetaSized as well, as default method bodies are currently able to (in effect) rely on the presence of that bound.

When to not apply the implicit bound is a bit more involved. If a generic binder context (params and where clause) introduces a trait obligation T: Deref (even for nongeneric T!), then an implicit obligation on the projection of <T as Deref>::Target: MetaSized needs to be introduced to the binder unless that same binder also contains a <T as Deref>::Target: ?MetaSized unbound2 or we already have an equality constraint on the projection (typically written in source as T: Deref<Target = U>) in which case the implicit bounds on U of course apply instead. The recursive property is that <T as Deref>::Target may itself contain some positive obligations (Deref doesn't, but Strategy above does), any trait bound obligations introduced for the projected type also need this implicit bound to be added in order to ensure current edition code is never exposed to ?MetaSized generics/projections without writing the ?MetaSized unbound themselves.

The extent of needing to apply MetaSized implicit bounds everywhere does make me wonder if it could be simpler to "just" "give up" and have the current edition code entirely unable to use ?MetaSized unbounds, with every reachable placeholder/generic type obligated to be MetaSized. It would be a strong cutoff between editions (and maybe even justify a distinctly settable dialect axis), but generically consuming ?MetaSized types can be expected to be rare. But I don't think applying a universal implicit obligation is really made significantly easier by not doing so in a way capable to being unbound. But I could see not allowing projection unbounds in the current edition, at least. Perhaps even supertrait unbounds3?

Pessimistic takeaway: extern type support in the current edition is going to feel awkward no matter what we do, in order to forbid ?MetaSized types from "leaking". The best we can do is to make it acceptably predictable and at least somewhat usable, then clean it up to our regular standards for the next edition.

Making ?Sized the only unbound again does seem to be the better solution for the future, but with one potential exception I just thought of and want to point out: dyn Trait. Unowned &dyn Trait can be ?MetaSized just fine, but it would seem unfortunate to require that most owned trait objects specify Box<dyn Trait + MetaSized> instead.

Footnotes

  1. There's an awkward middle ground where existing traits can opt into getting associated type MetaSized obligations elaborated, but new traits can have their associated types not get it. This enables traits to use impl detail associated types which are ?MetaSized without downstream current edition needing to unbound them. I don't even know if it's possible to make a type projection usable in your crate but not downstream without resorting to unreachable pub tricks, which would still get the implicit bound applied.

  2. Note that unbounds in this position (on a type other than a generic parameter introduced by that binder) are currently always a semantic error.

  3. Supertrait MetaSized is actually one case where I'm wondering if it could be desirable to keep it as an implicit bound with an unbound even in the next edition. The reason is that the implicit bound is enabled to behave differently from an explicit bound; namely, be exempt from dyn upcasting (but allow it for an explicit bound for consistency) and not be elaborated (such that adding the unbound is nonbreaking), which is potentially desirable. On the other hand, traits do fine without an implicit Sized and adding Sized when necessary, so there's reasonably little reason to have a conservative MetaSized bound either.

@Jules-Bertholet
Copy link
Contributor

This actually makes me realize one more edge case to consider: trait DerefExt: Deref needs to have an implied bound of <Self as Deref>::Target: MetaSized as well, as default method bodies are currently able to (in effect) rely on the presence of that bound.

You could gate the default method bodies behind the implied bound without gating the trait itself, as MetaSized would presumably be specialization-safe (no lifetime-dependent impls).

@snmps2

This comment has been minimized.

@GoldsteinE
Copy link

There could be use in defining extern types with known alignment, like so:

extern {
    #[repr(align = 8)]
    type Foo;
}

This type can be the last field of a struct, but it’s not MetaSized, merely MetaAligned. Supporting such a feature would require separation of these two concepts, so I think they should be separated from the start to avoid having to migrate custom DSTs two times.

@WaffleLapkin
Copy link
Member

@GoldsteinE unless I misunderstand what MetaSized/MetaAligned are, isn't Foo not just MetaAligned, but Aligned (i.e. the alignment is statically known even without metadata, similarly to slices).

Would also be nice to be able to specify "the same alignment as T", but that's besides the point.

@GoldsteinE
Copy link

GoldsteinE commented Apr 15, 2024

@WaffleLapkin Sure, it’s also Aligned. MetaAligned still seems to me like a more natural restriction for a tail field, although I’m not sure MetaAligned + !Aligned types are even achievable without fully custom DSTs (i.e. DSTs with user-defined size_of_val and align_of_val).

Would also be nice to be able to specify "the same alignment as T", but that's besides the point.

I think that with generic_const_exprs it would be just Aligned<ALIGN = align_of::<T>()>.

@Skepfyr
Copy link
Author

Skepfyr commented Apr 24, 2024

@nikomatsakis has written https://smallcultfollowing.com/babysteps/blog/2024/04/23/dynsized-unsized/ which suggests an alternative to relaxing ?Sized by using positive bounds (e.g. : MetaSized) that inherently relax the implied Sized bound. I'm sure I saw this suggestion before too but also have no idea where I saw it.

This suggestions should definitely go into the alternatives section at the very least, the way I see it the pros are:

  • It doesn't require an edition boundary to make the change.
  • It gets rid of the somewhat confusing ?Sized notation

The cons are:

  • The implied relaxation of Sized could be surprising (especially if it works in supertraits), T: Unsized isn't really a bound at all, as it doesn't let the function assume anything about T making these traits especially weird.
  • trait definitions currently assume Self is ?Sized + MetaSized (kinda? they allow functions that require Sized too which is odd), if this proposal means that in order to use a trait with an extern type it would have to be declared trait Foo: Unsized, it would be annoying to use them as it's likely crates would forget to do that.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Apr 24, 2024

trait definitions currently assume Self is ?Sized + MetaSized (kinda? they allow functions that require Sized too which is odd), if this proposal means that in order to use a trait with an extern type it would have to be declared trait Foo: Unsized, it would be annoying to use them as it's likely crates would forget to do that.

Hmm, I overlooked that, I think we would want to use an edition for this, or possibly we could just get away with changing the default to Unsized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
Status: Rejected/Not lang
Development

Successfully merging this pull request may close these issues.