Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create non-primitive-constant-data.md #46

Closed

Conversation

davidwrighton
Copy link
Member

Runtime level non-primitive constant support

@jkotas
Copy link
Member

jkotas commented Aug 11, 2018

There is existing very similar proposal https://github.com/dotnet/corefx/issues/26948

The differences between the two is:

  • This proposal is extended to handle arbitrary structs. The older proposal was scoped to primitives and Span because of that's what C# syntax was designed for.
  • RuntimeFieldHandle vs. field address + generic argument. Is there a good reason why to choose field address and generic argument, and not just follow the existing InitializeArray pattern with RuntimeFieldHandle ?
  • API names - the API review folks converged to CreateSpan as the preferred API name

@jkotas
Copy link
Member

jkotas commented Aug 11, 2018

cc @VSadov

return LoadConstantData(ref inData, 1)[0];
}

// General replacement for ArrayInitialize pattern, also handles non-primitive constants.
Copy link
Member

@jkotas jkotas Aug 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative design could be to teach the existing InitializeArray API to support more types, and not introduce new API.

Copy link
Member

@VSadov VSadov Aug 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InitializeArray requires that there is an array instance. Span-based API feels a bit more flexible.

It might be nice if InitializeArray works with generalized const-able types, but having span based APIs, it would not be strictly necessary. Compiler couild just emit ToArray or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, its straightforward to enable InitializeArray API to support more types. I generally prefer to avoid extending existing apis, but that would be reasonable to do.


### Definition of types which can be represented as constants
- Must be a struct
- The struct must have sequential layout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about explicit layout?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the requirement should just be that the type has a constant size (so no auto-layout and no pointer or pointer-like types).
If the user opts-for Sequential layout, they are responsible for ensuring that the layout is the same on all systems; or the runtime should throw when the constant size and the actual size are different

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the requirement should just be that the type has a constant size

That is not sufficient to make this work on bigendian platforms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout is sufficiently variable that it must either support explicit layout only(without overlapping fields), or sequential layout could also be supported. If sequential layout is supported at all, it doesn't make sense to put arbitrary restrictions on it, like must be consistent between platforms, as that's impossible to define.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also occurs to me that the current proposal would actually also work for auto layout. I'm not sure how I feel about that, but we could certainly make it work.

- Must be a struct
- The struct must have sequential layout
- Without pointers of any form (No object reference, IntPtr, UIntPtr, pointer, function pointer)
- All fields must be public, or the type must be a primitive type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does public matter at all here? I would assume the type must be primitive not any or qualification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requiring all fields be public would also block "opaque" types, such as Vector128<T>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does public matter at all here?

The field layout has to be public contract for this to work with versioning. The easiest way to guarantee it is to require all fields to be public. Consider the case where the struct is defined in one assembly and used in different assembly.

"opaque" types, such as Vector128<T>

Types like Vector128<T> do not work with this in general. The swizzler would have to special case them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field layout has to be public contract for this to work with versioning

I don't believe so. It could work with fields of any accessibility, provided the user doesn't do anything that modifies the layout of the struct later.

At the language level, it would be possible for the user to specially attribute structs that they want constant to work with (which tells the compiler, "I am opting into this behavior, and I won't change the layout later"). If they were to change it later, then it would be a breaking change and would fail at runtime (the same as any number of other breaking changes that are possible for people to make).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the language level, it would be possible for the user to specially attribute structs that they want constant to work with

I think it's going to be a requirement for these types to be marked in a special way. There are too many requirement,s most outlined in this document, that go against how struct are generated in general.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference assemblies and the tooling around them would need to learn how to deal with it.

It would be great if the tooling could just reuse the C# Ref Assembly feature (which already deals with fields, etc)... It would be conceivable, even for partial facades, if the reference definitions were pulled from the general S.P.Corelib reference assembly (as needed), rather than being manually recreated.

Copy link
Member Author

@davidwrighton davidwrighton Aug 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding - Types like Vector128 which are special cased by the runtime are effectively primitives. I'll update the doc to note that Vector128 and Vector256 fall into that category. If we do end up allowing

@jaredpar Generally, non-public fields are not considered to be part of the public contract of a type. I am aware that in some circumstances, the private details of a structure affect various behaviors of the C#, but we generally want to make public contracts impacted by private details.

An alternative approach would be to describe the language compiler generated layout of structures via a set of attributes on the types. This would possibly allow us to describe the special case for VectorXXX as well as make the discussion of public vs private fields moot (at the runtime level). The idea of private fields being part of this sort of a contract makes me uncomfortable, but it works for me if one is talking about constants for types defined in the same module or something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if the tooling could just reuse the C# Ref Assembly

I agree that it would be great, but last time I have checked the C# Ref Assembly feature was insufficient for what CoreFX needs.

rather than being manually recreated

The manually maintaned public surface definition lets us ensure that we have same surface between platforms, and compatible surface betwen versions and different runtimes. The C# Ref Assembly feature does not have equivalent for this today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, non-public fields are not considered to be part of the public contract of a type. I am aware that in some circumstances, the private details of a structure affect various behaviors of the C#, but we generally want to make public contracts impacted by private details.

Struct private fields are very much a part of the c# contract. I think every time people have tried to get cute about them not they get broken. I don't think I'd want to require public fields here in C#. Would detract a bit from the feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be great, but last time I have checked the C# Ref Assembly feature was insufficient for what CoreFX needs.

We should follow up again on this as I'm not sure what feature is missing at this point. There shouldn't be any gap at this point, other than changing the infrastructure for calling the compiler.

The proposal is to provide a set of CompilerServices apis implemented as intrinsics (in most cases) which will allow the use of more complex constants. This api will accept as a byref argument a reference to data in a well defined type layout, and return a byref to data in the correct platform layout.

### Definition of types which can be represented as constants
- Must be a struct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be a generic type? - like int? ?
It feels "why not", but then we must be sure that such types respect sequential layout and have predictable packing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, int? is probably a bad example, since it has private fields, but the question still remains - would generic types be supported?
A better example would be tuples like (byte, byte, byte) - can that be a constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be a generic type, but it only make sense if we were willing to make setting the fields of a nullable a well known detail of the frameworks. Given the history of nullable I think it would be fine, but it would probably require special casing in the runtime to have this knowledge.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, it would be ok to declare the fields of ValueTuple to be a public contract as well. With regards to layout of generic structures, by allowing the layout to differ between the PE file and the actual runtime, support for a variety of capabilities pops out. I hadn't thought of generics, but they should just work.

@davidwrighton
Copy link
Member Author

davidwrighton commented Aug 13, 2018

With regards to the proposal by @VSadov, that proposal might be a better api to access this functionality. Initially, I had been thinking that avoiding the RuntimeFieldHandle and using just the ref parameter would be simpler, but as I dug into putting together the handling to allow proper interning of the data and only allowing this in safe ways, the design has gotten more complex. Support for the ref T returning variant is quite important, as it allows for individual constants to be more efficiently encoded, but we could easily swap to using the RuntimeFieldHandle to get at the data, and obviously the names can be swapped around.

@VSadov
Copy link
Member

VSadov commented Aug 13, 2018

Yes, we went through exactly the same reasoning. Referring to RVA constants symbolically - via RuntimeFieldHandle is sufficient for scenarios that matter.
It is also safer since in the handle case we can know the size of the data.

- allow auto and explicit layout
- explicitly declare that generics are supported
- Add comment about special types (Vector128<T>, Vector256<T>)
…s of api, but still retaining the handling of arbitrary types.
@davidwrighton
Copy link
Member Author

@jaredpar @VSadov I've updated the proposal.

Jared, do you think that there is a path to getting non-primitive constants into the language?

class RuntimeHelpers
{
// Placing this attribute on a type indicates that it can be stored and loaded using the CreateSpan<T>, LoadConstant<T> and InitializeArray helper functions
public WellKnownLayoutTypeAttribute : Attribute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never have attributes as nested types.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this attribute be synthetic and encoded as 0x18 in the type flags?


The proposal is to provide a set of CompilerServices apis implemented as intrinsics (in most cases) which will allow the use of more complex constants. The intention is to permit more complex constant forms for ReadOnlySpan<T> access, constant access, and through IL compiler magic, add new forms of array initialization that are not currently supported through the InitializeArray pattern.

### Definition of types which can be represented as constants
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of banned types should include mono's size-variadic types nint and nfloat as they don't fit on any of those categories.

- All fields must also be of a type which can be represented as a constant

### Well known type layout
Given that IL binaries may be loaded on many different platforms with different type layout rules, constant data must be represented in a consistent manner across all of them. For consistent type layout, types shall be laid out in precisely a sequential manner as described in metadata, with *no* packing between any fields. For instance, this struct will utilize 9 bytes when stored. In addition, if the type uses explicit layout, even if there are overlapping fields, those fields are not to be overlapped in the WellKnownLayout, the layout of the final form will be based on copying out of the data stored in the PE file in metadata order into the runtime layout. An empty structure will consume 0 bytes of storage, and thus while it may be used as a field within a constant, is not itself permitted to be used to instantiate CreateSpan<T> or LoadConstant<T>
Copy link

@kumpera kumpera Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mono supports platforms that doesn't support unaligned loads, meaning all WellKnownLayout types and its fields must be accessed with unaligned. prefix.

class RuntimeHelpers
{
// Placing this attribute on a type indicates that it can be stored and loaded using the CreateSpan<T>, LoadConstant<T> and InitializeArray helper functions
public WellKnownLayoutTypeAttribute : Attribute
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this attribute be synthetic and encoded as 0x18 in the type flags?

@jaredpar
Copy link
Member

@davidwrighton

Jared, do you think that there is a path to getting non-primitive constants into the language?

By non-primitive do you mean not the CLR primitives (int, string, double, etc ...)?

@davidwrighton
Copy link
Member Author

Yes

And about naming... the naming on the attributes I've whipped up is very suspect. These would need more formal review before I would consider the naming to be at all finished.

@tannergooding
Copy link
Member

@davidwrighton. How much work would it be to add a RuntimeHelpers.CreateReadOnlySpan method that just supports the 12 primitive types listed here: http://source.roslyn.io/#Microsoft.CodeAnalysis/SpecialTypeExtensions.cs,69087839d91b67fd

Currently, adding the support for constant user-defined structs is likely out of scope for C# 8.0

However, updating C# to directly call RuntimeHelpers.CreateReadOnlySpan instead of RuntimeHelpers.InitializeArray is likely a compiler only change and is would be "in scope" (since it is basically just a bug fix/perf improvement).

@davidwrighton
Copy link
Member Author

That would be rather less work. If we don't have any real need for the more complex stuff, it would be much simpler to do that the rather large pile of stuff I suggest in this proposal.

@terrajobst
Copy link
Member

What's the status of this?

@davidwrighton
Copy link
Member Author

Probably not going to happen any time soon, there doesn't seem to be a great need for this at this time.

@terrajobst
Copy link
Member

Should this be closed then?

@pentp
Copy link

pentp commented Mar 27, 2020

No, there's still need for this, it's just lower priority than having constant data for primitive types.

@terrajobst terrajobst marked this pull request as draft April 9, 2020 23:44
@richlander
Copy link
Member

If we're not doing anything with them, can be merge it into "proposed"? We should only use open PRs for active proposals (that we're actively working on, as a proposal).

@davidwrighton davidwrighton marked this pull request as ready for review April 27, 2020 18:21
@davidwrighton
Copy link
Member Author

@richlander, I'm unfamiliar with the merge policies of this repo. If you believe that the it is appropriate to merge, please do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants