Proposal: "nominal records" for C# #1667

agocke · 2018-06-25T20:57:55Z

This is the first part of a set of proposals about data manipulation and representation in C#, but can be viewed as a standalone language feature.

agocke · 2018-06-25T20:58:06Z

cc @jaredpar

CyrusNajmabadi · 2018-06-25T21:18:51Z

proposals/data-classes.md

+First, the generation of equality support. Data members are only public
+fields and auto-properties. This allows data classes to have private
+implementation details for things like caching without giving up simple
+equality semantics. There are a few places this could be problematic. For


I'm not saying this is wrong. But this def feels like it needs to be expanded upon, esp. give the above sections on what data is/isn't. Talking about impl-details(i.e computation) and caching, after somewhat saying that data is not these things isn't wrong, but it's very subtle and muddies things a bit.

I'll elaborate more on this. The idea is that the public contract should be seen as a collection of data. As always, there's some room to hide your dirty laundry in private implementation details if the real world is more complicated than that.

FWIW, this also breaks serialization (in particular, _de_serialization), at least the way it's commonly implemented in e.g. Newtonsoft.Json.

CyrusNajmabadi · 2018-06-25T21:20:20Z

proposals/data-classes.md

+
+`GetHashCode` would be implemented by calling `GetHashCode` on each of
+the data members and these would be combined using a symmetric operation
+(like `xor`) to prevent ordering from affecting the results.


is it expected that 'data' is unordered? Above, you mentioned:

Data *is* a collection of values with potentially heterogenous types.

Should that be "an unordered collection of values"?

Or, potentially:

Data *is* a collection of **named** values with potentially heterogenous types.

@agocke No, this is a different issue. The problem with using a symmetric operation for combining hash codes is that it means that e.g. new Point { X = 1, Y = 0 }.GetHashCode() returns the same value as new Point { X = 0, Y = 1 }.GetHashCode().

This is easily solvable by adding the name of the field to the hash.

The comment about pretending that types are the same is because I assumed the reason for symmetry is that you wanted e.g. new Contoso.Person { FirstName = "Andy", LastName = "Gocke" }.GetHashCode() to be the same as new AdventureWorks.Person { LastName = "Gocke", FirstName = "Andy" }.GetHashCode().

Again, making the name of the field part of the hash solves the problem.

CyrusNajmabadi · 2018-06-25T21:24:58Z

proposals/data-classes.md

+semantics with `readonly` semantics. Requiring initialization via constructor
+means that field order becomes a public API and requires careful versioning,
+which is not true of mutable `data` types and is a constraint that should
+be irrelevant to `readonly` semantics.


interesting idea. This feels like this may be unpalatable for a section of users. It sounds like you would initialize with something akin to the field/property initializers you have today in C#. i.e. new Foo { X = x, Y = y. etc }. I would contend there are customers for whom concerns like smell and careful versioning do not apply, who would find the above far too verbose and cumbersome for their tastes. Not having a form like new Foo(x, y) may be too bitter a pill for some to swallow.

The most unpalatable aspect of this would be the compiler surfacing public mutable fields with autogenerated unspeakable names with only a wink and a nudge to enforce their immutability. There must be a significantly better way to accomplish this behavior without requiring private implementation details to be exposed which then every other compiler either has to ignore or has to respect.

What about creating a nested public struct to contain all of the data members and having the constructor or a static factory method accepting that struct?

public data class LoginResource { public string Username { get; } public string Password { get; } } var loginResource = new LoginResource { Username = "andy", Password = password }; // translates into public class LoginResource { private readonly DataMembers _data; public LoginResource(DataMembers data) => _data = data; public string Username { get => _data.Username; } public string Password { get => _data.Password; } public struct DataMembers { public string Username { get; set; } public string Password { get; set; } } // other members elided for brevity } var loginResource = new LoginResource(new LoginResource.DataMembers { Username = "andy", Password = password });

@HaloFour I'm came up with this exact same idea! The problem is, what if you have a data struct? Now you have two structs for every struct! That seems like a lot of metadata bloat.

@agocke

That seems like a lot of metadata bloat.

Which, in my opinion, is a much more palatable issue to have than publicly-exposed secretly named writable "readonly" fields. 😄

The builder pattern isn't an uncommon one to see, especially with readonly data classes. To see it codified and accessible as a language feature would be a welcome addition, in my opinion.

I, too, think this would be worth exploring.

Though, i'm also amenable to either approach being taken. I think @HaloFour (understandably) has a very visceral reaction to 'faked up' readonly. I'm more ok with it. But it would be nice if the underlying generation was actually just 'clean' and followed some sort of pattern that any language could consume/produce without having to understand hackery...

Not to reveal too many secrets but...

all of F#'s supposedly thoroughly safe, immutable data structures are not marked initonly in metadata

😉

This is not to imply F# is unsafe -- just the opposite. I think the truth is that language rules provide the vast majority of safety. This is also the position we took with the new ref readonly feature, where we allow ref variables to be taken directly to readonly fields and rely on the C# language rules to enforce the safety of ref readonly (which the CLR has no conception of).

@agocke I think relying on language rules with private or internal members is fine, but doing the same with public members is much more problematic. Does F# ever produce public non-initonly fields for its immutable data structures?

And ref readonly uses modreq, which means that some compiler could circumvent those rules, but it would have to be wilful.

Using unspeakable names offers questionable safety, especially since there is a major .Net language that can speak them, namely F# with its double backtick syntax.

@agocke

Warning: I saw Hello World implemented in F# once so I'm totally qualified to comment on the language.

F# also encodes a boatload of secret metadata in binary resources that is intended to only be understood and supported by F#. Where you can expose metadata publicly either the rules are different or the magic breaks down. An example might be F#'s extended generic type parameter constraints which most languages don't understand and don't enforce only to have the program explode at runtime if the generic type argument isn't "valid". In my opinion C# should be a much better citizen in the .NET ecosystem of languages and should try to avoid relying on magic public behavior where it is possible.

I do agree that readonly ref does involve some magic, but I would argue that it's different in that it's relatively niche and that the magic mostly happens on the implementer side. IIRC it would take someone writing some pathological code to simulate a readonly ref from another language which could abuse the lack of enforcement. Data classes, on the other hand, are intended for a significantly wider audience to eliminate the boilerplate of creating these very common classes. Having a leaky abstraction where any consumer can wreck the internal state of the class seems incredibly dangerous.

Anywho, the crux of my argument is that I would prefer that clean/safe approaches be considered before hacky ones. In fact, I think support for a builder pattern would be nice for C# as an orthogonal feature and data classes can simply piggyback on that.

HaloFour · 2018-06-25T22:02:30Z

When will an Issue be created for this proposal? I imagine that some of these points will be quite contentious. Just briefly scanning the "readonly initialization" section is enough to give me nightmares.

gulshan · 2018-06-26T10:32:20Z

How this proposal relates to Records #39 ? As I commented there, I am for just a primary constructor based short hand definition capability for all/any types, and then the equality, Hash, with and deconstructor generation on top of that when associated with some keyword like data (or attribute like [Data]). The same data keyword being proposed here with equality and hash generation. Shouldn't these two proposals be consolidated?

I think proposed readonly initialization changes current meaning of initializers in C#. With non-nullable references are coming, initialization of individual reference type fields will be a concern. I personally think, default values for primary constructor parameters are enough to deal with this. It also optionally provides names for fields and ignore the order of the arguments to some extent.

svick · 2018-06-26T10:57:45Z

proposals/data-classes.md

+is a subtype of `A` (and therefore has an implicit conversion) and all of
+the shared members between `A` and `B` are equal. In contrast, `A` would
+likely not have an implicit conversion to `B`, so they would not be
+considered equal.


If I understand what you're proposing correctly, then I think this is a terrible idea:

It means that a.Equals(b) can return true, even though a.GetHashCode() and b.GetHashCode() are not equal.

As mentioned, it means Equals is not symmetric.

Both of these points mean that these types won't work correctly in hash-based collections like Dictionary<TKey, TValue> or HashSet<T>. And they also break the invariants required by implementations of Equals(object) and GetHashCode(). #Resolved

This is a really good point. The problem I ran into was: how do I prohibit it?

Making .Equals do a dynamic type check seems problematic. So does calling GetHashCode inside Equals.

However, Neal has sent me a great idea for producing a symmetric Equals:

class Person { public string Name { get; } public Person(string name) { Name = name; } protected virtual Type EqualityContractOrigin => typeof(Person); // Note here public override bool Equals(object obj) { var that = obj as Person; return that != null && that.EqualityContractOrigin == this.EqualityContractOrigin && // Note here that.Name.Equals(this.Name); } }

I really like this solution.

#Resolved

@agocke

Making .Equals do a dynamic type check seems problematic.

I think that's actually the right solution. Different types should not compare as equal, even if e.g. a derived type does not override methods it should override.

Why would it be problematic?

If it's about performance, I think code like that.GetType() == this.GetType() has the potential to be more efficient than two virtual calls (though the CLR could do a better job at optimizing it). #Resolved

svick · 2018-06-26T11:04:09Z

proposals/data-classes.md

+
+`GetHashCode` would be implemented by calling `GetHashCode` on each of
+the data members and these would be combined using a symmetric operation
+(like `xor`) to prevent ordering from affecting the results.


I think the primary result of this would be a bad hash code.

Trying to pretend that two different data classes are the same if they have the same members sounds like a bad idea to me.

This sounds like you're restating the previous comment. If Equals is symmetric, do you have any other objection?

@agocke No, this is a different issue. The problem with using a symmetric operation for combining hash codes is that it means that e.g. new Point { X = 1, Y = 0 }.GetHashCode() returns the same value as new Point { X = 0, Y = 1 }.GetHashCode().

I believe that in some data sets, this could increase the number of collisions by a lot.

The comment about pretending that types are the same is because I assumed the reason for symmetry is that you wanted e.g. new Contoso.Person { FirstName = "Andy", LastName = "Gocke" }.GetHashCode() to be the same as new AdventureWorks.Person { LastName = "Gocke", FirstName = "Andy" }.GetHashCode().

Sorry I was unclear.

Ah, in that case the intention is: yes, fields are deliberately unordered. However, there is no structural equality -- the types must be nominally compatible for two types to be considered equal. It sounds like we're also centering around subtypes not being equal to supertypes, so that would imply the types are equal as well.

@agocke In that case, I think the order of fields should matter specifically for the purpose of computing the hash code. Hash codes should not be preserved across executions, so I think it's fine if changing the order of fields changes the hash code.

(Even stronger example is .Net Core 2.1 HashCode. It uses randomization, which means hash codes produced by it change even if the types themselves don't. If that behavior is acceptable, then I think depending on the order of fields should be also acceptable, even if they are otherwise unordered.)

svick · 2018-06-26T11:07:29Z

proposals/data-classes.md

+default initialization in the base class, the subclass is required to
+define a protected constructor. The constructor must assign all readonly
+members of the base class before the constructor ends, or an error is
+produced.


Do I understand it correctly that because of this, it's not possible to make something like the following code work?

data class Person { public string Name { get; } } data class Student : Person { public int Grade { get; } } … new Student { Name = "Jane Doe", Grade = 1 }

Or do these rules only apply to inheritors that are normal classes, not data classes?

This should only apply to normal classes. Data classes should inherit the requirements of their base data classes.

gafter · 2018-06-26T14:33:59Z

@agocke we figured out how to make equals be symmetric in the case of inheritance. I’ll send you notes tomorrow.

agocke · 2018-06-26T16:00:08Z

@HaloFour Several members of the LDM requested a PR, since that's essentially an issue in GitHub, but it has better change tracking and you can leave comments on individual sections.

agocke · 2018-06-26T16:06:08Z

@gulshan @CyrusNajmabadi If you're looking for an ultra-short syntax for data members, you may be interested in my forthcoming companion proposal for expanding tuples (I call them "named tuples").

jaredpar · 2018-06-26T16:07:47Z

proposals/data-classes.md

+{
+    public string Username { get; set; }
+    public string Password { get; set; }
+    public bool RememberMe { get; set; } = false;


Why have = false here? It's the default and having it put there makes me think it's relevant to the feature. #Resolved

jaredpar · 2018-06-26T16:08:34Z

proposals/data-classes.md

+Unfortunately, there are still serious problems. There is no piecewise
+comparer implicitly defined for C# classes, so if you want simple data
+comparison, the real example looks like
+[this](https://gist.github.com/agocke/6ba6e64f77f1212ba7292bfd1f1000e5)


Rather than link to a GIST I would just inline the sample here. Yes it's a big sample but that's part of the point. #Resolved

jaredpar · 2018-06-26T16:09:39Z

proposals/data-classes.md

+1. A constructor must be manually defined.
+1. The constructor parameters are ordered, while the fields are not.
+   Consumers have now taken a dependency on the parameter ordering.
+1. The constructor must be maintained with any field changes.


Another issue: constructor parameter names don't necessarily line up with field / property names hence named argument passing doesn't have the same ease of use as object initializers. #Resolved

jaredpar · 2018-06-26T16:09:48Z

proposals/data-classes.md

+
+There is also no way to create a copy of a data structure with readonly
+fields with only one item changed. A new type must be constructed manually.
+


double blank line. #Resolved

jaredpar · 2018-06-26T16:10:53Z

proposals/data-classes.md

+## Proposal
+
+To resolve many of these issues, I propse a new modifier for classes and structs: `data`.
+`data` classes or structs are meant to satisfy the goals listed above by doing the


Can we have a sample of a data class usage in the proposal?

jaredpar · 2018-06-26T16:11:07Z

proposals/data-classes.md

+`data` classes or structs are meant to satisfy the goals listed above by doing the
+following things:
+
+1. Automatically generating `Equals`, `GetHashCode`, `ToString`, `==`, `!=`, and `IEquatable<T>`


How is ToString relevant here?

+1 on including ToString. It's handy for debugging.
A debugger display tip would be ok, as an alternative.

In reply to: 198207292 [](ancestors = 198207292)

jnm2 · 2018-06-26T16:38:16Z

proposals/data-classes.md

+
+## Proposal
+
+To resolve many of these issues, I propse a new modifier for classes and structs: `data`.


Typo propse #Resolved

jnm2 · 2018-06-26T16:42:33Z

proposals/data-classes.md

+`readonly` in metadata. The CLR treats `readonly` mostly as guidance -- it
+can easily be overriden using reflection anyway. Most of the safety of
+`readonly` members in C# is not provided by the runtime, but by C# safety
+rules. One way we could enfore compiler rules would be to generate public


Typo enfore

jnm2 · 2018-06-26T16:56:40Z

Concerned about the public unspeakable members, but I'm thrilled about this proposal in general. Can hardly wait for this!

There is also no way to create a copy of a data structure with readonly fields with only one item changed. A new type must be constructed manually.

I'm particularly interested in this since today I'm maintaining With* methods by hand. What would the syntax be for creating a copy with only one item changed?
If I wanted to suppress a certain With* method or add a custom With* method to require two fields to be set at once, perhaps with very simple validation, would I have to swap the entire class over to a normal class first?
Would permitting the manual override of ToString on a data type ever be considered? I write data types and override ToString for UI data binding to control the display of the object, choosing one or two of the fields to use in the display string.

HaloFour · 2018-06-26T17:22:10Z

@agocke

Several members of the LDM requested a PR, since that's essentially an issue in GitHub, but it has better change tracking and you can leave comments on individual sections.

I agree that PRs make it easier to discuss specific portions of a proposal, but it doesn't feel appropriate to have this one discussion segregated from every other discussion. Once this proposal is merged the conversation is effectively lost since a user would have to know to look in a completely different place to find it. The only thing worse than the shortcomings of the Issues is scattering the discussions all over the place.

agocke · 2018-06-26T17:23:57Z

@dotnet/csharplangdesign See @HaloFour's comment above about proposal formatting.

gulshan · 2018-06-26T18:08:36Z

So, what about the Records #39 ? Abandoned?

Fixes some typos and changes equality to be commutative even in the presence of subtyping.

agocke · 2018-06-26T18:49:39Z

@gulshan You can see this as part of a broader "records" proposal. If the thing you were looking for in records was the syntax, that will be part of a follow-up proposal I'm calling "named tuples".

CyrusNajmabadi · 2018-06-26T18:54:08Z

proposals/data-classes.md

+        hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(Username);
+        hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(Password);
+        hashCode = hashCode * -1521134295 + RememberMe.GetHashCode();
+        return hashCode;


note: this is simpler now with HashCode.Combine. (just to make the example code fair :)).

CyrusNajmabadi · 2018-06-26T18:55:55Z

proposals/data-classes.md

+1. The constructor must be maintained with any field changes.
+1. Constructor parameter names don't necessarily line up with field/property
+names hence named argument passing doesn't have the same ease of use as
+object initializers.


Note: tehse are definitely cons of a constructor. But it might be nice to mention the pros as well. Namely terse and sensible syntax for 'data' that has well-understood positional ordering.

CyrusNajmabadi · 2018-06-26T18:56:38Z

proposals/data-classes.md

+
+1. Automatically generating `Equals`, `GetHashCode`, `ToString`, `==`, `!=`, and `IEquatable<T>`
+   based on the member data of the type.
+1. Allow object initializers to also initialize readonly members.


not sure if it's mentioned later. But is it possible to override any of this? For example, if you want to provide a better GethashCode impl? Or if you want to provide your own specific ToString?

Yup. We may want to give a warning/error if you override everything and data just means nothing.

Well, i was hoping 'data' would solve the issue of:

+There is also no way to create a copy of a data structure with readonly
+fields with only one item changed. A new type must be constructed manually.

So i could imagine having a data class where i override everything. But i still benefit from the fact that you provided me the easy way to generate a copy with only some pieces changed.

We would not give the warning whenever there are read-only data members because someone may be using the object initializer support for them.

As far as With-ers goes, they will be generated too, but I don't have the details worked out yet. I think we may want to introduce an actual with { } expression to mirror the object initializer syntax to make handling read-only members easier (and not generate many intermediate objects).

CyrusNajmabadi · 2018-06-26T18:57:23Z

proposals/data-classes.md

+{
+    public string Username { get; set; }
+    public string Password { get; set; }
+    public bool RememberMe { get; set; }


very nervous about these being mutable. can i make immutable if i want?

Ok good :) Personally, in any docs, i think it might be good to start with the data being immutable. but then showing how you can opt-into mutability pretty easily. but that' sa very minor point.

I've changed the example to read-only to show off the codegen hacks.

CyrusNajmabadi · 2018-06-26T18:58:17Z

proposals/data-classes.md

+        {
+            return EqualityComparer<string>.Default.GetHashCode(Username) +
+                EqualityComparer<string>.Default.GetHashCode(Password) +
+                RememberMe.GetHashCode();


since you can only compare with other LoginResource's... is there a reason that hashing is unordered?

CyrusNajmabadi · 2018-06-26T18:59:43Z

proposals/data-classes.md

+To support these cases and provide an easy escape hatch, I propose a
+new attribute, `DataMemberAttribute` with a boolean flag argument on the
+constructor. This allows users to override the normal behavior and include
+or exclude extra members in equality. The previous example would now read:


this seems odd to me. Given that it's a Data-Class or Data-Struct, my bias is that the public props/fields should be part of the data-ness. It seems like if someone does not want that, the attribute should be to opt-out, instead of needing to opt-in.

auto-properties are part of data-ness, computed properties may not be. That's especially true if you have, say, a half dozen computed properties all reading from single piece of data that you could compare directly for equality. For instance, flags enums to boolean properties.

auto-properties are part of data-ness, computed properties may not be.

Agreed. but my point is simply: when something "may not be", what side of hte fence do you decide they're on. My argument is we shoudl just assume they are part of the data-ness. And if you don't want that, you opt-out.

Note: i also think opt-out is a good idea because maybe i want an auto-prop and i do not want that auto-prop to be part of the data-ness. Now, i'd have a consistent way for auto-props and computed-props to say that. i.e.:

[NotData] internal int LogLevel { get; set; }

I want this to be an auto-prop because i really don't need to compute anything. But i don't want it participating in the data-ness.

That's especially true if you have, say, a half dozen computed properties all reading from single piece of data that you could compare directly for equality.

This example is interesting to me, because for the data classes I would want to write, that piece of data would not be a public prop/field, and would instead be a private field. The example that immediately comes to mind is CommonConversion in Roslyn, where it's a bunch of bool properties that we internally store as bits in an integer. For that example, what would you propose for comparison? Would there be some way of saying "These public things aren't part of equality, but this private field is"?

@333fred I think in that case, the best thing would be to just provide your own override of GetHashCode/Equals. i.e. you know better. So you can give the fastest impl.

@333fred That's what the proposal is talking about. You put [DataMember(true)] on the private field and it's considered in equality. The public computed properties are not.

I think @CyrusNajmabadi's approach is too heavy-handed.

:D

Note: in fred's example, he jsut wants the private field used for equals/gethashcode. Is there then a way to opt-out of something being a data member? i.e. even if you have an auto-prop, can you put [DataMember(false)] on it?

If so, i think i'm happy. Basically, the language would have rules about if an unadorned member was a data-member or not (i.e. auto-props are, blah blah blah are not). But you can always put on DataMember(true/false) on anything to explicitly opt-in/out.

Does that make sense ot you Andy?

@CyrusNajmabadi Yeah, the boolean flag in the constructor was supposed to work both ways. To both opt-in and opt-out.

CyrusNajmabadi · 2018-06-26T19:01:31Z

proposals/data-classes.md

+are equal. The members are compared by `==` if it is available. Otherwise,
+the method `Equals` is tried according to overload resolution rules (st. an
+`Equals` method with an identity conversion to the target type is preferred
+over the virtual `Equals(object)` method).


is this better than saying that you use EqualityComparer<X>.Default.Equals(a, b)? That shoudl do "the right thing" for basically all types, right? Of course, the compiler could consider optimizing that for some things like primitive types. but it seems odd to bake in the knowledge about == vs .Equals vs IEquatable.Equals.

That's a great question that I don't know the answer to. (yet)

CyrusNajmabadi · 2018-06-26T19:02:33Z

proposals/data-classes.md

+
+`GetHashCode` would be implemented by calling `GetHashCode` on each of
+the data members and these would be combined using a symmetric operation
+(like `+`) to prevent ordering from affecting the results.


not clear ot me why/how ordering could/would matter.

Since you can't compare against a different type... your orders will be the same.

You could certainly save the hash code and compare before/after reordering members.

This is interesting because now you're providing an even stronger guarantee. That the hashcode is not only safe across reordering. But also safe across versioning of the compiler/language. I'm ok with such a guarantee. but we would really have to be specific about it.

Personally though, i don't like it. :)

I would much rather us get behind a weaker guarantee. That only within a session could you even be sure that hte hashcodes would be the same. I would even do what some systems do and introduce some random value so that each session produces new hashcodes. That way, if you actually tried to take a dependency on hashcode values, that you would break almost immediately.

@agocke The documentation of GetHashCode() strongly warns against doing that.

That's fair. Still, a commutative hashcode (+) is simpler, so I'm not sure why you would want a non-commutative one.

Basically, because collisions are far more likely. frankly, i think from a language perspective we should say nothing more than: if you have the same member values in teh same session, you'll get teh same hashcode. Then, from an impl perspective we should basically either call into HashCode.Combine (if it exists), or just emit the same GetHashCode code as we do for anonymous-types.

Also: > is simpler

This isn't really a virtue (to me at least) for generated code. We already have fairly good hashcodes for tuples/anonymous-types. I would not want us to pick a poorer hashcode here (i.e. more likely to collide) and then regret that later on because customers are negatively impacted.

Sure. You've convinced me. We should use whatever hash code combine we think is fast and provides a good distribution. Hash.Combine is probably a decent starting point.

agocke · 2018-08-04T21:22:51Z

I've made some changes here and renamed my proposal to "nominal records" to differentiate from the existing records proposal, which I'll now refer to as "positional records." I have also changed to an "initializer struct" design for lowering, which I previously abandoned because I was worried it had too much metadata overhead. However, looking at it more, I think it's a bit cleaner and the metadata overhead is almost nothing in my tests.

This is not quite @HaloFour's proposal, because I still favor unspeakable initializers to allow compilers to enforce safety rules around required initialization.

HaloFour · 2018-08-04T21:34:51Z

@agocke

This is not quite @HaloFour's proposal, because I still favor unspeakable initializers to allow compilers to enforce safety rules around required initialization.

In my opinion that makes them unnecessarily inaccessible from other languages or downlevel compilers.

CyrusNajmabadi · 2018-08-05T15:27:11Z

proposals/data-classes.md

-    public string Password => <>Backing_Password;
-    public string <>Backing_RememberMe = false;
-    public bool RememberMe => <>Backing_RememberMe;
+    public struct <>Initializer


my personal preference would be that this just be a well known name that then supports this pattern 'by convention'. Like so many patterns in the language, we would just be looking for a general accessible/public API-shape, and if found, would light up this functionality.

So, in this case, having htis be called "Initializer" or "Builder" would be far preferred to some unnameable name like <>Initializer.

You realize this is a ridiculously complicated pattern, right? You have to have a public builder. You have to manually move your initializers from the class to the builder (the compiler can't help with or verify this step). The fields must have the exact same names as the properties. What if the property has a setter? Is the pattern illegal now (allowing it would be a breaking change)? What message does the compiler produce?

Worse, now this whole system is a public API surface that can't be changed. What if we want to change the lowering strategy, as we just did in the course of this PR?

This isn't really a pattern, it's a lowering strategy.

You realize this is a ridiculously complicated pattern, right?

Not to me :)

You have to have a public builder.

Not ridiculous. Not complicated (IMO).

You have to manually move your initializers from the class to the builder (the compiler can't help with or verify this step).

Neither ridiculous nor complicated :)

The fields must have the exact same names as the properties.

Neither ridiculous nor complicated :)

What if the property has a setter? Is the pattern illegal now (allowing it would be a breaking change)? What message does the compiler produce?

I'm happy to discuss all those potential concerns. But i would like to start with an open mind that this isn't defacto 'absurd' as you are making it out to be :)

It may be that there is enough cases to help drive the idea that this is better as something truly unnameable. But i would prefer that be the result of a good discussion with enough informign cases, rather than haivng that be the presumed starting point.

Effectively, none of the above seemed at all strange or onerous to have as a defacto pattern. Heck, it seems exceptionally tame compared to many other patterns we have in the language :) (Just my 2c on these).

I don't underestand what it buys you if it was a pattern? why you'd want to implement it manually? also I think @agocke's point about it being "complicated" is not the pattern itself, the compiler should be able to extract semantical information from that pattern which could change as the feature involve in the future, defining such pattern can be quite compilicated to be future-proof. on the other hand, as a "lowering strategy" it will come after all the semnatic analysis and also can change which has a greater value imo.

CyrusNajmabadi · 2018-08-05T15:29:51Z

proposals/data-classes.md

+`readonly` semantics. Requiring initialization via constructor means that
+field order becomes a public API and requires careful versioning, which is
+not true of mutable `data` types and is a constraint that should be
+irrelevant to `readonly` semantics.


ok... i've read this a few times... and i feel like i'm not getting it. Specifically, i don't see why it's necessary for the initializer name to be unspeakable.

HaloFour · 2018-08-05T16:13:11Z

@CyrusNajmabadi

I think the argument for unspeakability is that then the compiler can enforce that some members must be initialized. If the initializer is simply based on a normal coding pattern then downlevel compilers can consume it directly and initialization of those required members can't be enforced.

Given that these protections are at best suggestions which can't be enforced particularly with languages like F# that can speak unspeakables and that adding a new required member is very intentionally not a breaking charge and existing consumers would continue to not initialize that member I don't think it's a point worth making the type unspeakable. The compiler enforcement could just as easily be accomplished through an analyzer on the builder member.

alrz · 2018-08-05T16:28:11Z

proposals/data-classes.md

+look like:
+
+```C#
+public class LoginResource : IEquatable<LoginResource>


Can we make IEquatable impl opt-in? e.g. only if spelled out in the source: data class C : IEquatable<C>

(or opt-out a la fsharp)

to reduce the amount of possibly unused code generated. the opt-in solution would be synonymous to derive or deriving in rust and haskell.

alrz · 2018-08-05T16:31:02Z

proposals/data-classes.md

+```C#
+public data class LoginResource
+{
+    public string Username { get; }


Does it even make sense to make these non-public? Could it default to public?

Private fields and properties are perfectly fine. We could consider it, but right now I'm not wild about changing the default accessibility of members. That's a significant amount of extra complexity in the language, and peoples' style guidelines often depend on it.

also related, I think it's actually useful to force properties to come first in the class body, otherwise it doesnt look like a "data class" anymore. sure it might be basically a style preference but it worth to consider.

agocke · 2018-08-05T19:01:46Z

@HaloFour That's one argument, but I also want to leave this entire initialization phase as implementation defined to allow for codegen changes later.

agocke · 2018-08-05T19:59:08Z

@CyrusNajmabadi @HaloFour I think you guys are misremembering how patterns work in C#. Patterns are for allowing user-written types to substitute for well-known framework types. So instead of just Task being awaitable, anything that looks like Task is awaitable.

There's no such thing as a "compiler-generated" pattern. If the pattern components are public and speakable, that means that regular C# code can interface with them, which means that they must be real and actually exist in source. If you can type it, then you need to be able to go-to-def on it, so the source must be there and navigable. The compiler, never, ever, alters source trees or produces source code. This is just a non-starter.

svick · 2018-08-05T20:24:58Z

@agocke

If you can type it, then you need to be able to go-to-def on it, so the source must be there and navigable.

What about properties of anonymous types or the Invoke method of a delegate type? You can type them, but you can't really go-to-def on them.

alrz · 2018-08-05T20:27:26Z

@svick I think it means that you can type them at the declaration site.

agocke · 2018-08-05T23:00:24Z

Well, the rule isn't exact: certainly the generated constructor of a class with no constructors is "real" and has a go-to-def target, but this isn't a pattern, per se. I actually have the same question for the "with" methods proposed in the records proposal. If you can call those methods and treat them as "real" methods, how do they appear to the compiler? Do we actually create fake method symbols to represent them? Where's their syntax?

CyrusNajmabadi · 2018-08-06T01:59:10Z

@CyrusNajmabadi @HaloFour I think you guys are misremembering how patterns work in C#. Patterns are for allowing user-written types to substitute for well-known framework types.

I'm just 'pattern' in a generic sense. i.e. just the general idea of "what the compiler would look for."

If what we lower to is something reasonable (which i feel like it totally could be0, i don't see any reason why i wouldn't want the compiler to be able to see such a thing in IL (which might have been produced by something else) and go: "ah, i know how to use that! i'll let you use things like Object-Initializers, but then have it compile down to that Class and it's Builder class automatically for you".

So instead of just Task being awaitable, anything that looks like Task is awaitable.

Yes. And that's effectively what i want. Instead of 'object initializers' only working with classes/structs with mutable fields/properties, i want anything that looks 'initializable' to be initializable. However, we would extend the 'initializable' pattern to recognize seeing a well known convention to enable this facility.

CyrusNajmabadi · 2018-08-06T02:00:11Z

The compiler, never, ever, alters source trees or produces source code. This is just a non-starter.

I don't know why that needed to be mentioned :) There must be some confusion as i haven't ever expected the compiler to do anything of the sort :)

CyrusNajmabadi · 2018-08-06T02:05:04Z

That's one argument, but I also want to leave this entire initialization phase as implementation defined to allow for codegen changes later.

I definitely get that. And i think that's a reasonable position to take. It's just one i disagree with.

I think we should spend hte cycles (even though they may not work out) to consider trying to make this not an implementation detail, but to strongly specify what the phase should look like.

If you then created a data class you would get that specific lowering output. However, conversely, if we saw that lowered 'pattern' (i'm not sure what a great term here is, so i'm going to keep using 'pattern'), we would say: "oh, this is a data class!".

That's all. :)

Note: i'm not saying: "we have to do this". Or "it's ludicrous and stupid to do this another way". I'm simply saying that we should examine and consider this and properly weigh out the pros/cons here before dismissing it out of hand right at the beginning. I personally think there is a lot of merit in defining a convention-based approach to mutable/immutable data. And i think if we're going to spend cycles looking into this space, we shouldn't be so quick to just immediately reject this idea.

HaloFour · 2018-08-06T02:09:36Z

@agocke

I think you guys are misremembering how patterns work in C#. Patterns are for allowing user-written types to substitute for well-known framework types.

Are collection or indexer initializers not patterns? If not, what are they? I certainly don't want to get bogged down in an argument over what things are named. I care about how they behave. I'd rather a general purpose feature, one that enables me to then add said support to my own types or possibly even other types via extension. Immutable object initializers are an orthogonal concern with other existing requests/proposals requesting this capability. Having this proposal come up with a method that both can't be used elsewhere in the language and also can't be used well from other languages or downlevel compilers seems like a real shame.

CyrusNajmabadi · 2018-08-06T02:17:44Z

I think i feel very similarly to @HaloFour. It seems like a shame to go down a path that seems less capable and less enabling, when (my hope is) that there is a very cheap alternative that surfaces the same user experience, while also not preventing these other interesting scenarios.

HaloFour · 2018-08-06T02:19:01Z

One question, which I don't think is explicitly addressed in the proposal: is the addition of a new required readonly property to a nominal record considered a breaking change?

agocke · 2018-08-06T05:17:40Z

@HaloFour It would produce a new warning at the minimum, so yes.

gulshan · 2018-08-06T06:14:25Z

Will positional and nominal records be mutually exclusive?

agocke · 2018-08-06T06:44:16Z

@gulshan I don't think they have to be, no. But the intersection of the two could be very different than either alone.

HaloFour · 2018-08-06T12:14:40Z

@agocke

It would produce a new warning at the minimum, so yes.

Ok. Just mentally reconciling the gymnastics around initialization with the end result of being able to add new members without breaking existing consumers, at least until they're recompiled. This feature seems closely related to records in F# which take a fairly different approach regarding initialization and consider the addition of members to be a hard breaking change which requires recompilation.

HaloFour · 2018-11-19T19:45:35Z

I know that I've beaten this poor horse to death but I wanted to note the contention that language-specific unspeakable encoding of features is causing in CoreFX to the extent that they are recommending that such features not be used in public APIs:

https://github.com/dotnet/corefx/issues/33553

agocke · 2018-11-19T19:50:13Z

@HaloFour You're not wrong. In fact, you've convinced me. The nail in the coffin for me is usage in older versions of the C# compiler. I don't want to completely lock out old compiler versions, so I think the builder should be speakable, but marked with [EditorBrowsable(false)] so it doesn't pollute IDE suggestions.

agocke · 2018-11-19T19:50:42Z

I didn't mention it in the notes, but in the LDM discussion on this I actually proposed making the builder speakable.

agocke · 2019-07-27T15:38:09Z

I'm gonna close this out because I have a new proposal that I'll add soon that doesn't require builders or marking anything unspeakable

Add data class proposal

dd84917

agocke requested a review from a team June 25, 2018 20:58

jcouv added the Proposal label Jun 25, 2018

CyrusNajmabadi reviewed Jun 25, 2018

View reviewed changes

svick reviewed Jun 26, 2018

View reviewed changes

jaredpar reviewed Jun 26, 2018

View reviewed changes

jnm2 reviewed Jun 26, 2018

View reviewed changes

Respond to comments

b6fab99

Fixes some typos and changes equality to be commutative even in the presence of subtyping.

CyrusNajmabadi reviewed Jun 26, 2018

View reviewed changes

agocke changed the title ~~Proposal: "data" classes for C#~~ Proposal: "nominal records" for C# Aug 4, 2018

CyrusNajmabadi reviewed Aug 5, 2018

View reviewed changes

alrz reviewed Aug 5, 2018

View reviewed changes

agocke closed this Jul 27, 2019

This was referenced Mar 18, 2020

Named tuple type alias #3276

Closed

C# Constructor Shorthand #3279

Closed


		There is also no way to create a copy of a data structure with readonly
		fields with only one item changed. A new type must be constructed manually.


		## Proposal

		To resolve many of these issues, I propse a new modifier for classes and structs: `data`.

Proposal: "nominal records" for C# #1667

Proposal: "nominal records" for C# #1667

Conversation

agocke commented Jun 25, 2018

agocke commented Jun 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CyrusNajmabadi Jun 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HaloFour Jun 26, 2018 • edited Loading

Choose a reason for hiding this comment

agocke Jun 26, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HaloFour commented Jun 25, 2018

gulshan commented Jun 26, 2018

svick Jun 26, 2018 • edited by jcouv Loading

Choose a reason for hiding this comment

agocke Jun 26, 2018 • edited by jcouv Loading

Choose a reason for hiding this comment

svick Jun 26, 2018 • edited by jcouv Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gafter commented Jun 26, 2018

agocke commented Jun 26, 2018

agocke commented Jun 26, 2018

jaredpar Jun 26, 2018 • edited by jcouv Loading

Choose a reason for hiding this comment

jaredpar Jun 26, 2018 • edited by jcouv Loading

Choose a reason for hiding this comment

jaredpar Jun 26, 2018 • edited by jcouv Loading

Choose a reason for hiding this comment

jaredpar Jun 26, 2018 • edited by jcouv Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnm2 Jun 26, 2018 • edited by jcouv Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnm2 commented Jun 26, 2018 • edited Loading

HaloFour commented Jun 26, 2018

agocke commented Jun 26, 2018

gulshan commented Jun 26, 2018

agocke commented Jun 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agocke Jun 26, 2018 • edited Loading

CyrusNajmabadi Jun 25, 2018 •

edited

Loading

HaloFour Jun 26, 2018 •

edited

Loading

agocke Jun 26, 2018 •

edited

Loading

svick Jun 26, 2018 •

edited by jcouv

Loading

agocke Jun 26, 2018 •

edited by jcouv

Loading

svick Jun 26, 2018 •

edited by jcouv

Loading

jaredpar Jun 26, 2018 •

edited by jcouv

Loading

jaredpar Jun 26, 2018 •

edited by jcouv

Loading

jaredpar Jun 26, 2018 •

edited by jcouv

Loading

jaredpar Jun 26, 2018 •

edited by jcouv

Loading

jnm2 Jun 26, 2018 •

edited by jcouv

Loading

jnm2 commented Jun 26, 2018 •

edited

Loading

agocke Jun 26, 2018 •

edited

Loading