-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal/Discussion: Builder-based Initialization #1689
Comments
100% support from me on this one. I had the same idea back when code generators would come around, though a bit different. My idea was something like Since it seems like generators are way too far out now, a formal proposal to bake something like this into the compiler is definitely the next best thing. |
I very much like this idea, particularly when compared to some uglier alternatives... 👀 I don't want to derail the discussion from the basics, but could this include array- and dictionary-initializer syntax too? If so, |
I'd prefer a builder method over builder type, and don't want an additional allocation. |
A lot of magic is going on here. Looking at Point, you have no idea a Pointbuilder exists, so the code leaves you clueless as to why this works. Somebody who has never yet seen this code syntax wouldn't even know what to search for, whereas somebody who does still doesn't know where Pointbuilder is. Also what happens if two Builders exist? How is the correct one determined? I feel that if this syntax is added, it should be done by changing the Point class, not the PointBuilder. |
I think the nested-type and attribute questions would help clear that up. |
Nothing would preclude the builder type from being a
I intentionally left a lot of the details as to how the compiler might identify the builder for a type for the questions at the bottom. I agree that it should be something that is easily recognized, but I don't want the conversation to get bogged down on any particular details. Could be a requirement that the builder type be nested in the target type, or that the builder type needs to be indicated by an attribute, or that the target type needs a static method that returns a builder type, or any combination of the above, or some other mechanism that I haven't mentioned. I welcome ideas in this space. |
It's probably worth updating your examples to use a |
How upcoming "Non-nullable ref" type member initialization will be ensured with either approach, builder based initialization or readonly object initializers? The proposal doesn't talk clearly about auto-generating the builder type. If generation is not part of the proposal and all it is about simply converting |
Just my opinion, constructors with named optional parameters is the most convenient way to address the problem. And if we are talking new feature, auto-generating properties from primary constructor is the way to go IMHO. |
This proposal doesn't deal with generation of builder types. This is intended to be a stepping stone on which that conversation could happen, an alternative to the approach described for "data classes" as already proposed.
For types of relatively few members I probably agree. But for larger types it becomes cumbersome, especially as the type might evolve over time by acquiring more members.
This isn't intended to preclude primary constructors at all. I hope that they also come to the language and enable case classes, ADTs, DUs, etc.. But as with normal constructors they would be cumbersome for "data classes" with a large number of members. And the conversation seems to be that "data classes" aren't positional as primary constructors are. |
I agree that maintaining constructors for large types are cumbersome. But doesn't using builder pattern require similar (if not more) effort? |
More code, yes. But you don't have to be concerned about overloading the constructor or ordering of parameters. I'm not trying to knock constructors here, they are often a great tool for this job. But after I exceed a certain number of parameters and the constructor starts to look like a nested type unto itself that's usually when I start to reach for a builder. My goal here is to gain more traction with the idea of builders and hopefully get that considered with "data" classes so that all you have to code is the following: public data class MyDataClass {
public string Member1 { get; }
public string Member2 { get; }
public string MemberN { get; }
} And that will generate all of the builder boilerplate necessary so that you can initialize an instance with object initializer syntax as follows, despite the class being immutable: var mdc = new MyDataClass { Member1 = "Foo", Member2 = "Bar", MemberN = "Baz" }; |
What if the builder class had the exact same layout and members as the immutable class, and you could just blit one into the other. Is that totally horrendous, or a possibility? |
Retracting this as I now realise it would cause problems with the constructor not being invoked. |
I'm mostly concerned about the shape of the builder and how the compiler would interact with it. What the builder does internally to actually initialize the immutable type would be entirely opaque. I don't think that we'd want the compiler to emit that code, at least not until the compiler is generating builders for us. |
@yaakov-h Re: blitting, I'd have the same concern as here: #1683 (comment) |
The builder type should accept a constructed instance of the target type so that something like this: var mdc = new MyDataClass { Member1 = "Foo", Member2 = "Bar", MemberN = "Baz" }; can execute constructor logic prior to object initialization running. I think that above code should lower into something like this: var constructed = new MyDataClass()
var mdcBuilder = new mdcBuilder(constructed)
mdcBuilder.Member1 = "Foo";
mdcBuilder.Member2 = "Bar";
mdcBuilder.MemberN = "Baz";
var mdc = mdcBuilder.Build(); With structs that design is fairly straightforward, you can simply reinterpret the struct data as the correct type so long as it has the same structure and nothing observes a reference to the type (and even then the worst thing that would happen is something would hold an old value). I think that approach would work for classes as long as nothing closes over
The devil is that those things can be OK as long as they don't hold a reference to edit: I guess it isn't really that important that the constructor runs before the build method... is that observable now? edit2: class Program {
static void Main(string[] args) {
var d = new MyClass { Member1 = AssignMember1() };
Console.ReadKey();
}
static string AssignMember1() {
Console.WriteLine("AssignMember1");
return "";
}
}
class MyClass {
public MyClass() { Console.WriteLine("Constructor"); }
public string Member1 { get; set; }
} output:
Yep. |
That seems really weird to me, necessitating that the target type expose an empty constructor when that might not be desired and causing two allocations instead of one if the target type is a What if the builder could be created from a var mdcBuilder = MyDataClass.NewBuilder();
mdcBuilder.Member1 = "Foo";
mdcBuilder.Member2 = "Bar";
mdcBuilder.MemberN = "Baz";
var mdc = mdcBuilder.Build(); That would allow an arbitrary and opaque relationship between the builder and target instances. |
That could work, I think you still need to somehow call a constructor on |
My reasoning for the constructor call requirement is that in an existing object initializer call, the constructor invocation is observable and understood to readers of the code: var mdc = new MyDataClass(arg1, arg2) { Member1 = Foo(), Member2 = "Bar", MemberN = "Baz" }; is a valid statement today and can be seen to call the constructor that accepts those arguments prior to the method Additionally if the builder accepted a constructed instance, it would work as an abstraction to support a var mdc = new MyDataClass(arg1, arg2) { Member1 = Foo(), Member2 = "Bar", MemberN = "Baz" };
//becomes
var temp1 = new MyDataClass(arg1, arg2);
var mdcBuilder = MyDataClass.NewBuilder(temp1);
mdcBuilder.Member1 = Foo();
mdcBuilder.Member2 = "Bar";
mdcBuilder.MemberN = "Baz";
var mdc = mdcBuilder.Build();
//...
var another = mdc with { Member1 = "Zap" };
// becomes
var mdcBuilder = MyDataClass.NewBuilder(mdc);
mdcBuilder.Member1 = "Zap";
var mdc = mdcBuilder.Build(); edit: gist: https://gist.github.com/bbarry/75db9239ad9da9b58d9312be4e4601c9 |
As to questions: (note updated gist: https://gist.github.com/bbarry/75db9239ad9da9b58d9312be4e4601c9 not really happy with the copier class but that could be improved with some CLR intrinsic or runtime generation)
If the class has a static builder method, it doesn't matter.
I think a closed set of shapes would be ideal: shape SType {
public static SBuilderType ToBuilder(SType instance);
}
shape SBuilderType {
public SType Build();
} (names tbd) The parameter I think this functionality should be opt in somehow if/when/where the compiler is to generate the builders.
Yes, there may not be a 1-1 correspondence; the expressions inside the initializer should be referring to the builder type.
Instance method on the builder. Otherwise the initializer statement would either be observable to invoke in different orders for builder types vs non-builder types OR the constructor would be observed to execute twice.
No. The initializer should function as if it is an initializer for the builder type. I think the compiler should generate these members for the target type on the builder type but that is a discussion for whatever mechanism triggers that. |
But doesn't the example break design that doesn't allow certain members to be set from outside? Allowing to set get-only properties from the object initializer breaks stuff, I'm pretty sure. Or did I misunderstand something? |
I think you are misunderstanding something (though I might be, of course). For the type: class SomeType
{
public string A { get; }
public string B { get; }
public SomeType (string a) (A, B) = (a, "");
} Then the resultant builder would have to be: public struct SomeTypeBuilder
{
public string a { get; set; }
public SomeType Build() => new SomeType(a);
} and it would be used like: var x = new SomeType { a = "foo" }; which gets lowered to: var x = new SomeTypeBuilder { a = "foo" }.Build(); Since |
@DavidArno if it is that way, I agree its a useful proposal. Point 5 on the proposal however assigns get only properties. I am against this proposal if assigning get only properties via this syntax with no corresponding constructor on the type is allowed. @HaloFour : |
There are no "points" on the proposal, only questions.
The example omits a constructor for brevity, just as it omits the declaration of
That is part of question 2. |
For me, answering this question has a huge impact on how this feature would work. What I want the answer to be is "no, the compiler will generate that builder type (once per assembly?) on the fly when it encounters initialization syntax that assigns to read-only properties. However, I suspect that this would then have significant complexity/performance implications for VS/intellisense and so is likely a non-starter. And I'm unclear as to whether I'm actually the only person that even wants this feature to work this way. |
I'm not sure I get how that would work. The builder type is intimately linked to the constructor and serves to bridge between named and positional. If the compiler is expected to generate the builder based on consumption that means that the compiler already knows how to map the properties to constructor arguments, in which case an intermediate builder type becomes unnecessary. |
In that case, I'm confused as to how this proposal would work. If I take your example class, public class Point {
public int X { get; }
public int Y { get; }
public Point(int x, int y) => (X, Y) = (x, y);
} How will the builder be auto-generated, other than by mapping properties to constructor parameters? Without |
@DavidArno that is interesting, so a builder compatible type (for example a readonly struct in issue #1684) in assembly namespace N1
{
[BuilderCompatible]
public readonly struct Name
{
public string FirstName { get; }
public string LastName { get; }
}
} And then when a usage is seen some internal types to the using assembly would be generated: internal static class Extensions
{
public static NameBuilder ToBuilder(in this Name instance) => Unsafe.As<Name, NameBuilder>(ref Unsafe.AsRef(instance));
}
internal struct NameBuilder
{
public string FirstName { get; set; }
public string LastName { get; set; }
public Name Build() => Unsafe.As<NameBuilder, Name>(ref this);
} This implementation has a dependency on the structure of the internal static class Extensions
{
public static (string FirstName, string LirstName) ToBuilder(in this Name instance) =>
Unsafe.As<Name, (string, string) >(ref Unsafe.AsRef(instance));
public static Name Build(ref (string, string) instance) =>
Unsafe.As<(string, string) , Name>(ref instance);
} I see that as no better than simply allowing one assembly to modify the unspeakable private implementation of a type in another assembly while ignoring the initonly attribute. How could it be written some other way, reflection maybe (presumably a clr change that internalizes that specific method)? |
@bbarry, generating compile-time info like For the record, I really don't like the look of that ` |
I am bypassing the constructor. The constructor is observable. It could have side effects that can be seen if it happens more than once or out of order. I don't see a way around them without severely limiting this builder pattern to only types that have no additional constructors. |
I am looking at the whole picture as this:
Is there another way to view the problem? |
This proposal doesn't explore the generation of a builder ... yet. It's assumed in that example that I'm hoping that when proposals like "data classes" are discussed in combination with this one that the conversation around generating builders will happen. In that case the compiler would generate the builder and the constructor(s) for a given type based on the definition of the type. The exact shape/relationship between the builder and constructor would be an implementation detail (well, depending on question 4 anyway). Potentially, given data class: public data class Point {
public int X { get; }
public int Y { get; }
} The compiler might emit something akin to this: [MyBuilderIs(typeof(Point.PointBuilder))]
public class Point : ILotsOfInterfaces<Point> {
public int X => _builder.X;
public int Y => _builder.Y;
private readonly PointBuilder _builder;
private Point(in PointBuilder builder) => _builder = builder;
public struct PointBuilder {
public int X;
public int Y;
public Point Build() => new Point(this);
}
// lots of other members elided for brevity
} In that way the public surface only cares about the members of
While that concept gives me the hibbie jibbies, if construction happens through an instance method of the builder it doesn't really matter what you decide to do, or what the compiler might decide to do internally in the case of a generated builder. As long as the public surface is clean and it doesn't bypass the intention of the developer it's all fair game. |
Oh. Oops. I missed that small detail. Sorry, but I've now gone off this proposal as it doesn't really do anything other than save a few keystrokes in that case 😞 |
Yep, this proposal is pretty useless on its own. But it opens the door for exploring situations where the compiler can auto-generate those builders, which is where it gets more interesting. And keeping the builder syntax candy nicely separated from those specific scenarios also allows developers and tools to define builders for their own situations, e.g. EntityFramework could perhaps have T4 generators to emit immutable versions of entity classes without necessarily having to opt-in to whatever other opinions might be along for the ride with "data" classes or other similar proposals. |
@HaloFour in an implementation where the builder pattern would only be used for a better object initialization dev experience it doesn't matter at all where construction happens because the developer has full control when it is hand written and we can constrain the language for where it would generate code to avoid the issues. It does matter if the builder pattern is used with a @DavidArno As to where this particular proposal is useful on its own, it is useful because it provides an abstraction layer that gives developers in one assembly the ability to control how objects initialize in other assemblies. For example I could decide that initialization of certain properties must happen in a particular order and can enforce that by providing my own builder type: public class Point {
public int X { get; }
public int Y { get; }
public Point() { }
public Point(int x, int y) {
X = x;
Thread.MemoryBarrier();
Y = y;
}
public static PointBuilder ToBuilder(Point p) => new PointBuilder(X, Y);
}
public class PointBuilder {
public int X { get; set; }
public int Y { get; set; }
internal PointBuilder(int x, int y) => (X, Y) = (x, y);
public Point Build() => new Point(X, Y);
} A user cannot produce a var p = new Point {
Y = 1,
X = 2
}; |
I'm certainly not knocking this proposal, or @HaloFour. And I'm still upvoting it. I'm just frustrated with myself that this is the second thread on the topic of initializers where I've jumped to the wrong conclusions about the intent of the OP and effectively derailed things. I need to read things more carefully. |
For better or worse the team has decided on using |
Currently object initializer syntax in C# requires that the members be writable. This doesn't work for types that are intended to be immutable/readonly. Usually such types accept their data in a constructor, but for a type with a lot of members this can be quite cumbersome for consumers and is also fragile. Another approach is to define another type duplicating the members of the immutable type with mutable members which can be initialized directly and the "build" into an instance of the immutable type. Here's a simple example:
You can then create an instance of the class using the builder as follows:
There might not seem to be an advantage to this approach with two members, but for a class containing several dozen members it does simplify the creation of those types. And since the object initializer on the builder is not positional the type and builder can both be modified to include additional members without breaking any existing consumers.
I propose that the C# language learn to recognize the builder pattern by convention and/or metadata and offer support for it through object initializer syntax, thus allowing:
This would be translated into the use of the
PointBuilder
class, as above.Advantages:
Disadvantages:
Questions:
While I think that the compiler could recognize a "builder" type based on its shape it is possible that the compiler might pick up existing implementations of that pattern and interact with them incorrectly if used in an object initializer. In my opinion the likelihood of this would be fairly rare, and would represent little harm. But it may be useful to require that either the immutable type or the builder be annotated with a new attribute:
I welcome open discussion/debate on the subject.
The text was updated successfully, but these errors were encountered: