-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalized generative constructor initializer code. #3002
Comments
I really like this proposal. I have heard from many users over the years that the constructor initializer syntax is one of the most unintuitive parts of the language. It only really makes sense if you have C++ experience and the set of people who do is not exactly growing these days. Top-level super calls
I think requiring this to be at the top level would be annoyingly restrictive. I can see users wanting to write: SomeClass(bool b) {
if (b) {
super('some', 'stuff');
} else {
super('different', 'things');
}
} Given that we're already basing the proposal around definite assignment analysis, I think a natural way to model this is to consider the super constructor call as "initializing
ClosuresClosures are nasty, as always. I'd be inclined to just say that you can't close over an instance member at all until after the superclass constructor call. |
I understand the desire to have a more expressive language available for the initialization phase of construction, but I'm worried about the non-homogeneous semantics. In particular, any pre-super access to an instance variable declared in the current class would be similar to an access to a local variable, but every other access to that instance variable (at any location with access to For example: class A {
final int i;
A() {
i = 0;
print(i); // Prints '0'.
super();
print(i); // Same. No wait, if `this is B` then it throws!
}
}
class B extends A {
int get i => throw "Not the same as reading the instance variable like a local variable";
} The fact that In short, I don't think this kind of semantics is particularly readable, maintainable, debuggable, etc. We could consider a different way to get a similar level of expressive power, but maintaining the current semantics: class A {
final int i, j; // Declare two instance variables to show that it works with more than one.
// Use a pattern assignment in the initializer list to set all instance variables in one step
// (assumes that https://github.com/dart-lang/language/issues/2774 has been accepted).
// Use a function literal returning a record to provide all the values in one step.
// Then call `super()` as usual at the end of the initializer list.
A(int arg): (i, j) = ((){
... // No access to `this`, but otherwise all of Dart.
return (arg, arg + 1); // Return tuple for pattern assignment to `(i, j)`.
}()), super() {
... // Normal constructor body code. No special rules.
}
} I'm not suggesting that anyone would be really happy about writing code in this style (even though it could actually be used exactly as shown if we assume #2774), but it could serve as a starting point for a mechanism whose semantics is as shown in the code above, and whose syntax is a non-redundant and readable abbreviation thereof. For example, we could simply consider using a block expression: class A {
final int i, j;
A(int arg): (i, j) = {
... // Initialization code.
return (arg, arg + 1);
}, super() {
... // Normal constructor body code.
}
} A block expression is similar to a function literal with no formal parameters, but it is always executed (we just run the code when it is reached, there is never a function object). We might well want block expressions anyway, so why not use them here, together with pattern assignment. Return expressions in the block expression must fit the context (in this case: they must return an Of course, we can mix and match these approaches: If we need to declare local variables and perform arbitrary computations in order to initialize some instance variables |
I am not a fan of the injection of the implicit I think if you combine the explicit super with the mental model of definite assignment for |
The difference is that a But it is a point where the syntax gets confusing, and why I am more worried about Erik's point that So: Before And Bob's idea of considering That also makes me less worried about implicitly inserting the It's true that a statement-expression/block-expression with a record result, and pattern assignments in initializer lists, gives almost the same behavior. It doesn't allow local variables surviving across the |
This still feels way too magic to me, and it provides very little value. Requiring it to be explicit makes the code more readable/understandable. For instance, how does the debugging story work here? All the sudden you just get launched into the super constructor? It means you can get exceptions in between synchronous lines of user code etc (and in general, arbitrary code can run between user visible statements). All of that is pretty horrible IMO. |
We could say that if there is no But that changes behavioir relative to the current behavior, where the Or we could insert it there, and if you want the constructor body to do initialization, you have to write a |
I would feel a lot more comfortable with it only being injected at the end. |
Is there something fundamentally new here, or is it 'just syntax'? Can generalized generative constructor be implemented by rewrite to a combination of existing constructors? I think closure scope possibly can't. One thing I have wanted but is not addressed here is the initialization of cycles, both self-cycles ( |
Most likely, now that we have records, everything can be rewritten into something we can do today. class Foo {
final int v1;
final String v2;
Foo(args) : {
// initBlock assigns to this.v1, this.v2
super(super1);
// post-`super`-constructor-block
} could become class Foo {
final int v1;
final String v2;
Foo(args) : this._(_computeValues(args));
Foo._(({int v1, String v2, bool $super1, $init: void Function(Foo)}) values) : this.v1 = values.v1, this.v2 = values.v2,
super(values.$super1) {
fieldValues.$init(this);
}
static ({int v1, String v2}) _computeValues(args) {
final int $this_v1;
final String $this_v2;
bool $super1;
// initialization block, with `this.v1 = ...` replaced by `$this_v1 = ...` and `super(value)` with `$super1 = value`.
return (v1: $this_v1, v2: $this_v2, super1: super1, $init: (Foo self) {
// post-`super()`-constructor-code, with `self` instead of `this`, and using `self.v1` to access fields.
});
}
///...
} Where it gets a little tricky is Bob's idea to allow choosing dynamically between super-constructors. Another possible issue is local variables accessed across the That problem is in the semantic details of the rewrite, not something we definitely can't find a rewrite that solves. We just have to be very careful. |
This proposes a generalization of object initialization, which allows more powerful and expressive computations during initialization, while still maintaining a separation between code running before an object has been fully initialized (no access to
this
) and after (the current constructor body).Motivation
The current syntax for initializing instance variables in non-redirecting generative constructors, the "initializer list", is very restricted in what it can express.
The only allowance is assigning the result of a single expression to one instance variable. If two fields need to share a value in any way, say one containing a stream-controller, and another a value depending on the stream of that controller, it cannot be expressed in a single constructor.
The immediate workaround is to use a factory constructor which does all computation, and the calls a private generative constructor which just initializes fields with pre-computed values.
If a public generative constructor is needed, another workaround is to use a forwarding generative constructor which creates the shared object, and pass it to another constructor, which can then refer to it through the parameter, like a kind of "let" constructor using constructor chaining.
Proposal
Allow initialization of instance fields to happen inside the constructor body, as well as in an initializer list.
To do that, the super-constructor invocation is allowed to be moved into the constructor body as well.
The grammar is changed such that a non-redirecting generative constructor with:
super
constructor invocationcan be followed by a constructor body block which can contain at most one
super
constructor invocation as a top-level "statement" in the block.If such a constructor contains zero
super
constructor invocations, one is inserted automatically at the latest possible place where it would be allowed in the body block.A `super constructor invocation statement has the form
super(args);
orsuper.name(args);
. That is, the same syntax as the entry in the initializer list, followed by a semicolon.All existing syntax remains valid. A constructor with no body block, just a
;
, will still get itssuper
constructor invocation appended to the initializer list. A constructor with asuper
constructor invocation in the initializer list will work exactly like today.The behavior of such a constructor is that:
super
constructor invocation is initialization code.this
is restricted as follows:this.x
is allowed, wherex
is an instance variable of the current class.Any other use of
this
is a compile-time error.super.foo()
invocations.x
, resolving to instance variables are allowed, as equivalent tothis.x
. Those will beshadowed by parameters or other local variables as normal (unlike initializer lists which allow
x = x
insteadof
this.x = x
).similar to local variables.
definitely assigned to, using the same rules as for definite assignment of local variables.
final
, definitely uninitializedfinal
, and potentially unassignedlate final
variables may be assigned to.late
, variables may be read. (This is new!)super
constructor invocation, all non-nullable instance variables must be definitely assigned.super
constructor invocation then chains to the superclass object initialization as normal.this
freely.Because they're all either definitely assigned or
late
.(Obviously an implementation can remember information for optimizations,
like knowing that a
late
variable is definitely assigned.)super
constructor invocation are still in scope and can be accessed.If neither the initializer list, nor the constructor body block, contains a
super
constructor invocation,an invocation of
super()
is inserted as late as possible.If there is no constructor body block, it's inserted at the end of the initializer list as normal.
If there is a constructor body block, it's inserted at the latest possible point in that block, which means just before the first statement of the block which references
this
orsuper
in a way that is not allowed in the initialization code. If there is no such statement, thesuper()
constructor invocation is inserted at the end of the constructor body block.That is, the only change in behavior occurs when the constructor body block contains a
super
constructor invocation, which is entirely new syntax, or the constructor does not contain anysuper
constructor invocation at all. In the latter case, thesuper
constructor invocation may be moved to later in the body, so some local computation may now happen before the super-constructor invocation, but it's only about computation ordering, the computations should not affect each other, unless they do so through global state.A
const
constructor must still not have a body, which restricts them to the existing initializer list and no statement control flow.Consequences
With this change, you never need more than one constructor to construct an object.
You can still have multiple constructors, doing different things, but you never need to add an extra private constructor just to do more complicated computation before initialization.
Closures
I conspicuously avoided mentioning closures.
If one creates a closure in the initialization code of a constructor body block, which references an instance variable, and then calls the closures after the
super
constructor invocation, what happens?Preferably it should just work as if the closure had always referenced the same instance variable. But it's not unreasonable that the object doesn't yet exist during initialization, and the
this.x
variables are really place-holder local variables that are being initialized, and only stored into an object later, when it's been allocated.Maybe that issue solves itself, if creating a closure containing an instance variable will always treat the variable as potentially, but not definitely, unassigned, so any attempt to read it will fail. An attempt to write might be valid, though, if the variable isn't
final
.The most direct solution is to say that it "just works", but that may force a specific implementation approach onto back-ends, where the object is always allocated first, and variables during initialization are backed by the object instance's memory slots.
In most cases, it just won't matter, because capturing instance variables during initialization is incredibly rare, and reusing the closure afterwards is even rarer. And if the compiler can optimize the remaining constructors, a few de-optimized cases won't be a problem.
(But maybe accidentally capturing becomes a bigger issue if we start allowing more kinds of code, like doing
someList.any((x) => x.name == inputName)
whereinputName
is an already initialized non-final
instance variable.We can't directly see that this closure won't escape to be called after object initialization, so we may need treat the
inputName
field less efficiently during instantiation. But ifinputName
isfinal
, we can choose to just close over the value, not the variable, which must be definitely assigned already for the code to even be valid.)I'd suggest that when accessing instance fields during initialization is allowed, we should also allow closing over them, and then take whatever hit it costs us if someone does that.
An alternative is to not allow reading initialized variables, only allow writing to them. It's slightly less ergonomic, but it's what we do today, and it isn't too bad.
Variants and extensions
Don't allow reading initialized instance fields
Instead of allowing you to read
this.x
in initializer code if it's already definitely assigned, we just don't allow that.The only valid use of
this.x
is to assign to it.That also means that capturing an instance variable is less likely to happen. You have to capture a write,
this.x = v
, which only makes sense if the variable is non-final
orlate
(because otherwise the closure itself forcesthis.x
to be potentially assigned, in case the closure is called more than once).It's still possible to refer to local variables with the same value, it just requires changing:
to
Which isn't bad.
(Or go all-in on brevity and do:
)
"Factory" generative constructors.
Sometimes you don't want to expose a public generative constructor, because you don't want people to subclass your class through that constructor.
With Dart 3, you can make the class
final
orinterface
to prevent that entirely, but if you want to push subclassing to use a specific constructor, and still expose another constructor for creating instances, you'd have to make the other constructor a factory constructor.We could allow you to write
factory
on a generative constructor:The
factory
modifier is put after the constructor parameters, because putting it in front will make the second line above conflict with the existing factory syntax offactory Foo(args) { body-returning-value }
.The effect would be that this particular constructor cannot be used as a super-constructor by subclasses (maybe only "outside of the same library", like other access modifiers).
Initializer list blocks
Rather than, or in addition to, moving initialization into the constructor body, we could allow code blocks inside the initializer list.
Each initializer list block will be treated the same as the initializer code inside the body proper. It can initialize instance variables, and access ones already initialized earlier in the initializer list.
At the end of it, some instance variables will have been definitely or potentially assigned, and that carries forward to the rest of the initializer list, and the body initializer code, if any.
Local variables in initializer list blocks are not visible in later initializers.
Unless we want them to be.
The syntax is a little hard to read, e.g.,
Foo(args): {initblock}, {initblock} {bodyblock}
. The separation between initializer block and body block is hard to read. This readability issue is the primary reason why the proposal doesn't try to split initialization code into its own block.The text was updated successfully, but these errors were encountered: