Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow if-scoped variable declarations to help address field promotion #1991

Open
leafpetersen opened this issue Nov 24, 2021 · 22 comments
Open
Labels
field-promotion Issues related to addressing the lack of field promotion

Comments

@leafpetersen
Copy link
Member

Breaking out a proposal from discussion in this issue, we could consider adding the ability to declare a variable in the condition of an if statement which is scoped to the if and the else branch:

if (var x = obj.x; x != null) {
    // x is in scope here
} else { 
  // and here
}
// But not in scope here

As usual, you can replace var with final.

Related features

Go has a version of this, as does C#.

Questions

Should you be able to declare multiple variables?

Possible extensions

@lrhn proposed a generalization of this in which we introduce a general let syntax using var x = e1; e2 where x is bound in e2, and then further specify that in a few specific constructs (if statements at least) we extrude the scope into the subsequent blocks.

Discussion

This is a fairly minimal change. It doesn't really eliminate any verbosity, and requires you to explicitly write the name of the variable twice:

class A {
  int? myNullableField;
  void test() {
     if (var myNullableField = this.myNullableField; myNullableField != null) {
      // Stuff
     }

There is nonetheless a general consensus in discussions that moving the variable binding into the if statement is materially better than writing it as a standalone statement before hand: the latter feels somehow more ad hoc.

This proposal doesn't help with the guard-let use case where a negative check is done and the body of the if throws. Possibly we could allow the variable to be in scope in the continuation even when the failure continuation is the enclosing block to handle this case? Or possibly only when the failure continuation is the enclosing block and the success continuation never completes (i.e. always either returns or throws)?

cc @munificent @lrhn @eernstg @jakemac53 @natebosch @stereotype441

@leafpetersen leafpetersen added the field-promotion Issues related to addressing the lack of field promotion label Nov 24, 2021
@lrhn
Copy link
Member

lrhn commented Nov 25, 2021

This proposal doesn't help with the guard-let use case where a negative check is done and the body of the if throws.

Could you please provide an example, just to illustrate what this means?

if (var field = myNullableField; field == null) {
  return null;
}
// use `field` as non-null here.

You can do that for local variables, because they are also in scope after the if statement, so all we do is promote them.
For the (var x=e1;e2) expression, the x is only in scope in e2, and we have to explicitly extend it to the body of an if statement if we need it. To allow the guarded escape promotion, we'd also have to extend the scope to after the if statement (but only if the if-statement definitely doesn't complete normally?)

@eernstg
Copy link
Member

eernstg commented Nov 25, 2021

I don't think there's a real problem here: If you want to use the variable after the end of the if statement then the variable should be declared outside the if statement:

var field = myNullableField;
if (field == null) return null;
// use `field` as non-null here.

This kind of scoping is unsurprising, and the "more global" declaration of field justifies the more global availability.

I just changed #1210 to restrict the scope of variables introduced by a binding expression to the nearest <expressionStatement> if it occurs in such a statement. It was already restricting the scope to the nearest enclosing composite statement (<ifStatement>, <forStatement>, etc). So if a binding expression introduces a variable in the guard of an if then it is in scope anywhere in the if (then and else), and nowhere outside; and if the variable is introduced inside an expression statement (say, in the then part of an if or at top level in a function, whatever) then it is confined to that expression statement alone.

@mit-mit
Copy link
Member

mit-mit commented Nov 25, 2021

This proposal doesn't help with the guard-let use case where a negative check is done and the body of the if throws.

This might be viewed as a benefit: it means that putting the variable declaration inside the if-statement also expresses that the variable is only in scope inside that if-statement.

@jodinathan
Copy link

I really wish we had something simpler instead of full declaring the variable.
The reason is that I don't see myself using this instead of a normal variable declaration:

if (obj.prop != null) {

At this moment I think "I have to declare prop to use it".
Then why would I do this:

if (final prop = obj.prop; prop != null) {

Instead of:

final prop = obj.prop;

if (prop != null) {

?

The only place I see myself using this, in the current state, is in List/Map declaration.

We need something that makes sense for using the declaration inside the if. Maybe using := like Go. That would already be something.

At least our team aways tend to make if statements shorter as possible to better readability.
I would probably use a lint to not use this feature when not necessary.

@jodinathan
Copy link

Another point is that you don't have to write "final": for local declarations "var" is a common practice regardless of "finality".

In our team local var is only allowed if the variable will change, if not, it is a rule to use final.
We are used to that but we could change that rule if it was something very beneficial, which for now I don't think is the case.

@lrhn
Copy link
Member

lrhn commented Nov 25, 2021

I don't write final for local variables because I can't be bothered. The net gain is so minimal, that even having to consider whether to write var or final makes it not worth it.
Now, if x := e meant final x = e, then the size-gain might tip the scales.

@jodinathan
Copy link

We also didn't like final at first. It is bigger than var and doesn't make a difference to the compiler.
However, IIRC, when we opted for pedantic and it had the omit local types lint we decided to give final a chance and after a few days using it we really liked the result.
As our rule is final every time except when it can change, once you see a var or a type you know that it may change within that scope. This made reading the code much, much easier.

@bwilkerson
Copy link
Member

have you ever discussed changing syntax highlighting to emphasize the identifiers at the point of declaration?

The analysis server already produces different highlight region types for IntelliJ. It appears that by default the colors are the same, but users can adjust the color for any region type (Editor / Color Scheme / Dart). For LSP-based clients the analysis server produces a SemanticTokenModifiers.declaration modifier at declaration sites. I don't know whether individual clients allow customization based on that modifier or what the defaults are.

@srawlins
Copy link
Member

On the brevity of variables with a short scope: I often read and write Dart code with single-letter variables in for-each loops, and closures:

for (var e in someIterator) { ... }

someIterator.forEach((e) { ... });

but I almost never use single-letter variable names for anything else (other parameters, other local variables, declared in a variable declaration statement, fields). So perhaps developers would lean towards using single-letter if-variables in this way, which would be another benefit to the if-variable over a preceding variable declaration statement.

if (var p = obj.propertyNameWhichIsLong; p != null) {
  p.methodName(p.prop1, p.prop2);
  // more uses of p, without `p!`.
}

@lrhn
Copy link
Member

lrhn commented Nov 29, 2021

I think single-character variables are underused in Dart. The "avoid abbreviations" style guide makes them hard to argue for, but in many cases, a small scope and a clear declaration makes longer variable names unnecessary.
The for/in is an example:

for (var p in properties) {
  something(p);
  p.other;
}

If the body of the loop is short, every reader will remember what p means, or they can lift their eyes by ~3 degrees and see it directly (var p ... stands out because there is space on both sides of p, which only really happens in declarations). Same for catch (e, s) which I personally consider idiomatic. Clear meaning which is easily seen from very close context, no ambiguity.
I'd definitely be using it for var p = something.long.property; p != null) { few lines using p } too.

I'd probably not allow x := e1, y := e2 as a declaration. There are voices who wants to remove var x = e1, y = e2; too, because it makes it harder to scan the declarations. If you write

var x = e1;
var y = e2;

instead, it takes two lines, but it also means the variable names are aligned and next to a var. That does make them easier to find and read, instead of having to parse past the first initializer expression to find the next variable.
(I'm personally undecided. There are definitely times where I care more about brevity and convenience than readability, like when writing stand-alone scripts or tools for myself. If you don't want to use multiple vars on the same line, I'm sure there is a lint for that. I like to have the option.)

@munificent
Copy link
Member

It's an entirely tractable feature, but I just can't convince myself that it adds much value. All it really does is slightly reduce the scope of the local variable. But function bodies tend to be fairly short in Dart anyway, so this is a marginal benefit.

If we didn't already have control flow based type promotion, it would be powerful. But as it is... meh.

@munificent
Copy link
Member

By this logic, "for" loop is also ... meh.

for-in loops provide a lot of value because what they desugar to is fairly complex, verbose, and easy to get wrong. C-style for loops are only somewhat marginal... but they are already in the heads of millions of users, which is itself a large benefit.

@lrhn
Copy link
Member

lrhn commented Nov 30, 2021

C-style for loops actually interferes with my idea to allow while (var x = something; x != null) { ... } where the var declaration is executed on each iteration. It looks almost like for (var x= something; x != null;) { ... } where the var declaration is only executed once. ☹️

Maybe #171's do { var x= something } while (x != null) { x.doThing(); } is a better choice for that.

@stereotype441
Copy link
Member

Personally I would rather see the variable added to the enclosing scope, because it would allow usage patterns like this:

if (var x = obj.x; x == null) {
  return;
}
// x is in scope here

In fact, this feature could be trivially generalized to allow the form (var varName = value; expr) to appear in the place of any subexpression, e.g. this would be allowed:

if (x is! int || (var y = x + 1; y < 0)) {
  // Note: y is in scope here but it can't be used because it's not definitely assigned
  return;
}
// y is in scope here and can be used

and would behave equivalently to:

int y;
if (x is! int || (y = x + 1) < 0) {
  // Note: y is in scope here but it can't be used because it's not definitely assigned
  return;
}
// y is in scope here and can be used

@lrhn
Copy link
Member

lrhn commented Nov 30, 2021

@stereotype441

I like the idea of having (var x = e1; e2) as an <expression> that hoists its variable to the surrounding scope (which is the entire current block scope, like every other variable declaration).

It's very, very close to allowing (var x = e) as an expression with the value e, but it's more powerful because it allows
(var x = e1; test ? x < 0 : x > 0). two occurrences of x where neither dominate the other.

Technically, inside the if, before the return, y is in scope but is not definitely assigned, but it's also not final, so could you assign to it? (I'd personally prefer to make it an error to access such a variable if it's not definitely assigned, even for assignment. That differs from statement-level declarations, and is not strictly necessary, we could say the variable can be accessed any place that is syntactically after the var x = e, it's just potentially or definitely unassigned in some of those places.)

@leafpetersen
Copy link
Member Author

Technically, inside the if, before the return, y is in scope but is not definitely assigned, but it's also not final, so could you assign to it? (I'd personally prefer to make it an error to access such a variable if it's not definitely assigned, even for assignment. That differs from statement-level declarations, and is not strictly necessary, we could say the variable can be accessed any place that is syntactically after the var x = e, it's just potentially or definitely unassigned in some of those places.)

Agreed, I don't think re-using definite assignment here does what you want (consider also the case that the type is nullable, and hence not required to be definitely assigned).

In fact, this feature could be trivially generalized to allow the form (var varName = value; expr) to appear in the place of any subexpression, e.g. this would be allowed:

Do you mean any sub-expression anywhere (i.e. this proposal) or just inside of if conditions? The former is discussed in #1420 and I've given my thoughts there. The latter feels odd to me - it's weird to add a whole new category of expression that can only be used in one context and has a specific set of affordances. It also is another step down the slope of messing with the scoping rules that I'm not very comfortable with.

if (x is! int || (var y = x + 1; y < 0)) {
  // Note: y is in scope here but it can't be used because it's not definitely assigned
  return;
}
// y is in scope here and can be used

meh. That code is 10x more readable if written as follows

if (x is! int) return;
var y = x + 1;
if (y < 0)) return;
// y is in scope here and can be used

@eernstg
Copy link
Member

eernstg commented Dec 20, 2021

@tatumizer, you have to be more specific—local variables will be promoted just fine in the example as shown:

void use(int i) {}

void main() {
  int? prop = 1 as dynamic; // Ensure that `prop` has type `int?`.
  if (prop != null) {
    use(prop); // Promotion - no error
  }
}

The variables that aren't promoted are non-local ones, and we have spent all of 'field-promotion' on debating how to make promotion of non-local variables available in some way (statically checked or dynamically checked, cached or non-cached, etc).

In this case the type of the variable prop is int? at the declaration, and it's int after promotion. There is no need to introduce a special int-ish type like Assumed<int>.

One of the proposals is concerned with an "optimistic" approach, and that one could be said to use a specialized type:

When promotion is done optimistically, a promoted instance variable x could have declared type num? and promoted type int, and the compiler would then keep the information around that this type must be ascertained at each usage, because it is not known to hold. So we compile every access x as (x as int). So we'd need something like Assumed<int> in order to remember that (1) the type is int, but (2) we have to check it every time it's used. However, I don't think it will be necessary to reveal to developers exactly how the two pieces of information are represented, it could be based on an Assumed<> pseudo-class, or it could be handled in many other ways. So we might as well just maintain the essence: With an optimistic approach, the compiler must somehow remember to check the type at every usage. Done.

However, we do have a few cases where there's no good type for a promotion. For example, consider a variable v of type FutureOr<int?>. If we promote v using v != null then we know that the value is either a Future<int?> or an int; but that isn't the same thing as FutureOr<int>, so we can't actually express the knowledge which was established by knowing that v != null evaluated to true. So in those cases we can "promote", but it doesn't help, because v still has the type FutureOr<int?>.

Of course, the pragmatic advice in this case would be to promote v to int or to Future<int?>, because it would probably be necessary to discriminate the Future related union as well as the Null related union at some point before the value can actually be used. And then a final else could handle null.

We did discuss the idea that we'd have a "non-nullable operator" on types, but we decided that the complexity of handling such an operator wasn't worth the small improvement that it would be expected to provide.

@eernstg
Copy link
Member

eernstg commented Dec 20, 2021

Yes, "Optimistic promotion" is what I basically had in mind

OK, very good! Note that it is a double-edged sword, though, because it looks nice, but it introduces a (potentially large) number of dynamic checks, and most of the recent changes to Dart have been aimed at eliminating dynamic checks.

@eernstg
Copy link
Member

eernstg commented Dec 21, 2021

@tatumizer wrote:

Maybe you mean ...

I think the optimistic approach is a perfectly consistent and usable proposal. It also represents a particular language design trade-off, allowing for a potentially large number of dynamic checks with no visible syntax, and this might cause the proposal to get strong support or ditto opposition. So we could do that, but the debate about it is perhaps at the software engineering level rather than the technical level ("do we want to accept those dynamic checks?").

The approaches that are based on evaluating something (in particular, an instance variable, but it could be any expression) and storing the value in a local variable are safe at the immediate level (there are no implicit dynamic checks). However, these approaches may give rise to subtle bugs because they rely on caching a value, and the behavior of the enclosing application may be such that the cached value is stale. So we could do that as well, and it would even make sense to do both, if it's considered appropriate to have support for so much language mechanism in this topic area.

Finally, we have an approach like #1518 where immutability is turned into an interface property (that is, we make it a part of the contract of a particular getter in an interface that it is stable, and it is then enforced by the language that this contract is upheld by all implementations). This approach is properly safe (no dynamic checks, and no values can be stale), but it can only be applied in the case where it is actually an appropriate contract that a specific property should be immutable.

expr@xxx != null, #1210, is an example of a proposal that relies on caching.

@Cat-sushi
Copy link

Cat-sushi commented May 13, 2023

Now, we can write if (obj.x case var x?) 🎉

@rubenferreira97
Copy link

@Cat-sushi Awesome! However I think the correct syntax is if(obj.x case var x?).
Apparently we can also use if(obj.x case var x!) to assert null.

I really need to check all pattern types available😁, I was missing some.

@munificent
Copy link
Member

@Cat-sushi is exactly right, and null check patterns were added in large part to support this idiom. The syntax is a pretty unintuitive but once you learn it, I think it works OK. At least, we never were able to come up with anything better.

Swift has something similar, which is where I got it from, and is a signal that it's useful and works. But the fact that the top search result for that syntax is https://fuckingifcaseletsyntax.com/ does not inspire confidence that the users find the syntax intuitive. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
field-promotion Issues related to addressing the lack of field promotion
Projects
None yet
Development

No branches or pull requests