Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

Issues with ASI #7

Closed
leobalter opened this issue Jun 20, 2017 · 57 comments
Closed

Issues with ASI #7

leobalter opened this issue Jun 20, 2017 · 57 comments

Comments

@leobalter
Copy link
Member

The current grammar for FieldDefinitionLists requires a closing semicolon:

https://tc39.github.io/proposal-class-fields/#sec-updated-syntax

ClassElement [Yield, Await] :
  ...
  FieldDefinitionList [?Yield, ?Await] ;
  static FieldDefinitionList [?Yield, ?Await] ;

I feel like it should optional, but maybe it's intentional and I'm here to request clarification.

Thanks

@bakkot
Copy link
Contributor

bakkot commented Jun 20, 2017

It's optional because of ASI. It shouldn't be optional in general - class A { x y z(){} } is not what we want, I think.

@leobalter
Copy link
Member Author

🔥 🔥 🔥 🔥 🔥 🔥 🔥 ASI 🔥 🔥 🔥 🔥 🔥 🔥 🔥

@leobalter
Copy link
Member Author

btw, ASI has some weird behavior here:

class C {
  foo = 'TC39';
  [1]
  
  bar = 'TC39'
  [2]

  prop() { console.log( this.foo, this.bar ) }
}

new C().prop() // 'TC39', '3'

@bakkot
Copy link
Contributor

bakkot commented Jun 20, 2017

Yeah, and a whole bunch of other cases. I think this snippet covers most of the major surprises:

class A {
  a = 0
  [b](){} // SyntaxError

  a = 0
  *b(){} // SyntaxError

  async // property named `async`
  a(){} // method named `a`

  get
  a(){} // getter for `a`

  get; // property named `get`
  a(){} // method named `a`

  a = 0; // property named `a`, initialized to `0`
  in // property named `in`
  b; // property named `b`

  a = 0 
  in
  b; // property named `a`, initialized to `0 in b`
}

@borela
Copy link

borela commented Jun 20, 2017

It's not wierd, ASI rules are simple, MPJ did a really good video on the topic: https://www.youtube.com/watch?v=Qlr-FGbhKaI

@leobalter
Copy link
Member Author

I want a declaration list of fields, I'm not really interested in any of the cases @bakkot or I presented here. Saying that ASI is simple puts you in a privileged PoV we can't share with other regular developers.

My examples are not common cases I expect, but I would be pretty annoyed in a case like this:

class C {
  a = 0
  *b(){} // SyntaxError
}

I still need time to fill myself with context from the notes and issues as I believe the champions might have had thoughtful discussions on this. I don't want to assume it's going to be easy to simply get rid of semicolon and ASI here.

@borela
Copy link

borela commented Jun 20, 2017

I understand the concern, but I believe that the traps of not knowing the ASI rules are far more problematic specially during the development phase where the code might not have the proper format yet, for example:

function a() {
    return [
        'a',
        'b',
        'c'
    ];
}

function b() {
    return
    [
        'a',
        'b',
        'c'
    ];
}

There are a lot of examples of beginners having "strange" bugs due to not knowing the ASI rules and I always recommend that video to them; I got bitten a couple of times by ASI even in projects that required semicolons before I decided to finally learn how it works.

@littledan
Copy link
Member

@leobalter I agree; that case is pretty annoying. It's hard for me to think of what a better alternative would be, though. We've gotten pretty strong feedback from developers that they don't want to have to start putting semicolons after their declarations. Other languages use an explicit \ where JS relies on ASI logic. Another thing would be to require parens around a multiline expression in an initializer. The downside is that these would be a little incongruous to do just in class bodies. If you have any further ideas, I'd be interested.

@borela
Copy link

borela commented Jun 21, 2017

@littledan Why not make ASI a bit more intelligent by adding a rule for class methods.

@littledan
Copy link
Member

@borela Sure, what would the intelligence be exactly?

@borela
Copy link

borela commented Jun 22, 2017

@littledan The problem there is that ASI is treating the * as a multiplication operator instead of generator modifier, all it has to do is to look at the right hand side of the * for method declaration, then, if a method is found you can safely assume that the * was used to modify that method and a semicolon can be added before it.

@bakkot
Copy link
Contributor

bakkot commented Jun 22, 2017

@borela That would arbitrary lookahead, which is not something we generally allow in the grammar. In particular, you can't distinguish between multiplication and a method until you've gotten to the brace:

class A {
  x = 0
  * y (a1, a2, ...a3) // multiplication
}
class A {
  x = 0;
  * y (a1, a2, ...a3) {} // method
}

Similarly for distinguishing between a computed name and a computed property access.

@borela
Copy link

borela commented Jun 22, 2017

Yes, the idea would be to look for [...](...){ or something(...){ after the * just to give it the correct meaning inside classes.

@bakkot
Copy link
Contributor

bakkot commented Jun 22, 2017

So the proposal would be that

class A {
  x = 0
  * y (a1, a2, ...a3)
}

would parse as a class field followed by a generator method, and then be a syntax error due to the lack of a function body? That would be something very different from ASI as it currently stands; I don't think that would fly.

@borela
Copy link

borela commented Jun 22, 2017

No, if a { is not found after the ), you could assume it is still part of the expression, like you said, it's a look ahead to check the meaning of the *.

@bakkot
Copy link
Contributor

bakkot commented Jun 22, 2017

Right, so distinguishing between a method definition and an expression requires reading arbitrarily far ahead. As I say, I doubt that would fly.

(Technically we did do something similar with arrow parameters, actually, but it's caused a lot of pain in implementations, and I'm not sure how analogous the situations are.)

And it still doesn't solve the problem, anyway: is the following two fields, or one field whose initializer is an array access? Either would be syntactically valid.

class A {
  x = 0
  [y]
}

@borela
Copy link

borela commented Jun 22, 2017

I would consider it two fields, I don't see why I would ever put [] after a numeric constant. But I know it wouldn't be too simple to code.

Another option is to drop ASI completely, @leobalter was right that the rules are not that simple, after talking to other programmers I realized ASI is a bit unnatural to many people.

@bakkot
Copy link
Contributor

bakkot commented Jun 22, 2017

@borela Sorry, a better example would have been x = z.

Dropping ASI is an option, but we've heard a lot of pushback on that idea from people who are accustomed to the no-semi style, who are after all the only people who'd have to deal with the complexity. And, well, making new exceptions to ASI is also kind of confusing.

(Personally I'm approximately neutral on that question, but I don't ever use the no-semi style, so I'm not a good person to ask.)

@borela
Copy link

borela commented Jun 22, 2017

I used no semicolons in my sources because of ASI, I had too many gotchas after years of C++; by not using semicolons I force myself to read my code more carefully and prevent stuff like returning undefined when I meant to return something; I would be ok if there was an option to at least disable ASI.

@leobalter
Copy link
Member Author

I didn't have time to explore the previous discussions yet or the proposal's history, but as this is already being discussed, I have had a slight idea of removing the ; rather than the ASI.

The problem is that ; can already be a ClassElement, regardless this proposal, so removing the ; from FieldDefinitionList is not an option here.

I agree the lookahead issue would be a potential blocker to move this proposal towards to Stage 3.

One thing we might try is allowing a trailing comma in the FieldDefinitionList:

FieldDefinitionList[Yield, Await]:
  FieldDefinition [?Yield, ?Await]
  FieldDefinitionList [?Yield, ?Await] , FieldDefinition [?Yield, ?Await]
  FieldDefinitionList [?Yield, ?Await] ,

This way we can separate a class field from a method in a better visual way:

class C {
  x = 42,
  *gen() {}
}

This should work just fine as the FieldDefinition expects an AssignmentExpression as the Initializer, not Expressions.

I'm reopening this issue to allow more feedback from other people. Feel free to close it if this is noisy.

@leobalter leobalter reopened this Jun 22, 2017
@leobalter leobalter changed the title Clarification question: is semicolon required after every FieldDefinitionList? Issues with ASI Jun 22, 2017
@littledan
Copy link
Member

@bakkot Arrow functions already have a lot of this complexity, but it's not tied into ASI, so I agree, this would not be great.

@leobalter Thanks for the thread name change. Well, this sounds like it's getting back into the ES2015-era discussion, should we use , or ; between class elements. To me, it seems pretty weird to allow a comma after a field declaration, but not between two method declarations. Why is trailing comma easier to write than a semicolon?

@leobalter
Copy link
Member Author

My suggestion only adds the trailing comma, it already exists between fields in this proposal.

, feels better in my PoV the class will have a declaration list, rather than ;. I assume this preference is subjective.

@littledan
Copy link
Member

Declaration lists don't generally permit trailing commas. Should we allow them for var, let, etc declarations as well?

@littledan
Copy link
Member

Ping, any further ideas on this thread?

@leobalter
Copy link
Member Author

I don't believe this should force a trailing comma for var, etc declarations, considering this declaration list also allows the uncommon class method definitions.

@zenparsing
Copy link
Member

For me, @littledan has the important point:

Well, this sounds like it's getting back into the ES2015-era discussion, should we use , or ; between class elements. To me, it seems pretty weird to allow a comma after a field declaration, but not between two method declarations.

We've already decided on semicolons between class elements.

Although, I think it would be best to remove the comma-list option from class fields. It feels bizarre to me for some reason.

@thepeoplesbourgeois
Copy link

random suggestion gremlin:

I realized this morning that this idea might ruin dot-chaining, and a significant portion of the rest of ECMAscript … and I have no idea how hard it might actually be to get implemented… but what if infix operators/keywords were forbidden at the beginnings of lines within class bodies? That way you could enforce the unary meanings of overloaded operators (*) for fields, and wouldn't require a lookahead to do so.

@claudepache
Copy link

@thepeoplesbourgeois The case of * is probably one of the least concerning, because a missing semicolon at the end of the declaration preceding the generator definition will almost surely trigger a syntax error at the level of the next {. So,

  • there will be a syntax error instead of a silent bug;
  • for those that wilfully take advantage of ASI, that would be a gentle reminder that they must watch out for lines beginning with a punctuation mark (not only in class bodies).

Also, your suggestion does not resolve the issue:

(1)

class C {
    foo  // semicolon inserted here
    bar() { }
}

(2)

class C {
    get  // no semicolon inserted here
    bar() { }
}

@littledan
Copy link
Member

I see your point @claudepache , and it seems pretty inherent to anything we do here with ASI. If you want to be able to omit semicolons anywhere, you should be able to do it after a field declaration which has no initializer.

But I'm not sure how much the "field named get/set/static" issue will come up. I think get and set make more sense semantically as methods, rather than uninitialized fields, and static is more of a weird name for a field. We could add future keywords within classes similarly to async, with a no newline restriction.

@thepeoplesbourgeois
Copy link

thepeoplesbourgeois commented Oct 11, 2017

@littledan captured my line of reasoning pretty succinctly. I had actually believed get, set, and static were keywords, like async, all of which I would expect to take effect on the next declaration when left unterminated by a semicolon.

I guess the potential issues this would open up center around forward extensibility; what becomes a keyword in future specs would need immeasurably more consideration in the event that it might already have gained common use in a framework. It's not difficult to imagine the usefulness of something like render to denote a function intended for canvas-like interaction, for instance.

... then again, this would already have needed to be a consideration for const, let, async, await, etc., so maybe it's a solved problem

@littledan
Copy link
Member

The extensibility point is really huge, and @waldemarhorwat emphasized it to me after I presented on ASI previously. If we allow ASI, it has cross-cutting effects for any changes we want to make in the class grammar; the most obvious one is no more get-style keywords in classes which don't prohibit a newline afterwards. Aside from that, it just has to be thought about for any sort of change in the class grammar--how does this interact with ASI? It's a little worse than outside classes since we don't make as heavy use of keywords.

@thepeoplesbourgeois
Copy link

thepeoplesbourgeois commented Oct 11, 2017

Experimenting with different newline combinations in the Babel REPL shows that async actually follows a different set of ASI rules than get, set, static, and *. The latter keywords will take effect on the declaration on the next line, with static being chainable with get and set, but not being chainable with async and *. * pleasantly surprised me by working as the infix multiplication operator and as the generator mark as necessary, though.

// PRESETS: es2015, es2017, react, stage-0

class C {
  get() {}                           // method

  something() {}                     // method
  
  async somethingElse() {}           // async method
  
  async 
  somethingYetEvenMoreDifferent() {} // method
  
  get 
  anotherThing() {}                  // getter
  
  static 
  set 
  anotherThing(param) {}             // class setter
  
  static() {}                        // method

  * 
  yetAnotherThing() {}               // generator

  foo = 5 *
    2                                // == 10
}

@littledan
Copy link
Member

@thepeoplesbourgeois That's right. We will have to go 'async style' with all future keywords within classes if we do this sort of ASI.

@jridgewell
Copy link
Member

We will have to go 'async style' with all future keywords within classes if we do this sort of ASI.

I'm in favor of that. I think the current ASI rules for get/set/static/* are a bit silly.

@littledan
Copy link
Member

We discussed ASI in the November 2017 TC39 meeting. The committee's conclusion was that we should maintain ASI in class fields, but be more explicit in the specification with notes about what using ASI means and the risks of doing so. I'm preparing a patch which does this.

@littledan
Copy link
Member

Closing per resolution in #7 (comment)

@zenparsing
Copy link
Member

@littledan The comment above at #7 (comment) seems to indicate that ASI in the context of class fields is future hostile in the sense that we won't be able to add any member modifiers without peppering newline restrictions all over the place.

Was there consensus that such newline restrictions are acceptable?

@ljharb
Copy link
Member

ljharb commented May 20, 2018

@zenparsing yes, there was (specifically, consensus that future language additions would no longer be held back by asi concerns alone)

@zenparsing
Copy link
Member

There are a couple of places where ASI is not applied, for good reasons (e.g. for heads). At some point I'd like to discuss further whether a class field declaration (with no initializer) should warrant such an exception.

@bakkot
Copy link
Contributor

bakkot commented May 21, 2018

@zenparsing We discussed that specific question in committee and decided that they did not. See notes, slides.

I was initially against allowing ASI here, but I've since been convinced it's not worth adding another exception, mostly because there doesn't seem to be any group of users which would be served by it. (The people who rely on ASI really object to having more required semicolons even if that means more edge cases, and the people who don't won't run into such ASI issues anyway.

And as far as I can tell, no one puts a linebreak between get and m(){}, so personally I don't think the new NLTH restrictions are really that big of a concern. As evidence of this claim, I note that both TypeScript and babel acted as if there was a NLTH restriction after get in class bodies for years, and apparently no one noticed except me when I was looking into ASI for class fields for this proposal, and then no one noticed when that behavior was fixed.

@zenparsing
Copy link
Member

@bakkot Thanks for the links! I certainly agree that in current practice users almost universally place contextual keywords on the same line as the following token.

Still, for this particular case (field declarations without an initializer) my 🕷 sense is tingling quite strongly.

@littledan
Copy link
Member

@zenparsing What do you think we should do for next steps to evaluate this concern?

@littledan
Copy link
Member

@waldemarhorwat Do you have any more thoughts on this issue?

@zenparsing
Copy link
Member

@littledan I have further thoughts on this topic but I'm not quite ready to discuss them yet. I'll follow up soon!

@zenparsing
Copy link
Member

My biggest worry (that ASI would overly constrain MethodDefinition in both classes and object literals) turned out to be unjustified. We can still extend the method syntax with leading operator characters and other things down the road if we choose (and are willing to accept the mild ASI hazards).

The lesser worry is whether relying on NLT for new contextual keywords within class bodies (and object literal methods) creates a confusing situation, since some of the contextual keywords (get, set, and static) will not require NLT but some of them will (async, and all future keywords).

Before async, we had this situation:

  • Outside of class bodies and object literals, contextual keywords required an NLT.
  • For object literals and class bodies, contextual keywords did not require an NLT.

Which is fairly simple.

What is the situation now? Internally, the syntax is a mess, but what is the simple intuition?

I suppose that the simple intuition, if we admit ASI here, is that contextual keywords should never end a line, regardless of what parsing context we might be in. This intuition seems acceptable to me.

It's unfortunate that we have to sacrifice internal consistency and simplicity in order to accommodate history and ergonomics (ASI in this case), but it's a tradeoff that we've had to make many times before, and it's probably the right call here (shrug)...

Unless @waldemarhorwat has any other thoughts?

@bakkot
Copy link
Contributor

bakkot commented Jun 11, 2018

I suppose that the simple intuition, if we admit ASI here, is that contextual keywords should never end a line, regardless of what parsing context we might be in.

Yeah, that's my position.

(Though I might be happier if we could add NLTH restrictions to static, get, and set in class bodies, and relax the NLTH restriction on async in object literals. This would require splitting the grammar, but wouldn't be hard. Then the rule would be "contextual keywords should never end a line except in object literals", and that would actually match what the grammar demanded. Probably this isn't worth worrying about, though, since I believe it is already the case that people don't end lines with contextual keywords in practice.

No idea how risky that would be, though given that almost all tooling got these wrong until relatively recently, I suspect it might be doable.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests