Better error recovery in comma separated lists. #14509

adampauls · 2022-02-17T20:28:44Z

#14463 improves error recovery in comma-separated lists when an error occurs before a comma is encountered. The way commaSeparated currently works, there is still bad behavior when the individual element of a comma-separated list is successfully parsed, but not followed by a comma. For example, in foo(5 6, 7), the parser will successfully parse 5 into an expression, terminate the comma-separated list (because it encounters 6 instead of a comma, then pop up (producing foo(5)). The parser will then be in a state where it is looking for a terminal ), but instead it finds a 6and issues an error. This is okay behavior because a reasonable error is issued in the right place, but it means that everything after the6` is parsed as if it occurred outside the comma-separated list, leading to potentially confusing errors.

In this PR, we pass down the expected end token for a comma-separated list (where that is defined) and issue an error while still parsing the list if a part of the list is parsed without being followed by a comma or the expected terminal token. More code than I would like for such a small win, but I think the resulting behavior is worth it.

adampauls · 2022-02-17T20:29:03Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

@@ -553,19 +553,27 @@ object Parsers {
    def inDefScopeBraces[T](body: => T, rewriteWithColon: Boolean = false): T =
      inBracesOrIndented(body, rewriteWithColon)

-    /** part { `separator` part }
-     */
-    def tokenSeparated[T](separator: Int, part: () => T): List[T] = {


I inlined this since it was only ever used for a comma separated list.

That rang a bell. At the beginning of pandemic, I moved Scala 2 trailing comma handling into parser for better handling, so that is why this method still pulls its weight there.

scala/scala#8780

I said on the PR that would forward-port it; apparently I looked and it seemed similar. But obviously I haven't done that yet. It must have slipped my mind during pandemic brain fog.

I went ahead and wrote up something to get rid of the weird commas. I'll put it up after this PR is in.

adampauls · 2022-02-17T20:30:18Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

        in.nextToken()
        ts += part()
      }
+      if (expectedEnd != EMPTY && in.token != expectedEnd) {


The main change.

So in case we have more that one unexpected token for example foo(1 2 3, 4, 5) this will not try to parse the rest of the param clause?

Maybe it makes sense to try to find the comma or end of line?

syntaxErrorOrIncomplete skips to the next safe pint as a side effect, so we will see 2 is not a ), issue an error, then jump to , and keep parsing. I'll add a third digit to one of the test cases to be sure.

adampauls · 2022-02-17T20:31:57Z

tests/neg/i1679.scala

@@ -1,5 +1,5 @@
 class A[T]
 object o {
  // Testing compiler crash, this test should be modified when named type argument are completely implemented
-  val x: A[T=Int, T=Int] = ??? // error: ']' expected, but '=' found // error
+  val x: A[T=Int, T=Int] = ??? // error: ']' expected, but '=' found // error: ']' expected, but '=' found


Previously, this error complained about val x not having a body, which doesn't make any sense.

adampauls · 2022-02-17T20:46:03Z

cc @tgodzik , a better solution to the bad error message from #14463.

tgodzik

That looks great! I just have a few questions.

tgodzik · 2022-02-18T08:15:00Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

        in.nextToken()
        ts += part()
      }
+      if (expectedEnd != EMPTY && in.token != expectedEnd) {


So in case we have more that one unexpected token for example foo(1 2 3, 4, 5) this will not try to parse the rest of the param clause?

Maybe it makes sense to try to find the comma or end of line?

tgodzik · 2022-02-18T08:16:13Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

@@ -2532,7 +2540,7 @@ object Parsers {
              if (leading == LBRACE || in.token == CASE)
                enumerators()
              else {
-                val pats = patternsOpt()
+                val pats = patternsOpt(EMPTY)


Why EMPTY in this case? Shouldn't it be something like OUTDENT ?

I thought this was to handle for x, y in (1,2) do or for x, y in (1,2) yield, so I think the expected end would have to be do or yield? Passing EMPTY preserves the old behavior, so I figured it was safe. I could pass predicate down instead I suppose. What do you think?

I guess this if fine, the other possibility would be an Option but I don't see it used anywhere in the class, so probably there is a reason for it.

I was maybe being overly cautious but I didn't want to have a perf hit from creating a lot of unnecessary Some wrappers.

tgodzik

Looks good for me, but I would probably ask for a second opinion here 😅

odersky

It does feel like a lot of effort for not that much of a gain. Error recovery is currently very streamlined and centralized. I think it can be improved, but my fear is that local improvements like this will make it harder to come up with a better overall design.

In that sense, even though it improves precision, I fear that #14463 was the already a step in the wrong direction, and this is more of the same. To pass context, I would instead try to rely on the currentRegion abstraction. It tells us about what we expect to come, and we should make more use of it. For instance, in retrospect, instead of the solution in #14463 it would be more elegant to have a region that is a specialized version of InParens that knows that elements can also be separated with commas. That avoids the ad-hoc passing of an additional argument to skip. And likewise we should be able to avoid passing additional arguments of commaSeparated.

So, my proposal: Let's revert #14463 and base everything on regions. I have the impression that skip could be a lot smarter than it is now, if it makes better use of this info.

adampauls · 2022-02-21T21:10:02Z

I spent a little time trying to figure this out, but I'm not sure what you're looking for. If we want to fix the trailing comma bug fixed by #14517, then the parser has to pass down information to about whether it is a in delimited comma-separated list (e.g. (a, b)) or an open comma-separated list (e.g. a, b as in for a, b <- ) that doesn't know what follows. So no matter what, you have to pass down information in all the places I've declared to commaSeparated whether it should expect a delimiter or not.

Perhaps I did this PR a disservice by separating it from #14517, because it gives the appearance of only being relevant to error recovery. In fact, I think we need to pull the current comma-handling out of the scanner to get correct parsing behavior. Try to cram more information into the region, which I think currently exists largely to handle significant indentation, is the wrong way to go I think. I'll also note that the parser already passes down Locations, which are like regions but handle orthogonal concerns.

You could potentially make skip a little smarter by inspecting the region, but when I tried to either add a new CommaSeparated region or add a flag to the existing InBraces and InParens regions, things started to break because the code that inspects regions (of which there is a lot, and it's already fairly brittle from what I can see) now need to think about whether to ignore these flags or not. I was not able to track down all the places that need to change to fix tests, but I'm fairly confident that more code ends up needing to think about comma-separated regions that shouldn't have to, just to avoid passing single flag to skip.

I spent a couple of hours trying to figure out a better way, but I'm not seeing it! Further tips appreciated.

adampauls · 2022-02-21T21:31:58Z

You can see my best attempt here if you're interested.

adampauls · 2022-02-21T22:39:59Z

This is another (much smaller) try that avoid passing down the skipAtComma bit. This one passes all current tests, but note that this version will not handle errors well in undelimited comma-separated lists because skip is no longer smart enough to skip to the next comma.

You can see this change and #14463 rebased against main here.

odersky · 2022-03-13T17:43:21Z

What is an undelimited comma separated list?

odersky · 2022-03-13T17:45:58Z

I think it would be good to merge the three PRs on comma-separated lists with a single one that makes use
of Scanner.Region.

adampauls · 2022-03-15T19:35:21Z

Thanks, I made a single PR that uses Scanner.Region at #14534. An undelimited comma-separated list is one of these

val x, y, z = (1, 2, 3)

Tested here.

They also occur in derives clauses.

odersky

I believe this would be superseded by #14695.

adampauls commented Feb 17, 2022

View reviewed changes

adampauls force-pushed the comma_separated_lookahead branch from 3097f54 to ae845df Compare February 17, 2022 20:30

adampauls commented Feb 17, 2022

View reviewed changes

adampauls force-pushed the comma_separated_lookahead branch 2 times, most recently from 3cc0a28 to d28980b Compare February 17, 2022 20:57

tgodzik reviewed Feb 18, 2022

View reviewed changes

adampauls force-pushed the comma_separated_lookahead branch 2 times, most recently from 5e9d681 to 415e6d5 Compare February 18, 2022 17:29

Better error recovery in comma-separated lists

547ed43

adampauls force-pushed the comma_separated_lookahead branch from 415e6d5 to 547ed43 Compare February 18, 2022 23:46

adampauls mentioned this pull request Feb 18, 2022

Handle trailing commas in parser instead of scanner #14517

Closed

tgodzik approved these changes Feb 21, 2022

View reviewed changes

tgodzik requested a review from odersky February 21, 2022 11:34

odersky requested changes Feb 21, 2022

View reviewed changes

adampauls requested a review from odersky February 23, 2022 16:57

dwijnand assigned odersky Mar 4, 2022

adampauls mentioned this pull request Mar 5, 2022

Stop parsing incorrect trailing commas and improve error recovery in comma-separated lists. #14534

Closed

odersky reviewed Mar 20, 2022

View reviewed changes

adampauls closed this Mar 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better error recovery in comma separated lists. #14509

Better error recovery in comma separated lists. #14509

adampauls commented Feb 17, 2022

adampauls Feb 17, 2022

som-snytt Feb 17, 2022

adampauls Feb 18, 2022

adampauls Feb 17, 2022

tgodzik Feb 18, 2022

adampauls Feb 18, 2022

adampauls Feb 17, 2022

adampauls commented Feb 17, 2022

tgodzik left a comment

tgodzik Feb 18, 2022

tgodzik Feb 18, 2022

adampauls Feb 18, 2022

tgodzik Feb 21, 2022

adampauls Feb 21, 2022

tgodzik left a comment

odersky left a comment

adampauls commented Feb 21, 2022 •

edited

Loading

adampauls commented Feb 21, 2022 •

edited

Loading

adampauls commented Feb 21, 2022 •

edited

Loading

odersky commented Mar 13, 2022

odersky commented Mar 13, 2022

adampauls commented Mar 15, 2022 •

edited

Loading

odersky left a comment

Better error recovery in comma separated lists. #14509

Better error recovery in comma separated lists. #14509

Conversation

adampauls commented Feb 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adampauls commented Feb 17, 2022

tgodzik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgodzik left a comment

Choose a reason for hiding this comment

odersky left a comment

Choose a reason for hiding this comment

adampauls commented Feb 21, 2022 • edited Loading

adampauls commented Feb 21, 2022 • edited Loading

adampauls commented Feb 21, 2022 • edited Loading

odersky commented Mar 13, 2022

odersky commented Mar 13, 2022

adampauls commented Mar 15, 2022 • edited Loading

odersky left a comment

Choose a reason for hiding this comment

adampauls commented Feb 21, 2022 •

edited

Loading

adampauls commented Feb 21, 2022 •

edited

Loading

adampauls commented Feb 21, 2022 •

edited

Loading

adampauls commented Mar 15, 2022 •

edited

Loading