-
-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some small grammar fixes #871
Conversation
any-label = label | ||
|
||
; Allow specifically `Some` in record and union labels. | ||
any-label-or-some = any-label / Some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't None
need to be included too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because Some
is a keyword, but None
is a builtin function, and builtins are already allowed as labels of records/unions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks! That distinction wasn't so clear to me.
@@ -1 +1 @@ | |||
[3, 9, [24, null, 0, 7], ["foo", 0]] | |||
["missing//foo", 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not how the Haskell implementation parses it. Could you explain why this change is correct (and why the Haskell implementation is wrong)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that both parses are technically allowed (i.e. the grammar is ambiguous). However, the text of the grammar states "prefer the first parse" and "prefer alternatives that parse as many repetitions as possible", which can be summed up as "parse greedily". Taking that into account, we get that when parsing whatever// foo
, the whole of whatever//
should be consumed as an identifier (since slashes are allowed in identifiers).
Moreover I'm quite confident that this is the correct answer since I derived a parser directly from the abnf (taking into account greediness) and that's the output I got.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my reading of the grammar is that this should be parsed as a single label
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, seems like I just misread the grammar then! :)
If it had turned out to be ambiguous, I would have suggested that we change that, but fortunately it doesn't seem to be the case. :)
@@ -295,9 +298,9 @@ unbraced-escape = | |||
; | |||
; See the `valid-non-ascii` rule for the exact ranges that are not allowed | |||
braced-codepoint = | |||
1*3HEXDIG ; %x000-FFF | |||
("1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" / "A" / "B" / "C" / "D" / "E" / "F" / "10") unicode-suffix; (Planes 1-16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of the other changes, this tweak should be merged. I brainfarted when I wrote this and didn't think about optimal order to avoid backtracking..
@@ -1 +1 @@ | |||
[3, 9, [24, null, 0, 7], ["foo", 0]] | |||
["missing//foo", 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my reading of the grammar is that this should be parsed as a single label
Added an additional tiny tweak because one of the test didn't respect the spec up to variable name equality, whereas the rest of the tests do |
I recently caught up with the latest few months of standard updates. This uncovered a few errors in tests and one case where the order of alternatives in the grammar was causing my PEG parser to choke.
Since we now have
merge
forOptional
, I also altered the grammar to allowSome
as a label for records and unions. I guess I could have allowed other keywords too but I chose to be conservative.