Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some small grammar fixes #871

Merged
merged 5 commits into from
Dec 27, 2019
Merged

Some small grammar fixes #871

merged 5 commits into from
Dec 27, 2019

Conversation

Nadrieril
Copy link
Member

I recently caught up with the latest few months of standard updates. This uncovered a few errors in tests and one case where the order of alternatives in the grammar was causing my PEG parser to choke.
Since we now have merge for Optional, I also altered the grammar to allow Some as a label for records and unions. I guess I could have allowed other keywords too but I chose to be conservative.

any-label = label

; Allow specifically `Some` in record and union labels.
any-label-or-some = any-label / Some
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't None need to be included too?

Copy link
Member Author

@Nadrieril Nadrieril Dec 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Some is a keyword, but None is a builtin function, and builtins are already allowed as labels of records/unions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks! That distinction wasn't so clear to me.

@@ -1 +1 @@
[3, 9, [24, null, 0, 7], ["foo", 0]]
["missing//foo", 0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not how the Haskell implementation parses it. Could you explain why this change is correct (and why the Haskell implementation is wrong)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that both parses are technically allowed (i.e. the grammar is ambiguous). However, the text of the grammar states "prefer the first parse" and "prefer alternatives that parse as many repetitions as possible", which can be summed up as "parse greedily". Taking that into account, we get that when parsing whatever// foo, the whole of whatever// should be consumed as an identifier (since slashes are allowed in identifiers).

Moreover I'm quite confident that this is the correct answer since I derived a parser directly from the abnf (taking into account greediness) and that's the output I got.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my reading of the grammar is that this should be parsed as a single label

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, seems like I just misread the grammar then! :)

If it had turned out to be ambiguous, I would have suggested that we change that, but fortunately it doesn't seem to be the case. :)

@@ -295,9 +298,9 @@ unbraced-escape =
;
; See the `valid-non-ascii` rule for the exact ranges that are not allowed
braced-codepoint =
1*3HEXDIG ; %x000-FFF
("1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" / "A" / "B" / "C" / "D" / "E" / "F" / "10") unicode-suffix; (Planes 1-16)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of the other changes, this tweak should be merged. I brainfarted when I wrote this and didn't think about optimal order to avoid backtracking..

@@ -1 +1 @@
[3, 9, [24, null, 0, 7], ["foo", 0]]
["missing//foo", 0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my reading of the grammar is that this should be parsed as a single label

@Nadrieril
Copy link
Member Author

Nadrieril commented Dec 24, 2019

Added an additional tiny tweak because one of the test didn't respect the spec up to variable name equality, whereas the rest of the tests do

@Nadrieril Nadrieril merged commit e2d08eb into dhall-lang:master Dec 27, 2019
@Nadrieril Nadrieril deleted the fixes branch December 27, 2019 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants