-
-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Earley parser produces non-deterministic results even with explicit terminal weights #201
Comments
Earley currently does not support weights on terminals. But it should work as you expect it to with weights on rules. Such as: root: a | b
a.1: ("0".."9")+
b.2: ("0".."9")+ This is probably not the best behavior. Hopefully I'll get the chance to fix it soon. |
I see. Perhaps you could help me work around this issue with a project I'm working on. In the project I have terminals similar to the following for unquoted strings, decimal numbers, and hexadecimal numbers:
I'd like to be able to have things like "10" always parse as a decimal and "0xFF" as a hexadecimal, but the non-deterministic behavior is causing them to sometimes be interpreted as strings. Is there an easy way I can refactor this to work around the unexpected behavior? |
Of course. Just add intermediate rules.
etc. Hope this helps! |
Thanks for the advice. It seems to work in some cases but not in others. For example, with the grammar and text below, the hexadecimal literal seems to be interpreted as a string unless I remove the string terminal entirely, in which case it is interpreted as a hexadecimal.
I'm not sure if this is an issue with my grammar or with the parser. The Any assistance would be greatly appreciated. |
Hmm yeah, this is an actual bug with the current implementation. I'm currently working on a new earley implementation that should solve this problem. You can check out the branch |
I was able to switch over to If you feel that this issue falls under the scope of #191, please feel free to close it. |
0.7 published |
Perhaps related to #191
When using the earley parser with weighted terminals, the result is still non-deterministic in certain cases.
Example:
It is my understanding that using weights was a workaround for this non-determinism (as indicated in #167), but this does not seem to be the case.
I am using the latest lark version available on pip (0.6.2) on Fedora Linux 27 x86_64.
The text was updated successfully, but these errors were encountered: