Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get different results when run the parser multiple times #251

Closed
dongli opened this issue Oct 16, 2018 · 3 comments
Closed

Get different results when run the parser multiple times #251

dongli opened this issue Oct 16, 2018 · 3 comments

Comments

@dongli
Copy link

dongli commented Oct 16, 2018

I am writing a parser for a configuration file format. Here is the grammar that I have written

  start: expr*

  %import common.CNAME
  %import common.ESCAPED_STRING
  %import common.SIGNED_NUMBER
  %import common.WS

  COMMENT: /\/\/.*/
  OPERATOR: ">=" | "<=" | ">" | "<" | "=="
  TRUE: "TRUE"
  FALSE: "FALSE"

  %ignore WS
  %ignore COMMENT
  %ignore ";"

  expr: key "=" value

  key: CNAME
  value: dict
       | array
       | boolean
       | string
       | number
       | threshold
       | keyword

  dict: "{" expr* "}"
  array: "[" (value "," )* [value] "]"
  string: ESCAPED_STRING
  number: SIGNED_NUMBER
  boolean: TRUE | FALSE
  threshold: OPERATOR number
  keyword: CNAME

And the sample file is:

climo_mean = {

   file_name = [];
   field     = [];

   regrid = {
      vld_thresh = 0.5;
      method     = NEAREST;
      width      = 1;
   }

   time_interp_method = DW_MEAN;
   match_day          = FALSE;
   time_step          = 21600;
}

I also wrote a transformer to turn the tree into a dict object, when I noticed that the FALSE is parsed as boolean or keyword at different runs. I have already put boolean rule before keyword. Any idea why I got random results? Thanks!

RUN 1:

keyword NEAREST
keyword DW_MEAN
boolean FALSE
{'climo_mean': {'field': [],
                'file_name': [],
                'match_day': False,
                'regrid': {'method': 'NEAREST', 'vld_thresh': 0.5, 'width': 1},
                'time_interp_method': 'DW_MEAN',
                'time_step': 21600}}

RUN 2:

keyword NEAREST
keyword DW_MEAN
keyword FALSE
{'climo_mean': {'field': [],
                'file_name': [],
                'match_day': 'FALSE',
                'regrid': {'method': 'NEAREST', 'vld_thresh': 0.5, 'width': 1},
                'time_interp_method': 'DW_MEAN',
                'time_step': 21600}}
@erezsh
Copy link
Member

erezsh commented Oct 16, 2018

Hi, I believe this is a known bug: #201

I'm going to roll out Lark v0.7 in a week or two, which should solve this problem.

Meanwhile, you can use the earley_sppf branch for testing, and to verify that this is indeed a known issue.

@erezsh
Copy link
Member

erezsh commented Oct 16, 2018

Btw, as another solution meanwhile, you can try to play with rule weights. For example:

boolean.10: TRUE | FALSE

@dongli
Copy link
Author

dongli commented Oct 16, 2018

@erezsh I have tried rule weight, but it does not work.

@erezsh erezsh closed this as completed Mar 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants