Skip to content
This repository has been archived by the owner on Apr 17, 2019. It is now read-only.

problematic recursive block-text (EBNF) definition #13

Open
GlenDC opened this issue Dec 2, 2016 · 3 comments
Open

problematic recursive block-text (EBNF) definition #13

GlenDC opened this issue Dec 2, 2016 · 3 comments

Comments

@GlenDC
Copy link
Contributor

GlenDC commented Dec 2, 2016

Currently I'm working on a C# implementation of L20n (FTL), which I do mostly based on the specified EBNF grammar.

TLDR: Could we rewrite the block-text / unquoted-pattern section of this EBNF grammar, to make a block-text not n*2-recursive, where n is the amount of lines defined for a (group?) block-text.

A block-text is defined as NL __ '|' unquoted-pattern and a unquoted-pattern is defined as: unquoted-text | placeable | block-text)+. This is a bit problematic as it is completely recursive in its nature.

Not sure if this is a problem in the way I parse it in C#, or in the way the EBNF definition is setup. But the way I would expect it, is that a block-text such as:

description =
  | a
  | b
  | c

would be simply parsed as 1 block-text, consisting out of 3 unquoted-patterns {a, b, c} (which aren't block-texts on their own. In this case those 3 unquoted-patterns would be 3 unquoted-texts {a, b, c}.

The problem is that because of the current definition, this would instead result in 1 block-text a that has 1 child block-text b, which has 1 child block-text of its own c. This is very recursive, making the parsing go very deep in its call-stacks, and I can imagine that a block text could be big enough to actually get a callstack overflow on runtime.

My C# implementation is written by hand, so in theory I should be able to fix this by refactoring the code, to be not exactly like the EBNF grammar specified, while still getting the same result. However I was thinking to write parsers automatically based on the EBNF grammar for some other languages, and for those there wouldn't be such a solution possible.

@GlenDC
Copy link
Contributor Author

GlenDC commented Dec 6, 2016

A possible fix for this issue(?) would be to change the definition of a block-text to the following:

block-text           ::= NL __ '|' (unquoted-text | placeable)+;

@GlenDC
Copy link
Contributor Author

GlenDC commented Dec 11, 2016

Looking back at this issue, it might be that recursive definitions are normal for an EBNF grammar. For example an arglist and placeable-list are also defined recursively. While parsers will in most cases parse such cases iterative, as this is more idiomatic for most commonly used programming languages these days.

Is there a good reason this is defined recursively? As there are enough places where definitions are iterative, using the + operator. I don't see why the situations mentioned in this issue aren't defined in a similar fashion.

@GlenDC
Copy link
Contributor Author

GlenDC commented Jan 21, 2017

@stasm is this issue in fact valid, or should it be marked as a question and it can be closed with some explanation? If you want I can re-open it in https://github.com/projectfluent/syntax, as that is now the new repository of this project, no?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant