-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexer actions and semantic predicates are executed out of order #3611
Comments
I've had this problem before, when I suspected I was using it incorrectly, and now it looks like it might be a bug in ANTLR4. I used action to track the currently matched question number and calculate the next question number as the boundary of the current question content, but it didn't work. I once thought that my usage was wrong. TestParser.g4:
TestLexer.g4:
TestParseTest.java: import org.antlr.v4.runtime.CharStream;
import org.antlr.v4.runtime.CharStreams;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.Lexer;
import org.antlr.v4.runtime.tree.ParseTree;
public class TestParseTest {
public static void main(String[] args) {
CharStream charStream = CharStreams.fromString("1. hahaha\n" +
"2. hahaha\n");
Lexer lexer = new TestLexer(charStream);
CommonTokenStream tokens = new CommonTokenStream(lexer);
TestParser parser = new TestParser(tokens);
ParseTree parseTree = parser.root();
System.out.println(parseTree.toStringTree(parser));
}
} The output is as follows:
The above exception indicates that the semantic prediction was executed before the action, but this is not what I want. I want the action to be executed before the final semantic prediction. |
@parrt Even if we correct #3606, actions and predicates are going to be executed out of order because actions are queued whereas semantic predicates are not. This should also be addressed (documented, checked+flagged as error, or changed to be more like a parser). |
Ok, the book makes it clear but online doc does not. (@kaby76 can I send you a copy?) I'll update online doc. seems actions in lexers should only appear at the right edge of a rule so I should add an error message or something. The general rule is that predicates cannot be a function of actions that are visible when making parsing decision. Actions are never executed during parsing decision, just during parsing or lexing. |
Actually it looks like some of the book has been copied in doc already:
|
Hmm...On the other hand I see that we fixed a bug where we had actions in the middle of a lexer rule: |
…tions and semantic predicates; Fixes antlr#3611. Fixes antlr#3606. Signed-off-by: Terence Parr <[email protected]>
It should be reopened since introducing of warning |
This concerns lexer rules that contain a mix of actions and semantic predicates within one rule. It's somewhat related to #3606 in so far as when I debugged that, I then found out this problem.
Suppose we have the following grammars:
lexer:
parser:
input:
The expectation here is that the lexer counts the number of 'a's in the input and allows a valid token with only three 'a's. Yes, it is contrived, but it illustrates something that is a deep-order assumption that I did not know, even having used Antlr for many many years.
Unfortunately, the parser does not work. The lexer ExecATN() evaluates the semantic action first, before the action is evaluated. The reason it does this is because the semantic predicates are evaluated "on the fly", while actions are queued up and evaluated at the end of the function. I don't understand why actions are queued up anyways, and people often refer to "semantic predicates" as "actions" but of a special type.
This is not "referentially transparent" because the order of evaluating the actions is not interleaved with the semantic predicates even though the action is listed in the rule RHS before the semantic predicate. The expectation normal "users" would have is that the actions and semantic predicates are evaluated in the order as they occur on the RHS of the rule.
In this example, the rule never matches, and it is impossible to parse anything.
I did a quick search in the grammars-v4 repository to see if there are rules like this. It's not an extensive search, but there is one here.
The text was updated successfully, but these errors were encountered: