-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a LALR grammar for Rust with testing support #21452
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
😍 |
+10000, so glad to finally see this work start to move in-tree. |
Wow, this is awesome. Did any of these tools find any ambiguity in the grammar? |
+1 |
Whoa...! |
Strictly speaking, this isn't a LALR grammar for Rust, because it relies on the use of a |
This looks pretty awesome. What are the prospects for testing that files the Rust parser rejects are also rejected by the grammar? |
I love where this is headed! |
The grammar as it stands has a number of S/R conflicts, but they are all resolved through use of the precedence features in bison to (hopefully) match how the production rust parser works in these situations. As far as testing goes, the testing script does not do negative tests for programs that should fail to parse, but that feature can easily be added to the script. We can do that using programs in the compile-fail directory, however not all files there fail to parse as they are meant to fail in a later stage of compilation. We can check whether it's supposed to parse first with |
This adds a new lexer/parser combo for the entire Rust language can be generated with with flex and bison, taken from my project at https://github.com/bleibig/rust-grammar. There is also a testing script that runs the generated parser with all *.rs files in the repository (except for tests in compile-fail or ones that marked as "ignore-test" or "ignore-lexer-test"). If you have flex and bison installed, you can run these tests using the new "check-grammar" make target. This does not depend on or interact with the existing testing code in the grammar, which only provides and tests a lexer specification. OS X users should take note that the version of bison that comes with the Xcode toolchain (2.3) is too old to work with this grammar, they need to download and install version 3.0 or later. The parser builds up an S-expression-based AST, which can be displayed by giving the "-v" argument to parser-lalr (normally it only gives output on error). It is only a rough approximation of what is parsed and doesn't capture every detail and nuance of the program. Hopefully this should be sufficient for issue #2234, or at least a good starting point.
Where should one send changes to grammar now? bleibig/rust-grammar or rust-lang/rust? |
Rust-lang/rust |
This grammar is likely not LALR(1). When an ambiguity exists in an LALR or LR grammar, it could be resolved in two ways, either by:
In first case, the grammar is guaranteed to be (LA)LR, but in second—it might or might not. This Rust grammar is resolving ambiguities with precedence, so—it might not be (LA)LR.
But, practically speaking, most (LA)LR parser generators allow you to resolve grammar ambiguities with precedence, so this is probably not a big deal. |
This grammar seems to define assignment and compound assignment operators as left-associative (which is corresponds to reference description), however this example confirms that fn main() {
let mut u: ();
let mut a: u8;
u = (a = 2); // right associativity
//(u = a) = 2; left associativity, doesn't work
u = a = 2; // this works, so it must be right-associative
print!("{} {}", a, u == ());
} |
// prefix_exprs | ||
%precedence RETURN | ||
|
||
%left '=' SHLEQ SHREQ MINUSEQ ANDEQ OREQ PLUSEQ STAREQ SLASHEQ CARETEQ PERCENTEQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the place where erroneous left-associativity of assignment operators is defined.
This adds a new lexer/parser combo for the entire Rust language can be generated with with flex and bison, taken from my project at https://github.com/bleibig/rust-grammar. There is also a testing script that runs the generated parser with all *.rs files in the repository (except for tests in compile-fail or ones that marked as "ignore-test" or "ignore-lexer-test"). If you have flex and bison installed, you can run these tests using the new "check-grammar" make target.
This does not depend on or interact with the existing testing code in the grammar, which only provides and tests a lexer specification.
OS X users should take note that the version of bison that comes with the Xcode toolchain (2.3) is too old to work with this grammar, they need to download and install version 3.0 or later.
The parser builds up an S-expression-based AST, which can be displayed by giving the "-v" argument to parser-lalr (normally it only gives output on error). It is only a rough approximation of what is parsed and doesn't capture every detail and nuance of the program.
Hopefully this should be sufficient for issue #2234, or at least a good starting point.