Separate semantic analysis and parsing #370

jyn514 · 2020-04-12T04:52:42Z

This is basically a rewrite of the entire parser. ~~It is very much a work in progress and all the tests are failing.~~ Additionally, I think I've lost some of the unit tests during the rewrite, I need to find them in the git history. That said, most of the work for the rewrite is done at this point and everything left is just cleanup.

Changes

Most of these could be broken up into separate PRs, except for the main 'separate parsing and analysis' change.

Instead of having ExprType::{Add(left, right), Sub(left, right), ...}, have ExprType::Binary(BinaryOp, left, right), which makes life much easier for the constant folder and backend. This also makes the code much easier to understand since BinaryOp is now a struct instead of a function.
Clean up declaration_specifiers a lot. This is now sane and won't randomly miss specifiers if they occur in the wrong order. Cannot be in a separate PR.
Add benchmark for nested parentheses. rcc now reliably handles 3000+ parentheses in a row when before it had trouble even with 300.
Implement limited span merging. This merges spans for expressions, statements, and declaration but is not very good at retrieving the original subspans. For example, this will show the whole function declarator as an error:

int f(int i, int j, void);
<stdin>:1:6 error: invalid program: void must be the first and only parameter if specified
int f(int i, int j, void);
     ^^^^^^^^^^^^^^^^^^^^

Correctly parse qualifiers for pointers and variables (Parse qualifiers correctly #347). Cannot be in a separate PR.
Remove the unused Type::Bitfield
Rename the AssignmentToken variants to not look dumb (AddAssign instead of PlusAssign)
Move most of the scaffolding for typedefs earlier in the parser to avoid having enormous match statements everywhere. This also makes is_decl_specifier much more reliable. The cost of course is that's it's super hacky, but it works very reliably.
The backend is no longer responsible for desugaring complex assignment (good riddance!). Cannot be in a separate PR.
Variables are now given a Metadata when they are declared which is reused across scopes. This means the backend no longer has to have any idea of what a scope is, making it easier to do codegen. In particular, there are no more bugs where the frontend's scope is different from the backend's scope.

Action Items

These should be reverted or fixed before merging.

The lexer now turns identifiers into keywords again. This broke the preprocessor and needs to be reverted, it was mostly for testing.
I tried making the Location a trait instead of a type. This failed miserably.
Added derive_more for displays. I only used it in one trivial place, it should either take the place of impl Display for Expr or be removed altogether before this is merged.
codespan is used only for storing the Files table. This is kind of a waste and I should either switch to codespan-reporting once and for all or write my own Files real quick to remove the unnecessary dependency. I'll make a follow-up PR for this, this one is big enough.
There is still a lot of commented-out code that needs to be removed.
The Lexer needs a design decision. Right now it implements Iterator<Item = Result<Token, LexError>> which I like because it reflects its semantic purpose. However, the parser only accepts Iterator<Item = Result<Token, CompileError>>. ~~Either the lexer needs to yield CompileError or there needs to be a wrapper that turns all the LexErrors into the more generic CompileError.~~ I went with a third option: the parser accepts any iterator over Result<Token, E: Into<CompileError>

Issues

Closes

#139 is fixed in the parser but now crashes because cranelift hasn't implemented boolean ops (bytecodealliance/wasmtime#1133)

Addresses #59, but only as a misfeature - it ignores all of the keywords listed there.

Makes a great deal of progress towards #266.

cc @pythondude325

higher numeric values you can't have a precedence that's less than 0, this crashes: 1 = 2 = 3 + 4*5 + 1

This automatically runs it with --release

It was only 3 lines and only used in 1 place

No known bugs this fixes, but there shouldn't be invalid expressions with a valid type.

jyn514 · 2020-04-26T19:13:33Z

src/analyze/mod.rs

+                self.declarations.typedefs.insert(id, ());
+            } else if ctype == Type::Void {
+                // TODO: catch this error for types besides void?
+                self.err(SemanticError::VoidType, location);


Not quite sure what I was thinking with the comment here. The error is for void i; and things like that.

src/analyze/mod.rs

This is _such_ a hack

Avoids segfaults (in the parser) for highly nested `sizeof`. Still segfaults in the analyzer.

_technically_ they were never removed https://xkcd.com/1475/

Where is your clippy now?

pythongirl325

Some comments

src/analyze/mod.rs

pythongirl325 · 2020-04-29T03:15:18Z

src/analyze/mod.rs

+        self.parse_typename(ctype, location)
+    }
+    // TODO: I don't think this is a very good abstraction
+    fn parse_typename(&mut self, ctype: ast::TypeName, location: Location) -> Type {


Should this (and the following) parsing function even be in the analysis module?

Yes, why wouldn't they be? These are helper functions for other functions in analyze, but I don't see why they would go somewhere else.

src/analyze/mod.rs

pythongirl325 · 2020-04-29T03:41:27Z

src/analyze/mod.rs

+            },
+        };
+        let mut storage_class = None;
+        for (spec, sc) in &[


I think this loop can be rewritten to be a little clearer, maybe even with iterators.

src/analyze/mod.rs

pythongirl325 · 2020-04-29T03:51:23Z

src/analyze/mod.rs

+                None => ctype = Some(Type::Int(signed)),
+            }
+        }
+        // i;


Can you give this comment some context

i; (in a scope where i has not previously been declared) declares a new variable called i with type int. This 'feature' was removed in C99 but is still common in real-world code.

http://port70.net/~nsz/c/c99/n1256.html#Forewordp5

The bit that allows this in C89 is 3.5.2:

int , signed , signed int , or no type specifiers

See https://stackoverflow.com/questions/26488502/which-section-in-c89-standard-allows-the-implicit-int-rule

jyn514 added 30 commits February 29, 2020 14:51

separate AST into separate crate

4240519

some things probably work

86e251e

[BROKEN] start impl Display for Expr

81e9f99

Finished impl Display

b6ae53d

[broken] fixed a little of binary expr

456a29b

it compiles at least

8ea68fe

it turns out there was a reason the algorithm had higher precedence as

1b66aaf

higher numeric values you can't have a precedence that's less than 0, this crashes: 1 = 2 = 3 + 4*5 + 1

It works!

b5a9239

ternaries!

bcf0303

rustfmt

143fa71

ternary made this take up less stack space :)

1010e90

change script for new root

80c700d

Move paren test to benchmarks

a4c9626

This automatically runs it with --release

BinaryPrecedence -> Precedence

cc475aa

match_unary -> match_prefix

ed60b38

Add postfix expressions

0ef045c

add struct deref

092976a

add ++, --

a02878f

add array indexing

1f29941

add tests

94a7f71

Add function calls

7c7dd9d

add missing files

aeed05e

[BROKEN] start on prefix precedence

78aa638

Add alignof to AST

d8dc20a

[probably broken] implement cast_expr, sizeof, pre-increment

4261f18

Parse keywords again

bda24bb

Start on parsing types

6e6f78c

Specifiers are mostly working

0804256

Appease rustfmt

59fa70c

Appease clippy

3b074ac

jyn514 added 2 commits April 26, 2020 13:14

Remove dead code

ec4c5c8

Remove unnecessary clippy allows

cc250c1

pythongirl325 self-requested a review April 26, 2020 17:23

jyn514 added 7 commits April 26, 2020 14:13

Add tests for fixed issues

94d57ae

Get rid of parse_binary

3914775

It was only 3 lines and only used in 1 place

Mark typedefs in expression context as having Type::Error

bc9fe1a

No known bugs this fixes, but there shouldn't be invalid expressions with a valid type.

Fix outdated comment

bf31f68

Add comments to analyze/expr.rs

1b92658

Add comments to analyze/mod.rs; fix linkage bug

d29ff43

Fix impl Display for Qualifiers

af62d8c

jyn514 commented Apr 27, 2020

View reviewed changes

jyn514 added 10 commits April 26, 2020 20:12

Add more comments to analyze/mod.rs

b52c394

Add more comments to analyze/mod.rs

c171f96

Allow parsing typedefs in an inner scope

e67fe84

This is _such_ a hack

Separate primary_expr and postfix_expr

1251fca

Use iteration instead of recursion for parsing sizeof

d23dd5c

Avoids segfaults (in the parser) for highly nested `sizeof`. Still segfaults in the analyzer.

Allow long int i

8806487

Add back missing tests for const_fold

cb4069d

_technically_ they were never removed https://xkcd.com/1475/

Remove useless assignment

b2b19b3

Where is your clippy now?

Add some more comments to src/analyze/mod.rs

ccadd3a

Add more comments to analyzer

fb3670b

pythongirl325 reviewed May 1, 2020

View reviewed changes

jyn514 mentioned this pull request May 1, 2020

Make function qualifiers part of FunctionType #391

Open

jyn514 added 3 commits May 1, 2020 14:46

Even more comments

6ec7ef4

Merge branch 'master' into pratt-parsing

44f2725

Fix clippy warnings

6ffd969

jyn514 merged commit baeceb2 into master May 1, 2020

jyn514 deleted the pratt-parsing branch May 1, 2020 20:00

jyn514 mentioned this pull request May 1, 2020

[ICE] infinite recursion on union containing itself #261

Open

jyn514 mentioned this pull request May 13, 2020

[ICE] crash on function initializers #259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate semantic analysis and parsing #370

Separate semantic analysis and parsing #370

jyn514 commented Apr 12, 2020 •

edited

Loading

jyn514 Apr 26, 2020

pythongirl325 left a comment

pythongirl325 Apr 29, 2020

jyn514 May 1, 2020

pythongirl325 Apr 29, 2020

pythongirl325 Apr 29, 2020

jyn514 May 1, 2020

jyn514 May 1, 2020

Separate semantic analysis and parsing #370

Separate semantic analysis and parsing #370

Conversation

jyn514 commented Apr 12, 2020 • edited Loading

Changes

Action Items

Issues

jyn514 Apr 26, 2020

Choose a reason for hiding this comment

pythongirl325 left a comment

Choose a reason for hiding this comment

pythongirl325 Apr 29, 2020

Choose a reason for hiding this comment

jyn514 May 1, 2020

Choose a reason for hiding this comment

pythongirl325 Apr 29, 2020

Choose a reason for hiding this comment

pythongirl325 Apr 29, 2020

Choose a reason for hiding this comment

jyn514 May 1, 2020

Choose a reason for hiding this comment

jyn514 May 1, 2020

Choose a reason for hiding this comment

jyn514 commented Apr 12, 2020 •

edited

Loading