Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive stage2 parser #2405

Merged
merged 6 commits into from
May 12, 2019
Merged

Conversation

hryx
Copy link
Contributor

@hryx hryx commented May 2, 2019

TODO

  • 🏆 Finish implementation such that CI passes
  • 🥊 Remove as many TODO comments as possible by getting answers to questions in code review and creating follow-ups and tickets as needed
  • 💬 Address all review comments
  • 🚿 Squish several commits via interactive rebase to clean up history a bit
  • 📋 Fill out remaining sections of this pull request description

Closes #1729

Keep your eyes open for any TODO comments and leave your suggestions or opinions. Some of these comments will likely be retained and their corresponding changes postponed till a later PR, but maybe some of them can be solved now.

Performance impact

Here are some informal tests of parser_test.zig with perf stat -d on x86_64 Linux. In the absence of visualizations and more formal testing, some quick findings based on my system:

  • CPU cycles, instructions, branches: -55%
  • Page faults are roughly the same
  • Real time elapsed: -45%

Readability impact

Indentation stats of std/zig/parse.zig:

master                          | stage2-recursive-parser
--------------------------------+---------------------------------
indent count                    | indent count
------------                    | ------------
    0     92                    |      0   263
    1    241                    |      1   803
    2    123                    |      2   763
    3    291                    |      3   334
    4    654                    |      4   133
    5    662                    |      5    60
    6    700                    |      6    28
    7    347                    |      7     5
    8    229                    |
    9     42                    |
   10     18                    |
avg indentation level: 4.796999 | avg indentation level: 1.827543
source lines of code:      3399 | source lines of code:      2389
  • Maximum indentation level went from 10 to 7
  • Average indentation level went from 4.79 to 1.82
  • Lines of code (excluding blank/comments) went from 3399 to 2389

Follow-up items

Because this was a heavy change, a few ideas and concerns came up along the way which would unnecessarily increase the scope of this PR or add diff noise but which I would like to address.

 PrefixTypeOp
     <- QUESTIONMARK
      / KEYWORD_promise MINUSRARROW
-     / ArrayTypeStart (ByteAlign / KEYWORD_const / KEYWORD_volatile / KEYWORD_allowzero)*
+     / ArrayTypeStart
+     / SliceTypeStart (ByteAlign / KEYWORD_const / KEYWORD_volatile / KEYWORD_allowzero)*
      / PtrTypeStart (KEYWORD_align LPAREN Expr (COLON INTEGER COLON INTEGER)? RPAREN / KEYWORD_const / KEYWORD_volatile / KEYWORD_allowzero)*
  • Implement disallowing duplicate pointer qualifiers in stage1. Add this test to test/compile_errors.zig:
cases.add(
    "duplicate qualifiers",
    \\test "duplicates" {
    \\    var a: *align(4) align(4) i32 = 0;
    \\    var b: *const const i32 = 0;
    \\    var c: *volatile volatile i32 = 0;
    \\    var d: *allowzero allowzero i32 = 0;
    \\}
,
    "tmp.zig:2:22: error: Extra align qualifier",
    "tmp.zig:3:19: error: Extra const qualifier",
    "tmp.zig:4:22: error: Extra volatile qualifier",
    "tmp.zig:5:23: error: Extra allowzero qualifier",
);
  • The existing parse error system is somewhat cumbersome: Every possible error is its own type, has entries in the switches for the render() and loc() methods, and exists in the ast.Error union. If it's possible, it would be more flexible, concise, legible, and expressive to generate errors from string literals at the site of the error itself. Something like this:
const expr_node = try expectNode(arena, it, tree, parseExpr, "expression or assignment");

Miscellany

🙇‍♂️ Massive thanks to @tyler569 for unblocking me by finding a memory leak I caused along the way.

🌹 I am so sorry I killed your art @Hejsil

std/zig/parse2.zig Outdated Show resolved Hide resolved
@Hejsil
Copy link
Contributor

Hejsil commented May 2, 2019

Exciting. I'll look through all of this once you're done :)

@andrewrk
Copy link
Member

andrewrk commented May 8, 2019

All tests passing 🎉 🎉

Ready for some review?

@hryx hryx marked this pull request as ready for review May 8, 2019 22:53
@hryx
Copy link
Contributor Author

hryx commented May 8, 2019

I still have some work to do to finish the PR description, but since this a big one, I suppose it's time to get started. Let's do it!

std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
@andrewrk
Copy link
Member

andrewrk commented May 9, 2019

This branch has too many commits. How should the git history be cleaned up when ready to merge?

For a branch this large, if there are conflicts, merge master into the branch, then fix the conflicts. If there are no conflicts - which it looks to be the case - no work is needed on your part, I'll merge it into master with a merge commit.

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks amazing. I left a few comments, but I already consider this ready to merge. I'll give @Hejsil a chance to take a look, and then I think it's good to go!

std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
Copy link
Contributor

@Hejsil Hejsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @hryx. I have a few comments.

std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
std/zig/parse.zig Show resolved Hide resolved
std/zig/parse.zig Outdated Show resolved Hide resolved
@hryx
Copy link
Contributor Author

hryx commented May 10, 2019

Thanks for the review and great comments @Hejsil & @andrewrk. Been a busy week but I will have time to address all these comments this weekend. Looking forward to wrapping up!

@Sahnvour
Copy link
Contributor

* This branch has too many commits. How should the git history be cleaned up when ready to merge?

Looking through the commits, I think you can first reduce their number by doing an interactive rebase and use many small commits as fixups for previous ones.

@hryx
Copy link
Contributor Author

hryx commented May 11, 2019

Thanks @Sahnvour, that's my plan now. After addressing all these review comments today, I'll turn a bunch of commits into fixups. But since it sounds like having several commits is allowed, I won't obsess over getting it perfect.

@hryx
Copy link
Contributor Author

hryx commented May 12, 2019

Those ❌ marks gave me a good scare; all tests pass locally. I assume Azure cancels a run when a newer commit is pushed on the branch.

hryx and others added 5 commits May 12, 2019 01:52
The `arena` instance being used bythe parse tree was valid and
pointed to valid memory, but existed as a local variable inside the
stack frame of the `parse` function (the `const arena`), which was never
stored anywhere before leaving the scope.

This meant that code above the `parse` function saw a valid instance of
an `ArenaAllocator` that pointed to the same backing memory, but didn't
posess any of the local state built up after the call to `parseRoot`,
basically the caller saw an empty arena.

This meant that when `deinit` was called, it saw an Arena with 0
allocations in it's `buffer_list` and wasn't able to destroy any of the
memory.  This caused it to leak and caused FailingAllocator to balk.

The fix is to make sure the parse tree is using the same instance of
ArenaAllocator as is reported up the call stack, the one inside the
`Tree{}` object.  I'm not sure why that field is marked with a comment
to remove it, as it's used by the `std.ast.Tree.deinit()` function, but
this change seems to solve the problem.
@hryx hryx force-pushed the stage2-recursive-parser branch from be0f97e to 0d62942 Compare May 12, 2019 09:33
@hryx
Copy link
Contributor Author

hryx commented May 12, 2019

The git history has been cleaned up. Please don't ask me how I did it. The original 138 commits can be seen at this tag: master...hryx:stage2-recursive-parser-pre-cleanup

Regarding the most recent changes: some other files did get touched, either to address review comments or to make the parser conform more strictly to the grammar rules. Either way, I now consider this ready to merge if there are no more concerns or blockers!

and now, art

⬜ ⬜ ⬜ ⬜ 🌞 ⬜ ⬜ ⬜
⬜ ⬜ ⬜ ⬜ ⬜ ⬜ ⬜ ⬜
⬜ 🎅 💦 ⬜ ⬜ ⬜ ⬜ ⛵
🌊 🌊 🌊 🌊 🌊 🦈 🌊 🌊

std/zig.zig Outdated Show resolved Hide resolved
std/zig/ast.zig Show resolved Hide resolved
std/zig/render.zig Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rewrite the stage2 parser to be recursive and adhere to the official grammar
8 participants