Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count nodes, scopes, symbols, references in parser #74

Open
overlookmotel opened this issue Jul 23, 2024 · 1 comment
Open

Count nodes, scopes, symbols, references in parser #74

overlookmotel opened this issue Jul 23, 2024 · 1 comment

Comments

@overlookmotel
Copy link

Where we're at

oxc-project/oxc#4367 added an AST traversal at start of building Semantic, to count how many nodes, scopes, symbols, and references there are in the AST.

It uses these counts to pre-allocate sufficient capacity in AstNodes, ScopeTree and SymbolTable before the main traversal which populates these structures with data. This avoids these structures needing to grow during the main traversal, with all the memory-copying involved in that. The result was a big boost for performance.

How we can improve on it

We can do even better by compiling these counts in parser while AST is being built, and remove the extra AST traversal that oxc-project/oxc#4367 introduced. Parser can return a Stats object as part of ParserReturn.

Experiments discussed in oxc-project/oxc#4328 (comment) and oxc-project/oxc#4328 (comment) suggest we can get a significant perf boost by removing the extra AST pass - perhaps another 15%-20% on oxc_semantic benchmarks, and maybe even more than that on RadixUIAdoptionSection.jsx benchmark (which I think is the most important one).

How to count nodes

As Boshen pointed out, counting nodes can be done in AstBuilder. There's a few complications to this, however:

Currently AstBuilder is Copy, which we'll need to change to make it stateful.

Transformer currently creates multiple copies of AstBuilder. This becomes a problem if all these copies need to call methods which require &mut AstBuilder. We don't want the overhead of RefCell. But transformer doesn't need these counts anyway - counts are clear just from nodes.len() etc.

A couple of options:

  1. Make AstBuilder into a trait. The builder which parser uses can include node count, builder transformer uses doesn't have to. So transformer's builder can be Copy.
  2. Only access AstBuilder in transformer via ctx.ast (where ctx is TraverseCtx), rather than self.ctx.ast. This removes the need to have multiple copies of AstBuilder.

I prefer option 2.

Also, the parser currently bypasses AST builder in some places. We'd need all AST node creation to go via the builder.

How to count scopes, symbols, references

Counting scopes, symbols and references can also be done in parser.

Accurately counting scopes is a little complicated because some scopes are conditional, so it requires some surrounding context to decide if a scope is going to be created or not. e.g. for (let key in obj) creates a scope, for (var key in obj) doesn't.

Alternatively, we can skip that complication and just count anything which could be a scope. Scopes count may then be over-estimated, but that's probably not a big problem - over-allocating capacity in ScopeTree is not very costly, it's only under-estimating which is very expensive, as it causes ScopeTree to have to grow.

@overlookmotel
Copy link
Author

oxc-project/oxc#4332 also had a much simpler method for calculating capacities required - guess! Probably we'll find that calculating counts accurately in parser has a very minimal overhead. But in case that's not the case, we could look at estimating counts based on source text length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant