Count nodes, scopes, symbols, references in parser #74

overlookmotel · 2024-07-23T11:33:26Z

Where we're at

oxc-project/oxc#4367 added an AST traversal at start of building Semantic, to count how many nodes, scopes, symbols, and references there are in the AST.

It uses these counts to pre-allocate sufficient capacity in AstNodes, ScopeTree and SymbolTable before the main traversal which populates these structures with data. This avoids these structures needing to grow during the main traversal, with all the memory-copying involved in that. The result was a big boost for performance.

How we can improve on it

We can do even better by compiling these counts in parser while AST is being built, and remove the extra AST traversal that oxc-project/oxc#4367 introduced. Parser can return a Stats object as part of ParserReturn.

Experiments discussed in oxc-project/oxc#4328 (comment) and oxc-project/oxc#4328 (comment) suggest we can get a significant perf boost by removing the extra AST pass - perhaps another 15%-20% on oxc_semantic benchmarks, and maybe even more than that on RadixUIAdoptionSection.jsx benchmark (which I think is the most important one).

How to count nodes

As Boshen pointed out, counting nodes can be done in AstBuilder. There's a few complications to this, however:

Currently AstBuilder is Copy, which we'll need to change to make it stateful.

Transformer currently creates multiple copies of AstBuilder. This becomes a problem if all these copies need to call methods which require &mut AstBuilder. We don't want the overhead of RefCell. But transformer doesn't need these counts anyway - counts are clear just from nodes.len() etc.

A couple of options:

Make AstBuilder into a trait. The builder which parser uses can include node count, builder transformer uses doesn't have to. So transformer's builder can be Copy.
Only access AstBuilder in transformer via ctx.ast (where ctx is TraverseCtx), rather than self.ctx.ast. This removes the need to have multiple copies of AstBuilder.

I prefer option 2.

Also, the parser currently bypasses AST builder in some places. We'd need all AST node creation to go via the builder.

How to count scopes, symbols, references

Counting scopes, symbols and references can also be done in parser.

Accurately counting scopes is a little complicated because some scopes are conditional, so it requires some surrounding context to decide if a scope is going to be created or not. e.g. for (let key in obj) creates a scope, for (var key in obj) doesn't.

Alternatively, we can skip that complication and just count anything which could be a scope. Scopes count may then be over-estimated, but that's probably not a big problem - over-allocating capacity in ScopeTree is not very costly, it's only under-estimating which is very expensive, as it causes ScopeTree to have to grow.

The text was updated successfully, but these errors were encountered:

overlookmotel · 2024-07-23T14:56:07Z

oxc-project/oxc#4332 also had a much simpler method for calculating capacities required - guess! Probably we'll find that calculating counts accurately in parser has a very minimal overhead. But in case that's not the case, we could look at estimating counts based on source text length.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Count nodes, scopes, symbols, references in parser #74

Count nodes, scopes, symbols, references in parser #74

overlookmotel commented Jul 23, 2024

overlookmotel commented Jul 23, 2024

Count nodes, scopes, symbols, references in parser #74

Count nodes, scopes, symbols, references in parser #74

Comments

overlookmotel commented Jul 23, 2024

Where we're at

How we can improve on it

How to count nodes

How to count scopes, symbols, references

overlookmotel commented Jul 23, 2024