You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SemanticBuilder and Transformer both create a large number of temporary data structures which are discarded at the end of the pass.
There are 3 problems with this:
It's expensive calling into system allocator to allocate these temp structures.
All these temp structures need to be dropped individually, which is also expensive.
All this allocation/deallocation introduces variance into our benchmarks. This makes it harder to do other perf optimization work, as our measurements are somewhat unreliable.
Proposed solution
I propose we introduce "scratch space" for storing temporary data in. This would be a separate arena which is filled up with data during the pass, and dropped all in one go at the end. This would solve all 3 of these problems.
Call it ScratchAllocator.
It's also a good alternative to #89, which we have no solution for at present.
Prior art: serde appears to do this.
Implementation
We can reuse our existing infrastructure from oxc_allocator:
ScratchAllocator would be a bumpalo Bump.
Temp structures use oxc_allocator's Vec and Box.
hashbrown's HashMap and HashSet support custom allocators in stable Rust. We don't have to implement our own hash map (thankfully!).
3 options for integrating this into our code:
Option 1: Create scratch arena at start of semantic/transformer pass
Create a ScratchAllocator at start of each Transformer pass.
I think we can align lifetimes so the &'a ScratchAllocator uses same lifetime as main &'a Allocator. So we don't need to introduce another lifetime.
Option 2: Reuse a single ScratchAllocator
Create a single ScratchAllocator and pass it in to Transformer::build and SemanticBuilder::build.
Advantages:
We re-use 1 ScratchAllocator over and over, so no overhead of allocating space for it on each Transformer::build call.
The memory backing ScratchAllocator will remain warm in CPU cache.
Disadvantages:
More complex public API. User has to create both an Allocator and a ScratchAllocator, and pass both of them around.
Option 3: Store scratch allocator inside Allocator
Same as option (2), but Allocator contains 2 arenas - main, and scratch.
Advantage: No changes to public API. User only creates one Allocator, same as now.
Disadvantage: Rolldown creates a separate Allocator per AST. Combining main arena and scratch arena into 1 structure prevents reusing the same ScratchAllocator for processing multiple ASTs.
Maybe there are additional APIs we can add for advanced use cases like Rolldown's, which allow taking the scratch arena out of an Allocator and creating a new Allocator that re-uses it - re-cycle a single scratch arena, while keeping the standard API simple.
Which?
I think probably option (3) is the best starting point.
Later on
Sometimes, a bump allocator is not ideal for our uses.
Often temp structures are created and destroyed on entering/leaving an AST node or scope. e.g. UnresolvedReferences in SemanticBuilder works like this. A stack-like allocator would be more appropriate for these cases.
In other places, temp structures are created and destroyed in non-sequential order. An allocator with a simple free list would be better for memory consumption. Maybe something like talc would be good.
We could use a bump allocator initially, and later on optimize further by using more appropriate allocators for these cases.
In meantime, we can use other mechanisms to recycle structures within the scratch arena, the way UnresolvedReferencesStack does. That may be sufficient for our needs, without changing the scratch allocator itself.
The text was updated successfully, but these errors were encountered:
The problem
SemanticBuilder
andTransformer
both create a large number of temporary data structures which are discarded at the end of the pass.There are 3 problems with this:
Proposed solution
I propose we introduce "scratch space" for storing temporary data in. This would be a separate arena which is filled up with data during the pass, and dropped all in one go at the end. This would solve all 3 of these problems.
Call it
ScratchAllocator
.It's also a good alternative to #89, which we have no solution for at present.
Prior art:
serde
appears to do this.Implementation
We can reuse our existing infrastructure from
oxc_allocator
:ScratchAllocator
would be a bumpaloBump
.oxc_allocator
'sVec
andBox
.HashMap
andHashSet
support custom allocators in stable Rust. We don't have to implement our own hash map (thankfully!).3 options for integrating this into our code:
Option 1: Create scratch arena at start of semantic/transformer pass
Create a
ScratchAllocator
at start of eachTransformer
pass.I think we can align lifetimes so the
&'a ScratchAllocator
uses same lifetime as main&'a Allocator
. So we don't need to introduce another lifetime.Option 2: Reuse a single
ScratchAllocator
Create a single
ScratchAllocator
and pass it in toTransformer::build
andSemanticBuilder::build
.Advantages:
ScratchAllocator
over and over, so no overhead of allocating space for it on eachTransformer::build
call.ScratchAllocator
will remain warm in CPU cache.Disadvantages:
Allocator
and aScratchAllocator
, and pass both of them around.Option 3: Store scratch allocator inside
Allocator
Same as option (2), but
Allocator
contains 2 arenas - main, and scratch.Advantage: No changes to public API. User only creates one
Allocator
, same as now.Disadvantage: Rolldown creates a separate
Allocator
per AST. Combining main arena and scratch arena into 1 structure prevents reusing the sameScratchAllocator
for processing multiple ASTs.Maybe there are additional APIs we can add for advanced use cases like Rolldown's, which allow taking the scratch arena out of an
Allocator
and creating a newAllocator
that re-uses it - re-cycle a single scratch arena, while keeping the standard API simple.Which?
I think probably option (3) is the best starting point.
Later on
Sometimes, a bump allocator is not ideal for our uses.
Often temp structures are created and destroyed on entering/leaving an AST node or scope. e.g.
UnresolvedReferences
inSemanticBuilder
works like this. A stack-like allocator would be more appropriate for these cases.In other places, temp structures are created and destroyed in non-sequential order. An allocator with a simple free list would be better for memory consumption. Maybe something like talc would be good.
We could use a bump allocator initially, and later on optimize further by using more appropriate allocators for these cases.
In meantime, we can use other mechanisms to recycle structures within the scratch arena, the way
UnresolvedReferencesStack
does. That may be sufficient for our needs, without changing the scratch allocator itself.The text was updated successfully, but these errors were encountered: