Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ThinVec type #113

Open
overlookmotel opened this issue Sep 11, 2024 · 2 comments
Open

ThinVec type #113

overlookmotel opened this issue Sep 11, 2024 · 2 comments

Comments

@overlookmotel
Copy link

overlookmotel commented Sep 11, 2024

The problem

There are various parts of AST where we have a Vec which is usually empty. e.g.:

  • Function::directives
  • Class::decorators
  • FormalParameter::decorators

To save space, we currently use Box<Vec<T>> for these fields. This is better than plain Vec, as it's 8 bytes instead of 32.

But downsides are:

  1. Checking if Vec is empty or not involves a "far off" memory read (follow Box's pointer).
  2. When the Vec does have content, reading/writing an element involves double-indirection (follow Box's pointer, then Vec's pointer).

ThinVec

We could do a bit better with a ThinVec-like type.

ThinVec stores its length and capacity in same allocation as the vec's data. So ThinVec itself is pointer-sized (8 bytes).

I could not find an existing ThinVec-like crate which accepts a custom allocator, so we'd have to build our own.

Designing for our use case

Because we'll be using it with arena allocator, we don't have to worry about Drop, so we could make a tweak to the design to allow very cheap ThinVec::is_empty:

  • When the ThinVec is empty (len == 0), store a sentinel value in place of the pointer.
  • Sentinel value could be 0.
  • If we want Option<ThinVec> to also be pointer-sized, we could use a NonNull pointer with sentinel value of 1. As long as alignment of T in ThinVec<T> is greater than 1 (all our AST types have alignment of at least 4), then 1 can never be a valid pointer.
  • When a ThinVec has elements and has its last element removed, its pointer is replaced with the sentinel. This means its allocation is discarded, and if you push to the ThinVec again, it'll have to make a fresh allocation. But I think this is a rare case, so not much of a problem in practice.
  • The other trade-off is that ThinVec::len involves a branch to first check for the sentinel (rather than original ThinVec where this is straight-line code). But as we expect almost all ThinVecs to be empty, this branch should be very predictable.

Optimization for single-entry ThinVecs

We could also have an optimization for ThinVecs with a single entry, where ThinVec would set lowest bit of pointer to 1 and the pointer points to the single entry (essentially it's a Box). This does impose cost of checking that bit and an AND instruction to get rid of it, on every read/write, so may not be worthwhile. But might be useful for Vecs which commonly contain only a single entry.

@Dunqing
Copy link
Member

Dunqing commented Oct 12, 2024

I'd like to learn and do it in my leisure time! Before that, I want to check with you and @Boshen, do we still need this currently?

@Boshen
Copy link
Member

Boshen commented Oct 12, 2024

I don't think thin vec will have much effect given the number of usages is rare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants