Use `Vec<T>` for keeping track of gc objects #3493

HalidOdat · 2023-12-01T08:33:22Z

This does not reduce memory usage, since we still keep a pointer to the gc objects, but instead it reduces the overhead of jumping through the intrusive linked list and also simplifies the gc collection functions.

Benchmarks

Main

PROGRESS Richards
RESULT Richards 36.5
PROGRESS DeltaBlue
RESULT DeltaBlue 41.7
PROGRESS Encrypt
PROGRESS Decrypt
RESULT Crypto 55.5
PROGRESS RayTrace
RESULT RayTrace 147
PROGRESS Earley
PROGRESS Boyer
RESULT EarleyBoyer 138
PROGRESS Splay
RESULT Splay 120
PROGRESS NavierStokes
RESULT NavierStokes 111
SCORE 81.0

PR

PROGRESS Richards
RESULT Richards 39.1
PROGRESS DeltaBlue
RESULT DeltaBlue 46.4
PROGRESS Encrypt
PROGRESS Decrypt
RESULT Crypto 56.3
PROGRESS RayTrace
RESULT RayTrace 150
PROGRESS Earley
PROGRESS Boyer
RESULT EarleyBoyer 153
PROGRESS Splay
RESULT Splay 166
PROGRESS NavierStokes
RESULT NavierStokes 112
SCORE 88.9

github-actions · 2023-12-01T08:42:38Z

Test262 conformance changes

Test result	main count	PR count	difference
Total	95,609	95,609	0
Passed	76,526	76,526	0
Ignored	18,132	18,132	0
Failed	951	951	0
Panics	0	0	0
Conformance	80.04%	80.04%	0.00%

codecov · 2023-12-01T08:42:42Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (9f181ef) 44.57% compared to head (8330f9d) 44.54%.

❗ Current head 8330f9d differs from pull request most recent head 685b29d. Consider uploading reports for the commit 685b29d to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3493      +/-   ##
==========================================
- Coverage   44.57%   44.54%   -0.03%     
==========================================
  Files         487      487              
  Lines       50621    50598      -23     
==========================================
- Hits        22563    22539      -24     
- Misses      28058    28059       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jedel1043 · 2023-12-01T10:13:23Z

Um, I'm pretty sure the point of using intrusive linked lists here is to avoid having live allocations on the GC state. This does make it faster, but consider that the Vec will keep expanding if the number of objects keeps growing, and it'll never reclaim the allocated buffer. This can be seen using our heap profiling tools:

Sample

{
    let array = []
    for (let i = 0; i <= 100000; i++) {
        array.push({data: i});
    }
}

$boa.gc.collect()

{
    let array = []
    for (let i = 0; i <= 100000; i++) {
        array.push({data: i});
    }
}

$boa.gc.collect()

Running cargo run --profile release-dbg --features dhat -- test.js --debug-object

Main

dhat: Total:     248,322,251 bytes in 1,984,424 blocks
dhat: At t-gmax: 53,298,622 bytes in 274,607 blocks
dhat: At t-end:  499,999 bytes in 2,648 blocks

This PR

dhat: Total:     217,362,347 bytes in 1,984,405 blocks
dhat: At t-gmax: 57,012,374 bytes in 283,328 blocks
dhat: At t-end:  4,678,511 bytes in 2,650 blocks

Note how main uses only 500k bytes of heap at the end of the program, while this PR uses 4.5M bytes.

jedel1043 · 2023-12-01T10:25:51Z

The alternative would be to resize the Vec on each collection, but that'll probably impact performance if the GC needs to do several GC collections on a high workload, and we'll still have half of Vec's buffer unused at some points in time.

HalidOdat · 2023-12-01T10:26:07Z

In that case we can shrink_to with some_grace_value (maybe len/2 or len/3) on every collect.

HalidOdat · 2023-12-01T10:26:54Z

Had the same thought 😆

HalidOdat · 2023-12-01T14:48:25Z

Fixed in 685b29d :)

dhat: Total:     259,991,433 bytes in 1,984,563 blocks
dhat: At t-gmax: 57,561,904 bytes in 283,328 blocks
dhat: At t-end:  499,999 bytes in 2,650 blocks

Previous implementation

PROGRESS Richards
RESULT Richards 38.7
PROGRESS DeltaBlue
RESULT DeltaBlue 46.1
PROGRESS Encrypt
PROGRESS Decrypt
RESULT Crypto 55.8
PROGRESS RayTrace
RESULT RayTrace 150
PROGRESS Earley
PROGRESS Boyer
RESULT EarleyBoyer 146
PROGRESS Splay
RESULT Splay 159
PROGRESS NavierStokes
RESULT NavierStokes 108
SCORE 86.9

Current implementation

PROGRESS Richards
RESULT Richards 38.8
PROGRESS DeltaBlue
RESULT DeltaBlue 45.7
PROGRESS Encrypt
PROGRESS Decrypt
RESULT Crypto 56.7
PROGRESS RayTrace
RESULT RayTrace 149
PROGRESS Earley
PROGRESS Boyer
RESULT EarleyBoyer 146
PROGRESS Splay
RESULT Splay 160
PROGRESS NavierStokes
RESULT NavierStokes 110
SCORE 87.2

jedel1043 · 2023-12-01T17:50:17Z

~~Yeah, that's what I expected. Perf is virtually the same with that change.~~ Clarified that the new bench compares two implementations of this PR.

I'd personally keep the current linked list just because it could enable some optimizations in the future; I think CPython uses the pointer as a tagged pointer to store the ref count.

Since I'm tending towards not merging this, I'd probably defer the review to @nekevss and @raskad for an unbiased opinion.

HalidOdat · 2023-12-01T18:11:07Z

Yeah, that's what I expected. Perf is virtually the same with that change.

Umm.. Note the "previous implementation" is the previous implementation of this PR (without 685b29d), not main branch, so there is still a pretty nice perf increase. (I should have added a note on it 😅 ). I wanted to show that limiting the capacity allocation does not effect the performance, oddly I saw a very slight improvement (😕)?

There is a ~40 score increase in splay from main's imlementation, probably because it creates so many objects, to construct the tree.

jedel1043 · 2023-12-01T18:19:20Z

Yeah, that's what I expected. Perf is virtually the same with that change.

Umm.. Note the "previous implementation" is the previous implementation of this PR (without 685b29d), not main branch, so there is still a pretty nice perf increase. (I should have added a note on it 😅 ). I wanted to show that limiting the capacity allocation does not effect the performance, oddly I saw a very slight improvement (😕)?

There is a ~40 score increase in splay from main's imlementation, probably because it creates so many objects, to construct the tree.

Noted 👍

I still feel it's "wrong" to use allocations to keep track of allocations, so I'd still defer to the rest of the team for more opinions on this.

nekevss · 2023-12-02T15:24:25Z

I'd tend to agree a bit more with Jedel that using the extra allocation of Vec feels off in this instance. 😕

HalidOdat · 2023-12-02T16:15:46Z

This just moves the memory that was used to track the gc pointers (next field) into a dense array so that the CPU can better cache, hence the speed improvement.

Why is this weird?

In the future, if we aspire to achieve significant improvements in speed, it may be necessary to develop our own "malloc" and take charge of managing the heap pages ourselves. Perhaps my exposure to a considerable amount of v8/Spidermonkey code and techniques has influenced my perspective of what is weird. Making this proposed change seem normal. 😆

jedel1043 · 2023-12-02T19:33:52Z

Perhaps my exposure to a considerable amount of v8/Spidermonkey code and techniques has influenced my perspective of what is weird. Making this proposed change seem normal.

Ooh, does V8/SM use dynamic arrays to keep track of allocated memory? If that's the case, it'll remove my uncertainty completely.

nekevss · 2023-12-02T20:45:35Z

Most of my hesitation here is around the strict provenance and how loading the NonNulls into a Vec may affect the provenance. It may be totally fine. I'm just not 100% certain, hence the hesitation. Whereas, Cell helps to enforce provenance.

I agree on the malloc/allocator point, and it may be worth looking into using the Allocator API via allocator_api2 early to handle page allocation. But if we did that, we'd be dealing with two allocations via Box and Vec, rather than just Box.

That all being said, I could be wrong. If there's consensus to move forward with this, we can.

HalidOdat · 2023-12-02T21:41:45Z

Ooh, does V8/SM use dynamic arrays to keep track of allocated memory? If that's the case, it'll remove my uncertainty completely.

They preallocate a large slabs of memory, they keep track of whats free instead of what is allocated. I think they use an in-place free list.

You can find some information here

Anyway, I just felt like it was a waste to leave that big of a performance boost. Though I'm totally fine with closing if there is any cause for concern or a consensus has been reached :)

@raskad what are your thoughts on this?

raskad · 2023-12-04T01:26:03Z

To be honest, I though of this exact change multiple times myself already but was never to confident to actually try it :D

I think this is the right step forward. As far as I know, storing gc allocations in one or more continuous blocks of memory is probably the best approach. And starting it out with a simple Vec seems like the most natural solution to me.

I'm courious about the strict provenance argument. As far as I know miri has some strict provenance checks correct? @HalidOdat did you run miri on the boa_gc tests with this change?

jedel1043 · 2023-12-04T02:39:02Z

Made some research and Spidermonkey uses a mixed architecture of "Zones", which are mini heaps of memory that contain many objects that are tracked using linked lists. Their GC collects at the granularity of zones, and they keep track of the active zones using a Vec (technically a SmallVec that is first an array but expands to a Vec when exceeding 4 zones).

I think this should be enough evidence to accept this change.

HalidOdat · 2023-12-04T06:54:03Z

I'm courious about the strict provenance argument. As far as I know miri has some strict provenance checks correct? @HalidOdat did you run miri on the boa_gc tests with this change?

Ran it with the -Zmiri-strict-provenance env option and miri did not complain and it ran fine.

MIRIFLAGS="-Zmiri-strict-provenance" cargo +nightly miri test -p boa_gc

nekevss

If SM is using a Vec equivalent and miri isn't throwing any errors, then I'm fine with moving forward with this.

Use Vec<T> for keeping track of gc objects

8330f9d

HalidOdat added performance Performance related changes and issues gc Issue related to garbage collection labels Dec 1, 2023

HalidOdat added this to the v0.18.0 milestone Dec 1, 2023

HalidOdat requested a review from a team December 1, 2023 08:33

Apply review

685b29d

jedel1043 approved these changes Dec 4, 2023

View reviewed changes

nekevss approved these changes Dec 4, 2023

View reviewed changes

raskad approved these changes Dec 4, 2023

View reviewed changes

raskad added this pull request to the merge queue Dec 4, 2023

Merged via the queue into main with commit 6506f65 Dec 4, 2023
13 checks passed

raskad deleted the gc-use-vec-instead-linked-list branch December 5, 2023 12:15

HalidOdat mentioned this pull request Feb 28, 2024

initial changelog for v0.18 #3691

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `Vec<T>` for keeping track of gc objects #3493

Use `Vec<T>` for keeping track of gc objects #3493

HalidOdat commented Dec 1, 2023

github-actions bot commented Dec 1, 2023

codecov bot commented Dec 1, 2023 •

edited

Loading

jedel1043 commented Dec 1, 2023 •

edited

Loading

jedel1043 commented Dec 1, 2023

HalidOdat commented Dec 1, 2023 •

edited

Loading

HalidOdat commented Dec 1, 2023

HalidOdat commented Dec 1, 2023

jedel1043 commented Dec 1, 2023 •

edited

Loading

HalidOdat commented Dec 1, 2023

jedel1043 commented Dec 1, 2023

nekevss commented Dec 2, 2023

HalidOdat commented Dec 2, 2023

jedel1043 commented Dec 2, 2023 •

edited

Loading

nekevss commented Dec 2, 2023 •

edited

Loading

HalidOdat commented Dec 2, 2023 •

edited

Loading

raskad commented Dec 4, 2023

jedel1043 commented Dec 4, 2023 •

edited

Loading

HalidOdat commented Dec 4, 2023

nekevss left a comment

Use Vec<T> for keeping track of gc objects #3493

Use Vec<T> for keeping track of gc objects #3493

Conversation

HalidOdat commented Dec 1, 2023

Benchmarks

Main

PR

github-actions bot commented Dec 1, 2023

Test262 conformance changes

codecov bot commented Dec 1, 2023 • edited Loading

Codecov Report

jedel1043 commented Dec 1, 2023 • edited Loading

Sample

Main

This PR

jedel1043 commented Dec 1, 2023

HalidOdat commented Dec 1, 2023 • edited Loading

HalidOdat commented Dec 1, 2023

HalidOdat commented Dec 1, 2023

Previous implementation

Current implementation

jedel1043 commented Dec 1, 2023 • edited Loading

HalidOdat commented Dec 1, 2023

jedel1043 commented Dec 1, 2023

nekevss commented Dec 2, 2023

HalidOdat commented Dec 2, 2023

jedel1043 commented Dec 2, 2023 • edited Loading

nekevss commented Dec 2, 2023 • edited Loading

HalidOdat commented Dec 2, 2023 • edited Loading

raskad commented Dec 4, 2023

jedel1043 commented Dec 4, 2023 • edited Loading

HalidOdat commented Dec 4, 2023

nekevss left a comment

Choose a reason for hiding this comment

Use `Vec<T>` for keeping track of gc objects #3493

Use `Vec<T>` for keeping track of gc objects #3493

codecov bot commented Dec 1, 2023 •

edited

Loading

jedel1043 commented Dec 1, 2023 •

edited

Loading

HalidOdat commented Dec 1, 2023 •

edited

Loading

jedel1043 commented Dec 1, 2023 •

edited

Loading

jedel1043 commented Dec 2, 2023 •

edited

Loading

nekevss commented Dec 2, 2023 •

edited

Loading

HalidOdat commented Dec 2, 2023 •

edited

Loading

jedel1043 commented Dec 4, 2023 •

edited

Loading