[rfc] parallel marking #45639

d-netto · 2022-06-10T19:40:40Z

This PR extends the refactoring from #45608 by parallelizing the GC mark-loop.

TODO

Move work-stealing queue implementation to separate file.
Run more benchmarks (and test scalability for large number of threads).
Fix GC debugging infrastructure & annotations.

oscardssmith · 2022-06-10T20:39:10Z

can we get 32 core benchmarks as well? It would be very unfortunate if this was a regression for cpus with many cores. (it would be fine to limit GC to use 4 if thats all it can use, but negative scaling is very unfortunate)

d-netto · 2022-06-17T19:02:22Z

(EDIT: see scaling plots below)

oscardssmith · 2022-06-17T19:08:38Z

Were these timings made on a version of Julia that includes the changes #45714 that make the GC time report correctly?

d-netto · 2022-06-17T19:09:30Z

No. Will merge and update that.

nanosoldier · 2022-08-02T14:11:55Z

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo -n /nanosoldier/cset/bin/cset shield -e -- sudo -n -u nanosoldier-worker -- /nanosoldier/workdir/jl_Ru6OqQ/benchscript.sh`, ProcessSignaled(6)) [0]

Logs and partial data can be found here

vchuravy · 2022-08-03T20:22:18Z

@nanosoldier runbenchmarks(!"scalar", vs=":master")

d-netto · 2022-08-05T19:05:21Z

@nanosoldier runbenchmarks(!"scalar", vs=":master")

chflood

So I ran the workspace on several examples and it meet my criteria of getting better performance with 2 parallel threads than with our current single threaded solution.

The changes look reasonable to me.

This is a huge disruptive change and I think we need to think long and hard about whether we want parallel marking to be our only GC algorithm or whether we want users to be able to choose at run time.

oscardssmith · 2022-08-25T20:12:38Z

Were there any regressions? Can you post the before/after?

vchuravy · 2022-09-09T12:29:30Z

@nanosoldier runtests(ALL, vs = ":master", buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"], vs_buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"])

## Previous work Since #21590, the GC mark-loop was implemented by keeping two manually managed stacks: one of which contained iterator states used to keep track of the object currently being marked. As an example, to mark arrays, we would pop the corresponding iterator state from the stack, iterate over the array until we found an unmarked reference, and if so, we would update the iterator state (to reflect the index we left off), "repush" it into the stack and proceed with marking the reference we just found. ## This PR This PR eliminates the need of keeping the iterator states by modifying the object graph traversal code. We keep a single stack of `jl_value_t *` currently being processed. To mark an object, we first pop it from the stack, push all unmarked references into the stack and proceed with marking. I believe this doesn't break any invariant from the generational GC. Indeed, the age bits are set after marking (if the object survived one GC cycle it's moved to the old generation), so this new traversal scheme wouldn't change the fact of whether an object had references to old objects or not. Furthermore, we must not update GC metadata for objects in the `remset`, and we ensure this by calling `gc_mark_outrefs` in `gc_queue_remset` with `meta_updated` set to 1. ## Additional advantages 1. There are no recursive function calls in the GC mark-loop code (one of the reasons why #21590 was implemented). 2. Keeping a single GC queue will **greatly** simplify work-stealing in the multi-threaded GC we are working on (c.f. #45639). 3. Arrays of references, for example, are now marked on a regular stride fashion, which could help with hardware prefetching. 4. We can easily modify the traversal mode (to breadth first, for example) by only changing the `jl_gc_markqueue_t`(from LIFO to FIFO, for example) methods without touching the mark-loop itself, which could enable further exploration on the GC in the future. Since this PR changes the mark-loop graph traversal code, there are some changes in the heap-snapshot, though I'm not familiar with that PR. Some benchmark results are here: https://hackmd.io/@Idnmfpb3SxK98-OsBtRD5A/H1V6QSzvs.

vtjnash · 2023-02-02T20:15:30Z

What is the status of this now that #21590 was merged?

d-netto · 2023-02-02T20:30:41Z

I think it's not ready to merge/review yet. It's still causing regressions on some of the GCBenchmarks and I'm diagnosing that (will post some scaling plots in a sec).

I think @kpamnany also saw some segfaults on RAI tests.

d-netto · 2023-02-02T20:36:18Z

Speedup/slowdowns on GC times relative to the master commit from which it branched vs nthreads (1, 2, 4, 8, 16):

d-netto · 2023-02-19T02:02:03Z

Superseded by #48600.

d-netto mentioned this pull request Jun 10, 2022

GC/Parallel marking #44643

Closed

3 tasks

d-netto force-pushed the dcn/pmark2 branch from d21d6b5 to 563df8e Compare June 10, 2022 19:52

jpsamaroo added the GC Garbage collector label Jun 11, 2022

d-netto mentioned this pull request Jun 12, 2022

GC mark-loop rewrite #45608

Closed

3 tasks

d-netto force-pushed the dcn/pmark2 branch 7 times, most recently from afac580 to 5414eb5 Compare June 26, 2022 21:40

d-netto mentioned this pull request Jun 29, 2022

mark time = 0 JuliaCI/GCBenchmarks#34

Closed

Diogo Netto and others added 14 commits July 5, 2022 10:29

single gc queue

59fc823

only two gc bits

871bd2e

debugging infra

f98429b

docstrings

0815f90

1 outref opt

2a6fa8b

compiler warning

f01e90a

deleted tmp files

e71cc48

spacing

8d41596

fmt

27aca87

fmt

99a767c

gc-debug

c85f362

ws

2c703f0

finlist

58d3de3

fl

8dae1bb

vchuravy requested review from chflood and kpamnany August 2, 2022 13:13

chunking finlist

ef220a2

unroll last iter of marking

0dbb1b1

This comment was marked as off-topic.

Sign in to view

Diogo Netto added 2 commits August 12, 2022 19:29

avoid duplicates in remset

5d5dadc

rm useless files

04e3187

d-netto mentioned this pull request Aug 13, 2022

[wip] gc threads #46338

Closed

chflood reviewed Aug 25, 2022

View reviewed changes

d-netto force-pushed the dcn/pmark2 branch 3 times, most recently from fb4da4a to 04e3187 Compare September 8, 2022 00:38

pmark as flag

6be3776

d-netto force-pushed the dcn/pmark2 branch from 97127d8 to 6be3776 Compare September 8, 2022 00:58

passing chunk by ref

23e2e60

d-netto mentioned this pull request Nov 28, 2022

GC mark-loop rewrite #47292

Merged

d-netto closed this Jan 23, 2023

d-netto reopened this Jan 24, 2023

d-netto mentioned this pull request Feb 9, 2023

Run GC on multiple threads #48600

Merged

d-netto closed this Feb 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rfc] parallel marking #45639

[rfc] parallel marking #45639

d-netto commented Jun 10, 2022 •

edited

Loading

oscardssmith commented Jun 10, 2022 •

edited

Loading

d-netto commented Jun 17, 2022 •

edited

Loading

oscardssmith commented Jun 17, 2022

d-netto commented Jun 17, 2022

nanosoldier commented Aug 2, 2022

vchuravy commented Aug 3, 2022

d-netto commented Aug 5, 2022

This comment was marked as off-topic.

chflood left a comment

oscardssmith commented Aug 25, 2022

vchuravy commented Sep 9, 2022

vtjnash commented Feb 2, 2023

d-netto commented Feb 2, 2023

d-netto commented Feb 2, 2023 •

edited

Loading

d-netto commented Feb 19, 2023

[rfc] parallel marking #45639

[rfc] parallel marking #45639

Conversation

d-netto commented Jun 10, 2022 • edited Loading

TODO

oscardssmith commented Jun 10, 2022 • edited Loading

d-netto commented Jun 17, 2022 • edited Loading

oscardssmith commented Jun 17, 2022

d-netto commented Jun 17, 2022

nanosoldier commented Aug 2, 2022

vchuravy commented Aug 3, 2022

d-netto commented Aug 5, 2022

This comment was marked as off-topic.

chflood left a comment

Choose a reason for hiding this comment

oscardssmith commented Aug 25, 2022

vchuravy commented Sep 9, 2022

vtjnash commented Feb 2, 2023

d-netto commented Feb 2, 2023

d-netto commented Feb 2, 2023 • edited Loading

d-netto commented Feb 19, 2023

d-netto commented Jun 10, 2022 •

edited

Loading

oscardssmith commented Jun 10, 2022 •

edited

Loading

d-netto commented Jun 17, 2022 •

edited

Loading

d-netto commented Feb 2, 2023 •

edited

Loading