make GC counters thread-local #32217

JeffBezanson · 2019-06-01T17:54:06Z

This should fix #31923. I see much less memory growth in alloc-heavy threaded loops.

Keno · 2019-06-01T19:06:58Z

This seems slightly problematic, because the contention on these variables is likely to be high. Can we do something like keep thread-local counters and every n MB of allocation atomically update the global counters?

JeffBezanson · 2019-06-01T19:12:53Z

Agreed. Making gc counters thread-local has been on my list. Some of the counters also seem slightly unnecessary, e.g. the number of malloc calls.

yuyichao

Yes, the counter has to be thread local. Also, I believe for any reasonable allocation measurement experience, the global one needs to be updated when someone query gc_num.

JeffBezanson · 2019-06-01T21:12:28Z

for any reasonable allocation measurement experience, the global one needs to be updated when someone query gc_num

Yes, of course. I'm thinking we can return a copy of gc_num with all the per-thread counters added in, just with whatever snapshot the calling thread happens to see. Do you think that's good enough?

0 · 2019-06-03T05:26:35Z

I can't tell if this has already been addressed in the above comments, so apologies for the noise if it has. There seems to be some sort of race condition in this patch. If I run

@time Threads.@threads for _ in 1:Threads.nthreads()
    zeros(10^8)
end

with JULIA_NUM_THREADS=16 on commit 5923179, it usually finishes in about 5 seconds. Occasionally, though, it gets stuck (possibly indefinitely, but at least for several hours) with an arbitrary fraction of the threads at 100% CPU utilization. I can't reproduce this on commit 123ff48 (the parent of 5923179).

In case this is useful, here's a snapshot of where the threads are according to gdb (with OPENBLAS_NUM_THREADS=1 so there are fewer irrelevant threads):

Top of stack	Topmost Julia function	Count
`__GI___sigtimedwait`	`signal_listener`	1
`__GI_epoll_pwait`	`jl_task_get_next`	1
`jl_safepoint_wait_gc`	`jl_safepoint_wait_gc`	10
`_mm_pause`	`jl_safepoint_wait_gc`	4
`_mm_pause`	`jl_gc_wait_for_the_world`	1

JeffBezanson · 2019-06-03T20:49:45Z

Thanks for that summary of the backtraces. Very convenient presentation of the info!

JeffBezanson · 2019-06-04T22:43:15Z

OK, I pushed the thread-local version. One interesting question is what the exact intervals should be between atomic updates of the global counter. I tried to use a different interval on each thread, but perhaps it's better just to pick a constant.

yuyichao · 2019-06-04T22:45:10Z

I think the best way is to make a guess of the next interval per thread during gc and once that's triggered we can do a sync and decide what to do

StefanKarpinski · 2019-06-04T22:45:21Z

Random?

yuyichao · 2019-06-04T22:56:31Z

src/gc.c

 static inline int maybe_collect(jl_ptls_t ptls)
 {
-    if (should_collect() || gc_debug_check_other()) {
+    int should_collect = 0;
+    if (ptls->gc_num.allocd >= 0) {


This function need to use atomic relaxed load.

Hmm, the line number shows up really weird.... The comment was for combine_thread_gc_counts.

(Actually, to avoid UB, all access of these outside of the collection phase should be atomic.)

Keno · 2019-06-04T23:01:28Z

src/gc.c

+        int64_t intvl = per_thread_counter_interval(ptls->tid, gc_num.interval);
+        size_t localbytes = ptls->gc_num.allocd + intvl;
+        ptls->gc_num.allocd = -intvl;
+        jl_atomic_fetch_add(&gc_num.allocd, localbytes);


jl_atomic_fetch_add should return the resulting value which is the right thing to use for the comparison on the next line.

I think that's add_fetch? But yes we should use it.

JeffBezanson · 2019-06-05T19:21:31Z

I think the best way is to make a guess of the next interval per thread during gc and once that's triggered we can do a sync and decide what to do

Does that basically mean: keep a per-thread allocation count and a per-thread interval, and use the amount allocated by a thread since the last collection as its next interval value?

JeffBezanson · 2019-06-07T20:18:37Z

Note: this needs #32238 first.

yuyichao · 2019-06-07T20:30:02Z

Does that basically mean: keep a per-thread allocation count and a per-thread interval, and use the amount allocated by a thread since the last collection as its next interval value?

Yes. I think the logic in the allocation should be as simple as possible. I feel like the worst case for this is when the allocation pattern changes a lot between thread but even in that case I think it shouldn't be too bad either. We'll certainly put some limit on it

Another thing is that I think we can limit sync to page change since that's the slower path anyway.

fixes #27173

staticfloat · 2019-06-11T05:57:38Z

In a fascinating turn of events, this PR is the first one to have something caught by the analyzegc run! Unfortunately, this PR was too old to have the new version LLVM that would actually error out on finding an issue, but hopefully this is simple enough:

/buildworker/worker/analyzegc_linux64/build/src/gc.c:1023:5: note: Calling potential safepoint from function annotated JL_NOTSAFEPOINT
    combine_thread_gc_counts(&gc_num);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.

Is this a false alarm, or an actual issue?

JeffBezanson added multithreading Base.Threads and related functionality GC Garbage collector bugfix This change fixes an existing bug labels Jun 1, 2019

JeffBezanson mentioned this pull request Jun 1, 2019

Large memory leak when using threads #31923

Closed

yuyichao requested changes Jun 1, 2019

View reviewed changes

JeffBezanson mentioned this pull request Jun 4, 2019

threading: fixup scheduler statepoints for GC #32238

Merged

JeffBezanson force-pushed the jb/atomicgccount branch from 5923179 to 38eb516 Compare June 4, 2019 22:40

JeffBezanson changed the title ~~make GC counters atomic~~ make GC counters thread-local Jun 4, 2019

yuyichao reviewed Jun 4, 2019

View reviewed changes

Keno reviewed Jun 4, 2019

View reviewed changes

JeffBezanson force-pushed the jb/atomicgccount branch from 38eb516 to 62a6e4e Compare June 7, 2019 20:49

make GC counters thread-local

a02f1b3

fixes #27173

JeffBezanson force-pushed the jb/atomicgccount branch from 62a6e4e to a02f1b3 Compare June 10, 2019 17:34

vtjnash approved these changes Jun 10, 2019

View reviewed changes

JeffBezanson merged commit 5335a94 into master Jun 11, 2019

JeffBezanson deleted the jb/atomicgccount branch June 11, 2019 02:33

Roger-luo mentioned this pull request Jul 11, 2019

add multi threading QuantumBFS/YaoArrayRegister.jl#19

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make GC counters thread-local #32217

make GC counters thread-local #32217

JeffBezanson commented Jun 1, 2019 •

edited

Loading

Keno commented Jun 1, 2019

JeffBezanson commented Jun 1, 2019

yuyichao left a comment

JeffBezanson commented Jun 1, 2019

0 commented Jun 3, 2019

JeffBezanson commented Jun 3, 2019

JeffBezanson commented Jun 4, 2019

yuyichao commented Jun 4, 2019

StefanKarpinski commented Jun 4, 2019

yuyichao Jun 4, 2019

JeffBezanson Jun 5, 2019

yuyichao Jun 5, 2019

Keno Jun 4, 2019

JeffBezanson Jun 5, 2019

JeffBezanson commented Jun 5, 2019

JeffBezanson commented Jun 7, 2019

yuyichao commented Jun 7, 2019

staticfloat commented Jun 11, 2019

make GC counters thread-local #32217

make GC counters thread-local #32217

Conversation

JeffBezanson commented Jun 1, 2019 • edited Loading

Keno commented Jun 1, 2019

JeffBezanson commented Jun 1, 2019

yuyichao left a comment

Choose a reason for hiding this comment

JeffBezanson commented Jun 1, 2019

0 commented Jun 3, 2019

JeffBezanson commented Jun 3, 2019

JeffBezanson commented Jun 4, 2019

yuyichao commented Jun 4, 2019

StefanKarpinski commented Jun 4, 2019

yuyichao Jun 4, 2019

Choose a reason for hiding this comment

JeffBezanson Jun 5, 2019

Choose a reason for hiding this comment

yuyichao Jun 5, 2019

Choose a reason for hiding this comment

Keno Jun 4, 2019

Choose a reason for hiding this comment

JeffBezanson Jun 5, 2019

Choose a reason for hiding this comment

JeffBezanson commented Jun 5, 2019

JeffBezanson commented Jun 7, 2019

yuyichao commented Jun 7, 2019

staticfloat commented Jun 11, 2019

JeffBezanson commented Jun 1, 2019 •

edited

Loading