-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC marking cleanup #19940
GC marking cleanup #19940
Conversation
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
// refyoung |= gc_push_root(ptls, children[--ci], d); | ||
} | ||
else { | ||
jl_printf(JL_STDOUT, "GC error (probable corruption) :\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like it would be useful to preserve this branch in gc_mark_obj
?
// this check should not be needed but it helps catching corruptions early
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, sorry, I just realized you just moved it out into a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes =)
And also folded it with the check here
I assume the check up there was needed before the tuple type overhaul?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
squash when merging?
@@ -1361,44 +1378,55 @@ void gc_mark_object_list(jl_ptls_t ptls, arraylist_t *list, size_t start) | |||
} | |||
} | |||
|
|||
STATIC_INLINE void gc_assert_datatype(jl_datatype_t *vt) | |||
{ | |||
if (__unlikely(jl_is_datatype(vt))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not that it affects anything on all but a tiny number of target processors with a small subset of compiler versions, but why is this __unlikely
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooops. Flipped the condition without flipping the likely
/unlikely
......
Fixed and squashed manually. |
* Clean up pointer cast in `push_root` Only check `jl_is_datatype` once on the tag. * Rename `push_root` to avoid confusion with `gc_push_root` Also pass the whole tag in to reduce the number of load on the tag. * Atomically check and set gc bits This makes sure we won't overcount with parallel marking. Note that the setmark is still not threadsafe due to global metadata update. * `gc_setmark` overhaul * Add fast path to do nothing if the object was already marked * Return whether the object's gc bits are changed or not * Use this info to replace the custom logic for arrays * Separate marking and scanning of the object Make sure in `gc_mark_obj` that the object will only be scanned once.
These are some simpler changes preparing for parallel marking. The main changes are
Rename
push_root
, it was very confusing what's the difference betweengc_push_root
andpush_root
....Split
push_root
to mark and scanThe main idea is that the marking phase consists of setting the mark and scanning the object and incremental marking (not in current GC), recursion limit and parallel marking (TODO) all require doing the two separately (in time). Having distinct functions for the two also helps clarifying the work intended to be done by the caller (i.e. remset and markstack scanning only need scan and not setmark) and avoid doing extra work in the setmark code.
Make setmark atomic
Though in principle the marking can be done in parallel on many threads, we need the setmark to be atomic since it can otherwise cause inconsistency in metadata. Luckily, I believe we only need a xchg and not cmpxchg for this since setting it multiple times to the same final value is not an issue. We also don't need any ordering from the tag so we can use relaxed ordering so the impact on the performance should be minimum (no memory barrier needed on any archs we support).
Note that setmark is still not thread safe due to global metadata/list update.
@nanosoldier
runbenchmarks(ALL, vs=":master")