-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
output the system image as static data #22254
Conversation
src/staticdata.c
Outdated
// array of definitions for the predefined function pointers | ||
// (reverse of fptr_to_id) | ||
static const jl_fptr_t id_to_fptrs[] = { | ||
NULL, NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be 4 space indents (plus closing }
on its own line)
Cool! You can probably guess what I'm going to ask but here goes: any data on sysimg size and startup time? |
src/gc.c
Outdated
if (update_meta) | ||
gc_setmark(ptls, o, bits, l * sizeof(void*) + sizeof(jl_svec_t)); | ||
if (update_meta) { | ||
if (foreign_alloc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since objprofile_count
is a non-default debug option, I think it'll be better to make update_meta = 0
in this case and do this in an else branch so that it's more clear to the compiler that the two branches can be merged together. It also seems like it's exactly the same code so maybe just do at the very beginning
if (update_meta) {
if (((void*)o >= sysimg_base && (void*)o < sysimg_end)) {
update_meta = 0;
objprofile_count(vt, bits == GC_OLD_MARKED, jl_datasize_size(vt));
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a copy-paste error (weren't supposed to be the same). Sadly, I had implemented this the other way first (as you suggested), then forgot what what I was doing, and changed them all back to this version
RSS 148M from size -A usr/lib/julia/sys.o sysimg size breakdown:
$ ls -lh usr/lib/julia/ |
b192385
to
5c0c93f
Compare
src/staticdata.c
Outdated
size_t item = backref_table_numel++; | ||
char *pos = (char*)HT_NOTFOUND + 1 + item; | ||
size_t item = ++backref_table_numel; | ||
assert(item < (1 << 28) && "too many items to serialize"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
28 appears often here but its significance isn't obvious, give it a name?
Nice. Sysimg is larger as one would expect, but seems to be totally worth it. |
src/staticdata.c
Outdated
static void jl_update_all_gvars(jl_serializer_state *s) | ||
{ | ||
if (sysimg_gvars == NULL) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one too few spaces
src/staticdata.c
Outdated
uint32_t b2 = *(uint8_t*)(*base)++; | ||
uint32_t b1 = *(uint8_t*)(*base)++; | ||
uint32_t b0 = *(uint8_t*)(*base)++; | ||
return b0 | (b1 << 8) | (b2 << 16) | (b3 << 24); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint32_t v;
memcpy(&v, base, 4);
return ntohl(v);
Is much easier for the compiler to optimize.
@@ -644,6 +644,37 @@ void gc_setmark_buf(jl_ptls_t ptls, void *o, uint8_t mark_mode, size_t minsz) | |||
gc_setmark_buf_(ptls, o, mark_mode, minsz); | |||
} | |||
|
|||
void jl_gc_force_mark_old(jl_ptls_t ptls, jl_value_t *v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and gc_sweep_sysimg
etc below) should go into a header. It happens way too often when we put declaration in C files and they get out of sync when the implementation is updated.
just for fun: @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
we use this step to try to maximize the number of shared mmap pages and minimize memory (virtual and physical) that is required. the sorting order appears to have a strong impact on the time for a full gc
This is a pretty large change. What about the stack size compiler flag? |
That should work for embedders, more-or-less, but it's probably unreasonable to ask Python to recompile. That said, I have no idea how many people use pyjulia. |
This replaces the current lisp-y system image compression with an static image format. This is more similar to how it would be handled in a static compiler.
benefits:
fixes #19199
fixes #22320