Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page based heap size heuristics #50144

Merged
merged 13 commits into from
Jul 23, 2023
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Language changes

Compiler/Runtime improvements
-----------------------------
* Updated GC heuristics to count allocated pages instead of individual objects ([#50144]).

Command-line option changes
---------------------------
Expand Down
12 changes: 9 additions & 3 deletions doc/src/devdocs/gc.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,12 @@ This scheme eliminates the need of explicitly keeping a flag to indicate a full
## Heuristics

GC heuristics tune the GC by changing the size of the allocation interval between garbage collections.
If a GC was unproductive, then we increase the size of the allocation interval to allow objects more time to die.
If a GC returns a lot of space we can shrink the interval. The goal is to find a steady state where we are
allocating just about the same amount as we are collecting.

The GC heuristics measure how big the heap size is after a collection and set the next
collection according to the algorithm described by https://dl.acm.org/doi/10.1145/3563323,
in summary, it argues that the heap target should have a square root relationship with the live heap, and that it should also be scaled by how fast the GC is freeing objects and how fast the mutators are allocating.
The heuristics measure the heap size by counting the number of pages that are in use and the objects that use malloc. Previously we measured the heap size by counting
the alive objects, but that doesn't take into account fragmentation which could lead to bad decisions, that also meant that we used thread local information (allocations) to make
decisions about a process wide (when to GC), measuring pages means the decision is global.

The GC will do full collections when the heap size reaches 80% of the maximum allowed size.
17 changes: 15 additions & 2 deletions src/gc-debug.c
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
// This file is a part of Julia. License is MIT: https://julialang.org/license

#include "gc.h"
#include "julia.h"
#include <inttypes.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>

// re-include assert.h without NDEBUG,
Expand Down Expand Up @@ -1217,15 +1220,25 @@ JL_DLLEXPORT void jl_enable_gc_logging(int enable) {
gc_logging_enabled = enable;
}

void _report_gc_finished(uint64_t pause, uint64_t freed, int full, int recollect) JL_NOTSAFEPOINT {
void _report_gc_finished(uint64_t pause, uint64_t freed, int full, int recollect, int64_t live_bytes) JL_NOTSAFEPOINT {
if (!gc_logging_enabled) {
return;
}
jl_safe_printf("GC: pause %.2fms. collected %fMB. %s %s\n",
pause/1e6, freed/1e6,
pause/1e6, freed/(double)(1<<20),
full ? "full" : "incr",
recollect ? "recollect" : ""
);

jl_safe_printf("Heap stats: bytes_mapped %.2f MB, bytes_resident %.2f MB, heap_size %.2f MB, heap_target %.2f MB, live_bytes %.2f MB\n, Fragmentation %.3f",
jl_atomic_load_relaxed(&gc_heap_stats.bytes_mapped)/(double)(1<<20),
jl_atomic_load_relaxed(&gc_heap_stats.bytes_resident)/(double)(1<<20),
jl_atomic_load_relaxed(&gc_heap_stats.heap_size)/(double)(1<<20),
jl_atomic_load_relaxed(&gc_heap_stats.heap_target)/(double)(1<<20),
live_bytes/(double)(1<<20),
(double)live_bytes/(double)jl_atomic_load_relaxed(&gc_heap_stats.heap_size)
);
// Should fragmentation use bytes_resident instead of heap_size?
}

#ifdef __cplusplus
Expand Down
4 changes: 4 additions & 0 deletions src/gc-pages.c
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ char *jl_gc_try_alloc_pages_(int pg_cnt) JL_NOTSAFEPOINT
// round data pointer up to the nearest gc_page_data-aligned
// boundary if mmap didn't already do so.
mem = (char*)gc_page_data(mem + GC_PAGE_SZ - 1);
jl_atomic_fetch_add_relaxed(&gc_heap_stats.bytes_mapped, pages_sz);
jl_atomic_fetch_add_relaxed(&gc_heap_stats.bytes_resident, pages_sz);
return mem;
}

Expand Down Expand Up @@ -115,6 +117,7 @@ NOINLINE jl_gc_pagemeta_t *jl_gc_alloc_page(void) JL_NOTSAFEPOINT
// try to get page from `pool_freed`
meta = pop_lf_page_metadata_back(&global_page_pool_freed);
if (meta != NULL) {
jl_atomic_fetch_add_relaxed(&gc_heap_stats.bytes_resident, GC_PAGE_SZ);
gc_alloc_map_set(meta->data, GC_PAGE_ALLOCATED);
goto exit;
}
Expand Down Expand Up @@ -188,6 +191,7 @@ void jl_gc_free_page(jl_gc_pagemeta_t *pg) JL_NOTSAFEPOINT
madvise(p, decommit_size, MADV_DONTNEED);
#endif
msan_unpoison(p, decommit_size);
jl_atomic_fetch_add_relaxed(&gc_heap_stats.bytes_resident, -decommit_size);
}

#ifdef __cplusplus
Expand Down
Loading