Memory leak in Base #50345

schlichtanders · 2023-06-29T08:56:26Z

Hi there

I found a small example to reproduce a memory leak which kept me busy for several days. (Hoping to find a workaround soon).

Running the following in Julia REPL several times will show an increase in memory usage (about 200 KB per run).

begin
forecast_samples = [randn(30) for i in 1:10_000]
sums = vec(mapslices(sum, reduce(vcat, forecast_samples'), dims=1))
GC.gc(); Base.gc_live_bytes() / 2^20
end

For sure this example is not yet minimal, but at least super tiny.

(tested on Julia 1.9.0 and 1.9.1)

EDIT: Just realized that gc_live_bytes is quite noisy. Hence better do the above a couple of times. E.g.

[begin
forecast_samples = [randn(30) for i in 1:10_000]
sums = vec(mapslices(sum, reduce(vcat, forecast_samples'), dims=1))
GC.gc(); Base.gc_live_bytes() / 2^20
end for i in 1:50]

schlichtanders · 2023-06-29T09:09:23Z

Doing it with forloops indeed does not lead to a memory leak

[begin
forecast_samples = [randn(30) for i in 1:10_000]
sums = [sum([forecast_samples[s][i] for s in eachindex(forecast_samples)]) for i in 1:30]
GC.gc(); Base.gc_live_bytes() / 2^20
end for i in 1:50]

this is then my workaround

KristofferC · 2023-06-29T09:18:55Z

Could be a duplicate of #49545, where the GC decides to not do a full collection.

vchuravy · 2023-06-29T09:48:12Z

GC.gx(true); GC.gc(false) to force a full collection

schlichtanders · 2023-06-29T09:49:36Z

true is the default, according to documentation, and setting it explicitly still has a memory leak of same size

EDIT: Now I understood that you are saying one should run both!
Unfortunately, still memory leak

[begin
forecast_samples = [randn(30) for i in 1:10_000]
sums = vec(mapslices(sum, reduce(vcat, forecast_samples'), dims=1))
GC.gc(true); GC.gc(false); Base.gc_live_bytes() / 2^20
end for i in 1:50]

barucden · 2023-06-29T11:43:01Z

I think the example can be reduced to a reduce of vcat over an adjoint.

const forecast_samples     = [randn(30) for i in 1:10_000];
const forecast_samples_adj = [randn(30) for i in 1:10_000]';

function f(arr)
    reduce(vcat, arr)
    GC.gc(true)
    GC.gc(false)
    return Base.gc_live_bytes() / 2^20
end

without adjoint

julia> [f(forecast_samples) for i in 1:50]
50-element Vector{Float64}:
 10.236428260803223
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
  ⋮
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416

with adjoint

julia> [f(forecast_samples_adj) for i in 1:50]
50-element Vector{Float64}:
 10.919855117797852
 11.124429702758789
 11.328531265258789
 11.532632827758789
 11.736734390258789
 11.940835952758789
 12.144937515258789
 12.349039077758789
 12.553140640258789
 12.757242202758789
 12.961343765258789
 13.165445327758789
 13.369546890258789
 13.573648452758789
 13.777750015258789
 13.981851577758789
 14.185953140258789
  ⋮
 17.85978126525879
 18.06388282775879
 18.26798439025879
 18.47208595275879
 18.67618751525879
 18.88028907775879
 19.08439064025879
 19.28849220275879
 19.49259376525879
 19.69669532775879
 19.90079689025879
 20.10489845275879
 20.30900001525879
 20.51310157775879
 20.71720314025879
 20.92130470275879

gbaraldi · 2023-06-29T15:02:28Z

I can reproduce the leak on master but I can't find what the difference is. The GC is finding it because it counts it as live but I'm not sure what is going on, the reduce function is leaking some memory somewhere.

bvdmitri · 2023-06-29T19:23:24Z

I can reproduce on Mac M2. @barucden I also noticed that the "without adjoint" version does also allocate, but only on the first iteration (perhaps the array itself?)

julia> [f(forecast_samples) for i in 1:50] |> diff
49-element Vector{Float64}:
 0.0004730224609375
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 ⋮
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

julia> [f(forecast_samples_adj) for i in 1:50] |> diff
49-element Vector{Float64}:
 0.2045745849609375
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 ⋮
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625

julia> versioninfo()
Julia Version 1.9.1
Commit 147bdf428cd (2023-06-07 08:27 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 10 × Apple M2 Pro
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 6 virtual cores

vchuravy · 2023-06-29T19:56:46Z

One Potential way of understanding this would be to use heapsnapshot and Google Chrome has the ability to diff two snapshots.

gbaraldi · 2023-06-29T20:06:21Z

So I did that, and I couldn't find stuff, though I might have been misunderstanding how the tool works.

inkydragon · 2023-06-30T12:48:11Z

Run with different versions of Julia

test.jl

const forecast_samples_adj = [randn(30) for i in 1:10_000]';

function f(arr)
    reduce(vcat, arr)
    GC.gc(true)
    GC.gc(false)
    return Base.gc_live_bytes() / 2^20
end

a = [f(forecast_samples_adj) for i in 1:50]

println("START=$(a[1])\nEND  =$(a[end])\n")

Julia 1.6.7: 5.8616 --> 15.8630
Julia 1.8.5: 6.1122 --> 16.1137
Julia 1.9.1: 6.3575 --> 16.3590
master 02f80c6: 9.0230 --> 19.0246

Run with valgrind

test script

test with master 02f80c6, Julia Version 1.10.0-DEV.1607

gc.jl

const forecast_samples_adj = [randn(30) for i in 1:10_000]';

function f(arr)
    reduce(vcat, arr)
    GC.gc(true)
    GC.gc(false)
    return Base.gc_live_bytes() / 2^20
end

count = parse(Int, ARGS[1]);
a = [f(forecast_samples_adj) for i in 1:count]

print("run=$count times\nSTART=$(a[1])\nEND  =$(a[end])\n")

valgrind output log

raw logs: https://gist.github.com/inkydragon/12a26f5ab5acfd5fb93a76862ee493ca

		-	MiB	MiB	MiB	Bytes	Bytes	Bytes	Bytes
no	name	run times	START	END	END-START	in use at exit	diff	possibly lost	still reachable
1	baseline					17,357,612		2,981,723	14,374,385
2	gc1	1	8.6297	8.6297	0.0000	25,726,368		3,709,413	22,015,227
3	gc2	2	8.6297	8.8339	0.2043	25,726,408	40	3,708,932	22,015,748
4	gc5	5	8.6297	9.4463	0.8166	25,726,432	24	3,709,132	22,015,572
5	gc10	10	8.6297	10.4668	1.8372	25,726,496	64	3,709,723	22,015,045
6	gc50	50	8.6297	18.6312	10.0015	25,726,792	296	3,708,634	22,016,430
7	gc100	100	8.6297	28.8366	20.2070	25,727,200	408	3,707,076	22,018,396
8	gc200	200	8.6297	49.2477	40.6180	25,728,160	960	3,709,647	22,016,785
9	gc400	400	9.6297	90.0694	80.4397	25,729,568	1408	3,708,284	22,019,556

baseline script: julia -e 'print("1\n")'
START / END / END - START: Base.gc_live_bytes() / 2^20
HEAP SUMMARY: in use at exit / diff
LEAK SUMMARY: possibly lost / still reachable
definitely lost: 1,728 bytes, remains the same for gc1 ~ gc400

The in use at exit in HEAP SUMMARY increases gradually, consistent with the increase in the number of runs.
The three measurements in LEAK SUMMARY do not have a significant change and do not show an increasing trend.

In the last test #9: Base.gc_live_bytes() / 2^20 outputs 90.0694 MiB which is less than "heap in use at exit" 24.5 MiB (25_729_568 / 2^20).

in use at         definitely  possibly    still 
exit              lost        lost        reachable
25,729,568  ==  ( 1,728     + 3,708,284 + 22,019,556 )

Maybe julia did garbage collection before exiting?
Another possibility is that the count of Base.gc_live_bytes() is not accurate.

d-netto · 2024-07-24T23:59:41Z

Any chance this could be an issue with the memory accounting code (and the GC not reporting live_bytes accurately)?

FWIW, we've seen a case in #54275 recently and, internally, we've been struggling with memory accounting bugs at RAI (e.g. negative live_bytes in some workloads).

KristofferC added the GC Garbage collector label Jun 29, 2023

d-netto mentioned this issue Jun 30, 2023

ensure objects beyond bump allocated region are inserted into the object pool freelist #50357

Merged

Liozou mentioned this issue Apr 10, 2024

Explicit GC.gc call does not reclaim memory on 1.11 and master #54020

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in Base #50345

Memory leak in Base #50345

schlichtanders commented Jun 29, 2023 •

edited

Loading

schlichtanders commented Jun 29, 2023

KristofferC commented Jun 29, 2023

vchuravy commented Jun 29, 2023

schlichtanders commented Jun 29, 2023 •

edited

Loading

barucden commented Jun 29, 2023 •

edited

Loading

gbaraldi commented Jun 29, 2023

bvdmitri commented Jun 29, 2023 •

edited

Loading

vchuravy commented Jun 29, 2023

gbaraldi commented Jun 29, 2023

inkydragon commented Jun 30, 2023 •

edited

Loading

d-netto commented Jul 24, 2024

Memory leak in Base #50345

Memory leak in Base #50345

Comments

schlichtanders commented Jun 29, 2023 • edited Loading

schlichtanders commented Jun 29, 2023

KristofferC commented Jun 29, 2023

vchuravy commented Jun 29, 2023

schlichtanders commented Jun 29, 2023 • edited Loading

barucden commented Jun 29, 2023 • edited Loading

gbaraldi commented Jun 29, 2023

bvdmitri commented Jun 29, 2023 • edited Loading

vchuravy commented Jun 29, 2023

gbaraldi commented Jun 29, 2023

inkydragon commented Jun 30, 2023 • edited Loading

Run with different versions of Julia

Run with valgrind

test script

valgrind output log

d-netto commented Jul 24, 2024

schlichtanders commented Jun 29, 2023 •

edited

Loading

schlichtanders commented Jun 29, 2023 •

edited

Loading

barucden commented Jun 29, 2023 •

edited

Loading

bvdmitri commented Jun 29, 2023 •

edited

Loading

inkydragon commented Jun 30, 2023 •

edited

Loading