Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in Base #50345

Open
schlichtanders opened this issue Jun 29, 2023 · 11 comments
Open

Memory leak in Base #50345

schlichtanders opened this issue Jun 29, 2023 · 11 comments
Labels
GC Garbage collector

Comments

@schlichtanders
Copy link

schlichtanders commented Jun 29, 2023

Hi there

I found a small example to reproduce a memory leak which kept me busy for several days. (Hoping to find a workaround soon).

Running the following in Julia REPL several times will show an increase in memory usage (about 200 KB per run).

begin
forecast_samples = [randn(30) for i in 1:10_000]
sums = vec(mapslices(sum, reduce(vcat, forecast_samples'), dims=1))
GC.gc(); Base.gc_live_bytes() / 2^20
end

For sure this example is not yet minimal, but at least super tiny.

(tested on Julia 1.9.0 and 1.9.1)

EDIT: Just realized that gc_live_bytes is quite noisy. Hence better do the above a couple of times. E.g.

[begin
forecast_samples = [randn(30) for i in 1:10_000]
sums = vec(mapslices(sum, reduce(vcat, forecast_samples'), dims=1))
GC.gc(); Base.gc_live_bytes() / 2^20
end for i in 1:50]
@schlichtanders
Copy link
Author

Doing it with forloops indeed does not lead to a memory leak

[begin
forecast_samples = [randn(30) for i in 1:10_000]
sums = [sum([forecast_samples[s][i] for s in eachindex(forecast_samples)]) for i in 1:30]
GC.gc(); Base.gc_live_bytes() / 2^20
end for i in 1:50]

this is then my workaround

@KristofferC
Copy link
Member

Could be a duplicate of #49545, where the GC decides to not do a full collection.

@vchuravy
Copy link
Member

GC.gx(true); GC.gc(false) to force a full collection

@schlichtanders
Copy link
Author

schlichtanders commented Jun 29, 2023

true is the default, according to documentation, and setting it explicitly still has a memory leak of same size

EDIT: Now I understood that you are saying one should run both!
Unfortunately, still memory leak

[begin
forecast_samples = [randn(30) for i in 1:10_000]
sums = vec(mapslices(sum, reduce(vcat, forecast_samples'), dims=1))
GC.gc(true); GC.gc(false); Base.gc_live_bytes() / 2^20
end for i in 1:50]

@barucden
Copy link
Contributor

barucden commented Jun 29, 2023

I think the example can be reduced to a reduce of vcat over an adjoint.

const forecast_samples     = [randn(30) for i in 1:10_000];
const forecast_samples_adj = [randn(30) for i in 1:10_000]';

function f(arr)
    reduce(vcat, arr)
    GC.gc(true)
    GC.gc(false)
    return Base.gc_live_bytes() / 2^20
end
without adjoint
julia> [f(forecast_samples) for i in 1:50]
50-element Vector{Float64}:
 10.236428260803223
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
  
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
 10.23690128326416
with adjoint
julia> [f(forecast_samples_adj) for i in 1:50]
50-element Vector{Float64}:
 10.919855117797852
 11.124429702758789
 11.328531265258789
 11.532632827758789
 11.736734390258789
 11.940835952758789
 12.144937515258789
 12.349039077758789
 12.553140640258789
 12.757242202758789
 12.961343765258789
 13.165445327758789
 13.369546890258789
 13.573648452758789
 13.777750015258789
 13.981851577758789
 14.185953140258789
  
 17.85978126525879
 18.06388282775879
 18.26798439025879
 18.47208595275879
 18.67618751525879
 18.88028907775879
 19.08439064025879
 19.28849220275879
 19.49259376525879
 19.69669532775879
 19.90079689025879
 20.10489845275879
 20.30900001525879
 20.51310157775879
 20.71720314025879
 20.92130470275879

@KristofferC KristofferC added the GC Garbage collector label Jun 29, 2023
@gbaraldi
Copy link
Member

I can reproduce the leak on master but I can't find what the difference is. The GC is finding it because it counts it as live but I'm not sure what is going on, the reduce function is leaking some memory somewhere.

@bvdmitri
Copy link
Contributor

bvdmitri commented Jun 29, 2023

I can reproduce on Mac M2. @barucden I also noticed that the "without adjoint" version does also allocate, but only on the first iteration (perhaps the array itself?)

julia> [f(forecast_samples) for i in 1:50] |> diff
49-element Vector{Float64}:
 0.0004730224609375
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
julia> [f(forecast_samples_adj) for i in 1:50] |> diff
49-element Vector{Float64}:
 0.2045745849609375
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
 0.2041015625
julia> versioninfo()
Julia Version 1.9.1
Commit 147bdf428cd (2023-06-07 08:27 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 10 × Apple M2 Pro
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 6 virtual cores

@vchuravy
Copy link
Member

One Potential way of understanding this would be to use heapsnapshot and Google Chrome has the ability to diff two snapshots.

@gbaraldi
Copy link
Member

So I did that, and I couldn't find stuff, though I might have been misunderstanding how the tool works.

@inkydragon
Copy link
Member

inkydragon commented Jun 30, 2023

Run with different versions of Julia

test.jl
const forecast_samples_adj = [randn(30) for i in 1:10_000]';

function f(arr)
    reduce(vcat, arr)
    GC.gc(true)
    GC.gc(false)
    return Base.gc_live_bytes() / 2^20
end

a = [f(forecast_samples_adj) for i in 1:50]

println("START=$(a[1])\nEND  =$(a[end])\n")
  • Julia 1.6.7: 5.8616 --> 15.8630
  • Julia 1.8.5: 6.1122 --> 16.1137
  • Julia 1.9.1: 6.3575 --> 16.3590
  • master 02f80c6: 9.0230 --> 19.0246

Run with valgrind

test script

test with master 02f80c6, Julia Version 1.10.0-DEV.1607

gc.jl

const forecast_samples_adj = [randn(30) for i in 1:10_000]';

function f(arr)
    reduce(vcat, arr)
    GC.gc(true)
    GC.gc(false)
    return Base.gc_live_bytes() / 2^20
end

count = parse(Int, ARGS[1]);
a = [f(forecast_samples_adj) for i in 1:count]

print("run=$count times\nSTART=$(a[1])\nEND  =$(a[end])\n")

valgrind output log

raw logs: https://gist.github.com/inkydragon/12a26f5ab5acfd5fb93a76862ee493ca

- MiB MiB MiB Bytes Bytes Bytes Bytes
no name run times START END END-START in use at exit diff possibly lost still reachable
1 baseline 17,357,612 2,981,723 14,374,385
2 gc1 1 8.6297 8.6297 0.0000 25,726,368 3,709,413 22,015,227
3 gc2 2 8.6297 8.8339 0.2043 25,726,408 40 3,708,932 22,015,748
4 gc5 5 8.6297 9.4463 0.8166 25,726,432 24 3,709,132 22,015,572
5 gc10 10 8.6297 10.4668 1.8372 25,726,496 64 3,709,723 22,015,045
6 gc50 50 8.6297 18.6312 10.0015 25,726,792 296 3,708,634 22,016,430
7 gc100 100 8.6297 28.8366 20.2070 25,727,200 408 3,707,076 22,018,396
8 gc200 200 8.6297 49.2477 40.6180 25,728,160 960 3,709,647 22,016,785
9 gc400 400 9.6297 90.0694 80.4397 25,729,568 1408 3,708,284 22,019,556
  • baseline script: julia -e 'print("1\n")'
  • START / END / END - START: Base.gc_live_bytes() / 2^20
  • HEAP SUMMARY: in use at exit / diff
  • LEAK SUMMARY: possibly lost / still reachable
    definitely lost: 1,728 bytes, remains the same for gc1 ~ gc400

The in use at exit in HEAP SUMMARY increases gradually, consistent with the increase in the number of runs.
The three measurements in LEAK SUMMARY do not have a significant change and do not show an increasing trend.

In the last test #9: Base.gc_live_bytes() / 2^20 outputs 90.0694 MiB which is less than "heap in use at exit" 24.5 MiB (25_729_568 / 2^20).

in use at         definitely  possibly    still 
exit              lost        lost        reachable
25,729,568  ==  ( 1,728     + 3,708,284 + 22,019,556 )

Maybe julia did garbage collection before exiting?
Another possibility is that the count of Base.gc_live_bytes() is not accurate.

@d-netto
Copy link
Member

d-netto commented Jul 24, 2024

Any chance this could be an issue with the memory accounting code (and the GC not reporting live_bytes accurately)?

FWIW, we've seen a case in #54275 recently and, internally, we've been struggling with memory accounting bugs at RAI (e.g. negative live_bytes in some workloads).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GC Garbage collector
Projects
None yet
Development

No branches or pull requests

8 participants