-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: HeapSys increases until OOM termination #35890
Comments
Sometimes HeapSys can cross the 20Gb line and continue to grow. But anyway, in the end, it crashes by OOM.
|
I just realized that --memory-swap must be equal to --memory flag value to turn swapping off. Anyway, I got same result using settings --memory .10 --memory-swap 10 (and 8 is actually too, it crashes a bit faster). |
/cc @aclements |
From what I see: I do not see infinite growth. Maybe your machine is just at the memory limits that it dies before the memory use plateaus. The memory use of this code is definitely challenging for the allocator. It allocates a very large array, O(n^2), for the all-pairs shortest path results. It also does lots of smaller allocations for the various maps it uses. I think what is going on is that we allocate the large array, free it, then allocate a few small items from the freed space. When we go to allocate another large array, it doesn't fit in the space freed any more, so we allocate a new large array. Repeat that a few times, and trouble ensues. @mknyszek : Is there another bug related to this behavior? Maybe #14045? Tip should be at least a bit better than 1.13 for this program, as tip will scavenge memory more promptly (that is, assuming your OOM is related to physical memory use, not virtual). |
@randall77 #14045 is related I think, since heap growth should be causing lots of small bits of memory to get scavenged (maybe the scavenger thinks, erroneously, it should be off?) I suspect that in Go 1.12 and 1.13 it's easier to trigger this because we don't allocate across memory that is scavenged and unscavenged (and consequently this is why @randall77 you see that it maxes out a bit lower, maybe). But I don't fully understand how this is different to the problem in #35848. Perhaps the two can be folded into one issue? |
@mknyszek maybe I meet the similar problem, it's a database system and sql workload start from 15:00 and stop at 15:40, then no other load. the metric is taken from MemStat: HeapSys, HeapInuse, HeapIdle and HeapReleased, also (HeapIdle minus HeapReleased). it seems I test this use go 1.13/ linux |
@lysu This is exactly the behavior I'd expect and it is working as intended. We never unmap heap memory, we only return memory to the OS via |
@mknyszek thank you for your detailed explanations ~ and I'd learn more about |
I found a bug in my example: it did not use docker memory limits for So I've updated the example and tested it on different host OSs: Ubuntu 18.04 (using two kernel versions: In real service, we perform this kind of graph initialization once in an hour. So it has much time to collect all the garbage after this heavy operation. But it leaks anyway. |
What version of Go are you using (
go version
)?What operating system and processor architecture are you using?
Host machine - mac, Catalina 10.15.1, core i7, 2.5Ggz, 16Gb RAM
In docker (19.03.5) - ubuntu:latest, 10Gb mem limit, swap off
I build the app for Linux architecture using flags GOOS/GOARCH
What did you do?
I've prepared an extended example based on #35848, but this one has one important difference: we have 'current' pointer to data and also we have an 'old' pointer that is removed (old=nil) right after new data generation.
https://github.com/savalin/example/tree/extendend_dockerized_oom_example
branch: extendend_dockerized_oom_example (master is for #35858)
It demonstrates the problem we faced on production servers (kubernetes). In this example, we use one dataset and load it in a loop. OOM killer terminates it after ~10 iteration.
make build && make run
OutputWhat did you expect to see?
I expect the app consuming a constant amount of memory instead of increasing its consumption iteration by iteration.
What did you see instead?
When HeapSys reaches ~20Gb container terminates by OOM killer.
Sometimes it requires more time to reproduce the issue, but on my laptop, it fails approximately at 10th iteration (see output above).
The text was updated successfully, but these errors were encountered: