Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Does Nomad 0.9 add significant memory overhead? #6543

Closed
notnoop opened this issue Oct 24, 2019 · 2 comments
Closed

[Question] Does Nomad 0.9 add significant memory overhead? #6543

notnoop opened this issue Oct 24, 2019 · 2 comments

Comments

@notnoop
Copy link
Contributor

notnoop commented Oct 24, 2019

Nomad 0.9 introduced new auxiliary processes (e.g. logmon, docker_logger) per task. Nomad 0.8 only had executor for raw_exec/exec/java driver tasks. On Linux, each these processes consume around ~30MB, though Nomad 0.9.6 reduced the RSS metric to 10-25MB by #6341 .

Is there a cause of concern here? How does memory usage scale with processes? Would 100 running raw_exec tasks cause 3-5GB of overhead?

@notnoop
Copy link
Contributor Author

notnoop commented Oct 24, 2019

While there is some overhead Nomad 0.9, it's not in the order of magnitude the question assumes. Basic analysis is misleading and extrapolation leads us astray here for two main reasons.

Nomad binary and RSS counting

When running tens/hundreds of tasks, plain RSS is a bad value to sum to extrapolate on. Besides process specific memory (e.g. heap/stack), RSS also includes the loaded portions of the binary and shared libraries. The kernel caches and shares these shared libraries memory bits when running multiple instances of the same binary.

For example, the kernel may load and share glibc library among the so many processes that are linked to it. Though loaded once, the library overhead gets reported in the RSS of each of these processes.

The nomad large binary, around ~84MB currently, contributes to large reported RSS here, and is cached effectively when running at scale. In some of our tests, ~26MB out of 30MB RSS was due to nomad binary and some shared binaries (e.g. libc, libpthreads, ld); though values differ by tests and extract executor.

Golang memory usage and Garbage Collection Behavior

Detailing golang memory management and Garbage Collection (GC) is beyond scope here, and there are wealth of resources [1][2][3]. The salient point is each auxiliary process manages its own heap: each may allocate more memory than absolutely needed immediately, and may be slow at releasing freed memory back to system; also the system may lazily reclaim unused memory[4].

When running so many tasks, nomad auxiliary processes may claim more memory than warranted and complicate basic analysis, though they get freed to system under memory pressure.

Take away

At this point, we don't believe that externalizing processes cause a substantial increase in memory usage. We recognize that there is room for improvements (e.g. tweaking rpc buffers, etc) to reduce memory usage overall.

I may follow up with more detailed reports of finding as well as follow up github issues from research.

[1] https://blog.golang.org/ismmkeynote
[2] https://medium.com/samsara-engineering/running-go-on-low-memory-devices-536e1ca2fe8f
[3] https://povilasv.me/go-memory-management/
[4] https://golang.org/doc/go1.13#runtime

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant