-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why don't we use jemalloc? #21973
Comments
Related: nodejs/node-v0.x-archive#5339 |
I've tested this briefly. I'm seeing some light performance improvements (2.5%-5%) but higher RSS usage (100 -> 140MB) with some HTTP usage. I would like to test it with some heavy HTTP applications with very high memory usage first, to see if the increased 40MB of RSS is a fixed cost or a percentage. This is not a buffer-heavy use case. |
Also related: #17007 |
Should this remain open? Or is this a conversation that has run its course and the issue can be closed? |
@mcollina Any updates re: http performance? Also, just in case: which jemalloc version did you use? @Trott I belive that we need some data on how this affects something closer to real-world applications, not just microbenchmarks. |
I was not able to do any further testing. |
As a point of reference, Rust just finished removing all of jemalloc, saying that the system allocators tend to be better. |
@Fishrock123 Quite incorrect there. Jemalloc was removed as the default because it was forced binary bloat for projects that didnt care much. Jemalloc is often more performant. |
Part of the reason for the removal of jemalloc as the only runtime allocator in Rust was to provide support for a wider range of target platforms, architectures and tooling. It now allows an optional, configurable global_allocator that works with crates such as jemallocator. In my experience the reduced heap fragmentation in long-running, multi-threaded, glibc-based Linux processes is jemalloc's greatest benefit. Native Node.js modules using the libuv threadpool and worker threads would be good examples of where it might help some users some of the time. The ability to inject jemalloc via LD_PRELOAD on Linux already provides a runtime integration on the platform it benefits most, from which one could infer it doesn't need to live in Node.js itself. Perhaps an addition to the documentation about how to do this would be appropriate? |
I think the consensus is we're not going to make this change? And as @lovell points out, you can already use jemalloc through LD_PRELOAD. See also the conclusion in #17007. I'll go ahead and close this out. If someone wants to document the LD_PRELOAD approach, please open a pull request. I don't have suggestions on where to add it exactly, though. |
Note: while switching to
jemalloc
might be unoptimal, it could be still useful to gather information about positive and negative implications ofjemalloc
usage. Let's do that in this issue. I am not advocating to switch tojemalloc
(yet), but imo that is worth investigation.In some situations that I observed, it consumes slightly more memory (~5%), but it is able to significantly reduce the memory usage by orders of magnitude in some cases (basically in a subset of cases where
glibc
behaves significantly unoptimal).In some situations, jemalloc consumes significantly more memory though, but appears to be faster.
Testcase 1 (based on #21967):
Atm (with Node.js v10.7.0) it produces the following results:
I traced that down to C++
malloc()
behavior (testcase in #21967 (comment)).This is what happens just with
LD_PRELOAD=/usr/lib/libjemalloc.so
:Testcase 2:
Normal —
697 MiB
, jemalloc —37 MiB
.Testcase 3 (jemalloc consumes more memory):
Normal:
jemalloc:
Testcase 4:
Measured with
/usr/bin/time -f '%M KiB, %e seconds' node testcase-4.js
.Normal (1e4 * 1e7):
129 576 KiB, 11.41 seconds
.jemalloc (1e4 * 1e7):
34 928 KiB, 3.33 seconds
.Normal (1e5 * 1e6):
109 196 KiB, 11.51 seconds
.jemalloc (1e5 * 1e6):
35 636 KiB, 4.13 seconds
.Testcase 5 (like 4, but now we fill the buffer with
1
-s):Normal (1e4 * 1e7):
139 060 KiB, 14.38 seconds
.jemalloc (1e4 * 1e7):
170 308 KiB, 10.65 seconds
.Normal (1e5 * 1e6):
105 120 KiB, 12.25 seconds
.jemalloc (1e5 * 1e6):
112 548 KiB, 10.99 seconds
.Testcase 6 (from #8871, where @bnoordhuis mentioned
jemalloc
):No improvement in this case and a 5% loss.
Normal:
3 022 496 KiB, 5.88 seconds
.jemalloc:
3 179 716 KiB, 5.91 seconds
./cc @addaleax @bnoordhuis @mscdex
The text was updated successfully, but these errors were encountered: