-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer/Uint8Array using lots of memory #173
Comments
I would not make this assumption. I would expect at least, say, 48 bytes and up to 256 bytes even for an empty Uint8Array instance. Have a look at these blog posts: Merely storing a single integer in a set in C++ can take 32 bytes !!! There is just no way that creating a whole new Now, if you create sizeable Can you run the following code and tell me what you get? var arr = new Array();
let count = 0;
let unit = 128;
for(let i = 0; i < 10000; i++) {
arr.push(new Uint8Array(unit));
count += unit;
console.log(count+" "+process.memoryUsage().arrayBuffers+" "+process.memoryUsage().arrayBuffers/count);
}
I would allocate a buffer |
|
Thanks for the info @lemire. You're correct that there seems to be lots of overhead for each TypedArray created, and creating a single large typed array really does only consume the memory I was expecting. I've been doing some digging and found this interesting explanation from a V8 developer: I also tried the same with ArrayBuffers + DataView with very slightly better memory efficiency. But that is somewhat moot since ObjectIds can be represented as a 24 character hex string, which only consumes 40 bytes in V8, which is much better than Buffer consuming 96 bytes to represent the same raw 12 bytes.
I don't really have much control over this in our implementation since bson is instantiating lots of Buffers under the hood. Do you know of any good libraries for managing disparate data within a large arrayBuffer? There's some complexity around removing unused elements and redistributing the available space. Thanks again for your insight here. I think we can close this since it does not seem to be an NodeJS issue directly. |
You can grab the returned buffer and copy it to your own larger buffer.
Your project does end up looking like you are trying to build your own custom database engine... which is unavoidably going to require some engineering effort. |
FWIW when I investigated nodejs/node#53579 I noticed that even an empty array buffer in V8 takes 88 bytes, which is surprisingly big if you ask me. But that also has something to do with us not turning on pointer compression + V8 sandbox (otherwise it would've been ~44 bytes). Also not all the fields are strictly necessary for all array buffers but they are there in advance, or there should've been some clever ways to encode them to save space. But that could incur additional code complexity in V8 that makes it not worth it, and it's mostly a V8 issue. |
@joyeecheung So an empty buffer is made of 11 pointers? That sounds like a lot. |
Tested in NodeJS Versions: v22.3.0, v20.14.0, v18.20.3, v16.20.2
It appears as though Buffer/Uint8Array consumes much more memory than I would expect. This is particularly obvious with many small instances.
For example:
const data = new Uint8Array(12)
<-- I would think this would consume ~12bytesIt appears to have a shallow size of 96bytes and retains 196bytes
I'm not sure if this is a V8 issue but when I try the same in Chrome 126 I see a similar issue but it uses slightly less memory
Why this is an issue
I stumbled on this while trying to profile memory issues while pulling large amounts of MongoDB documents into memory, even projecting the documents to just return 2 ObjectIds each (we're building potentially large graphs in memory from the links).
A BSON ObjectId is 12 bytes. So we estimated ~24MB per million edges. (maybe a bit more for object overhead etc)
In reality this uses almost 500MB
At first I thought this was an issue with BSON's implementation but this can be recreated using Uint8Array directly.
Try it out
It doesn't appear that the memory used increases much with the size of the Uint8Array. Doubling the size of each Uint8Array from 12 -> 24 only increases the memory usage to 485MB in the above test. This tells me there's probably some overhead in the data structure itself than some data being duplicated or something.
Curiously, when I try the same thing with
Buffer.from(new Uint8Array(12))
it only outputs ~240MB. I assume this is because buffer doesn't keep a reference to something(?) and GC happens sometime before capturing heapUsed.See below when using Buffer.from(new Uint8Array(12)) it retains 100bytes less 🤔
Thanks
Big thanks to the Node.js Performance Team in advance. You're doing amazing work 👍 Please let me know if this is an issue with V8 directly or if this is completely expected behaviour. It really caught me off guard.
The text was updated successfully, but these errors were encountered: