-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is [email protected] leaking memory? #237
Comments
My first guess is that you may just be missing some required cleanup after your async event runs or you create so many objects so quickly that they just don't get to run fast enough and you get a large backlog an eventually run out of memory due to that backlog I'm travelling today but will to take a closer look later in the week. |
This would be good. Unfortunately I can't see anything that's missing although I've looked at everything many times.
Only one instance of MyObject is created from JavaScript and it's async
Excellent, no rush. I don't want to send you on a wild goose chase here but this issue reminds me a bit of this one. However, I have no idea if the issues are in any way related. |
@mhdawson Were you able to verify whether or not there's a memory leak here? |
I've not had time to investigate yet. Have a lot of things on my plate, sorry. |
Some investigation using the existing asyncworker
Running under valgrind with 5000 and 1000 interations showed: 5000 10000 So I don't see an obvious leak on the native side. The heap does seem to grow but running with From this don't think its something on the asyncworker side, so willl need to integrate the use of ObjectWrap to see if we get unbounded memory growth with that. |
Existing asyncworker test case with use of object wrap integrated: asyncworker.cc #include "napi.h"
using namespace Napi;
class TestWorker : public AsyncWorker {
public:
static void DoWork(const CallbackInfo& info) {
bool succeed = info[0].As<Boolean>();
Function cb = info[1].As<Function>();
Value data = info[2];
TestWorker* worker = new TestWorker(cb);
worker->Receiver().Set("data", data);
worker->_succeed = succeed;
worker->Queue();
}
TestWorker(Function cb) : AsyncWorker(cb), _succeed(true) {}
protected:
void Execute() override {
if (!_succeed) {
SetError("test error");
}
}
private:
bool _succeed;
};
class MyObject : public Napi::ObjectWrap<MyObject> {
public:
static Napi::Object Init(Napi::Env env, Napi::Object exports);
MyObject(const Napi::CallbackInfo& info);
private:
static Napi::FunctionReference constructor;
void Open(const Napi::CallbackInfo& info);
};
Napi::FunctionReference MyObject::constructor;
Napi::Object MyObject::Init(Napi::Env env, Napi::Object exports) {
Napi::HandleScope scope(env);
Napi::Function func = DefineClass(env, "MyObject", {
InstanceMethod("open", &MyObject::Open)
});
constructor = Napi::Persistent(func);
constructor.SuppressDestruct();
exports.Set("MyObject", func);
return exports;
}
MyObject::MyObject(const Napi::CallbackInfo& info)
: Napi::ObjectWrap<MyObject>(info) {
}
void MyObject::Open(const Napi::CallbackInfo& info) {
Napi::Function cb = info[0].As<Napi::Function>();
TestWorker* worker = new TestWorker(cb);
worker->Queue();
}
Object InitAsyncWorker(Env env) {
Object exports = Object::New(env);
exports["doWork"] = Function::New(env, TestWorker::DoWork);
return MyObject::Init(env, exports);
} asyncworker.js 'use strict';
const buildType = process.config.target_defaults.default_configuration;
const assert = require('assert');
test(require(`./build/${buildType}/binding.node`));
test(require(`./build/${buildType}/binding_noexcept.node`));
function test(binding) {
binding.asyncworker.doWork(true, function (e) {
assert.strictEqual(typeof e, 'undefined');
assert.strictEqual(typeof this, 'object');
assert.strictEqual(this.data, 'test data');
}, 'test data');
binding.asyncworker.doWork(false, function (e) {
assert.ok(e instanceof Error);
assert.strictEqual(e.message, 'test error');
assert.strictEqual(typeof this, 'object');
assert.strictEqual(this.data, 'test data');
}, 'test data');
const myObject = new binding.asyncworker.MyObject();
let count = 0;
function open() {
myObject.open(() => {
count += 1;
if (count % 1000000 === 0) {
console.log(count);
}
open();
});
}
open();
} |
using the following to limit the number of loops in the open() function: if (count < X) {
open();
} Valgrind on 10000 ==25463== HEAP SUMMARY:
==25463== in use at exit: 164,324 bytes in 1,470 blocks
==25463== total heap usage: 181,094 allocs, 179,624 frees, 52,618,468 bytes allocated
==25463==
==25463== LEAK SUMMARY:
==25463== definitely lost: 5,160 bytes in 298 blocks
==25463== indirectly lost: 72 bytes in 2 blocks
==25463== possibly lost: 76,425 bytes in 15 blocks
==25463== still reachable: 82,667 bytes in 1,155 blocks
==25463== suppressed: 0 bytes in 0 blocks
==25463== Rerun with --leak-check=full to see details of leaked memory Valgrind on 20000 ==25478== HEAP SUMMARY:
==25478== in use at exit: 164,324 bytes in 1,470 blocks
==25478== total heap usage: 341,518 allocs, 340,048 frees, 64,446,004 bytes allocated
==25478==
==25478== LEAK SUMMARY:
==25478== definitely lost: 5,136 bytes in 298 blocks
==25478== indirectly lost: 96 bytes in 2 blocks
==25478== possibly lost: 76,425 bytes in 15 blocks
==25478== still reachable: 82,667 bytes in 1,155 blocks
==25478== suppressed: 0 bytes in 0 blocks
==25478== Rerun with --leak-check=full to see details of leaked memory
==25478== So I don't see any native memory leak based on number of times open is called. |
Running with and watching memory with top, memory increases and then stabilizes at (shown at about 14 mins into the run) ~ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25500 mhdawson 20 0 1290348 236352 12132 R 174.6 1.4 14:11.08 node
1583 jsbuild 20 0 7664760 186020 15260 S 0.7 1.1 33:11.52 java
123 root 20 0 0 0 0 S 0.3 0.0 4:42.11 kworker/3:1 gc activity also stabilizes at frequent gc's freeing about 6MB each time [25500:0x2c2dd60] 512946 ms: Mark-sweep 126.5 (132.3) -> 126.5 (132.3) MB, 176.4 / 33.3 ms (+ 173.2 ms in 193 steps since start of marking, biggest step 61.9 ms, walltime since start of marking 383 ms) finalize incremental marking via task GC in old space requested
[25500:0x2c2dd60] 513336 ms: Mark-sweep 126.5 (132.3) -> 126.5 (132.3) MB, 181.5 / 40.5 ms (+ 175.8 ms in 193 steps since start of marking, biggest step 63.0 ms, walltime since start of marking 390 ms) finalize incremental marking via task GC in old space requested
[25500:0x2c2dd60] 513718 ms: Mark-sweep 126.5 (132.3) -> 126.5 (132.3) MB, 175.1 / 34.5 ms (+ 172.6 ms in 190 steps since start of marking, biggest step 61.3 ms, walltime since start of marking 382 ms) finalize incremental marking via task GC in old space requested
[25500:0x2c2dd60] 514100 ms: Mark-sweep 126.5 (132.3) -> 126.5 (132.3) MB, 175.5 / 34.5 ms (+ 172.9 ms in 193 steps since start of marking, biggest step 61.7 ms, walltime since start of marking 381 ms) finalize incremental marking via task GC in old space requested
[25500:0x2c2dd60] 514487 ms: Mark-sweep 126.5 (132.3) -> 126.5 (132.3) MB, 178.8 / 33.8 ms (+ 174.3 ms in 193 steps since start of marking, biggest step 62.4 ms, walltime since start of marking 386 ms) finalize incremental marking via task GC in old space requested
[25500:0x2c2dd60] 514868 ms: Mark-sweep 126.5 (132.3) -> 126.5 (132.3) MB, 173.3 / 33.5 ms (+ 173.7 ms in 196 steps since start of marking, biggest step 61.7 ms, walltime since start of marking 380 ms) finalize incremental marking via task GC in old space requested
[25500:0x2c2dd60] 515249 ms: Mark-sweep 126.5 (132.3) -> 126.5 (132.3) MB, 175.9 / 33.4 ms (+ 172.0 ms in 192 steps since start of marking, biggest step 61.1 ms, walltime since start of marking 381 ms) finalize incremental marking via task GC in old space requested
|
These were done with the lastest node master and current node-addon-api repo. |
@fivdi can you take a look at the code I used above, I think I integrated your example properly but it does not look like there is memory leaking to me. I see you are on a raspberry pi which might not have as much memory as I do. Can you try with --max_old_space_size=128 to see if the process still gets terminated? |
It looks like the examples was integrated correctly into
Yes, it's a Raspberry Pi 3 with 1GB of RAM. I tried --max_old_space_size=128 with my example from the original post, here's the output:
The program was running for several hours before the error occurred. The displaying of |
@mhdawson I did the last test with v10.0.0-nightly2018040653aaa55a3a and |
@mhdawson When the example you posted above is integrated into asyncworker.cc and asyncworker.js is modified to: 'use strict';
const buildType = process.config.target_defaults.default_configuration;
const assert = require('assert');
test(require(`./build/${buildType}/binding.node`));
test(require(`./build/${buildType}/binding_noexcept.node`));
function test(binding) {
binding.asyncworker.doWork(true, function (e) {
assert.strictEqual(typeof e, 'undefined');
assert.strictEqual(typeof this, 'object');
assert.strictEqual(this.data, 'test data');
}, 'test data');
binding.asyncworker.doWork(false, function (e) {
assert.ok(e instanceof Error);
assert.strictEqual(e.message, 'test error');
assert.strictEqual(typeof this, 'object');
assert.strictEqual(this.data, 'test data');
}, 'test data');
const myObject = new binding.asyncworker.MyObject();
let count = 0;
function open() {
myObject.open(() => {
count += 1;
if (count % 100000 === 0) {
console.log(count + ' ' + new Date().toISOString());
}
open();
});
}
open();
} The output is:
After |
I found this issue when trying to pinpoint my own memory leak, which I believe is caused by the same problem. While I am fairly new with native add-on things, I'd like to add one observation hoping it is helpful. The same symptom can be observed on both my application and @fivdi 's https://github.com/fivdi/node-addon-api-leak repo. So I am using node-addon-api-leak as the example. By running BTW, all this is on macOS and I haven't tried it on any other os. Node-addon-api version is 1.3.0. |
I ran node-addon-api-leak on my laptop, and it seems to top off at 7.8% memory usage, and it stops printing after 26000000. It continues running at 100%. |
While it prints, the CPU usage is stead, but after it stops printing it seems to spike and abate. |
I believe I know what is leaking with the latest, still not sure why but this what I think is related:
calls
The value of nullptr for the resource results in the napi_create_async_work allocating an emtpy Object to act as the resource. Thinking that might be the problem I modified the code to allocate a resource once and pass that in with:
With that change, then when looking at the heap snapshot with the chrome debugger the objects that are leaking are all Strings with the string value "genericXXXX". But since the code above passes in the same string every time (we only see the printout of "created resource id" once), the resource object must be being copied somewhere and then it held alive. |
Reset the persistent that keeps the resource Object alive when the AsyncResource is being destroyed. Fixes: nodejs/node-addon-api#237
This seems to resolve the problem: nodejs/node#20668 I can run the test case and when I take heapdumps with the chrome dev tools I no longer see growth in the heap and the resident set size stays reasonable. |
@mhdawson and I backed the JS part of the node-addon-api-leak test with the following native code, lacking an AsyncResource, and it did not leak: #include <stdio.h>
#include "uv.h"
#include "node.h"
using namespace v8;
typedef struct {
uv_work_t uv_work;
Persistent<Function> perma_cb;
} MyWork;
static void do_work(uv_work_t* work) {}
static void done_work(uv_work_t* work, int status) {
auto isolate = Isolate::GetCurrent();
HandleScope scope(isolate);
auto context = isolate->GetCurrentContext();
node::CallbackScope cb_scope(isolate, Object::New(isolate),
node::async_context());
MyWork* my_work = reinterpret_cast<MyWork*>(work);
Local<Function> js_cb =
Local<Function>::New(isolate, my_work->perma_cb);
js_cb->Call(context, context->Global(), 0, nullptr).ToLocalChecked();
my_work->perma_cb.Reset();
delete my_work;
}
static void MyObject(const FunctionCallbackInfo<Value>& info) {}
static void MyObject_open(const FunctionCallbackInfo<Value>& info) {
MyWork *work = new MyWork;
work->perma_cb.Reset(Isolate::GetCurrent(), info[0].As<Function>());
uv_loop_t *loop = uv_default_loop();
if (uv_queue_work(loop,
reinterpret_cast<uv_work_t*>(work), do_work, done_work)
!= 0) {
fprintf(stderr, "Failed to create work\n");
}
}
static void Init(Local<Object> exports, Local<Value> module) {
auto isolate = Isolate::GetCurrent();
auto context = isolate->GetCurrentContext();
auto js_class = FunctionTemplate::New(isolate, MyObject);
auto js_method = FunctionTemplate::New(isolate, MyObject_open,
Local<Value>(), Signature::New(isolate, js_class));
js_class->PrototypeTemplate()->Set(isolate, "open", js_method);
exports->Set(isolate->GetCurrentContext(),
String::NewFromUtf8(isolate, "MyObject", NewStringType::kNormal)
.ToLocalChecked(),
js_class->GetFunction(context).ToLocalChecked()).FromJust();
}
NODE_MODULE(NODE_GYP_MODULE_NAME, Init) |
Thanks @mhdawson for the quick turnaround! |
Reset the persistent that keeps the resource Object alive when the AsyncResource is being destroyed. Fixes: nodejs/node-addon-api#237 PR-URL: #20668 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]> Reviewed-By: Gabriel Schulhof <[email protected]> Reviewed-By: James M Snell <[email protected]>
Reset the persistent that keeps the resource Object alive when the AsyncResource is being destroyed. Fixes: nodejs/node-addon-api#237 PR-URL: #20668 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]> Reviewed-By: Gabriel Schulhof <[email protected]> Reviewed-By: James M Snell <[email protected]>
Reset the persistent that keeps the resource Object alive when the AsyncResource is being destroyed. Fixes: nodejs/node-addon-api#237 PR-URL: #20668 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]> Reviewed-By: Gabriel Schulhof <[email protected]> Reviewed-By: James M Snell <[email protected]>
Reset the persistent that keeps the resource Object alive when the AsyncResource is being destroyed. Fixes: nodejs/node-addon-api#237 PR-URL: #20668 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]> Reviewed-By: Gabriel Schulhof <[email protected]> Reviewed-By: James M Snell <[email protected]>
If the following addon:
my-object.h
my-object.cc
addon.cc
is tested with the following JavaScript:
then the memory usage of the node process keeps increasing and increasing and the process ends up crashing after the async
open
method has been called about 18 million times.Here's the output of the program:
The complete addon can be seen here.
The text was updated successfully, but these errors were encountered: