Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high tls memory usage (rss) #1522

Closed
ChALkeR opened this issue Apr 25, 2015 · 46 comments
Closed

high tls memory usage (rss) #1522

ChALkeR opened this issue Apr 25, 2015 · 46 comments
Labels
https Issues or PRs related to the https subsystem. memory Issues and PRs related to the memory management or memory footprint. tls Issues and PRs related to the tls subsystem.

Comments

@ChALkeR
Copy link
Member

ChALkeR commented Apr 25, 2015

https.request leaks in this example:

'use strict';
var https = require('https');
var options = {
    hostname: 'google.com'
};
function next() {
    setTimeout(iterate, 10);
}
function onError(error) {
    console.log(error);
}
function noop() {}
function onResponse(res) {
    console.log('statusCode: ', res.statusCode);
    res.on('data', noop);
    res.on('end', next);
}
function iterate() {
    console.log(JSON.stringify(process.memoryUsage()));
    console.log('iterating...');
    https.request(options, onResponse).
        on('error', onError).
        end();
};
iterate();

rss grows over time, heapUsed remains about constant (~30m).
This could be reproduced with all released iojs versions that I tried v1.0.1 and v1.8.1.

Node v0.12.2 behaves better in this testcase: rss grows to about 100m, but goes down periodically (I wouldn't call that a perfect behaviour, let's call that «fine»).

53ba494, dad73f6 and 8a83eba are three sequental commits.
53ba494 is fine (as 0.12), dad73f6 fails to build, 8a83eba leaks.
dad73f6 is «upgrade v8 to 3.31.74.1» with 831 files changed, 104394 insertions(+), 21427 deletions(-).
Node v0.12.2 could be fine because it still uses v8 3.28.73.
io.js 1.0.0/1.1.0 + shared v8 3.30.33.16 behaves fine (the same as 0.12).

All the testcases and massif data are at http://oserv.org/bugs/iojs/memory0/

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 25, 2015

Could be reproduced with wreck:

'use strict';
var wreck = require('wreck');
function callback(error, res, payload) {
    if (error) {
        console.log(error);
        return;
    }
    console.log('statusCode: ', res.statusCode);
    setTimeout(iterate, 10);
}
function iterate() {
    console.log(JSON.stringify(process.memoryUsage()));
    console.log('iterating...');
    wreck.get('https://google.com/', callback);
};
iterate();

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 25, 2015

I found a work-around for that particular testcase.
It's not applicable in real world, i think, but it could give some ideas on where the bug is.

'use strict';

var requests = 20;
var total = 1000;
// If `maxSockets` is less then `requests`, then RSS stops growing from some point.
var maxSockets = 10;

var https = require('https');
var agent = new https.Agent();
agent.maxSockets = maxSockets;
var count = 0;
var options = {
    agent: agent,
    hostname: 'google.com'
};
function next() {
    setTimeout(iterate, 0);
}
function onError(error) {
    console.log(error);
}
function noop() {}
function onResponse(res) {
    console.log('statusCode: ', res.statusCode);
    res.on('data', noop);
    res.on('end', next);
}
function iterate() {
    console.log(JSON.stringify(process.memoryUsage()));
    if (count >= total) process.exit();
    console.log('iterating... ' + ' num: ' + (++count));
    https.request(options, onResponse).
        on('error', onError).
        end();
};
for (var i = 0; i < requests; i++) iterate();

In this example, if agent.maxSockets is less than the number of https.requests being executed at the same time, then RSS stops growing.
If only one https.request is being executed at every moment, then it can't help.
agent.maxSockets = 10 and 20 calls to https.request is ok
agent.maxSockets = 10 and 10 calls to https.request is not ok

http://oserv.org/bugs/iojs/memory0/test-agent.js — testcase.
http://oserv.org/bugs/iojs/memory0/test-agent.txt — results.

http://oserv.org/bugs/iojs/memory0/test-fast.js could be used to reproduce this a bit faster (20 requests at the same time).

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 25, 2015

cc @indutny

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 25, 2015

http://oserv.org/bugs/iojs/memory0/test-fast.js could be used to reproduce this a bit faster (20 requests at the same time).

@mscdex mscdex added the https Issues or PRs related to the https subsystem. label Apr 25, 2015
@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 25, 2015

io.js 1.8.1:
iojs v1.8.1

node 0.12.2:
node v0.12.2

The time that those two were running for could be different, but you could clearly see the difference still.

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 25, 2015

Actually, it goes down after manually calling gc().
If I call gc() manually every 100 requests, RSS does not go above 70 MiB.

Could it be that gc() is never called?

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 25, 2015

Until a full gc run (calling gc()), TLSWrap native objects get never deleted.
Full gc is not called automatically, probably because v8 thinks that incremental gc works fine (well, heapUsed is ~30 MiB most of the time).

Full gc is not called automatically until ~500MiB rss (without valgrind/massif) is reached.

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 25, 2015

But even when rss reaches ~500MiB and full gc is called (deleting TLSWrap objects), rss memory usage does not go down.

@indutny
Copy link
Member

indutny commented Apr 25, 2015

@ChALkeR I have modified the test case to work with localhost server (partly because I'm on 3G internet, partly because I'm not so patient). And it appears to me that it grows up to 1.1gb very fast, then drops to 800mb and then stays around 600mb for 50 parallel connections.

The breaking point (1.1gb) happens at 21373 iteration for me. Have you tried waiting that long?

@indutny
Copy link
Member

indutny commented Apr 25, 2015

I have a fix for lowering 1.1gb, btw. Will push a PR tomorrow.

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

Lowering to which point? If agent.maxSockets is lower than the number of parallel connections, rss somehow stabilizes at ~40 MiB. What's the difference in this case?

Just checked — in this case TLSWrap objects are reused.
For 20 parallel connections and 1000 total requests:

  • with agent.maxSockets = 10 — only 10 TLSWrap objects are created,
  • with agent.maxSockets = 17 — only 17 TLSWrap objects are created,
  • with agent.maxSockets = 18 — 21 TLSWrap objects are created,
  • with agent.maxSockets = 19 — 83 TLSWrap objects are created,
  • with agent.maxSockets = 20 — 1000 TLSWrap objects are created.

@indutny
Copy link
Member

indutny commented Apr 26, 2015

@ChALkeR to 400mb as far as I remember. Anyway, do you see that everything is getting collected after growing to 1gb?

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

No.

For me, full gc kicks in at 500 MiB. But it does not lower rss memory usage, it just (almost) stops growing.

statusCode:  302
{"rss":556351488,"heapTotal":75819520,"heapUsed":25609128}
iterating...  num: 21408
statusCode:  302
{"rss":556351488,"heapTotal":75819520,"heapUsed":25723656}
iterating...  num: 21409
statusCode:  302
{"rss":556351488,"heapTotal":75819520,"heapUsed":25751024}
iterating...  num: 21410

@indutny
Copy link
Member

indutny commented Apr 26, 2015

@ChALkeR I wasn't talking about GC in general. I was talking about the point where all accumulated TLSWraps are collected and RSS lowers significantly.

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

TLSWraps are not collected untill a full gc run.
For me, a full gc run is called at 500 MiB and it collects the accumulated TLSWraps (and frees memory for new ones) but it does not lower the rss.
Memory usage does not significantly go above that point.

@indutny
Copy link
Member

indutny commented Apr 26, 2015

So why is this issue called a "leak" then? ;) If the RSS stays the same?

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

Because I found that out (the 500 MiB cap) after reporting the issue. =).
It's technically not a leak.

@ChALkeR ChALkeR changed the title https.request memory leak (rss) https.request memory usage grows to 500 MiB (rss) Apr 26, 2015
@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

There are two questions here:

  1. Why are TLSWrap objects not reused in a case when there are pauses between requests?
    As I understand it, the object is reused only if there is another request queued due to agent.maxSockets restriction. Can't it wait for some time?
  2. Why are not TLSWrap objects destroyed with incremental gc after js-side socket destuction? Is there some kind of a circular reference hanging?

@indutny
Copy link
Member

indutny commented Apr 26, 2015

@ChALkeR:

  1. No answer
  2. This is how V8's GC works. I think incremental GC does not destroy C++ objects. There is nothing about circular reference or no bug in our code causing this.

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

  1. Was not about the gc, it was about TLSWrap re-usage.

@indutny
Copy link
Member

indutny commented Apr 26, 2015

Erm, I swear I thought I put (2) there :) Sorry!

@indutny
Copy link
Member

indutny commented Apr 26, 2015

Yep, I did it. This is a flavored markdown that borked me out.

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

Ah, that's the Markdown. It re-numbers lists. Even 1. 1. 1. is displayed as:

  1. One.
  2. Also one.
  3. This also is a «1.».

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

  1. No, that's not it. C++ objects are collected with an incremental gc if there are no obstacles. You can verify that with manually calling process.binding('tls_wrap').wrap — if you do nothing with the result (or do something nice and then dispose of it), it gets collected almost immediately with an incremental gc run.

@indutny
Copy link
Member

indutny commented Apr 26, 2015

I could make RSS stay around 200mb with this patch:

diff --git a/src/node_crypto.h b/src/node_crypto.h
index 75ffe4f..fcc8cbe 100644
--- a/src/node_crypto.h
+++ b/src/node_crypto.h
@@ -140,12 +140,15 @@ class SSLWrap {
         session_callbacks_(false),
         new_session_wait_(false) {
     ssl_ = SSL_new(sc->ctx_);
+    env_->isolate()->AdjustAmountOfExternalAllocatedMemory(kExternalSize);
     CHECK_NE(ssl_, nullptr);
   }

   virtual ~SSLWrap() {
     if (ssl_ != nullptr) {
       SSL_free(ssl_);
+      const int64_t len = static_cast<int64_t>(kExternalSize);
+      env_->isolate()->AdjustAmountOfExternalAllocatedMemory(-len);
       ssl_ = nullptr;
     }
     if (next_sess_ != nullptr) {
@@ -169,6 +172,11 @@ class SSLWrap {
   inline bool is_waiting_new_session() const { return new_session_wait_; }

  protected:
+  // Size allocated by OpenSSL: one for SSL structure, one for SSL3_STATE and
+  // some for buffers.
+  // NOTE: Actually it is much more than this
+  static const int kExternalSize = sizeof(SSL) + sizeof(SSL3_STATE) + 42 * 1024;
+
   static void InitNPN(SecureContext* sc);
   static void AddMethods(Environment* env, v8::Handle<v8::FunctionTemplate> t);

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

That makes v8 take the minimum possible TLSWrap size into account when deciding when to run gc? That would be nice, but it doesn't solve the problem completely.

@indutny
Copy link
Member

indutny commented Apr 26, 2015

I'm not sure what the problem is then :)

@ChALkeR
Copy link
Member Author

ChALkeR commented Apr 26, 2015

Well, 200MiB just for requests is still a problem, isn't it?

indutny added a commit to indutny/io.js that referenced this issue Apr 27, 2015
Destroy singleUse context right after it is going out of use.

Fix: nodejs#1522
@Fishrock123 Fishrock123 changed the title https.request memory usage grows to 500 MiB (rss) high tls memory usage (rss) Apr 28, 2015
@Fishrock123 Fishrock123 added the tls Issues and PRs related to the tls subsystem. label Apr 28, 2015
shigeki pushed a commit to shigeki/node that referenced this issue Apr 30, 2015
Do not keep SSL structure in memory once socket is closed. This should
lower the memory usage in many cases.

Fix: nodejs#1522
shigeki pushed a commit to shigeki/node that referenced this issue Apr 30, 2015
Ensure that GC kicks in at the right times and the RSS does not blow up.

Fix: nodejs#1522
shigeki pushed a commit to shigeki/node that referenced this issue Apr 30, 2015
When connecting to server with `keepAlive` turned off - make sure that
the read/write buffers won't be kept in a single use SSL_CTX instance
after the socket will be destroyed.

Fix: nodejs#1522
shigeki pushed a commit to shigeki/node that referenced this issue Apr 30, 2015
Destroy singleUse context right after it is going out of use.

Fix: nodejs#1522
shigeki pushed a commit to shigeki/node that referenced this issue Apr 30, 2015
Do not keep SSL structure in memory once socket is closed. This should
lower the memory usage in many cases.

Fix: nodejs#1522
shigeki pushed a commit to shigeki/node that referenced this issue Apr 30, 2015
Ensure that GC kicks in at the right times and the RSS does not blow up.

Fix: nodejs#1522
shigeki pushed a commit to shigeki/node that referenced this issue Apr 30, 2015
When connecting to server with `keepAlive` turned off - make sure that
the read/write buffers won't be kept in a single use SSL_CTX instance
after the socket will be destroyed.

Fix: nodejs#1522
shigeki pushed a commit to shigeki/node that referenced this issue Apr 30, 2015
Destroy singleUse context right after it is going out of use.

Fix: nodejs#1522
indutny added a commit that referenced this issue Apr 30, 2015
Do not keep SSL structure in memory once socket is closed. This should
lower the memory usage in many cases.

Fix: #1522
PR-URL: #1529
Reviewed-By: Shigeki Ohtsu <[email protected]>
indutny added a commit that referenced this issue Apr 30, 2015
Ensure that GC kicks in at the right times and the RSS does not blow up.

Fix: #1522
PR-URL: #1529
Reviewed-By: Shigeki Ohtsu <[email protected]>
indutny added a commit that referenced this issue Apr 30, 2015
When connecting to server with `keepAlive` turned off - make sure that
the read/write buffers won't be kept in a single use SSL_CTX instance
after the socket will be destroyed.

Fix: #1522
PR-URL: #1529
Reviewed-By: Shigeki Ohtsu <[email protected]>
indutny added a commit that referenced this issue Apr 30, 2015
Destroy singleUse context right after it is going out of use.

Fix: #1522
PR-URL: #1529
Reviewed-By: Shigeki Ohtsu <[email protected]>
indutny added a commit that referenced this issue Apr 30, 2015
Do not keep SSL structure in memory once socket is closed. This should
lower the memory usage in many cases.

Fix: #1522
PR-URL: #1529
Reviewed-By: Shigeki Ohtsu <[email protected]>
indutny added a commit that referenced this issue Apr 30, 2015
Ensure that GC kicks in at the right times and the RSS does not blow up.

Fix: #1522
PR-URL: #1529
Reviewed-By: Shigeki Ohtsu <[email protected]>
indutny added a commit that referenced this issue Apr 30, 2015
When connecting to server with `keepAlive` turned off - make sure that
the read/write buffers won't be kept in a single use SSL_CTX instance
after the socket will be destroyed.

Fix: #1522
PR-URL: #1529
Reviewed-By: Shigeki Ohtsu <[email protected]>
indutny added a commit that referenced this issue Apr 30, 2015
Destroy singleUse context right after it is going out of use.

Fix: #1522
PR-URL: #1529
Reviewed-By: Shigeki Ohtsu <[email protected]>
@silverwind
Copy link
Contributor

Fixed by 2d241b3 ... 1787416

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
https Issues or PRs related to the https subsystem. memory Issues and PRs related to the memory management or memory footprint. tls Issues and PRs related to the tls subsystem.
Projects
None yet
Development

No branches or pull requests

5 participants