-
-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lots of connections in CLOSED state #99
Comments
Good spot! It doesn't seem to occur on Linux. Are you sure it's SockJS fault? Or is it maybe a nodejs bug? |
Not sure if node's, sockjs' or my fault =) However I think i've narrowed it a bit down, sockjs.on('connection', function() {
....
this._notAuthedSockets.add( socket.remoteAddress + ':' + socket.remotePort, socket );
...
}); then I run every 30secs a function that checks wheter that connection has sent auth or not: var notAuthedKeys = this._notAuthedSockets.getAllKeys(),
currentKey = null,
currentSocket = null;
this.logger.log('info', "Checking which sockets timedout to auth, pending: " + notAuthedKeys.length);
for (var i = notAuthedKeys.length - 1; i >= 0; i--) {
currentKey = notAuthedKeys[i];
currentSocket = this._notAuthedSockets.get(currentKey);
if ( currentSocket.connectTime + this._authTimeoutSeconds < Date.now() ) {
this._failedToAuthTimeoutCount++;
//currentSocket.close();
this._notAuthedSockets.remove(currentKey);
this.logger.log('info', 'Auth timeout for ' + currentKey );
}
}; If the commented line is uncommented ( currentSocket.close() or currentSocket.end() ), the process is leaking sockets in CLOSED state, otherwise seems to work just fine, except i have connections just hanging there and doesn't do anything. Also the connection.end()/close() seems to stuck only certain connections in CLOSED state from certain hosts, my guess is some kind of proxy maybe breaking the protocol? I'm now digging in the captures of the offending hosts to find something common. |
Still, TCP/IP connections being reported by netstat as 'CLOSED' look wrong. It looks like node.js isn't close()-ing them, or doesn't detect that they were closed by remote host. |
My current theory is that websockets are not bieng properly cleaned up, so I went to And added following code: var timer2 = undefined;
var timer = setTimeout( function() {
// check socket after 20mins, in my scenario they shouldn't survive more then 5min.
if ( this.readyState === API.CLOSING ) {
console.log(this._stream._peername, "Timeout catch socket in: ", this.readyState);
// if socket in closing/closed state, wait 5 mins more mins
timer2 = setTimeout( function() {
// if socket is still in CLOSING state, lets try to close it
if ( this.readyState === API.CLOSING ) {
console.log("---setTimeout---");
console.log( this );
console.log("peer: ", this._stream._peername );
console.log("---");
this._stream.end();
} else {
console.log(this._stream._peername, "Socket ended up in: ", this.readyState);
}
}.bind(this), 5 * 60 * 1000 );
}
}.bind(this), 20 * 60 * 1000);
var oldEnd = request.socket.end;
request.socket.end = function() {
if ( timer ) {
clearTimeout( timer );
}
if ( timer2 ) {
clearTimeout( timer2 );
}
oldEnd.call(this);
}; If the socket would've cleaned up ok, it should end up in API.CLOSED, not API.CLOSING Thats not a real solution though, just a an ugly workaround =) |
@yurynix great investigation, thanks! I think some sockjs-node users have noticed similar behaviour. I've sent a message on sockjs mailing list calling for verification: https://groups.google.com/group/sockjs/browse_thread/thread/9e847dc03efe7ac8 |
Just went through the entire thread. @yurynix Are you using a proxy in front of the Node process? If so, then it might be a problem with the proxy itself. The proxy might not be closing its connection to the Node process. The netstat command is also not good enough. It scans the state of all connections on the system. Instead narrow it down to the process's port so we can know if its indeed the node process or the proxy thats causing the problem.
|
@shripadk No, sockjs running on port 80 without anything in front of it. However, here's the output you requested: edit: |
Can you try changing |
@shripadk Come on man, two lines above the readystate is set to CLOSED. This bug is about readyState being stuck at CLOSING! |
@shripadk The close function there is never called for those sockets, if it would, the socket would end in readyState === API.CLOSED, https://github.com/faye/faye-websocket-node/blob/master/lib/faye/websocket/api.js#L62 |
a) netstat outputing CLOSED != faye having socket readyState=CLOSED |
I know netstat output isn't the same as readyState=CLOSED. I already mentioned that. Please read what i said :) |
(to avoid doubts, I'm speaking about this internal |
Ah ok now i get it. I was talking about the outer close() function. |
Also I don't understand why the close() function is so convoluted. There is no need for an
Any reason why we need to check for |
I've applied your suggestion directly to compiled JS:
The code I've insterted in websocket.js at faye's lib isn't called and I don't see sockets in CLOSED state at all |
@majek yes it was for the author of faye-websocket-node (@jcoglan). checking for Anyways here is my reasoning behind why this is happening:
|
@majek Also, this issue does not exist for the Draft 75/76 parser: https://github.com/faye/faye-websocket-node/blob/master/lib/faye/websocket/draft76_parser.js#L95 as the callback is called immediately (unlike the hybi one which requires the client to send back a close frame for the stream to be closed -> which i don't think will ever happen because the client would have gone away even before the frame can be sent back. I guess we have a race condition here!). |
@shripadk I haven't dig all the code at faye-websocket yet, however from how it behaves and the packet captures i've done on the offending clients, it looks like: sockjs/sockjs-client#94 (My server running on port 80) The user establishes websocket connection but then fails to send any data (I'm sending auth token from the client onOpen, which never arrives) I think the same thing is happening here, the client attempts send the ack, but fails and the server never recives the ack leaving the socket in CLOSING state, never cleaning it up. I'll try track down those users and ask them about their network config, imo it's some broken firewall/antivirus software |
@yurynix yeah maybe its a firewall/antivirus software thats causing this issue. As far as the API.CLOSING state is concerned, it cannot go into that state without the server initiating a close. When you open a websocket connection, does the socket go into CLOSE_WAIT state immediately (NOTE: I'm talking about socket state and not inner API state now)? or does it take few hours before which it goes into CLOSE_WAIT? If its the latter, then its a result of keepalive (2 hours or whatever idle time you have set) after which the socket times out (server then initiates |
@shripadk I'm calling sockj's end() on connections that doesn't send auth after 30sec, so i think what happened is: Client connects to websocket -> I've also seen cpu constantly with the sockets leaking, |
@yurynix I also tried this fix and it seems to be working for me. There are no sockets in CLOSED state now. But the ram usage seems to increase with time when ws is used. Perhaps it is not being garbage collected, are you also experiencing this? Thanks. |
@darklrd My stats atm: ps aux: My memory growth however might be related to my code, I need firstly to rule that out. |
@yurynix No, cpu usage seems to be normal (4.6%), the only issue I am observing is that ram usage keeps on increasing continuously until I restart app. node: v0.8.15 According to my observation, the memory isn't released when connections are closed and it keeps on increasing as new connections are established. |
@darklrd What OS you're on? I'll try later today remove all my code, leaving just the connection counters and see if I'm still having cpu/memory issues. |
@yurynix Ubuntu 12.04, what about you? I would be interested in your findings. Thanks. |
@darklrd I'm on FreeBSD 9.0-RELEASE |
I think we're might also be hitting node bug: nodejs/node-v0.x-archive#3613 notice FIN_WAIT_2, that explains the memory leak if node indeed not cleaning up the sockets properly. =\ |
I see. Thank you. How long has your server been running? Is it same as you have mentioned before? I didn't keep track of FIN_WAIT_2. I will try again. |
@darklrd it's the same one, 251555seconds. How is your netstat FIN_WAIT_2 ? You see the same thing? |
My netstat (as of yesterday): FIN_WAIT2 6 I will restart the app and track it now. @yurynix is TIME_WAIT a problem here? |
BTW, the |
@darklrd TIME_WAIT/FIN_WAIT_2/otherstate is not a problem by itslef, as long as you don't have sockets stuck in some state forever, in my previous netstat output sockets in FIN_WAIT_2 was rising with time, and thats not really good, eventualy I'll run out of resources. After seeing nodejs/node-v0.x-archive#3613, I've digged a bit in libuv, from what i can see while the first one will await client acknowlgment, in my case in FIN_WAIT_2, the later will discard the socket sooner. So i changed https://github.com/faye/faye-websocket-node/blob/master/lib/faye/websocket/api.js#L64 this._stream.destroy(); And restarted, after 16 hours, netstat: So it seems to solve that one for me... Regarding memory/cpu, too soon to say if it was related. @darklrd Is the amount of TIME_WAIT sockets in your server rised since yesterday or remained constant? |
@yurynix I will try this code change this weekend. TIME_WAIT sockets was almost half of ESTABLISHED sockets always. |
Fixed in 0.3.5 |
@yurynix any update on memory usage? Thanks. |
ok, I am going to try this code change and will report back how it goes, thanks! |
@shripadk
the first issue @majek handled when he passing the false to the ack parameter, the second issue however can't be handled from sockjs code and faye-ws code should be changed to call destroy() on the socket when the server initiating the socket close, this is however not a polite method to close a tcp connection. So not sure if that should be handled at faye-ws or in node's code. @darklrd The memory usage seems to be ok, rising up to ~350MB when i'm at ~2k connections, dropping to 250MB when i'm down to 1k connections at late night. Don't think there's a problem here, but too soon to say, i've restarted again to try @majek 's fix to #103 (still too soon to say if it's stable, just running for a day or so, i'll post stats after the weekend) |
@yurynix I was earlier using [email protected]. I switched to its master branch now (but my node.js version still is 0.8.15). Memory RSS (image below) increases continuously although number of connections is constant. Sockets in TIME_WAIT and FIN_WAIT2 state also appear to be constant. Yes, after upgrading to master branch of faye-websocket my cpu usage is higher now and is increasing continuously. |
@darklrd Just wanted to let you know, running node with --nouse_idle_notification fixed my high cpu load. |
@yurynix Thank you so much! Unfortunately I am still struggling with memory issue but thanks again! |
Problem still exists on FreeBSD 9.1 amd64 (sockjs 0.3.5). After few days of server work, I get error about file descriptors limit. events.js:71 Here is an output of "netstat -an -p tcp | awk '{print $6}' | sort | uniq -c | sort -n" netstat: kvm not available: /dev/mem: Permission denied any ideas? Or maybe ugly workaround that should work :( |
@Hashi101 kern.maxfiles |
@yurynix Also I haven't access to check kern.maxfiles and maxfilesperproc. But "ulimit -n" shows 8000. |
@yurynix Are you sure you're running code with this patch: #99 (comment) |
@majek
at https://github.com/faye/faye-websocket-node/blob/master/lib/faye/websocket/api.js#L57 When I had a problem i was hitting much more FIN_WAIT_2 sockets then @Hashi101 so I don't know if it's the same issue. |
@yurynix how much more FIN_WAIT_2 you had? netstat: kvm not available: /dev/mem: Permission denied |
@Hashi101 If in your case sometimes the server initiates closing of the socket, discussion about it you can find here: faye/faye-websocket-node#19 another solution i would try is to deploy sockjs behind haproxy, that's my setup now, on FreeBSD 9.0-RELEASE-p6, i'm not seeing any fin_wait_2 issues, but then again, i'm running with the patch above. |
I've already run with this patch for faye-websocket. I mean just "this._stream.destroy();" instead of this._stream..end(); Now, instead of it I'm gonna try: if (!ack) { but I doubt it would change anything. Anyway. Can I run HAProxy with only one IP? I mean HAProxy on different port, and SockJS workers on differents ports. EDIT: I've update my faye-websockets, and in this new version of it, still was "this._stream.end();" so now I've changed it to .destroy(); hope it helps :) I will tell about it after weekend. EDIT2: Everything seems to work well. |
Hi,
I'm on FreeBSD9, node 0.8.15, sockjs 0.3.4
After the application is running for a few hours, i have lots of connections with CLOSED state in netstat,
after killing the nodejs process, this conncetions dissapear, when i had a lower fds limit the process died with "accept EMFILE", same stacktrace as #94 , not sure if thats a node issue or sockjs issue not disposing sockets properly
Here is output after the app ran for ~18 hours:
netstat -an -p tcp | awk '{print $6}' | sort | uniq -c | sort -n
echo
sysctl kern.openfiles
output =>
4 LISTEN
29 FIN_WAIT_1
73 TIME_WAIT
78 FIN_WAIT_2
489 CLOSE_WAIT
3454 ESTABLISHED
14590 CLOSED
kern.openfiles: 18652
Not sure how to debug it further, any advice?
The text was updated successfully, but these errors were encountered: