Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.io 1.0.X - upgrade fails in multi process cluster #1636

Closed
vikram-chaitanya opened this issue Jun 19, 2014 · 9 comments
Closed

socket.io 1.0.X - upgrade fails in multi process cluster #1636

vikram-chaitanya opened this issue Jun 19, 2014 · 9 comments

Comments

@vikram-chaitanya
Copy link

Hi,

We are using multi process cluster for serving multiple clients and the post request used for upgrading the polling to websocket fails with 400 (Bad Request). We use haproxy with sticky session to load balance the multiple server each hosting multiple processes. This is the log from the haproxy :

Jun 19 02:05:45 ip6-localhost haproxy[21221]: x.x.x.x:9845 [19/Jun/2014:02:05:45.227] http-in agent/agent-4003 304/0/1/3/602 200 379 CHATSESS=agent-4003 - --VN 3/3/0/0/0 0/0 "GET /socket.io/?i=3255&s=f828b300f736cae19ca2ca90c807fc6fzafb0034af016c5db1b281aa5520c43c5fc9faa777af06558a40954d5cca6b2d2551b788aac19a3fc817c35e7420a337&siteid=8cc4c7704b276dee7791110f4bb545b2&c=fd&a=1&EIO=2&transport=polling&t=1403143544267-0 HTTP/1.1"

Jun 19 02:05:46 ip6-localhost haproxy[21221]: x.x.x.x:9846 [19/Jun/2014:02:05:45.409] http-in agent/agent-4003 472/0/0/1/771 200 281 CHATSESS=agent-4003 - --VN 2/2/0/0/0 0/0 "GET /socket.io/?i=3255&s=f828b300f736cae19ca2ca90c807fc6fzafb0034af016c5db1b281aa5520c43c5fc9faa777af06558a40954d5cca6b2d2551b788aac19a3fc817c35e7420a337&siteid=8cc4c7704b276dee7791110f4bb545b2&c=fd&a=1&EIO=2&transport=polling&t=1403143545864-2&sid=x6BUCzp1dUEIfurYAAAC HTTP/1.1"

Jun 19 02:05:47 ip6-localhost haproxy[21221]: x.x.x.x:9849 [19/Jun/2014:02:05:46.481] http-in agent/agent-4001 319/0/0/2/623 400 301 - - --NI 6/6/2/0/0 0/0 "OPTIONS /socket.io/?i=3255&s=f828b300f736cae19ca2ca90c807fc6fzafb0034af016c5db1b281aa5520c43c5fc9faa777af06558a40954d5cca6b2d2551b788aac19a3fc817c35e7420a337&siteid=8cc4c7704b276dee7791110f4bb545b2&c=fd&a=1&EIO=2&transport=polling&t=1403143545860-1&sid=x6BUCzp1dUEIfurYAAAC HTTP/1.1"

Jun 19 02:05:47 ip6-localhost haproxy[21221]: x.x.x.x:9847 [19/Jun/2014:02:05:45.452] http-in agent/agent-4003 758/0/0/843/1903 200 280 CHATSESS=agent-4003 - --VN 5/5/2/1/0 0/0 "GET /socket.io/?i=3255&s=f828b300f736cae19ca2ca90c807fc6fzafb0034af016c5db1b281aa5520c43c5fc9faa777af06558a40954d5cca6b2d2551b788aac19a3fc817c35e7420a337&siteid=8cc4c7704b276dee7791110f4bb545b2&c=fd&a=1&EIO=2&transport=polling&t=1403143546194-3&sid=x6BUCzp1dUEIfurYAAAC HTTP/1.1"

Jun 19 02:05:47 ip6-localhost haproxy[21221]: x.x.x.x:9851 [19/Jun/2014:02:05:46.526] http-in agent/agent-4003 40/0/1/2/925 101 141 CHATSESS=agent-4003 - --VN 4/4/1/0/0 0/0 "GET /socket.io/?i=3255&s=f828b300f736cae19ca2ca90c807fc6fzafb0034af016c5db1b281aa5520c43c5fc9faa777af06558a40954d5cca6b2d2551b788aac19a3fc817c35e7420a337&siteid=8cc4c7704b276dee7791110f4bb545b2&c=fd&a=1&EIO=2&transport=websocket&sid=x6BUCzp1dUEIfurYAAAC HTTP/1.1"

Jun 19 02:05:47 ip6-localhost haproxy[21221]: x.x.x.x:9850 [19/Jun/2014:02:05:46.497] http-in agent/agent-4002 662/0/1/2/957 400 301 - - --NI 3/3/0/0/0 0/0 "OPTIONS /socket.io/?i=3255&s=f828b300f736cae19ca2ca90c807fc6fzafb0034af016c5db1b281aa5520c43c5fc9faa777af06558a40954d5cca6b2d2551b788aac19a3fc817c35e7420a337&siteid=8cc4c7704b276dee7791110f4bb545b2&c=fd&a=1&EIO=2&transport=polling&t=1403143547159-4&sid=x6BUCzp1dUEIfurYAAAC HTTP/1.1"

As you can see the post request fired has the method type as OPTIONS. If you see the first two requests they has the sticky session cookie CHATSESS=agent-4003 attached to the request, but for some reason the OPTIONS post request fired doesn't have the sticky session cookie so it was redirected to a different server/process agent-4004.

If i use single process it works perfect, but only when change back to multiprocess this happens. I am not sure if this is happening because the OPTIONS post request gets fired quickly before even the other two gets completed?

Is there anyway for me to tell the post request also to use the same cookie or if it helps delay the post request so it can pick up the cookie from the earlier request.

@vikram-chaitanya
Copy link
Author

I investigated a little more and found that the the preflight request before the upgrade POST request doesn't set any cookies including the ones that control where the request should be going.

So the OPTIONS preflight request ended up going to a different process and when it tried to get the list of clients to verify the sid value the list was empty.

It looks like the client information connected to process 1 was not synchronized with process 2 in the same machine. I thought socket.io would emit some kind of event to inform other processes that a client has connected so that they can put the clients info in memory.

Is this how its supposed to work or is there something else that is blocking the synchronization of this clients information?

@timfpark
Copy link

timfpark commented Jul 2, 2014

+1 I'm seeing this with Azure load balancers as well. Same symptoms: Works in single process, fails with multi process with socket.io-redis.

@respectTheCode
Copy link

+1 same thing here.

Is there a work around?

@samuelngs
Copy link

+1 same thing here too.
No solution for this?

@soyuka
Copy link

soyuka commented Aug 28, 2014

#1723 related (slightly a better use-case for this)

See Unitech/pm2#637 discussion about socket.io and clusters.

@respectTheCode
Copy link

The best solution I have come up with is to give each instance (socket server) a different port and store the address of the instance with the number of clients connected to it in redis. Then have the client ask the a separate node app (socket manager) for a socket server address before connecting or reconnecting. The socket manager then picks the socket.io server with the least connections. It's not pretty but it does work and it can be scaled.

@darrachequesne
Copy link
Member

That issue was closed automatically. Please check if your issue is fixed with the latest release, and reopen if needed (with a fiddle reproducing the issue if possible).

@cyrilchapon
Copy link

Please this needs to be reopened. Stateful library in 2017 is a real pain to make it work with the real world..

Redis stuff could be integrated for handshake and upgrade stuff couldn't it ?

@darrachequesne
Copy link
Member

@cyrilchapon please see my answer #2140 (comment)

Note: Currently, sticky session can be achieved with either ip hash (https://socket.io/docs/using-multiple-nodes/) or by using a cookie (https://github.com/socketio/socket.io/tree/master/examples/cluster-haproxy).

Note²: I think SockJS has the same limitation: https://github.com/sockjs/sockjs-node#sticky-sessions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants