-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
websocket_recv_nb fails under heavy load but websocket_recv is fine #1716
Comments
Bonus: this seems to have also fixed #1602 (which may be the same as #1623). #1602 is about sending single large responses over A large response that takes a long time to send is more likely to overlap with other requests or responses. Thus again it's probably not heavy load per se but multiple messages being received or sent at almost the same time which causes all these bugs. |
In above (failed) tests did you run uWSGI with |
In the tests I was hitting uWSGI directly so it would most likely have been --http. In production it's behind nginx and uses --socket. The problem appears in both places when using websocket_recv_nb. |
How did you run the tests? Could you share a testing script? |
The problem is at least partially repeatable with the demos in the tests folder. Using this demo: https://github.com/unbit/uwsgi/blob/master/tests/websockets_chat_async.py First you need to update it for current versions of uwsgi (change websocket_handshake) and then add a setInterval to repeatedly send the message: https://gist.github.com/kylemacfarlane/841efeec92d63d2396dc0cab0d2a4e85 Note that for this demo I had to set the interval to 5ms / 200rps whereas in my real project 50ms / 20rps is enough. Probably something to do with message sizes and/or network latency. Run it with:
Access the page and type something in the message box so that it starts sending. Within a few minutes (perhaps even seconds) the websocket connection will drop with |
I am interested in gevent version so I run server app: I don't observe any errors after some minutes of running this test (over 10000 websocket messages so far)... |
I run this version of uWSGI:
|
gevent.select on stable is the hardest method to make fail. In the tests above I noted that it takes 5-15 minutes at 20rps over the real network so in a simple test on the same computer you'd probably have to leave it running for ages. |
If you put a uWSGI websocket server using
websocket_recv_nb
as shown intests/websockets_chat.py
under heavy load then a couple errors start appearing. On the other hand usingwebsocket_recv
insidegevent.spawn
and applying gevent monkeypatching as shown intests/websockets_chat_2.py
appears to work fine.First of all I don't think it's caused by load per se. I think the problem is due to messages arriving at the same time which just happens more frequently under heavy load. To recreate the issue I hit the server with two messages at the same time every 100ms.
First Symptom
You will encounter "delayed" messages as reported in #1241. To solve this you can simply iterate over websocket_recv_nb until it returns None.
Second Symptom
You will start getting this error when reading from the websocket:
This was reported in #1533. How can there be no PONG response when the client and server are talking at 20 messages per second? Even aggressively reading from the websocket as much as possible doesn't solve the problem.
Third Symptom
You will start getting this error when reading from Redis:
This was reported on StackOverflow and the uWSGI maillist:
Other Errors
If you start playing around there are other errors that can happen. For example if you suppress the OSError from Redis then next time you try to read the websocket you will get an fd error from that as well. But the above are the three errors you will most commonly encounter if you copy the examples in the tests directory: https://github.com/unbit/uwsgi/tree/master/tests
Testing
I experimented with various settings and the following are the outcomes. I didn't try asyncio because the docs say it's experimental.
uwsgi.wait_fd_read
based on tests/websockets_chat_async.py--async 100 --ugreen
.
uwsgi.async_sleep
instead of waiting for any fds--async 100 --ugreen
.
gevent.select
based on tests/websockets_chat.py--gevent 100 --gevent-monkey-patch
.
uwsgi.wait_fd_read
based on tests/websockets_chat_async.py--async 100 --ugreen
.
gevent.select
based on tests/websockets_chat.py--gevent 100 --gevent-monkey-patch
KeyError: 'env'
SystemError: <built-in function uwsgi_gevent_request> returned a result with an error set
.
gevent.spawn
andrequest_context
loosely based on tests/websockets_chat_2.pyuwsgi.websocket_recv_nb
andredis.get_message
combined withgevent.sleep
--gevent 100 --gevent-monkey-patch
.
gevent.spawn
andrequest_context
loosely based on tests/websockets_chat_2.pyuwsgi.websocket_recv
andredis.listen
and hope that gevent monkeypatching prevents any actual blocks--gevent 100 --gevent-monkey-patch
.
gevent.spawn
andrequest_context
uwsgi.websocket_recv
+gevent monkeypatch
with non-blockingredis.get_message
+gevent.sleep
uwsgi.websocket_recv_nb
+gevent.sleep
with "blocking"redis.listen
+gevent monkeypatch
websocket_recv_nb
.The text was updated successfully, but these errors were encountered: