-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix eventlet graceful timeout handling #1725
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. Good find.
Only one change needs to be made. The sock.close()
should be before pool.waitall()
. Otherwise, the socket will remain open by the process and therefore by the OS, leaving half open connections in the queue. We close the socket quickly so that the OS starts rejecting connections and any load balancer in front of Gunicorn can fail-over without waiting for a timeout.
Closing the listening socket this way should be safe because:
- If other workers or another Gunicorn arbiter process have it open the OS will not shut it down.
- The connection is already accepted, so it has its own socket and does not need the listener.
2dbfc3c
to
70c3b91
Compare
Thanks, that makes sense. The reason I moved
This is because the AsyncWorker However, now I understand a bit better, I can see that the I hope that makes sense. |
16badbe
to
e4db80a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! @berkerpeksag could you take a quick look, in case I missed something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to change the signature of handle()
methods? There are at least a couple of AsyncWorker
subclasses in the wild and we can't know how they call super()
in their implementations. There is a small chance that this may break third party code if they have a snippet like this:
def handle(self, listener, client, addr):
...
super().handle(listener=listener, client=client, addr=addr)
I have the same feeling, but I'm not sure what to do about it. |
77aa82e
to
e55a5e9
Compare
If we use I'm not sure of the side effects of switching the socket object we retrieve the name from, but it appears to be used only for the WSGI I've updated the PR to reflect this change. Let me know how to proceed. |
faf5e43
to
982e36b
Compare
@tilgovi @berkerpeksag any comments on this last change? |
982e36b
to
ae5c078
Compare
The `StopServer` exception can lead to a handler blocked waiting for an available greenthread to never be processed. This change ensures we attempt to handle any accepted socket connection within the graceful timeout period.
ae5c078
to
db5a7a1
Compare
I propose we remove the change in the base class and get this merged. Then, we could break the API in a separate PR for the next major release, if we want to. or We could store a dict of |
I would very much like to finish this up. The listeners should never change names so we could store them when we create the sockets and then pass them later from that cache. Any other approaches that don't result in breaking changes? We should make sure this works for every worker, not just eventlet. |
The
StopServer
exception can lead to a handler which is blocked waiting for an available greenthread to never be processed. The accepted connection is then terminated, leading to a 502 status code being received by the client due to an invalid response.This is reproducible with the following settings and a sufficient number of concurrent requests.
This change ensures we attempt to handle any accepted socket connection
within the graceful timeout period by resubmitting to the pool after receiving the exception.
In addition, we close the socket after all handling is complete, to prevent other listener errors.