-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sockets #1913
Comments
Is the anything interesting upstream of gunicorn in your pod, like a reverse-proxy, nginx? |
No, no proxies, no nginx |
I'm having the exact same problem 😕 Upstream we have HAProxy and on its HTTP log format, the session state at disconnection (see http://cbonte.github.io/haproxy-dconv/1.8/configuration.html#8.5) it logs those errors as
So, if I understand that correctly, the client closed the connection while gunicorn was still sending the response. |
Any clues what makes the client abort? Was it waiting a long time for gunicorn to send complete response headers? |
@javabrett it does not seem like that, at least on the few log messages I looked up, it is mostly images or other assets, so it should not be taking much time. The client might have closed the browser or any other action that would abruptly close the connection? 🤔 |
@gforcada are you using the proxy protocol with haproxy? |
@benoitc not that I'm aware of |
anyone get anywhere with this? I don't have much to contribute except the exact same error. My configuration consists of a loadbalancer that's being used to terminate SSL and forward requests to a django app running in a docker container. I'm not sure what the LB is implemented with - its a Digital Ocean product. I'm fairly certain it related to the load balancer because I have the same app running in another container that isn't behind an LB and its never had this problem. Any ideas on the root cause and how to prevent? |
I wonder if there's any action here. If this is a regular client disconnect, we could maybe silence the error and maybe log a disconnect in the access log, but otherwise I'm not sure what to do. |
I just had the same error which crashed our monitoring webserver:
|
I had the same with pod running docker image dpage/pgadmin4:4.2 OSError: [Errno 107] Socket not connected |
Looks very similar to: #2070 |
I'm getting this error occasionally on hosted Google Cloud Run. Below is a simplified version of our container definition:
Stackdriver shows the following stacktrace:
|
Same issue as OP here. Using Google Cloud Platform, Python 3.7, gunicorn 19.9.0
|
I'm having the exact same problem 😕 |
Exact same problem as GAEfan. Running a Flask app with Python 3.7 in App Engine Standard Env. |
Same issue here |
I'm having the same issue running Django app with Python 3.7 in Google App Engine. Traceback (most recent call last): File "/env/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 134, in handle req = six.next(parser) File "/env/lib/python3.7/site-packages/gunicorn/http/parser.py", line 41, in next self.mesg = self.mesg_class(self.cfg, self.unreader, self.req_count) File "/env/lib/python3.7/site-packages/gunicorn/http/message.py", line 181, in init super(Request, self).init(cfg, unreader) File "/env/lib/python3.7/site-packages/gunicorn/http/message.py", line 54, in init unused = self.parse(self.unreader) File "/env/lib/python3.7/site-packages/gunicorn/http/message.py", line 230, in parse self.headers = self.parse_headers(data[:idx]) File "/env/lib/python3.7/site-packages/gunicorn/http/message.py", line 74, in parse_headers remote_addr = self.unreader.sock.getpeername() OSError: [Errno 107] Transport endpoint is not connected |
Same issue running GAE python 3.7 gunicorn and fastapi/uvicorn. |
Same issue Google Cloud Run |
which kind of request are we talking about? |
same issue in google app engine. POST request. happens inconsistently. Flask app. @benoitc please let me know what info would be useful and i can post. |
Same issue as well, Google App Engine, POST request too, Flask app. It seemed to have started when I changed to a custom entrypoint code instead of letting the default one. Custom entrypoint is the following (in Google App Engine you set it inside an app.yaml file):
Default entrypoint is not setting anything (don't know what is used as default entrypoint though). Not sure if it started because of that, but I noticed this when I made this change (among other changes). |
I was using |
the question stand
what do you mean by entry point? can you post a debug log and the way the request is done? (raw http would help) |
I think he's referring to the fact that you're explicitly specifying the L.E.: Maybe the updated example over here helps: https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/appengine/standard_python37/hello_world (so basically you have to let the service handle how the server should be started) |
Firstly, @benoitc THANK YOU. Your work is awesome. I'm also experiencing this same issue on Google Cloud Run w/gunicorn. I'm posting what I have, though it's likely not unique, perusing the above. I'm running a Flask app with Gunicorn as the server (and no proxy) in a Docker container. The traceback (from GC console): File "/usr/local/lib/python3.7/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/gthread.py", line 104, in init_process
super(ThreadWorker, self).init_process()
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base.py", line 134, in init_process
self.run()
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/gthread.py", line 211, in run
callback(key.fileobj)
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/gthread.py", line 127, in accept
sock, client = listener.accept()
File "/usr/local/lib/python3.7/socket.py", line 212, in accept
fd, addr = self._accept()
OSError: [Errno 107] Transport endpoint is not connected And Google's parsed output of the above:
If there is anything else I can provide or do to help here, please let me know. |
A PR would be welcome to handle |
same behavior |
I think this might be fixed by #2277 |
A couple of socket operations can fail with ENOTCONN error if the other side of the connection is not connected anymore. In that case, let's not crash the whole worker and give a chance to accept new connections. In my case, the operation that sometimes fails is a "getpeername()", which was introduced in b07532b (v19.8.0). Someone in benoitc#1913 metionned that v19.7.1 was working fine so it matches. Fixes benoitc#1913
benoitc#2277 was branched off of master. I cherry-picked the PR's commit on top of the 20.0.4 tag of the main repo (and updated this commit message) for a custom build Do not raise and crash worker on ENOTCONN error A couple of socket operations can fail with ENOTCONN error if the other side of the connection is not connected anymore. In that case, let's not crash the whole worker and give a chance to accept new connections. In my case, the operation that sometimes fails is a "getpeername()", which was introduced in b07532b (v19.8.0). Someone in benoitc#1913 metionned that v19.7.1 was working fine so it matches. Fixes benoitc#1913
A couple of socket operations can fail with ENOTCONN error if the other side of the connection is not connected anymore. In that case, let's not crash the whole worker and give a chance to accept new connections. In my case, the operation that sometimes fails is a "getpeername()", which was introduced in b07532b (v19.8.0). Someone in benoitc#1913 metionned that v19.7.1 was working fine so it matches. Fixes benoitc#1913
In my case, Ansible's wait_for module is the cause. I use Ansible to deploy a gunicorn + flask server (specifically Python 3.6.12, gunicorn 19.9.0, Flask 1.4.1). After starting the service, i use the wait_for module to make sure the service is up and running. I guess other monitoring systems does the same. |
I got the same error .. hmm Python 3.8 With below docker properties..
Any solution? |
Hi @tilgovi |
I will make a release probably today. i will recheck this enotconn issue as i am not happy with the solution commited. @tilgovi has another fix that can be tested.
|
? |
did you test the other patch to help?
|
thanks, I am wondering know is there any update info about the pip package? |
@yehjames is master working for you? A release is planned now today. But any feedback on how master works on different platforms is welcome. |
We'll work to get a release out as soon as we can. We cannot promise a day, but we're working to figure out what remains for this release and to improve the release management for the future. |
Please use GitHub's "Watch" feature for the repository and watch for releases if you want to be notified. |
Hi. I am having the same Issue with HAProxy + Gunicorn + Django. My HAProxy backend looses almost all its servers due to checks not responded and Gunicorn logs are plagued with:
I am working with gunicorn==20.0.4, Django==3.1.5, HA-Proxy version 2.2.11-1ppa1~bionic Any clue on how to proceed? This is on TCP mode, no SSL, on Locust Stress Testing. |
Someone pls share the solution on this issue |
@krishnamanchikalapudi @ricarhincapie please upgrade to the latest release of Gunicorn :) |
Updating gunicorn on account of experiencing [this error](benoitc/gunicorn#1913)
Service is running on a kubernetes pod, and out of nowhere and without any specific causes, it happens off and on:
The text was updated successfully, but these errors were encountered: