DAMN ! worker 7 (pid: 1343) died, killed by signal 11 :( trying respawn ... #1792

jedie · 2018-05-16T08:09:44Z

I'm using docker... After i switched from https://github.com/phusion/baseimage-docker (phusion/baseimage:0.10.1 with Python v3.5) to https://hub.docker.com/_/python/ (python:3.6-alpine with Python v3.6)

After this i get very often the error:

DAMN ! worker X (pid: Y) died, killed by signal 11 :( trying respawn ...

The rest of the setup is the same and used uWSGI==2.0.17

Any idea?!?

The text was updated successfully, but these errors were encountered:

robnardo · 2018-07-11T15:40:16Z

Hey @jedie - i get the same error. I am building from python:3.6-alpine as well. My ENV and CMD in Dockerfile looks like this..

ENV UWSGI_WSGI_FILE=base/wsgi.py UWSGI_HTTP=:8000 UWSGI_MASTER=1 UWSGI_WORKERS=2 UWSGI_THREADS=8 UWSGI_UID=1000 UWSGI_GID=2000

CMD ["uwsgi", "--http-auto-chunked", "--http-keepalive", "--static-map", "/media/=/code/media/", "--static-map", "/static/=/code/static/"]

I am a bit worried of using this in a PRODUCTION environment.

Rob

jedie · 2018-07-12T07:18:27Z

I switched to 3.6-slim-stretch as a "work-a-round"...

v9Chris · 2018-07-18T11:47:40Z

Also getting this all of a sudden too.

18/07/2018 12:46:08DAMN ! worker 1 (pid: 75) died, killed by signal 11 :( trying respawn ...
18/07/2018 12:46:08Respawned uWSGI worker 1 (new pid: 80)

robnardo · 2018-07-18T14:06:52Z

I found that switching the uwsgi config to only one thread makes this go away. Here is my uwsgi config (from Dockerfile)..

ENV UWSGI_WSGI_FILE=base/wsgi.py UWSGI_HTTP=:8000 UWSGI_MASTER=1 UWSGI_WORKERS=8 UWSGI_UID=1000 UWSGI_GID=2000 UWSGI_TOUCH_RELOAD=touch-reload.txt UWSGI_LAZY_APPS=1 UWSGI_WSGI_ENV_BEHAVIOR=holy

deathemperor · 2018-09-19T10:17:10Z

I can confirm that configuring it to use one thread and this goes away.

beaugunderson · 2018-10-10T04:20:19Z

I'm also seeing this on python:3.6-alpine3.7. Works with threads = 1, random 502s from signal 11s with threads = 2.

beaugunderson · 2018-10-10T16:47:39Z

python:3.7-alpine3.8 did not help but switching to python:3.7-slim-stretch did. Would prefer to use alpine but this will be our workaround for now.

zhongdixiu · 2018-11-05T02:35:43Z

Hi, I also encountered the same problem. When I run flask uwsgi to call keras (using tensorflow backend) object detection API, an error “DAMN ! worker 1 （pid: 5240）died, killed by signal 11:（trying respawn……）”. Then I try to use only one thread, but it doesn't work. Instead, another error occurs which is " !!!uWSGI process 347 got Segmentation Fault!!!". My configuration file is as follows:

Can anyone give me some helps? Thanks !

kball · 2018-11-20T19:26:55Z

I ran into a similar issue, though for me segfaults were traced down to anything that tried to use ssl (e.g. to talk to a remote API). Changing to stretch-slim seemed to resolve the issue.

cridenour · 2019-02-06T19:20:40Z

Just wanted to note I ran across this issue with python3.6:alpine-3.8 but it was solved with python3.6:alpine-3.9, using uwsgi==2.0.17.1

xeor · 2019-03-04T10:16:53Z

I'm still getting this using uwsgi 2.0.18 on alpine 3.7.. Others still having the same problem?

asyncmind0 · 2019-03-05T23:23:16Z

Still having this problem, is there a way to make uwsgi exit if this happens, I have my service configured to restart on fail.

Better than being in an inconsistent state, running but not alive

I'm using

:»  uwsgi --version                                                                                                                                              
2.0.18
:» lsb_release                                                                                                                                                                             
No LSB modules are available.

:» lsb_release -a                                                                                                                                                                          No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:        18.04
Codename:       bionic

tamentis · 2019-03-08T19:32:18Z

Can confirm that switching to Alpine 3.9 fixed that problem for me. I had the same symptoms, completely out of the blue.

One of the most significant changes in 3.9 is the return to OpenSSL (from LibreSSL), I can imagine how changing such a foundational library could make a difference. It's also completely possible that there is a looming bug somewhere in my software that is no longer triggered due to the different underlying libraries.

unbit/uwsgi#1792

Mon-ius · 2019-03-19T07:43:55Z

I also meet this problem

Python 3.7.2
uwsgi --version
2.0.18

mightydeveloper · 2019-03-29T03:06:15Z

I also meet this problem. (But strangely, in alpine 3.9)
Base image : python:3.6.8-alpine3.9
uwsgi --version : 2.0.18
but switching threads=1 helps to solve the issue.

lekksi · 2019-04-10T16:06:42Z

Getting the same with python:3.6.8-alpine3.9 and uwsgi==2.0.15

Seems to get fixed by increasing uwsgi's thread-stacksize to 512. Now rolling with 2 or more threads without workers dying.

koorukuroo · 2019-04-22T10:03:55Z

In my case, I turn off the option "enable-threads".
I'm not sure if this experience will help you.

Python version: 3.6.7 , uWSGI 2.0.18 (64bit)

aliashkar · 2019-11-12T18:44:49Z

Any update on this issue? I have also ran in to the same issue with uWSGI --version 2.0.18 and python:3.6 image.

adimux · 2019-11-15T19:41:35Z

Same issue with python:3.7-alpine-3.9. I had to switch to a different distro: debian.

robnardo · 2019-11-15T20:24:28Z

I think this error is due to uswgi config. For my Django projects, I have been using Docker (based on python:3.7-alpine) in production with no issues. Below are my Dockerfile, docker-entrypoint.sh and uswgi.ini files - which were borrowed and inspired by other online articles and research. Hope this helps other folks.

Dockerfile:

FROM python:3.7-alpine
COPY ./src/requirements.txt /requirements.txt
RUN set -ex \
	&& apk add --no-cache --virtual .build-deps \
		gcc g++ make libc-dev musl-dev linux-headers pcre-dev \
        mariadb-dev \
		openssl-dev \
		uwsgi-python3 \
	&& pip3 install --upgrade pip \
	&& pip3 install --upgrade wheel \
	&& if [ ! -e /usr/bin/pip ]; then ln -s pip3 /usr/bin/pip ; fi \
	&& if [[ ! -e /usr/bin/python ]]; then ln -sf /usr/bin/python3 /usr/bin/python; fi \
	&& LIBRARY_PATH=/lib:/usr/lib /bin/sh -c "pip install --no-cache-dir -r /requirements.txt" \
	&& runDeps="$( \
		scanelf --needed --nobanner --recursive /usr/local \
			| awk '{ gsub(/,/, "\nso:", $2); print "so:" $2 }' \
			| sort -u \
			| xargs -r apk info --installed \
			| sort -u \
	)" \
	# add dependencies to the '.python-rundeps' virtual package (we will keep these)
	&& apk add --virtual .python-rundeps $runDeps \
	&& apk del .build-deps \
	# add non-build packages..
	&& apk add mariadb-client

RUN mkdir /code
WORKDIR /code/
ADD ./src /code/

EXPOSE 8000
ENV DJANGO_SETTINGS_MODULE=_base.settings
RUN DATABASE_URL='' python manage.py collectstatic --noinput && chmod a+x /code/docker-entrypoint.sh

ENTRYPOINT ["/code/docker-entrypoint.sh"]

docker-entrypoint.sh

#!/bin/sh

while ! mysqladmin ping -h"$MYSQL_HOST" --silent; do
    echo "database is unavailable - sleeping for 2 secs"
    sleep 2
done

if [ "x$DJANGO_MANAGEPY_MIGRATE" = 'xon' ]; then
    echo 'attempting to run "migrate" ..'
    python manage.py migrate --noinput
else
    echo 'DJANGO_MANAGEPY_MIGRATE is not "on", skipping'        
fi

echo "copying mime.types to /etc dir .."
cp mime.types /etc/mime.types

echo "starting uwsgi.."
uwsgi uwsgi.ini

uwsgi.ini

[uwsgi]
strict = true
master = true
enable-threads = true
vacuum = true                        ; Delete sockets during shutdown
single-interpreter = true
die-on-term = true                   ; Shutdown when receiving SIGTERM (default is respawn)
need-app = true

disable-logging = true               ; Disable built-in logging 
log-4xx = true                       ; but log 4xx's anyway
log-5xx = true                       ; and 5xx's

harakiri = 120                       ; forcefully kill workers after XX seconds
; py-callos-afterfork = true           ; allow workers to trap signals

max-requests = 1000                  ; Restart workers after this many requests
max-worker-lifetime = 3600           ; Restart workers after this many seconds
reload-on-rss = 2048                 ; Restart workers after this much resident memory
worker-reload-mercy = 60             ; How long to wait before forcefully killing workers

cheaper-algo = busyness
processes = 64                       ; Maximum number of workers allowed
cheaper = 8                          ; Minimum number of workers allowed
cheaper-initial = 16                 ; Workers created at startup
cheaper-overload = 1                 ; Length of a cycle in seconds
cheaper-step = 8                     ; How many workers to spawn at a time

cheaper-busyness-multiplier = 30     ; How many cycles to wait before killing workers
cheaper-busyness-min = 20            ; Below this threshold, kill workers (if stable for multiplier cycles)
cheaper-busyness-max = 70            ; Above this threshold, spawn new workers
cheaper-busyness-backlog-alert = 16  ; Spawn emergency workers if more than this many requests are waiting in the queue
cheaper-busyness-backlog-step = 2    ; How many emergency workers to create if there are too many requests in the queue

wsgi-file = /code/_base/wsgi.py
http = :8000
static-map = /static/=/code/static/
uid = 1000
gid = 2000
touch-reload = /code/reload-uwsgi

jacopofar · 2019-12-02T14:08:50Z

Same problem here, using debian:buster image as a base and Python 3.7. I tried both values of enable-threads and a few others settings but it still breaks. Weird enough, the very same Docker image runs normally on my computer, but gives this obscure error on our Kubernetes cluster, so I suspect it has something to do with the kernel or the network.

I noticed that Python 3.7 is not among the officially supported ones, so I downgraded to Python 3.5 but the error manifests nonetheless.

asherp · 2019-12-09T23:09:56Z

@jacopofar I too am getting the same error on kubernetes but not when I run locally. My image is based on https://github.com/dockerfiles/django-uwsgi-nginx

awelzel · 2019-12-14T17:49:43Z

@jacopofar , @asherp, @aliashkar - any chance there is a stacktrace in the logs before the "killed by signal 11" line and could paste it here?

It would also be very helpful if you could reveal some information about your apps: Are you by any chance using psycopg2 2.7.x wheels and/or other Python wheels that ship their own libssl?

It appears there's a known issue with wheels that include their own libssl (or other libs) - see #1569 and #1590 (also this: http://initd.org/psycopg/articles/2018/02/08/psycopg-274-released/)

jacopofar · 2019-12-18T11:37:14Z

@awelzel I tried to reproduce but cannot get it anymore ¯_(ツ)_/¯

I don't remember any additional stacktrace, it only printed that message. This is my requirements.txt for that version:

uwsgi==2.0.18
boto3==1.9.67
pytest==5.2.2
pytest-cov==2.8.1
flake8==3.7.9
pandas==0.25.2
plotly==4.2.1
psycopg2-binary==2.8.3
sqlalchemy==1.2.15
dash==1.5.1
dash_auth==1.3.2
dash-bootstrap-components==0.7.2
requests==2.22.0
pyarrow==0.15.1

I'm not aware of any embedded libssl except for psycopg2, sorry for not being able to provide more details :/

eburghar · 2020-01-17T13:12:03Z

Getting the same with python:3.6.8-alpine3.9 and uwsgi==2.0.15

Seems to get fixed by increasing uwsgi's thread-stacksize to 512. Now rolling with 2 or more threads without workers dying.

It also apparently solved my use case. Is there a way to track uwsgi stack memory consumption to be sure that it happens for out of memory reason ?

wss404 · 2020-03-26T12:42:54Z

the same error occurred when i try to run a job with frequet http requests.
i guess the error should due to long timeout.
i solved it by setting much bigger harakiri value in uwsgi.ini，then it's working well.

When running in staging without this set we would see workers randomly die causing 502s `DAMN ! worker 5 (pid: 429) died, killed by signal 11 :( trying respawn ...` unbit/uwsgi#1792 (comment)

jasonTu · 2020-06-05T06:43:35Z

I'm still getting this using uwsgi 2.0.18 on alpine 3.7.. Others still having the same problem?

I met this issue at the same env

arviCV · 2021-07-01T18:42:49Z

@jacopofar I too am getting the same error on kubernetes but not when I run locally. My image is based on https://github.com/dockerfiles/django-uwsgi-nginx

I almost tried every single thing explained here. Still exactly same error occurred due to uwsgi server. This is specifically for a particular flask endpoint whenever I deployed to k83 cluster and worked perfectly in dev machine.
Surprisingly, requesting more resource fixed the issue.

resources:
  limits:
    memory: 1Gi
  requests:
    memory: 512Mi

https://github.com/unbit/uwsgi/issues/1792\#issuecomment-554514578

ylmuhaha · 2023-09-05T09:31:41Z

I have also encountered this problem. my uwsgi version is 2.0.18 . and threads per worker sets to 6
This is my analysis:
one thread ended request and called uwsgi_close_request,
and in uwsgi_close_request, it founds the worker's delta_requests reached max_requests, then it calls goodbye_cruel_world, cursed the worker and then calls simple_goodbye_cruel_world, in simple_goodbye_cruel_world wait for threads end。
However, there is a thread processing a time-consuming problem, but it is not actually stuck. And So after a reload-mercy time(for me it's 60s）, in uwsgi_master_check_mercy, it directly killed this worker."

I wonder if there is a more graceful way to handle this, for example, in simple_goodbye_cruel_world, set manage_next_request to zero before wait_for_threads ,thus stop receive requests.then in uwsgi_master_check_mercy wait for the threads to end befor killing it with signal 9. if the worker really stucsk , it can be killed by harakiri

Stephane-Ag · 2023-09-06T23:14:13Z

So far, upping threads from 1 to 4 seems to have helped for me.

toabi · 2023-09-18T12:30:21Z

I seem to have a similar issue.

alpine 3.17.5, uwsgi 2.0.22 and python 3.10.13

compiling uwsgi on alpine and copying it in my app container.
The django app works for GET requests, but when I try to do some POST requests, it fails with the segfault.

setting threads from 1 to 4 did not help
disabling threads did not help
giving it lots of resources did not help

Everything works locally on macOS/arm64. It fails in our linux/amd64 kubernetes cluster.

Sprabu4u · 2023-11-08T16:46:19Z

I am also facing similar issue with alpine 3.17.5, uwsgi 2.0.22 and python 3.10.13.

But my application works fine with lower verion python 3.9 alpine3.15. Tried all the above suggestions but no luck

Sprabu4u · 2023-11-08T17:07:44Z

I seem to have a similar issue.

alpine 3.17.5, uwsgi 2.0.22 and python 3.10.13

compiling uwsgi on alpine and copying it in my app container. The django app works for GET requests, but when I try to do some POST requests, it fails with the segfault.

setting threads from 1 to 4 did not help

disabling threads did not help

giving it lots of resources did not help

Everything works locally on macOS/arm64. It fails in our linux/amd64 kubernetes cluster.

Is this issue resolved ? Any work around on this ?

When running in staging without this set we would see workers randomly die causing 502s `DAMN ! worker 5 (pid: 429) died, killed by signal 11 :( trying respawn ...` unbit/uwsgi#1792 (comment)

jedie · 2024-09-19T05:15:11Z

nhsx/nhsx-website@eb91494 is a solution?!?

epetousis mentioned this issue Mar 21, 2019

Adding Alpine Linux 3.9 images tiangolo/uwsgi-nginx-docker#55

Merged

koorukuroo mentioned this issue Apr 22, 2019

DAMN ! worker 1 (pid: 108) died, killed by signal 9 :( trying respawn ... #1779

Open

duttonw mentioned this issue Oct 11, 2022

uwsgi updates based on qld-gov-au/opswx-ckan-cookbook#303

Merged

wp99cp mentioned this issue Nov 18, 2022

Introduces e2e Tests with Cypress cevi/automatic_walk-time_tables#135

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DAMN ! worker 7 (pid: 1343) died, killed by signal 11 :( trying respawn ... #1792

DAMN ! worker 7 (pid: 1343) died, killed by signal 11 :( trying respawn ... #1792

jedie commented May 16, 2018

robnardo commented Jul 11, 2018

jedie commented Jul 12, 2018

v9Chris commented Jul 18, 2018 •

edited

Loading

robnardo commented Jul 18, 2018

deathemperor commented Sep 19, 2018

beaugunderson commented Oct 10, 2018 •

edited

Loading

beaugunderson commented Oct 10, 2018

zhongdixiu commented Nov 5, 2018

kball commented Nov 20, 2018

cridenour commented Feb 6, 2019

xeor commented Mar 4, 2019

asyncmind0 commented Mar 5, 2019 •

edited

Loading

tamentis commented Mar 8, 2019

Mon-ius commented Mar 19, 2019

mightydeveloper commented Mar 29, 2019

lekksi commented Apr 10, 2019

koorukuroo commented Apr 22, 2019

aliashkar commented Nov 12, 2019

adimux commented Nov 15, 2019 •

edited

Loading

robnardo commented Nov 15, 2019

jacopofar commented Dec 2, 2019 •

edited

Loading

asherp commented Dec 9, 2019 •

edited

Loading

awelzel commented Dec 14, 2019

jacopofar commented Dec 18, 2019

eburghar commented Jan 17, 2020

wss404 commented Mar 26, 2020

jasonTu commented Jun 5, 2020

arviCV commented Jul 1, 2021

ylmuhaha commented Sep 5, 2023

Stephane-Ag commented Sep 6, 2023

toabi commented Sep 18, 2023 •

edited

Loading

Sprabu4u commented Nov 8, 2023

Sprabu4u commented Nov 8, 2023

jedie commented Sep 19, 2024

DAMN ! worker 7 (pid: 1343) died, killed by signal 11 :( trying respawn ... #1792

DAMN ! worker 7 (pid: 1343) died, killed by signal 11 :( trying respawn ... #1792

Comments

jedie commented May 16, 2018

robnardo commented Jul 11, 2018

jedie commented Jul 12, 2018

v9Chris commented Jul 18, 2018 • edited Loading

robnardo commented Jul 18, 2018

deathemperor commented Sep 19, 2018

beaugunderson commented Oct 10, 2018 • edited Loading

beaugunderson commented Oct 10, 2018

zhongdixiu commented Nov 5, 2018

kball commented Nov 20, 2018

cridenour commented Feb 6, 2019

xeor commented Mar 4, 2019

asyncmind0 commented Mar 5, 2019 • edited Loading

tamentis commented Mar 8, 2019

Mon-ius commented Mar 19, 2019

mightydeveloper commented Mar 29, 2019

lekksi commented Apr 10, 2019

koorukuroo commented Apr 22, 2019

aliashkar commented Nov 12, 2019

adimux commented Nov 15, 2019 • edited Loading

robnardo commented Nov 15, 2019

jacopofar commented Dec 2, 2019 • edited Loading

asherp commented Dec 9, 2019 • edited Loading

awelzel commented Dec 14, 2019

jacopofar commented Dec 18, 2019

eburghar commented Jan 17, 2020

wss404 commented Mar 26, 2020

jasonTu commented Jun 5, 2020

arviCV commented Jul 1, 2021

ylmuhaha commented Sep 5, 2023

Stephane-Ag commented Sep 6, 2023

toabi commented Sep 18, 2023 • edited Loading

Sprabu4u commented Nov 8, 2023

Sprabu4u commented Nov 8, 2023

jedie commented Sep 19, 2024

v9Chris commented Jul 18, 2018 •

edited

Loading

beaugunderson commented Oct 10, 2018 •

edited

Loading

asyncmind0 commented Mar 5, 2019 •

edited

Loading

adimux commented Nov 15, 2019 •

edited

Loading

jacopofar commented Dec 2, 2019 •

edited

Loading

asherp commented Dec 9, 2019 •

edited

Loading

toabi commented Sep 18, 2023 •

edited

Loading