Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRITICAL: µwsgi segfaulting causing infinite crashing #581

Closed
Datamance opened this issue Oct 5, 2020 · 3 comments
Closed

CRITICAL: µwsgi segfaulting causing infinite crashing #581

Datamance opened this issue Oct 5, 2020 · 3 comments
Assignees
Labels
Bug [Work Type] An issue with the program Diagnosis [Work Type] Requires investigation as to how the bug occurs Important [Priority] Affects/effects critical functionality

Comments

@Datamance
Copy link
Contributor

After the staging deploy triggered by the changelog introduction, µwsgi logs are showing some very concerning messages:

uwsgi /usr/local/lib/libpython3.8.so.1.0(+0x12b099) [0x7f670baa5099]
uwsgi /usr/local/lib/libpython3.8.so.1.0(+0x12923b) [0x7f670baa323b]
uwsgi /usr/local/lib/libpython3.8.so.1.0(PyObject_CallMethod+0xb2) [0x7f670baa3072]
uwsgi uwsgi(+0xb1664) [0x55f6fa4a7664]
uwsgi uwsgi(uwsgi_ignition+0x112) [0x55f6fa4805f2]
uwsgi uwsgi(uwsgi_worker_run+0x25e) [0x55f6fa484e2e]
uwsgi uwsgi(uwsgi_run+0x434) [0x55f6fa485384]
uwsgi uwsgi(+0x3cf3e) [0x55f6fa432f3e]
uwsgi /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7f670b7db09b]
uwsgi uwsgi(_start+0x2a) [0x55f6fa432f6a]
uwsgi *** end of backtrace ***
uwsgi DAMN ! worker 1 (pid: 163) died :( trying respawn ...
uwsgi Respawned uWSGI worker 1 (new pid: 164)
uwsgi *** running gevent loop engine [addr:0x55f6fa4a7100] ***
uwsgi !!! uWSGI process 164 got Segmentation Fault !!!
uwsgi *** backtrace of 164 ***
uwsgi uwsgi(uwsgi_backtrace+0x2a) [0x55f6fa48006a]
uwsgi uwsgi(uwsgi_segfault+0x23) [0x55f6fa480423]
uwsgi /lib/x86_64-linux-gnu/libc.so.6(+0x37840) [0x7f670b7ee840]

Builds are also showing very concerning messages concerning greenlet

Step #4 - "run-tests": <frozen importlib._bootstrap>:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
Step #4 - "run-tests": <frozen importlib._bootstrap>:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
Step #4 - "run-tests": <frozen importlib._bootstrap>:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject

This situation is in progress, I am working on fixing the build

@Datamance Datamance added Bug [Work Type] An issue with the program Diagnosis [Work Type] Requires investigation as to how the bug occurs Important [Priority] Affects/effects critical functionality labels Oct 5, 2020
@Datamance Datamance self-assigned this Oct 5, 2020
@Datamance
Copy link
Contributor Author

On the bad build, version bump:

Step #3 - "build-image": Collecting greenlet>=0.4.16; platform_python_implementation == "CPython"
Step #3 - "build-image":   Downloading greenlet-0.4.17-cp38-cp38-manylinux1_x86_64.whl (48 kB)

On the last good build:

Step #3 - "build-image": Collecting greenlet>=0.4.16; platform_python_implementation == "CPython"
Step #3 - "build-image":   Downloading greenlet-0.4.16-cp38-cp38-manylinux1_x86_64.whl (48 kB)

Fun. This is what happens when we have non-deterministic builds. It looks like the greenlet project merged the 0.4.17 tag 13 days ago, so the unpinned transitive requirement changed under our feet and borked µwsgi.

Here is the related bug filed with the greenlet team.

I've scaled the replicas for the web service down to 0 for now. Will issue a bugfix shortly just pinning to the downgraded version of greenlet.

Datamance added a commit that referenced this issue Oct 5, 2020
@Datamance
Copy link
Contributor Author

Actually, let's try upgrading gevent and pinning greenlet to 0.4.17 first.

@Datamance
Copy link
Contributor Author

"Fixed" and by "fixed" I mean this is a stop gap solution until we have something like #562 to prevent transitive dependency drift from screwing up future builds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug [Work Type] An issue with the program Diagnosis [Work Type] Requires investigation as to how the bug occurs Important [Priority] Affects/effects critical functionality
Projects
None yet
Development

No branches or pull requests

1 participant