Master process should restart expired workers. #517

gnat · 2019-12-10T13:50:23Z

Uvicorn stays alive when all workers are dead. Is this intended behaviour? Could someone explain the benefit of this?

Uvicorn does not seem to restart individual workers if they are dead. Therefore, shouldn't Uvicorn itself exit when there are zero workers left?

A container orchestrator could automatically reload Uvicorn on exit, which would be awesome.

Important

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

tomchristie · 2019-12-10T15:01:53Z

Uvicorn does not seem to restart individual workers if they are dead. Therefore, shouldn't Uvicorn itself exit when there are zero workers left?

Really what we'd like to do here is have the master process restart the child processes.

gnat · 2019-12-10T15:16:24Z

Either sounds good!

Also would Uvicorn restarting workers obsolete the usage of Gunicorn + Uvicorn workers? Or is Gunicorn also providing additional value that I'm not aware of?

tomchristie · 2019-12-10T15:37:22Z

@gnat For many users, yes.

I think we'd probably want to spin the GunicornWorker out into a seperatly managed package at that point, and in general just recommend running uvicorn directly.

Gunicorn gives you some more advanced options wrt. signal handling and restarts, but most users probably don't actually need that.

sposs · 2020-06-22T08:30:07Z

Hi,
For me the status of this issue is problematic: on the one hand you tell users in the doc to use Uvicorn directly (which I did), and advertise the option --limit-max-requests (which I used for safety), but on the other hand, the master process does not restart the stopped processes, and it's recommended here to actually use Gunicorn. Which is it? Shouldn't the doc be changed to mention that the --limit-max-requests does not actually restart the stopped processes? In which case I believe a little more detail is needed to be able to manually restart them... Or, maybe the doc could indicate that this option should not be used and Gunicorn be used instead to provide the same functionality?

sposs · 2020-06-22T08:34:03Z

Actually, I should specify that I used supervisor to run the master process, with the following conf

[program:checklist_server]
command=uvicorn --workers 4 backend.asgi:application --limit-max-requests 1000
user=service
autostart=true
autorestart=true
redirect_stderr=true
stopasgroup=true
killasgroup=true

and if the child processes are stopped, as the master still runs, the service is not restarted...

diwu1989 · 2020-10-02T23:27:15Z

The current behavior is very non-obvious and not what gunicorn users expect.
Please change this to auto-restart.

cb109 · 2021-04-22T09:49:03Z

This behaviour is the opposite of what I would expect to happen. The --limit-max-requests mechanic seems helpful to avoid workers going stale e.g. with memory leaks or such. But what good is a uvicorn parent process with all worker processes dead and never restarted? This seems to be a misconception that should be fixed to restart workers automatically after they hit the request limit.

If that's not feasible or out of scope I'd vote to remove the flag altogether honestly and document how to achieve the same with running under gunicorn.

Kludex · 2021-09-28T10:28:55Z

I'll be working on this.

Kludex · 2022-05-15T18:00:50Z

I have worked on this. The PR was ready (#1205), but there was not enough motivation to review, and I lost the strength to continue with that PR.

Gunicorn is the way to go here.

gnat · 2022-05-15T21:26:03Z

With all due respect @Kludex you're awesome and we highly appreciate your effort on this, but this is not your ticket to close.

If you cannot continue your effort, this issue should go back to unassigned and/or be taken out of the 1.0 milestone, but far too many uvicorn users are encountering this and it still must be documented as an issue.

@tomchristie thoughts?

Kludex · 2022-05-16T04:17:56Z

We already document that gunicorn should be used in production if using multiple workers.

PR is welcome to make it clear that the master process don't restart the dead workers.

When I say a lost strength is about trying to convince others, not about effort...

gnat · 2022-05-16T05:42:06Z

Sorry to hear that.

We already document that gunicorn should be used in production if using multiple workers.

No offense intended, but based on how this was originally triaged, it sounds like uvicorn wasn't supposed to remain a single-worker test server into 1.0.

Before the solution was transformed into a full process manager, many folks would be quite happy with:

Workers are dead? (Or % of)
Gracefully exit Uvicorn so it can be restarted.

This is dead simple and would address the immediate issue. This would be acceptable to anyone running Uvicorn under:

systemd
docker
bash script that auto restarts uvicorn

This is the 90% solution, would be KISS, and we can revisit a full blown process manager in the future. We could go into 1.0 with confidence.

Kludex · 2022-05-16T06:18:15Z

How is that "dead simple", and how it cannot be solved using gunicorn? Single-worker test server?

You keep mentioning 1.0, but I was the one who added this issue to that milestone... There is no one else supporting this to achieve that milestone.

I'll even go further...

The way I see it, we have three possible paths:

deprecate workers feature, and motivate the gunicorn usage.
further develop workers feature (this issue).
don't do anything, document better.

As we discussed, cc @euri10 ..

mrob95 · 2024-02-25T12:17:35Z

Hi, I just hit this in production.

Due to a misconfigured caching policy our uvicorn workers were hitting memory limits and getting killed, then never recovering. Once the workers that were started on start-up ran out, the app hung indefinitely.

An easy repro for this is to put the following into a main.py:

# pip install fastapi uvicorn

from fastapi import FastAPI
import os
import signal

app = FastAPI()

@app.get("/")
def die():
    os.kill(os.getpid(), signal.SIGKILL)

Then run:

uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

When the server gets hit, one of the workers dies. Once we run out of workers the server hangs:

while true; do echo "Sending request..."; curl http://localhost:8000; done
Sending request...
curl: (52) Empty reply from server
Sending request...
curl: (52) Empty reply from server
Sending request...
curl: (52) Empty reply from server
Sending request...
curl: (52) Empty reply from server
Sending request...
... (hangs indefinitely)

There are no indications in the server logs that anything went wrong.

I have fixed this problem by using gunicorn as a process manager as recommended here:

https://fastapi.tiangolo.com/deployment/server-workers/#gunicorn-with-uvicorn-workers

I'm surprised that a bug like this would be left in a web server implementation for 4+ years. Hanging without servicing requests or exiting is pretty much the worst thing a server can do.

If a robust fix is complicated to implement, I think it would be good to at least emphasise in the start-up logs (as the flask debug server does) that uvicorn alone is not intended for production usage.

gnat · 2024-02-25T13:05:54Z

@mrob95 the lines you need in your Uvicorn to do the job: #1942

I'm surprised that a bug like this would be left in a web server implementation for 4+ years. Hanging without servicing requests or exiting is pretty much the worst thing a server can do.

Yup, it's bad.

silverjam · 2024-04-12T20:17:07Z

[..]

deprecate workers feature, and motivate the gunicorn usage.
[..]

don't do anything, document better.

FWIW we got led down this path too by a desire to use the --limit-concurrency feature to make sure we kept memory usage under control in our service and eventually ran into the issue with workers hanging mentioned by @mrob95 because we were also using --limit-max-requests.

I would recommend the following path forward to mitigate the issue:

Deprecate --workers and warn (in code and docs) that --limit-max-requests should not be used in combination with it (or explicitly disallow the combination of --limit-max-requests and --workers).
Document more explicitly a recommended approach for accessing Uvicorn specific options via Gunicorn (such as the concurrency limits) -- e.g. something like this StackOverflow post or whatever is more appropriate/idiomatic.

Kludex · 2024-05-28T07:11:36Z

This will be available in 0.30.0.

This project supports the Gunicorn web server. Gunicorn's server design includes a primary "arbiter" process that spawns "worker" processes. Workers are child processes, each with their own running server. Workers are implemented as Python classes. Custom workers can be supplied. This project also supports the Uvicorn web server. In the past, Uvicorn supplied workers for use with Gunicorn. The Uvicorn workers were not tested. The `uvicorn.workers` module was completely omitted from coverage measurement due to use of the coverage.py `include` setting to specify source files. Efforts were made to test the Uvicorn workers (encode/uvicorn#1834, encode/uvicorn#1995), but the workers were arbitrarily deprecated and moved to someone's personal project (encode/uvicorn#2302), instead of an Encode-managed project as would have been expected (encode/uvicorn#517 (comment)) Rather than introducing a production dependency on a separate Uvicorn workers package that is not managed by Encode, this commit will add the Gunicorn workers directly to this project. This commit will add the code from `uvicorn.workers` to a new module `inboard/gunicorn_workers.py`. The code will be preserved as it was prior to deprecation, with a copy of the Uvicorn license and necessary updates for compliance with the code quality settings in this project: - Ruff [UP008](https://docs.astral.sh/ruff/rules/super-call-with-parameters/) - Ruff [UP035](https://docs.astral.sh/ruff/rules/deprecated-import/) - mypy `[attr-defined]` - Module "uvicorn.main" does not explicitly export attribute "Server" - mypy `[import-untyped]` - `gunicorn.arbiter` - mypy `[import-untyped]` - `gunicorn.workers.base` - mypy `[misc]` - Class cannot subclass "Worker" (has type "Any") - mypy `[type-arg]` - Missing type parameters for generic type "dict" (on `config_kwargs`) This commit will also add tests of 100% of the Gunicorn worker code to a new module `tests/test_gunicorn_workers.py`. A test fixture starts a subprocess running Gunicorn with a Uvicorn worker and an ASGI app. The subprocess includes an instance of `httpx.Client` for HTTP requests to the Uvicorn worker's ASGI app, and saves its output to a temporary file for assertions on `stdout`/`stderr`. Tests can send operating system signals to the process. The coverage.py configuration will be updated for subprocess test coverage measurement. Changes to coverage measurement include: - Enable the required parallel mode (note that it is important to ensure the `.gitignore` ignores files named `.coverage.*` because many coverage files are generated when subprocesses are measured in parallel mode) - Set the required `COVERAGE_PROCESS_START` environment variable - Add the `coverage_enable_subprocess` package to invoke `coverage.process_startup` - Combine coverage reports before reporting coverage - Add instructions to `contributing.md` about how to omit subprocess tests Related: https://github.com/encode/uvicorn/blob/4fd507718eb0313e2de66123e6737b054088f722/LICENSE.md https://github.com/encode/uvicorn/blob/4fd507718eb0313e2de66123e6737b054088f722/uvicorn/workers.py encode/uvicorn#517 (comment) encode/uvicorn#1834 encode/uvicorn#1995 encode/uvicorn#2302 https://coverage.readthedocs.io/en/latest/subprocess.html https://docs.gunicorn.org/en/latest/design.html https://docs.gunicorn.org/en/latest/signals.html https://www.uvicorn.org/deployment/#gunicorn

tomchristie changed the title ~~Uvicorn stays alive when all workers are dead.~~ Master process should restart expired workers. Dec 10, 2019

tomchristie mentioned this issue Dec 18, 2019

[question] production configuration of gunicorn encode/starlette#642

Closed

jdavid mentioned this issue Dec 5, 2020

ASGI jdavid/django#2

Open

bkocis mentioned this issue Sep 28, 2021

Use gunicorn instead uvicorn in main fastapi/fastapi#3955

Closed

9 tasks

Kludex added this to the Version 1.0 milestone Sep 28, 2021

Kludex mentioned this issue Oct 2, 2021

Create ProcessManager #1205

Closed

Kludex closed this as completed May 15, 2022

Kludex removed this from the Version 1.0 milestone May 16, 2022

gnat mentioned this issue Mar 20, 2023

How to deploy starlette in production? encode/starlette#671

Closed

Kludex reopened this Mar 20, 2023

gnat mentioned this issue Apr 13, 2023

Simplified worker monitoring and restart, without gunicorn. #1942

Closed

br3ndonland mentioned this issue Jun 4, 2023

Test workers #1995

Closed

3 tasks

pseudotensor mentioned this issue May 26, 2024

Use gunicorn so dead workers restart unlike uvicorn h2oai/h2ogpt#1645

Merged

Kludex closed this as completed May 28, 2024

br3ndonland mentioned this issue Dec 22, 2024

Add and test Gunicorn workers br3ndonland/inboard#116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Master process should restart expired workers. #517

Master process should restart expired workers. #517

gnat commented Dec 10, 2019 •

edited by polar-sh bot

Loading

tomchristie commented Dec 10, 2019

gnat commented Dec 10, 2019

tomchristie commented Dec 10, 2019

sposs commented Jun 22, 2020

sposs commented Jun 22, 2020 •

edited

Loading

diwu1989 commented Oct 2, 2020

cb109 commented Apr 22, 2021 •

edited

Loading

Kludex commented Sep 28, 2021

Kludex commented May 15, 2022

gnat commented May 15, 2022

Kludex commented May 16, 2022

gnat commented May 16, 2022 •

edited

Loading

Kludex commented May 16, 2022

mrob95 commented Feb 25, 2024

gnat commented Feb 25, 2024

silverjam commented Apr 12, 2024

Kludex commented May 28, 2024

Master process should restart expired workers. #517

Master process should restart expired workers. #517

Comments

gnat commented Dec 10, 2019 • edited by polar-sh bot Loading

tomchristie commented Dec 10, 2019

gnat commented Dec 10, 2019

tomchristie commented Dec 10, 2019

sposs commented Jun 22, 2020

sposs commented Jun 22, 2020 • edited Loading

diwu1989 commented Oct 2, 2020

cb109 commented Apr 22, 2021 • edited Loading

Kludex commented Sep 28, 2021

Kludex commented May 15, 2022

gnat commented May 15, 2022

Kludex commented May 16, 2022

gnat commented May 16, 2022 • edited Loading

Kludex commented May 16, 2022

mrob95 commented Feb 25, 2024

gnat commented Feb 25, 2024

silverjam commented Apr 12, 2024

Kludex commented May 28, 2024

gnat commented Dec 10, 2019 •

edited by polar-sh bot

Loading

sposs commented Jun 22, 2020 •

edited

Loading

cb109 commented Apr 22, 2021 •

edited

Loading

gnat commented May 16, 2022 •

edited

Loading