-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: How to Gracefully Shutdown Workers? #190
Comments
After digging into the source code a bit more, it seems like the If so, then I suppose it'd be a matter of configuring Kubernetes to have a long enough
|
Hi, you might find this article helpful. https://ellispritchard.medium.com/graceful-shutdown-on-kubernetes-with-signals-erlang-otp-20-a22325e8ae98 |
Thanks @jeremyowensboggs, that works nicely for apps exposed via a Kubernetes Ingress or Service. Failing the health check probes automatically prevents more incoming traffic, which allows processes to drain. It's still unclear to me how to achieve a similar result with |
Just checking back in, is there a way to configure how long a job has to finish it's work when the workers receive an exit/down signal before it's forcefully terminated? |
Hi, no, that hasn't been built in to the library yet. |
I am open to a PR on this. |
Thanks @Ch4s3, I'll see if I can't cook one up one of these nights/weekends when I find some spare time. I think it'd be a great addition to the library. |
awesome, let us know. I'll try to keep an eye out |
Hello, this is partially an Elixir question (and please forgive me for my ignorance) and partially (I believe) a question particularly relevant to how this library is structured.
tl;dr: I'm looking for guidance on the best way to take a SIGTERM or
System.stop()
anddisable_fetch
config setting fromfalse
totrue
, live.FaktoryWorker.Job
processes to finish their workFaktoryWorker
supervisorin that general order.
Long version:
I run a Faktory client on Kubernetes and perform rolling updates where a new pod/container spins up and then Kubernetes sends a SIGTERM to the old pod/container telling it to shutdown.
My goal is to have the old container gracefully finish in-flight jobs before shutting down. My jobs involve transferring large amounts of data. Arbitrarily killing/restarting during deploys can cause downstream problems (resumable streams are not always an option).
Excluding all of the Kubernetes configs to make graceful, rolling deployments happen, I'd like to know how to take a SIGTERM to my Elixir application and in turn disable the fetching of new jobs from Faktory, as well as trap the kill message in my
FaktoryWorker.Job
processes to give them enough time to finish.All the examples I've found online for doing this relate to simple GenServers and I'm not positive how to translate them to this library.
For context:
This video shows my desired outcome, but for HTTP requests and GenServers. https://www.youtube.com/watch?v=cbCgB9F6RrM
This forum posts talks about trapping exit messages, but I think this library hardcodes a
:brutal_kill
signal, which wouldn't work? https://elixirforum.com/t/graceful-shutdown-on-sigterm/23780/2Any and all help is appreciated!
The text was updated successfully, but these errors were encountered: