-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomad Restart of Unhealthy Containers #876
Comments
@vrenjith Not yet, but this is on our roadmap. |
Any updates on where in the roadmap this is? |
Any news? |
Hey, no update on this quite yet. We are refactoring the way we do Consul registrations in 0.5.X. This will make it easier to add new features like this. |
Hi, |
Related to #164. |
it would be great to add some option to restart some of containers but leave another one in failing state. and add timeout for restart, for example during some long term operations when container should be not available to accept client connection but should be active and killed only after deadline. |
Here is a workaround. What I've done:
Consul check for the main app
With python and python-consul that was quite simple. Any custom restart logic is possible here. |
+1 |
From @samart:
|
@dadgar Is there a way to manually resolve/restart an unhealthy allocation? Because I currently have one marked unhealthy, while it is perfectly responsive (as also consul shows). When I run How can I get nomad to re-evaluate the allocation status? |
@tino Currently there is no way to restart a particular allocation. Further I think the plan is just showing that because you likely have count = 2 and max_parallel = 1. It will do 1 at a time but will replace all of them. |
This function is critical! Also it would be great to see restart limits for whole cluster, to prevent situation when service is overloaded and can't handle all requests but massive restart might cause more problems and you need to restart services one by one. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Assuming that
nomad
is configured with acheck
block while registering withconsul
so that Consul does a health check of the containers.In this scenario, if consul reports that one of the containers are not healthy, will
nomad
restart/reschedule those containers?The text was updated successfully, but these errors were encountered: