-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP: New pod restartPolicy to restart the whole pod instead of just a container #2342
Conversation
/sig node |
/ok-to-test |
We have a wrong suggestion in the verify test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really want to restart all containers when any container exits?
We've discussed in the past the idea of a "keystone" or "primary" container which would govern the whole pod's lifecycle. For restart-all pods, this would be the one that would trigger the teardown. For jobs, this would be the one that exits and everything else can be assume to be complete.
@smarterclayton another case of desire for sub-pod orchestration
- "@smarterclayton" | ||
approvers: | ||
- "@liggitt" | ||
- "@derekwaynecarr" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thockin No. I agree, it may not be desirable to restart the pod if any container exits. I like the keystone or primary container approach. It would be better than my proposal here. Is there any consensus on this approach? |
I definitely see a need for "exit all containers if primary container
exits: a Job with a sidecar.
So, that would be another reason to define a primary.
…On Thu, Jul 5, 2018 at 9:16 PM Amshuman K R ***@***.***> wrote:
@thockin <https://github.com/thockin> No. I agree, it may not be
desirable to restart the pod if any container exits. I like the *keystone*
or *primary* container approach. It would be better than my proposal
here. Is there any consensus on this approach?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2342 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHuudgKb1hT4Nt6VLOJ9GKrdxGgY2xIeks5uDuSngaJpZM4VB_oB>
.
|
keps/sig-node/draft-20180702-pod-restartpolicy-to-restart-whole-pod.md
Outdated
Show resolved
Hide resolved
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This KEP seems to be the relevant one for primary/sidecar container feature. |
@thockin @erictune I raised this requirement of re-executing init-containers when the primary or any non-sidecar container terminates on KEP 0008. But from the response I understand that such a requirement is currently not considered there. What would you recommend? Is there any chance this requirement can be added to KEP 0008? Or should I update this KEP as an additional requirement on KEP 0008? |
/kind kep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @kow3ns
@@ -0,0 +1,125 @@ | |||
--- | |||
kep-number: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please update the KEP to the latest and increment that file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattfarina Thanks for the review!
Actually, I am in favour of closing this KEP if KEP 0008 can be expanded to include the case of RestartPolicyAlways
.
That is, if there are some sidecar containers and one or more primary containers and the restartPolicy
is Always
then if any of the primary containers terminate then the whole pod is restarted; not just the terminated container.
If the above requirement can be included in the scope of KEP 0008 then this KEP would be redundant.
@derekwaynecarr @dchen1107 out of curiosity, where does this stand in sig-node right now? |
FYI, We've added this to the agenda for Mondays SIG Apps to discuss. |
|
||
But both the `OnFailure` as well as the `Always` restart policies restart the individual containers in question and not the whole pod. This is, for the most part, desirable, even optimal. | ||
|
||
However, there are scenarios (some documented in [this issue][issue]) where the many containers in the pod (including init containers) might be interlinked or inter-dependent in such a way as to require closer co-ordination when any one of its containers are restarted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amshuman-kr We discussed this in SIG Apps this week. The motivation for adding this was not completely understood. Could you please expand on user stories (see the template) that outline situations that this would be used for. This will help others understand the need and now this can help with it.
REMINDER: KEPs are moving to k/enhancements on November 30. Please attempt to merge this KEP before then to signal consensus. Any questions regarding this move should be directed to that thread and not asked on GitHub. |
|
||
### Risks and Mitigations | ||
|
||
The `restartPolicy` or `AlwaysPod` would be a new value for an existing field in the pod specefication. So, the question of backward compatibility may not apply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not be backwards compatible for old kubelets, so we can’t change restartPolicy in v1 of the api. Something like “restartAffects: Pod” or a lifecycle hook would be necessary. It’s really orthogonal to restart policy and could be used with OnFailure or Always.
KEPs have moved to k/enhancements. Any questions regarding this move should be directed to that thread and not asked on GitHub. |
@justaugustus: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Just to clarify: this PR was simply closed and not moved anywhere? |
I have a use case where the initContainer creates a one time use token that the application in the regular container consumes at startup. If the regular container is restarted it needs the initContainer to restart as well in order to generate a new token. This could solve my problem. @mattfarina How I can I get this issue re-visited? |
This happens to line up with the larger topic of pod and container lifecycle. If we are to pursue this idea we need a shepherd from sig-node who can make sure the overall lifecycle is considered. |
I have a use case of a statefulset that when one containers fails it takes hours to become healthy again but if I restart the whole pod on failure it recovers within minutes. |
Restart pod if a container fails would be helpful in case if something wrong on the Kubernetes level, not an application. For example - kubernetes/kubernetes#105933
But, despite the nature of the case, users are more likely to restart the pod than look for how to solve the problem in the configuration or application, that is not good. |
@amshuman-kr do you want to open an issue for this in k/enhancements? |
@szh The discussions on container life-cycle enhancement were too fragmented and unfortunately, I didn't have the bandwidth to follow up across various semi-related discussions. Hence, I have abandoned this enhancement. |
OK. I just added kubernetes/enhancements#3676 to track it. |
A KEP for the PR that addresses the kubernetes issue #52345.