-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
generate systemd: do not set KillMode
#8889
Conversation
@giuseppe @msekletar PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
KillMode
KillMode
Marked as WIP as I want to perform some manual tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, rhatdan, vrothberg The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
`KillMode=none` has been deprecated in systemd and is now throwing big warnings when being used. Users have reported the issues upstream (see containers#8615) and on the mailing list. This deprecation was mainly motivated by an abusive use of third-party vendors causing all kinds of undesired side-effects. For instance, busy mounts that delay reboot. After talking to the systemd team, we came up with the following plan: **Short term**: we can use TimeoutStopSec and remove KillMode=none which will default to cgroup. **Long term**: we want to change the type to sdnotify. The plumbing for Podman is done but we need it for conmon. Once sdnotify is working, we can get rid of the pidfile handling etc. and let Podman handle it. Michal Seklatar came up with a nice idea that Podman increase the time out on demand. That's a much cleaner way than hard-coding the time out in the unit as suggest in the short-term solution. This change is executing the short-term plan and sets a minimum timeout of 60 seconds. User-specified timeouts are added to that. Fixes: containers#8615 Signed-off-by: Valentin Rothberg <[email protected]>
KillMode
KillMode
@containers/podman-maintainers PTAL |
/lgtm |
ExecStartPre=/bin/rm -f %t/jadda-jadda.pid %t/jadda-jadda.ctr-id | ||
ExecStart=/usr/bin/podman run --conmon-pidfile %t/jadda-jadda.pid --cidfile %t/jadda-jadda.ctr-id --cgroups=no-conmon -d --replace --name jadda-jadda --hostname hello-world awesome-image:latest command arg1 ... argN "foo=arg \"with \" space" | ||
ExecStop=/usr/bin/podman stop --ignore --cidfile %t/jadda-jadda.ctr-id -t 42 | ||
ExecStop=/usr/bin/podman stop --ignore --cidfile %t/jadda-jadda.ctr-id -t 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have TimeoutStopSec
now, what's the point of leaving ExecStop
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want Podman to take care of stopping and to clean up properly. TimeoutStopSec
is the big hammer of systemd to nuke everything in the cgroup and may leave resources behind.
Currently with KillMode=none, if podman -t hangs, the command repeats leaving the process and its cgroup around. But if a stop/start command hangs, we should not start another one. Instead time it out properly via TimeoutStopSec set to the 2x of the timeout given to the managed podman action. Then if it expires, kill its cgroup all the way (KillMode=control-group is a default) before rerunning the same operation. Also note that using KilMode process is not recommended by systemd man pages. Increase the grace stop timeout defaults 10->42s to align it with: containers/podman#8889 Closes-bug: #1945791 Change-Id: Iefe861f91cefe2a9cf773cae98b2440566ae8b5e Signed-off-by: Bogdan Dobrelya <[email protected]>
* Update tripleo-ansible from branch 'master' to badd5a3c10f0af9e1c69a9b88064e743a36d4f53 - Wrap stopping podman -t with systemd timeouts Currently with KillMode=none, if podman -t hangs, the command repeats leaving the process and its cgroup around. But if a stop/start command hangs, we should not start another one. Instead time it out properly via TimeoutStopSec set to the 2x of the timeout given to the managed podman action. Then if it expires, kill its cgroup all the way (KillMode=control-group is a default) before rerunning the same operation. Also note that using KilMode process is not recommended by systemd man pages. Increase the grace stop timeout defaults 10->42s to align it with: containers/podman#8889 Closes-bug: #1945791 Change-Id: Iefe861f91cefe2a9cf773cae98b2440566ae8b5e Signed-off-by: Bogdan Dobrelya <[email protected]>
* Update tripleo-ansible from branch 'master' to fdfc59cd1d1024d132e6046f3ba4746fed23c8c2 - Remove execpost and align systemd timeouts Align systemd timeouts to podman[1]. Remove stop command with ExecStopPost command, because it won't help much to run same command twice if it's stuck. Set default time for stop command to 50 if not defined otherwise. In case it's stuck, systemd will nuke everything (TimeoutStopSec=60) 60 seconds = 50 seconds of trying to stop + 10 seconds for avoiding the race. [1] containers/podman#8889 Change-Id: Ife90b6c45b7f4a3fd829b60f9aa785fc580a31b8
Align systemd timeouts to podman[1]. Remove stop command with ExecStopPost command, because it won't help much to run same command twice if it's stuck. Set default time for stop command to 50 if not defined otherwise. In case it's stuck, systemd will nuke everything (TimeoutStopSec=60) 60 seconds = 50 seconds of trying to stop + 10 seconds for avoiding the race. [1] containers/podman#8889 Change-Id: Ife90b6c45b7f4a3fd829b60f9aa785fc580a31b8
Currently with KillMode=none, if podman -t hangs, the command repeats leaving the process and its cgroup around. But if a stop/start command hangs, we should not start another one. Instead time it out properly via TimeoutStopSec set to the 2x of the timeout given to the managed podman action. Then if it expires, kill its cgroup all the way (KillMode=control-group is a default) before rerunning the same operation. Also note that using KilMode process is not recommended by systemd man pages. Increase the grace stop timeout defaults 10->42s to align it with: containers/podman#8889 Closes-bug: #1945791 Change-Id: Iefe861f91cefe2a9cf773cae98b2440566ae8b5e Signed-off-by: Bogdan Dobrelya <[email protected]> (cherry picked from commit badd5a3)
Currently with KillMode=none, if podman -t hangs, the command repeats leaving the process and its cgroup around. But if a stop/start command hangs, we should not start another one. Instead time it out properly via TimeoutStopSec set to the 2x of the timeout given to the managed podman action. Then if it expires, kill its cgroup all the way (KillMode=control-group is a default) before rerunning the same operation. Also note that using KilMode process is not recommended by systemd man pages. Increase the grace stop timeout defaults 10->42s to align it with: containers/podman#8889 Closes-bug: #1945791 Change-Id: Iefe861f91cefe2a9cf773cae98b2440566ae8b5e Signed-off-by: Bogdan Dobrelya <[email protected]> (cherry picked from commit badd5a3)
Currently with KillMode=none, if podman -t hangs, the command repeats leaving the process and its cgroup around. But if a stop/start command hangs, we should not start another one. Instead time it out properly via TimeoutStopSec set to the 2x of the timeout given to the managed podman action. Then if it expires, kill its cgroup all the way (KillMode=control-group is a default) before rerunning the same operation. Also note that using KilMode process is not recommended by systemd man pages. Increase the grace stop timeout defaults 10->42s to align it with: containers/podman#8889 Closes-bug: #1945791 Change-Id: Iefe861f91cefe2a9cf773cae98b2440566ae8b5e Signed-off-by: Bogdan Dobrelya <[email protected]> (cherry picked from commit badd5a3)
Currently with KillMode=none, if podman -t hangs, the command repeats leaving the process and its cgroup around. But if a stop/start command hangs, we should not start another one. Instead time it out properly via TimeoutStopSec set to the 2x of the timeout given to the managed podman action. Then if it expires, kill its cgroup all the way (KillMode=control-group is a default) before rerunning the same operation. Also note that using KilMode process is not recommended by systemd man pages. Increase the grace stop timeout defaults 10->42s to align it with: containers/podman#8889 Closes-bug: #1945791 Change-Id: Iefe861f91cefe2a9cf773cae98b2440566ae8b5e Signed-off-by: Bogdan Dobrelya <[email protected]> (cherry picked from commit badd5a3c10f0af9e1c69a9b88064e743a36d4f53)
1) Currently with KillMode=none, if podman -t hangs, the command repeats leaving the process and its cgroup around. But if a stop/start command hangs, we should not start another one. Instead time it out properly via TimeoutStopSec set to the 2x of the timeout given to the managed podman action. Then if it expires, kill its cgroup all the way (KillMode=control-group is a default) before rerunning the same operation. Also note that using KilMode process is not recommended by systemd man pages. Increase the grace stop timeout defaults 10->42s to align it with: containers/podman#8889 2) Make KillMode configurable for service units Libvirt container is a special beast that needs custom killmode value for its tripleo-manager service unit. Add kill_mode for the container values that defaults to control-group. 3) Fix TimeoutStopSec to integer convertion Related: rhbz#2010135 Closes-bug: #1945791 Change-Id: Iefe861f91cefe2a9cf773cae98b2440566ae8b5e Signed-off-by: Bogdan Dobrelya <[email protected]> (cherry picked from commit badd5a3)
KillMode=none
has been deprecated in systemd and is now throwing bigwarnings when being used. Users have reported the issues upstream
(see #8615) and on the mailing list.
This deprecation was mainly motivated by an abusive use of third-party
vendors causing all kinds of undesired side-effects. For instance, busy
mounts that delay reboot.
After talking to the systemd team, we came up with the following plan:
Short term: we can use TimeoutStopSec and remove KillMode=none which
will default to cgroup.
Long term: we want to change the type to sdnotify. The plumbing for
Podman is done but we need it for conmon. Once sdnotify is working, we
can get rid of the pidfile handling etc. and let Podman handle it.
Michal Seklatar came up with a nice idea that Podman increase the time
out on demand. That's a much cleaner way than hard-coding the time out
in the unit as suggest in the short-term solution.
This change is executing the short-term plan and sets a minimum timeout
of 60 seconds. User-specified timeouts are added to that.
Fixes: #8615
Signed-off-by: Valentin Rothberg [email protected]