-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active podman process blocks system reboot/shutdown #14531
Comments
Thanks for reaching out, @1player. I don't think there is much Podman can do. |
I agree, it is best to call |
I wrote this in bugzilla too:
|
@rhatdan, I don't think that would help in this scenario. If there's a container running that does not adhere to sigterm/stop etc. Then systemd is blocked on the process. We could think of a |
Whell it would exit with a SIGKILL after 10 seconds. Having a podman-shutdown.service might make some sense and do a |
Stop timeout is also user-configurable, so someone could theoretically have a container with a stop timeout of 90 seconds to ensure their container always has time to perform its safe shutdown routine, but that would still stall the system for 90 seconds on shutdown, potentially. |
I think the I have switched to |
Sorry for double posting, but please also note this comment of mine from https://bugzilla.redhat.com/show_bug.cgi?id=2081664#c2
Is this happening because of podman "refusing to clean up"? |
I would expect the tools to manage the containers and call
That may explain why the containers are still running: |
Is issue still tracked? This is quite annoying bug, is there a workaround? |
The issue in 89luca89/distrobox#340 looks different than the one discussed here:
I do not know what distrobox does but it needs to exit from all exec sessions before. At the moment, I don't see how this relates to the initial bug here when the container ignores a signal and gets killed after a grace period. |
This is not limited to distrobox. podman exhibits the exact same behaviour. It seems that running some applications inside the container puts it in a state that podman/conmon refuses to stop it gracefully upon system shutdown. I run emacs and pretty much all my dev tools inside a distrobox container, and most times it hangs on shutdown, but sometimes it doesn't. I do not understand, as explained in #14531 (comment), why running Maybe it is caused by subshells spawned inside the container, which causes podman to refuse terminating it, hence the delay until SIGKILL is called. |
It is your container process that is not responding to the signal, AFAIK shells do not shutdown on SIGTERM. |
Are you saying that this is a toolbox and distrobox bug, and not podman? |
yes, what is podman/systemd supposed to do when you container process does not shutdown on a normal stop signal, i.e. SIGTERM. So the only thing to do is to wait and send SIGKILL after timeout. You can change the stop signal and timeout with |
Sorry for being obtuse, but then why podman just throws its hands in the air and says |
I think you have to stop all exec session before, not sure if podman stop should to do that. @mheon might know that better? |
Podman stop should do it. This is probably a distinct issue. Open a new bug with the full template filled out, please. |
Is it really a distinct issue? As I described above, this seems to be the cause of this problem. Podman refusing to stop a container because "it has active exec session", thus causing issues with toolbox, thus causing shutdown issues. There are no particular logs to see, except that upon shutdown, journalctl points out that Here's the gist of it: a podman container should always be able to be stopped, except in case of an unresponsive process, which I would expect |
Are you certain Podman is refusing to stop the container? That error message doesn't read as a stop error to me, but a cleanup error. The container should have exited at this point, Podman is just having trouble cleaning up after it. |
Given this, it definitely smells like a different issue. Podman is seemingly having trouble handling cleanup on containers as the system shuts down, which is distinct from this issue where Podman takes a long time to kill containers that refuse to gracefully exit, causing shutdown to hang. |
A friendly reminder that this issue had no activity for 30 days. |
@vrothberg @giuseppe @mheon Do any of the fixups made recently to deadlocks address this issue? |
I suspect not.
…On Sat, Jul 30, 2022 at 10:11 Daniel J Walsh ***@***.***> wrote:
@vrothberg <https://github.com/vrothberg> @giuseppe
<https://github.com/giuseppe> @mheon <https://github.com/mheon> Do any of
the fixups made recently to deadlocks address this issue?
—
Reply to this email directly, view it on GitHub
<#14531 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB3AOCG2LNWDWU6WJOWAIM3VWVAZVANCNFSM5YGNWQ2A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
BTW, Fedora is supposed to shorten the timeout before unresponsive processes are SIGKILLed from 2 minutes down to 15 seconds, so if this is still open when that change ships, users won't notice anything during shutdown but containers will still be killed forcefully. As a big toolbox/distrobox user, I get this issue 4 out of every 5 times I reboot my workstation, and I don't keep any long running services inside the container. |
A friendly reminder that this issue had no activity for 30 days. |
This is still an issue and making life on Fedora Silverblue more painful than it needs to be. |
@1player can you share the exact systemd unit that you run Podman in? |
@vrothberg, you can test it with this container service on Silverblue. It takes 2 min to reboot/shutdown.
PS: This is syncthing official container, I didn't add any volume or any published port. Dockerfile: https://github.com/syncthing/syncthing/blob/main/Dockerfile Only way to reboot this systemd container service without waiting is use # autogenerated by Podman 4.3.1
# Tue Dec 6 16:27:12 +03 2022
[Unit]
Description=Podman syncthing-test.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=%t/containers
[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=no
TimeoutStopSec=70
ExecStartPre=/bin/rm \
-f %t/%n.ctr-id
ExecStart=/usr/bin/podman run \
--cidfile=%t/%n.ctr-id \
--cgroups=no-conmon \
--rm \
--sdnotify=conmon \
--replace \
--detach \
--name syncthing-test docker.io/syncthing/syncthing
ExecStop=/usr/bin/podman stop \
--ignore -t 10 \
--cidfile=%t/%n.ctr-id
ExecStopPost=/usr/bin/podman rm \
-f \
--ignore -t 10 \
--cidfile=%t/%n.ctr-id
Type=notify
NotifyAccess=all
[Install]
WantedBy=default.target |
Thanks for sharing, @queeup! I will take a look tomorrow. It's surprising to me as the stop-timeout is set to 10. So the container should - in theory - be killed after 10 seconds. |
I can reproduce |
The image ships a health check (see below) so Podman will run it on container start. But even a simple "Healthcheck": {
"Test": [
"CMD-SHELL",
"nc -z 127.0.0.1 8384 || exit 1"
],
"Interval": 60000000000,
"Timeout": 10000000000
}, |
I wished having found more time to work on this bug. One thing I noticed while debugging is that we're stuck on stopping the transient health-check timer. I hope to find some time tomorrow. |
When stopping the transient systemd timer/unit which powers running health checks, make sure to ignore its dependencies. It turns out that we're otherwise running into a timeout when running a container in a systemd unit and reboot. An alternative may be to further tweak some attributes/options when creating the timer/unit via systemd-run but it seems safe to just ignore the dependencies and stop. [NO NEW TESTS NEEDED] - we don't yet have means to test reboots. Fixes: containers#14531 Signed-off-by: Valentin Rothberg <[email protected]>
#16785 fixes the issue and will make it into Podman 4.4. |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
An active podman process is unable to be cleanly stopped by systemd reboot/shutdown, and thus has to be killed after the 2min grace period expires.
Steps to reproduce the issue:
podman run -it docker.io/library/busybox
sleep infinity
Describe the results you received:
Shutdown procedure hangs for ~2 minutes because podman can't be stopped. Then podman is killed and shutdown is complete.
Describe the results you expected:
The podman container to be cleanly terminated as the system shuts down.
Package info (e.g. output of
rpm -q podman
orapt list podman
):podman-4.1.0-1.fc36.x86_64
Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
Experienced this issue on Fedora Workstation 36 and Fedora Silverblue 36.
Downstream bug reports:
The text was updated successfully, but these errors were encountered: