-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplify podman systemd generate - remove cidfile #13236
Comments
With ExecStop removed TimeoutStopSec=70 is probably not needed either. The default KillMode=control-group should given the container and conman the fair chance to cleanly terminate. A systemd aware entry point can also EXTEND_TIMEOUT_USEC if it so desires. |
Thanks for opening the issue, @grooverdan! Your suggestion sounds good to me. One thing we'd need is the |
Please be aware of the consequences. We had this before and it did not work correctly, #11315 There are a number of problems when we do not use
|
Very fair point, I didn't consider this yet. The Podman clean-up process created by conmon should probably run outside the unit's cgroup. |
Also see #11304 (comment) |
Right so, https://github.com/containers/conmon/blob/e2215a1c4c01c25f2fc1206ad4df012d10374b99/src/ctr_exit.c#L222 as the SIGTERM handler should So addressing the list:
|
I think this should be enough. @giuseppe WDYT? |
I don't think so, if you have a process that does not respond to sigterm, systemd will wait the timout and then send sigkill to the main pid conmon. This will cause conmon to exit but the container process will keep running AFAICT. The cleanup process will never be started. We can also not use SendSIGKILL=no because otherwise process will never be terminated if sigterm is ignored
|
I concur. We need systemd to be able to nuke when needed. |
how can we do that? Create a new systemd scope? |
Do you think that could work? |
But this would be to late, no? conmon has to spawn the cleanup process but if conmon is killed with sigkill this is not possible. |
could we move the cleanup to a |
If things go south, yes.
Then we need to communicate the ID of the container somewhere and would need the --cidfile again. It seems that it's not as straight-forward as I'd wish it could be. In the end, it would "just" be a workaround for systemd. systemd would still reject the mainPID being sent by conmon. |
A friendly reminder that this issue had no activity for 30 days. |
@grooverdan @vrothberg What should we do with this issue? |
Is an early spawn of the cleanup process possible that activates when the parent process dies? I normally see cases of SIGCHILD but not the other way around.
Doesn't But if this is all to hard/messy for little gain I guess we can just close this. |
I don't think we should change the The main challenge is to find a way to prevent the clean-up process from being killed by systemd. @msekletar, do you have suggestions? The problem in a nutshell:
Is there a way to wait for a child process of conmon until systemd would nuke everything? |
would the gain just be to not use an external file? If so, I agree we should probably close this issue since it seems there are no valid alternatives |
Another mechanism that might be useful is to store information with memfd_create() and sd_pid_notify_with_fds(). I don't know if that mechanism could be helpful for this issue, but I think it's worth mentioning. I tested it with a minimal C program and saw that file descriptors that were stored from ExecStart, ExecStop and ExecStopPost are all available to the next instance of ExecStart in case the service was explicitly restarted ( A sketchy idea: In case conmon would like to perform a cleanup, conmon could store its intention (and the container ID) in a memfd_create file and have it stored by systemd. If the service is restarted before the container cleanup has completed, the new ExecStart podman instance could retrieve the stored information (via sd_listen_fds_with_names()) and try to complete the cleanup. |
The gain, by moving to |
I think what @grooverdan suggets in the issue description is doable but requires changes in conmon. Conmon can't exit immediately after starting cleanup process in the container cgroup. Instead it needs to wait for cleanup process to finish and only after that it exists to signal systemd that unit is no longer active. To avoid sending kill signal to other processes running in the cgroup podman should generate unit files with For bonus points conmon could initiate cleanup process, wait for it and it can even extend originally configured stop timeout using |
Thanks for taking a look, @msekletar, and for the chat off GitHub. I agree that this is the only way. For this to work, we had to update both
@mheon, what are your thoughts? |
Podman killing Conmon is a safety measure to ensure that we have a clean slate for restarting the container - Conmon holds the container's ports open, and if it's not gone attempting a container restart via the cleanup process (as happens with containers with restart policy) is not possible. Of course, Conmon could potentially be written to clean up all open FDs instead to clear that conflict, but I somewhat suspect it's not the only one. Might be easier to add this to conmon-rs, which was always intended to be a longer-living service. |
A friendly reminder that this issue had no activity for 30 days. |
@haircommander @saschagrunert WDYT? |
A friendly reminder that this issue had no activity for 30 days. |
Bump on this one. Current state of podman generated systemd unit files are problematic in fedora-37, podman 4.3.1 |
@mbach04, can you elaborate on what's problematic? |
Fedora 37, podman 4.3.1, clean install |
@mbach04, |
If the result of running I'm seeking to understand this as it appears like a very sensible thing to me. What's the delta between what gets created with just To expand on that, I believe |
We now recommend that users use quadlet for running pods under systemd using kubernetes.YAML. We do not support them directly. |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind feature
Description
podman systemd generate creates an
ExecStartPre
to remove a cidfile, andExecStart
line that include a cid file, and anExecStop
that uses it, and anExecStopPost
that removes it.For a
Type=notify
with the MAINPID of the conman pushed (per comment #12778 (comment) / #9642) all of these usages of cidfile are not needed.The
Type=notify
with an accurately communicated pid, and conman acting on all signals to shutdown the container means all the cidfile related directives can be removed.Steps to reproduce the issue:
Describe the results you received:
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):
Output of
podman version
:Package info (e.g. output of
rpm -q podman
orapt list podman
):built from source (today)
Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
Discovered investigating @eriksjolund's good use of systemd, podman and socket activation examples eriksjolund/mariadb-podman-socket-activation#1
The text was updated successfully, but these errors were encountered: