-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libpod: stop containers with --restart=always #18267
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
language, language
@vrothberg @mheon PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM. I'd really, really like to see this behavior unambiguously documented in |
Also:
...I think a similar fix will be needed for |
Kill should work with this correctly. It correctly sets StoppedByUser so I assume the fix in shouldRestart() applies to it as well. I am happy to touch up man pages in a follow up. |
I did try it before reporting, and just tried it again (f3d453c1fea032db312d0794dcec076754317d3e): $ bin/podman run -d --restart=always quay.io/libpod/testimage:20221018 date
1f4811e1e807232762bd620614b70ecb2bd713ec737efda20f1430f68e2e2846
$ bin/podman kill -a
$ bin/podman ps -a
...keeps showing "Exited (0) Less than a second ago"; once in a while shows "Up Less than a second" |
Kill is not supposed to work on non running containers. You have to test with
|
This seems a little iffy - it can change a container that is already in
Exited state to Stopping from my read, which does not seem correct. Is just
setting StoppedByUser not sufficient?
…On Wed, Apr 19, 2023 at 12:01 Paul Holzinger ***@***.***> wrote:
Kill is not supposed to work on non running containers. You have to test
with podman kill <name>, then you see it exists out with an error. --all
is of course a NOP if the continuer is not running. If you try it often
enough and hit the window were the container is really running then it
works:
$ podman kill test
Error: can only kill running containers. 7b6a2199d8e9d1e5148c23706bec7851ba748de23170bdd0a1f514dcf9669e1e is in state stopped: container state improper
$ podman kill test
Error: can only kill running containers. 7b6a2199d8e9d1e5148c23706bec7851ba748de23170bdd0a1f514dcf9669e1e is in state stopped: container state improper
$ podman kill test
test
$ podman ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
—
Reply to this email directly, view it on GitHub
<#18267 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB3AOCCLESCCHKZPWKQ6U7DXCAD7HANCNFSM6AAAAAAXD3AVBM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Stopping is not set when the sate is not correct see the |
libpod/container_internal.go
Outdated
// commands such as `podman ps` from progressing if the container lock | ||
// is held when busy-waiting for the container to be stopped. | ||
c.state.State = define.ContainerStateStopping | ||
} | ||
if err := c.save(); err != nil { | ||
return fmt.Errorf("saving container %s state before stopping: %w", c.ID(), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe log this instead if cannotStopErr
is not nil, so we don't lose the real error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
One suggestion, otherwise LGTM |
Commit 1ab833f improved the situation but it is still not enough. If you run short lived containers with --restart=always podman is basically permanently restarting them. To only way to stop this is podman stop. However podman stop does not do anything when the container is already in a not running state. While this makes sense we should still mark the container as explicitly stopped by the user. Together with the change in shouldRestart() which now checks for StoppedByUser this makes sure the cleanup process is not going to start it back up again. A simple reproducer is: ``` podman run --restart=always --name test -d alpine true podman stop test ``` then check if the container is still running, the behavior is very flaky, it took me like 20 podman stop tries before I finally hit the correct window were it was stopped permanently. With this patch it worked on the first try. Fixes containers#18259 [NO NEW TESTS NEEDED] This is super flaky and hard to correctly test in CI. MY ginkgo v2 work seems to trigger this in play kube tests so that should catch at least some regressions. Also this may be something that should be tested at podman test days by users (containers#17912). Signed-off-by: Paul Holzinger <[email protected]>
tests are green, please merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Luap99, vrothberg The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Commit 1ab833f improved the situation but it is still not enough. If you run short lived containers with --restart=always podman is basically permanently restarting them. To only way to stop this is podman stop. However podman stop does not do anything when the container is already in a not running state. While this makes sense we should still mark the container as explicitly stopped by the user.
Together with the change in shouldRestart() which now checks for StoppedByUser this makes sure the cleanup process is not going to start it back up again.
A simple reproducer is:
then check if the container is still running, the behavior is very flaky, it took me like 20 podman stop tries before I finally hit the correct window were it was stopped permanently.
With this patch it worked on the first try.
Fixes #18259
[NO NEW TESTS NEEDED] This is super flaky and hard to correctly test in CI. MY ginkgo v2 work seems to trigger this in play kube tests so that should catch at least some regressions. Also this may be something that should be tested at podman test days by users (#17912).
Does this PR introduce a user-facing change?