Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman rm --all not working with containers that have dependencies #18180

Closed
Luap99 opened this issue Apr 13, 2023 · 16 comments · Fixed by #18507
Closed

podman rm --all not working with containers that have dependencies #18180

Luap99 opened this issue Apr 13, 2023 · 16 comments · Fixed by #18507
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@Luap99
Copy link
Member

Luap99 commented Apr 13, 2023

Issue Description

Try to use podman rm -fa when you have containers with dependencies, e.g. --network container:xxx or any other namespace flag which supports container:.

Steps to reproduce the issue

  1. podman run --name test1 -d alpine top
  2. podman run --network container:test1 -d alpine top
  3. podman rm -fa

Describe the results you received

a32636dc3e18c522d3f16340801ffb76d4e8acad8f02b32723ce91721dbc50a0
Error: container 8c9cbe82f02577e166187ec899afc6497e3bd551fae1b05df5a437b8ef90d081 has dependent containers which must be removed before it: a32636dc3e18c522d3f16340801ffb76d4e8acad8f02b32723ce91721dbc50a0: container already exists

Describe the results you expected

All containers removed without errors.

podman info output

tested with latest main

Podman in a container

No

Privileged Or Rootless

None

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

This is causing issues in the e2e integration tests because we only do rm -fa as cleanup thus some processes are leaked.

@Luap99 Luap99 added the kind/bug Categorizes issue or PR as related to a bug. label Apr 13, 2023
@vrothberg
Copy link
Member

vrothberg commented Apr 13, 2023

Ouch!

Can you point out where this fails in the e2e tests? I think this error should be fatal (i.e., let the e2e tests fail).

@Luap99
Copy link
Member Author

Luap99 commented Apr 13, 2023

I pushed some changes for it to #18163.
The problem is in (p *PodmanTestIntegration) Cleanup() where we only run pod rm -fa and rm -fa

@vrothberg
Copy link
Member

Excellent, nice work!

@Luap99
Copy link
Member Author

Luap99 commented Apr 13, 2023

I think we need to order the container so that they are removed in the right order, looks like we already handle this for pods (e19e0de)

Following the comment it should not be used for containers:

// Visit a node on the container graph and remove it, or set an error if it
// failed to remove. Only intended for use in pod removal; do *not* use when
// removing individual containers.
// All containers are assumed to be *UNLOCKED* on running this function.
// Container locks will be acquired as necessary.
// Pod and infraID are optional. If a pod is given it must be *LOCKED*.
func removeNode(ctx context.Context, node *containerNode, pod *Pod, force bool, timeout *uint, setError bool, ctrErrors map[string]error, ctrsVisited map[string]bool, ctrNamedVolumes map[string]*ContainerNamedVolume) {

I am not sure why?
@mheon Can you take a look please? This is pretty important to fix broken e2e tests.

@edsantiago
Copy link
Member

OMG if this gets fixed, can you add like actual error checking to cleanup? This has been a flake nightmare for years. #11597

@Luap99
Copy link
Member Author

Luap99 commented Apr 13, 2023

I already have it in my ginkgo update PR, so I could find the problematic ones.

Luap99 added a commit to Luap99/libpod that referenced this issue Apr 13, 2023
Only check exit codes last, othwerwise in case of errors it will return
early and miss other commands.
Also explicitly stop before rm, rm is not working in all cases (containers#18180).

Signed-off-by: Paul Holzinger <[email protected]>
@mheon
Copy link
Member

mheon commented Apr 13, 2023

We could definitely restructure remove-all to do an ordered removal, but in the meantime, you could just add --depends to remove all dependencies and solve the errors that way?

@mheon
Copy link
Member

mheon commented Apr 13, 2023

Alternatively, we could make the combination of --all and --force imply --depends because if you want every container removed, dependencies are included in "every container".

@Luap99
Copy link
Member Author

Luap99 commented Apr 13, 2023

I only see a --depend option and it is not working:

$ bin/podman run --name test1 -d alpine top
8d09a1feea5963ecacf888264abd24563ee9f6224b4f39e01f5294494a54385c
$ bin/podman run --network container:test1 -d alpine top
617d47aa6ae7e40abed4dd71cc617b98e2b56890a5f8db6b592376a6bd31c1fa
$ bin/podman rm -fa --depend
617d47aa6ae7e40abed4dd71cc617b98e2b56890a5f8db6b592376a6bd31c1fa
Error: container 8d09a1feea5963ecacf888264abd24563ee9f6224b4f39e01f5294494a54385c has dependent containers which must be removed before it: 617d47aa6ae7e40abed4dd71cc617b98e2b56890a5f8db6b592376a6bd31c1fa: container already exists
$ bin/podman ps
CONTAINER ID  IMAGE                            COMMAND     CREATED         STATUS         PORTS       NAMES
8d09a1feea59  docker.io/library/alpine:latest  top         33 seconds ago  Up 34 seconds              test1

@mheon
Copy link
Member

mheon commented Apr 13, 2023

I checked, and it's already set by default on rm --all - so our actual bug is that --depend is broken.

@mheon
Copy link
Member

mheon commented Apr 13, 2023

--depend works fine outside of rm -fa - interesting.

@mheon
Copy link
Member

mheon commented Apr 13, 2023

Self-assigning, the code here is a bit of a mess and probably ought to be refactored into normal RemoveContainer.

@Luap99
Copy link
Member Author

Luap99 commented Apr 26, 2023

@mheon any chance you can finish this? I would love to get my ginkgo v2 PR in.

@edsantiago
Copy link
Member

@Luap99 as a stopgap, WDYT of just running two sequential podman rm -fas in cleanup?

Luap99 added a commit to Luap99/libpod that referenced this issue May 2, 2023
Only check exit codes last, othwerwise in case of errors it will return
early and miss other commands.
Also explicitly stop before rm, rm is not working in all cases (containers#18180).

Signed-off-by: Paul Holzinger <[email protected]>
Luap99 added a commit to Luap99/libpod that referenced this issue May 2, 2023
Add a workaround for containers#18180 so the ginkgo work can be merged without
being blocked by the issue. Please revert this commit when the issue
is fixed.

Signed-off-by: Paul Holzinger <[email protected]>
@Luap99
Copy link
Member Author

Luap99 commented May 3, 2023

@mheon When you open a PR please also revert: c4b9f4b

@mheon
Copy link
Member

mheon commented May 3, 2023

Ack

edsantiago added a commit to edsantiago/libpod that referenced this issue May 10, 2023
Several tweaks to see if we can track down containers#17216, the unlinkat-ebusy
flake:

 - teardown(): if a cleanup command fails, display it and its
   output to the debug channel. This should never happen, but
   it can and does (see containers#18180, dependent containers). We
   need to know about it.

 - selinux tests: use unique pod names. This should help when
   scanning journal logs.

 - many tests: add "-f -t0" to "pod rm"

And, several unrelated changes caught by accident:
 - images-commit-with-comment test: was leaving a stray image
   behind. Clean it up, and make a few more readability tweaks

 - podman-remote-group-add test: add an explicit skip()
   when not remote. (Otherwise, test passes cleanly on
   podman local, which is misleading)

 - lots of container cleanup and/or adding "--rm" to run commands,
   to avoid leaving stray containers

Signed-off-by: Ed Santiago <[email protected]>
mheon added a commit to mheon/libpod that referenced this issue Jun 1, 2023
This reverts commit c4b9f4b.

This was a temporary workaround until a fix for containers#18180 landed.

Signed-off-by: Matthew Heon <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 11, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants