Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote: flake: podman rm (more than 1 container): no such container #7837

Closed
edsantiago opened this issue Sep 29, 2020 · 3 comments · Fixed by #7867
Closed

remote: flake: podman rm (more than 1 container): no such container #7837

edsantiago opened this issue Sep 29, 2020 · 3 comments · Fixed by #7867
Assignees
Labels
flakes Flakes from Continuous Integration In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. remote Problem is in podman-remote

Comments

@edsantiago
Copy link
Member

Seen this one at least twice in CI. Here's a reproducer, albeit not a great one:

# cat testme
#!/bin/bash

set -ex

while :;do
    podman-remote run    -v /myvol1 --name c1 alpine true
    podman-remote run -d -v /myvol2 --name c2 alpine true
    podman-remote wait c2
    podman-remote rm c1 c2
done

# T0=$SECONDS;./testme;T1=$SECONDS;echo;echo $((T1-T0))
...
+ podman-remote run -v /myvol1 --name c1 alpine true
+ podman-remote run -d -v /myvol2 --name c2 alpine true
c1e9e53e217d2c081044f63de0fe4f41a9537072680d6b1b14cbb344e9590711
+ podman-remote rm c1 c2
c1e9e53e217d2c081044f63de0fe4f41a9537072680d6b1b14cbb344e9590711
Error: no container with name or ID c1e9e53e217d2c081044f63de0fe4f41a9537072680d6b1b14cbb344e9590711 found: no such container

Will usually fail within 5 minutes, but once it took 20. The wait is not actually necessary, but for some reason it makes the script fail more quickly. I'm pretty sure the -v is necessary, I couldn't get failures without it (but maybe I just didn't wait long enough).

CI failures: today, and Aug 28

@edsantiago edsantiago added flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. remote Problem is in podman-remote labels Sep 29, 2020
@vrothberg vrothberg self-assigned this Oct 1, 2020
@vrothberg vrothberg added the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label Oct 1, 2020
@vrothberg
Copy link
Member

Thanks a ton for the reproducer, @edsantiago!

I'll take a look. This one has been quite annoying.

@vrothberg
Copy link
Member

I can reproduce and after an audit of the code I suspected that the issue is not on the server-side removal but somehow corrupted data on the client. So I started logging which containers the client wants to remove:

ERRO[0000] *** removing "c1c9a6f8cf12ce6252064aba9371c72ba9bb044b5b9e997cbfc87f4904f71875"
ERRO[0000] *** removing "c1c9a6f8cf12ce6252064aba9371c72ba9bb044b5b9e997cbfc87f4904f71875"

... and indeed the client tries to remove the same container twice although it should be two different IDs (for the two containers).

Digging further.

@vrothberg
Copy link
Member

Found it ...

vrothberg added a commit to vrothberg/libpod that referenced this issue Oct 1, 2020
Fix the look up of containers and pods in the remote client.  User input
can refer to both, names or IDs of containers and pods, so there is a
fair chance of collisions (e.g., "c1" name with a "c1...." ID).

Those collisions are well handled (and battle tested) in the local
client which is directly using the libpod backend.  Hence, the remote
client should not attempt to introduce its own logic to prevent bugs and
divergence between the local and the remote clients.  To prevent
collisions such as in containers#7837, do a container/pod inspect on the
user-provided input to find the corresponding ID and eventually do full
ID comparisons to avoid potential collisions with names.

Note that this has a cost that I am not entirely happy with.  Looking at
issue containers#7837, the collisions are happening when removing the two
containers.  Remote container removal is now very chatty with the server
as it first queries for all containers, then iterates over the provided
names or IDs and does a remote inspect to figure out the IDs and find a
matching container object.  However, remote removal could just pass the
names and IDs directly to the batch removal endpoint.  Querying for all
containers could be prevented if the batch removal endpoint would remove
all if the slice is empty.

In other words, the bug is fixed but there's room for performance
improvements.

Fixes: containers#7837
Signed-off-by: Valentin Rothberg <[email protected]>
mheon pushed a commit to mheon/libpod that referenced this issue Oct 14, 2020
Fix the look up of containers and pods in the remote client.  User input
can refer to both, names or IDs of containers and pods, so there is a
fair chance of collisions (e.g., "c1" name with a "c1...." ID).

Those collisions are well handled (and battle tested) in the local
client which is directly using the libpod backend.  Hence, the remote
client should not attempt to introduce its own logic to prevent bugs and
divergence between the local and the remote clients.  To prevent
collisions such as in containers#7837, do a container/pod inspect on the
user-provided input to find the corresponding ID and eventually do full
ID comparisons to avoid potential collisions with names.

Note that this has a cost that I am not entirely happy with.  Looking at
issue containers#7837, the collisions are happening when removing the two
containers.  Remote container removal is now very chatty with the server
as it first queries for all containers, then iterates over the provided
names or IDs and does a remote inspect to figure out the IDs and find a
matching container object.  However, remote removal could just pass the
names and IDs directly to the batch removal endpoint.  Querying for all
containers could be prevented if the batch removal endpoint would remove
all if the slice is empty.

In other words, the bug is fixed but there's room for performance
improvements.

Fixes: containers#7837
Signed-off-by: Valentin Rothberg <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. remote Problem is in podman-remote
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants