remote: flake: podman rm (more than 1 container): no such container #7837

edsantiago · 2020-09-29T21:51:30Z

Seen this one at least twice in CI. Here's a reproducer, albeit not a great one:

# cat testme
#!/bin/bash

set -ex

while :;do
    podman-remote run    -v /myvol1 --name c1 alpine true
    podman-remote run -d -v /myvol2 --name c2 alpine true
    podman-remote wait c2
    podman-remote rm c1 c2
done

# T0=$SECONDS;./testme;T1=$SECONDS;echo;echo $((T1-T0))
...
+ podman-remote run -v /myvol1 --name c1 alpine true
+ podman-remote run -d -v /myvol2 --name c2 alpine true
c1e9e53e217d2c081044f63de0fe4f41a9537072680d6b1b14cbb344e9590711
+ podman-remote rm c1 c2
c1e9e53e217d2c081044f63de0fe4f41a9537072680d6b1b14cbb344e9590711
Error: no container with name or ID c1e9e53e217d2c081044f63de0fe4f41a9537072680d6b1b14cbb344e9590711 found: no such container

Will usually fail within 5 minutes, but once it took 20. The wait is not actually necessary, but for some reason it makes the script fail more quickly. I'm pretty sure the -v is necessary, I couldn't get failures without it (but maybe I just didn't wait long enough).

CI failures: today, and Aug 28

The text was updated successfully, but these errors were encountered:

vrothberg · 2020-10-01T09:22:37Z

Thanks a ton for the reproducer, @edsantiago!

I'll take a look. This one has been quite annoying.

vrothberg · 2020-10-01T10:09:29Z

I can reproduce and after an audit of the code I suspected that the issue is not on the server-side removal but somehow corrupted data on the client. So I started logging which containers the client wants to remove:

ERRO[0000] *** removing "c1c9a6f8cf12ce6252064aba9371c72ba9bb044b5b9e997cbfc87f4904f71875"
ERRO[0000] *** removing "c1c9a6f8cf12ce6252064aba9371c72ba9bb044b5b9e997cbfc87f4904f71875"

... and indeed the client tries to remove the same container twice although it should be two different IDs (for the two containers).

Digging further.

vrothberg · 2020-10-01T10:13:20Z

Found it ...

Fix the look up of containers and pods in the remote client. User input can refer to both, names or IDs of containers and pods, so there is a fair chance of collisions (e.g., "c1" name with a "c1...." ID). Those collisions are well handled (and battle tested) in the local client which is directly using the libpod backend. Hence, the remote client should not attempt to introduce its own logic to prevent bugs and divergence between the local and the remote clients. To prevent collisions such as in containers#7837, do a container/pod inspect on the user-provided input to find the corresponding ID and eventually do full ID comparisons to avoid potential collisions with names. Note that this has a cost that I am not entirely happy with. Looking at issue containers#7837, the collisions are happening when removing the two containers. Remote container removal is now very chatty with the server as it first queries for all containers, then iterates over the provided names or IDs and does a remote inspect to figure out the IDs and find a matching container object. However, remote removal could just pass the names and IDs directly to the batch removal endpoint. Querying for all containers could be prevented if the batch removal endpoint would remove all if the slice is empty. In other words, the bug is fixed but there's room for performance improvements. Fixes: containers#7837 Signed-off-by: Valentin Rothberg <[email protected]>

edsantiago added flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. remote Problem is in podman-remote labels Sep 29, 2020

edsantiago mentioned this issue Sep 29, 2020

System tests: add podman run --tz #7832

Merged

vrothberg self-assigned this Oct 1, 2020

vrothberg added the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label Oct 1, 2020

vrothberg mentioned this issue Oct 1, 2020

remote: fix name and ID collisions of containers and pods #7867

Merged

openshift-merge-robot closed this as completed in #7867 Oct 2, 2020

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remote: flake: podman rm (more than 1 container): no such container #7837

remote: flake: podman rm (more than 1 container): no such container #7837

edsantiago commented Sep 29, 2020

vrothberg commented Oct 1, 2020

vrothberg commented Oct 1, 2020

vrothberg commented Oct 1, 2020

remote: flake: podman rm (more than 1 container): no such container #7837

remote: flake: podman rm (more than 1 container): no such container #7837

Comments

edsantiago commented Sep 29, 2020

vrothberg commented Oct 1, 2020

vrothberg commented Oct 1, 2020

vrothberg commented Oct 1, 2020