-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remote: flake: podman rm (more than 1 container): no such container #7837
Comments
Thanks a ton for the reproducer, @edsantiago! I'll take a look. This one has been quite annoying. |
I can reproduce and after an audit of the code I suspected that the issue is not on the server-side removal but somehow corrupted data on the client. So I started logging which containers the client wants to remove:
... and indeed the client tries to remove the same container twice although it should be two different IDs (for the two containers). Digging further. |
Found it ... |
Fix the look up of containers and pods in the remote client. User input can refer to both, names or IDs of containers and pods, so there is a fair chance of collisions (e.g., "c1" name with a "c1...." ID). Those collisions are well handled (and battle tested) in the local client which is directly using the libpod backend. Hence, the remote client should not attempt to introduce its own logic to prevent bugs and divergence between the local and the remote clients. To prevent collisions such as in containers#7837, do a container/pod inspect on the user-provided input to find the corresponding ID and eventually do full ID comparisons to avoid potential collisions with names. Note that this has a cost that I am not entirely happy with. Looking at issue containers#7837, the collisions are happening when removing the two containers. Remote container removal is now very chatty with the server as it first queries for all containers, then iterates over the provided names or IDs and does a remote inspect to figure out the IDs and find a matching container object. However, remote removal could just pass the names and IDs directly to the batch removal endpoint. Querying for all containers could be prevented if the batch removal endpoint would remove all if the slice is empty. In other words, the bug is fixed but there's room for performance improvements. Fixes: containers#7837 Signed-off-by: Valentin Rothberg <[email protected]>
Fix the look up of containers and pods in the remote client. User input can refer to both, names or IDs of containers and pods, so there is a fair chance of collisions (e.g., "c1" name with a "c1...." ID). Those collisions are well handled (and battle tested) in the local client which is directly using the libpod backend. Hence, the remote client should not attempt to introduce its own logic to prevent bugs and divergence between the local and the remote clients. To prevent collisions such as in containers#7837, do a container/pod inspect on the user-provided input to find the corresponding ID and eventually do full ID comparisons to avoid potential collisions with names. Note that this has a cost that I am not entirely happy with. Looking at issue containers#7837, the collisions are happening when removing the two containers. Remote container removal is now very chatty with the server as it first queries for all containers, then iterates over the provided names or IDs and does a remote inspect to figure out the IDs and find a matching container object. However, remote removal could just pass the names and IDs directly to the batch removal endpoint. Querying for all containers could be prevented if the batch removal endpoint would remove all if the slice is empty. In other words, the bug is fixed but there's room for performance improvements. Fixes: containers#7837 Signed-off-by: Valentin Rothberg <[email protected]>
Seen this one at least twice in CI. Here's a reproducer, albeit not a great one:
Will usually fail within 5 minutes, but once it took 20. The
wait
is not actually necessary, but for some reason it makes the script fail more quickly. I'm pretty sure the-v
is necessary, I couldn't get failures without it (but maybe I just didn't wait long enough).CI failures: today, and Aug 28
The text was updated successfully, but these errors were encountered: