-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remote: podman kill: loses exit code #9751
Comments
I just saw it again, f33:
|
I'm starting to think there's a more general race in podman-remote exit-status handling:
Trivial reproducer (assumes $ cat Containerfile
FROM quay.io/libpod/nosuchimage:nosuchtag
RUN echo hi
$ while :;do ../bin/podman-remote build -t build_test --pull-never . && break;done
...runs for a minute or two, then
STEP 1: FROM quay.io/libpod/nosuchimage:nosuchtag
Error: error creating build container: pull policy is "never" but "quay.io/libpod/nosuchimage:nosuchtag" could not be found locally
STEP 1: FROM quay.io/libpod/nosuchimage:nosuchtag
$ <--- back to prompt, with exit status 0. This should never happen. My interpretation: the |
Another one in sys: podman stop - unlock while waiting for timeout
|
And a few more: sys: podman stop - unlock while waiting for timeout
I'm having trouble keeping track of which ones I've already posted, so I may have missed some |
This is very strange, since podman and podman-remote kill are using the same code path for kill. One would think this is a race condition between kill and inspect, but it should be faster with local execution then remote. It could be the kill kills the process inside of the container but the cleanup call does not happen until after the inspect. I would figure this would be the same race on local versus remote. @mheon WDYT? |
We do have this code for Docker, we could do the same for libpod, not sure why we would not want to.
|
The test as written is inherently racy. The test should be rewritten to use |
Alternatively, it needs something to force |
@mheon this is more than just |
That sounds like a separate bug? From where I stand, the original issue here with |
If we changed the test to do podman kill The issue probably goes away correct? |
I'll submit a PR for that. (And will file the build problem as a new issue) |
Add 'podman wait' between kill & inspect. Fixes: containers#9751 Signed-off-by: Ed Santiago <[email protected]>
Add 'podman wait' between kill & inspect. Fixes: containers#9751 Signed-off-by: Ed Santiago <[email protected]>
New flake. Only three instances in two months, but that's enough to make me worry:
sys: podman stop - unlock while waiting for timeout
It's possible that the bug is in the test itself.
The text was updated successfully, but these errors were encountered: