-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman stop $cid1 $cid2
successfully stops both but sometimes errors that $cid2 does not exist
#7384
Comments
Is the container actually still running? Or was it stopped and the error was bogus? |
If the container crashed for some reason, I would expect this behaviour. You could attempt this without the --rm and then maybe we could examine the state of the container, along with the exit code. |
There is a race that can cause this, between gathering the containers we need to work on, and working on the individual containers - if the container stops/is removed in the gap between those, we get an error. I think we can safely ignore those errors in some cases (e.g. |
After we do the podman stop, are we checking to see if the container was stopped? And if the rm happened, then the container no longer exists, so it was stopped. We probably should handle this case and not report an error. |
yes, from the
yeah good idea, though I'd be hard pressed to see |
I've run the same scenario without |
Thanks for the report, @asottile! It really smells like an internal race caused by |
Running the above reproducer, I've also seen |
Do not perform a container clean up for containers configured for auto-removal (e.g., via `podman run --rm`). There is a small race window with the other process performing the removal where a clean up during podman-stop may fail since the container has already been removed and cleaned up. As the removing process will clean up the container, we don't have to do it during podman-stop. Fixes: containers#7384 Signed-off-by: Valentin Rothberg <[email protected]>
awesome, thanks for the fix on this -- I know how tricky finding and fixing these races can be! |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
This one is pretty hard to track down and I haven't isolated a highly-reproducible scenario however I am able to observe this about ~10% of the time with my given workload.
The output below shows the failure scenario and the output of
podman container inspect $cid
for both of the containers just before runningpodman stop $cid1 $cid2
It appears that sometimes when stopping both containers podman successfully stops both but then loses track of the second container causing an error.
I can also reproduce this when running
podman stop $cid1
andpodman stop $cid2
separatelySteps to reproduce the issue:
I'm still working on more reproducible results but I'm currently running something similar to this in a loop and get an error every ~10 or so runs. I will comment / edit if I find a more reproducible run and without private components.
Describe the results you received:
podman inspect $cid1
(right before running stop)podman inspect $cid2
(right before running stop)output from
podman stop $cid1 $cid2
(exit status 125)for example
'podman', 'stop', '295e59ca74f4486cf27dece22be6d7b16ef0d9e54e5053418fb7ebf910b20ac8', 'c8cd61b084c71ecb4cce00faf211482d0dc9b6024bae4950153b7cfb0c1803a0'
note that I've only seen the second cid go "missing"
Describe the results you expected:
I expect there to not be an error
Additional information you deem important (e.g. issue happens only occasionally):
I've only been able to reproduce this about ~10% of the time
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
This is in virtualbox Version 6.0.20 r137117 (Qt5.6.2)
The text was updated successfully, but these errors were encountered: