Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race: 'podman stop' does not always remove all podman mounts #5747

Closed
edsantiago opened this issue Apr 7, 2020 · 6 comments · Fixed by #6483
Closed

race: 'podman stop' does not always remove all podman mounts #5747

edsantiago opened this issue Apr 7, 2020 · 6 comments · Fixed by #6483
Assignees
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@edsantiago
Copy link
Member

One of the e2e tests, podman list running container in test/e2e/mount_test.go, occasionally flakes in CI. Basically, podman mount is finding an active mountpoint even after the container is podman stopped. Rerunning podman mount a second later finds no mounts, so it's pretty likely a race: maybe there's some cleanup that isn't happening in time.

If this race condition is OK, that is, if it doesn't matter whether podman mount shows a dead mountpoint from a stopped container, then the e2e test must be fixed.

If this race condition is not OK, that is, if podman stop should guarantee that there are no mount points when it exits, then podman stop must be fixed.

Reproducer:

#!/bin/bash

set -e

T0=$SECONDS

while :;do
    cid=$(podman run -dt docker.io/library/alpine:latest top)
    podman mount --notruncate | grep -q $cid
    podman stop $cid > /dev/null
    m2=$(podman mount --notruncate)
    if [[ "$m2" =~ $cid ]]; then
        echo "FOO! Still mounted!"
        echo "$m2"
        echo "time = $(( $SECONDS - $T0 )) seconds"
        sleep 1
        echo
        echo "after sleep 1:"
        podman mount
        exit 1
    fi
    podman rm $cid >/dev/null
done

Sample run:

# /tmp/mtest
FOO! Still mounted!
2b473b127c369cd11da3c775780ea2e693e17511534c1e296980a35443d14a70 /var/lib/containers/storage/overlay/5798db64ca882ed888a403a5398d1bb5d020c8c0bab719ba67a71b13673b4773/merged
time = 96 seconds

after sleep 1:

CI failure is f30; I can reproduce with podman-1.8.0-4.fc30.

Problem still present in rawhide: podman-1.8.3-0.75.dev.gitf7dffed.fc33

@rhatdan
Copy link
Member

rhatdan commented Apr 7, 2020

Well if the container is running podman stop will stop the container, then conmon realizes that the container has stopped, and then execs podman container cleanup to cleanup the container. This is where the race happens.
We could make podman stop wait for the container to enter the stop state or disappear, but it could wait for a very long time,

@mheon
Copy link
Member

mheon commented Apr 7, 2020

Hmmmm. I think that, right now, podman stop does not provide a guarantee that a container has been cleaned up immediately after exit (we instead wait for the cleanup process to do it). I'm not sure if this is desirable, though. If we want to add such a guarantee, it would be trivial to make podman stop call ctr.Cleanup() immediately before exiting.

@rhatdan
Copy link
Member

rhatdan commented Apr 7, 2020

Wouldn't that give you a race as well, in that the container would still be running? Or at least podman container cleanup would fail.

@mheon
Copy link
Member

mheon commented Apr 7, 2020

I think we guarantee that the container is stopped after the Stop() API call - it's just that we don't actually verify that the cleanup has completed when podman stop exits.

@github-actions
Copy link

github-actions bot commented May 8, 2020

A friendly reminder that this issue had no activity for 30 days.

@mheon
Copy link
Member

mheon commented May 8, 2020

I'll self-assign this. It's fairly low priority, but hopefully I can get to it sometime in the next few weeks.

@mheon mheon self-assigned this May 8, 2020
mheon added a commit to mheon/libpod that referenced this issue Jun 3, 2020
The cleanup process was already running and ensuring that mounts
and networking configuration was cleaned up on container stop,
but this was async from the actual `podman stop` command which
breaks some expectations - the container is still mounted at the
end of `podman stop` and will be cleaned up soon, but not
immediately. Fortunately, it's a trivial change to resolve this.

Fixes containers#5747

Signed-off-by: Matthew Heon <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants