podman fails to start container after reboot if using volume #4605

maage · 2019-12-01T19:58:46Z

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I create volume and container to use it. I can run container multiple times without problems. But after I reboot, podman is stuck and does not start container fully or run any commands when 'podman run' is starting container.

Steps to reproduce the issue:

Setup:

Buy Raspberry pi 4, Install Ubuntu 19.10 64-bit (ubuntu-server)
https://ubuntu.com/download/raspberry-pi
add total_mem=3072 to /boot/firmware/usercfg.txt according to:
https://ubuntu.com/blog/roadmap-for-official-support-for-the-raspberry-pi-4
Connect USB SSD and move root/home to there using rsync method under LVM.
Update normally. Install haveged (to get more random bytes).

Linux ubuntu 5.3.0-1012-raspi2 #14-Ubuntu SMP Mon Nov 11 10:06:55 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

Build podman locally as there is no packages available at https://launchpad.net/~projectatomic/+archive/ubuntu/ppa.

Use https://podman.io/getting-started/installation as basis.

I've used this script:

$ cat build.sh 
#!/bin/bash

set -epux

# https://podman.io/getting-started/installation

# Packages good to have
sudo apt-get install btrfs-progs git iptables uidmap software-properties-common

# Development packages
sudo apt-get install make golang-go go-md2man libassuan-dev libbtrfs-dev libc6-dev libdevmapper-dev libglib2.0-dev libgpgme-dev libgpg-error-dev libprotobuf-dev libprotobuf-c-dev libseccomp-dev libselinux1-dev libsystemd-dev pkg-config libostree-dev libapparmor-dev

# Needed for testing
#sudo apt-get install docker.io
#sudo systemctl start docker || :
#sudo /usr/sbin/usermod -a -G docker $USER

export GOPATH=~/go

[ -d conmon ] || \
	git clone https://github.com/containers/conmon
pushd conmon
git pull --rebase
git status
make -j$(nproc)
sudo make podman PREFIX=/usr
/usr/libexec/podman/conmon --version
popd

[ -d $GOPATH/src/github.com/opencontainers/runc ] || \
	git clone https://github.com/opencontainers/runc.git $GOPATH/src/github.com/opencontainers/runc
pushd $GOPATH/src/github.com/opencontainers/runc
git pull --rebase
git status
make BUILDTAGS="seccomp selinux apparmor ambient"
sudo cp runc /usr/bin/runc
/usr/bin/runc --version
#make localtest
popd

[ -d $GOPATH/src/github.com/containernetworking/plugins ] || \
	git clone https://github.com/containernetworking/plugins.git $GOPATH/src/github.com/containernetworking/plugins
pushd $GOPATH/src/github.com/containernetworking/plugins
git pull --rebase
git status
./build_linux.sh
sudo mkdir -p /usr/libexec/cni
sudo cp bin/* /usr/libexec/cni
#./test_linux.sh
popd

sudo mkdir -p /etc/cni/net.d
[ -f /etc/cni/net.d/99-loopback.conf ] || \
	curl -qsSL https://raw.githubusercontent.com/containers/libpod/master/cni/87-podman-bridge.conflist | sudo tee /etc/cni/net.d/99-loopback.conf

sudo mkdir -p /etc/containers
[ -f /etc/containers/registries.conf ] || \
	sudo curl https://raw.githubusercontent.com/projectatomic/registries/master/registries.fedora -o /etc/containers/registries.conf
[ -f /etc/containers/policy.json ] || \
	sudo curl https://raw.githubusercontent.com/containers/skopeo/master/default-policy.json -o /etc/containers/policy.json

[ -d $GOPATH/src/github.com/containers/libpod ] || \
	git clone https://github.com/containers/libpod/ $GOPATH/src/github.com/containers/libpod
pushd $GOPATH/src/github.com/containers/libpod
git pull --rebase
git status
sudo make install BUILDTAGS="apparmor systemd selinux seccomp" PREFIX=/usr
popd

Now I should have latest podman built locally and recommended configs etc.

ubuntu@ubuntu:$ podman --version
podman version 1.6.4-dev
ubuntu@ubuntu:$ runc --version
runc version 1.0.0-rc9+dev
commit: 2186cfa3cd52b8e00b1de76db7859cacdf7b1f94
spec: 1.0.1-dev

Create and run container

$ cat tst-foo.sh 
#!/bin/bash

set -epux

for a in data; do
	podman volume create tst-"$a" || :
done
podman volume ls

podman rm foo || :

podman container run \
	--volume tst-data:/data \
	--network host \
	--log-level=trace \
	--name foo \
	debian:stretch-slim

Run container multiple times. Container starts and then stops automatically. And we are back to shell.

Like this:

ubuntu@ubuntu:~$ ./tst-foo.sh |& tee -a OK-1.log
+ for a in data
+ podman volume create tst-data
Error: volume with name tst-data already exists: volume already exists
+ :
+ podman volume ls
DRIVER   VOLUME NAME
local    tst-data
local    unifi-config
local    unifi-log
local    unifi-log2
local    unifi-run
local    unifi-work
+ podman rm foo
d2aa1241e7fc264c418f36fafebc86bbe0e531533a673b9749d4310a767b2a17
+ podman container run --volume tst-data:/data --network host --log-level=trace --name foo debian:stretch-slim
time="2019-12-01T21:08:55+02:00" level=debug msg="Reading configuration file \"/home/ubuntu/.config/containers/libpod.conf\""
time="2019-12-01T21:08:55+02:00" level=debug msg="Merged system config \"/home/ubuntu/.config/containers/libpod.conf\": &{{false false false true true true} 0 {
   [] [] []} /home/ubuntu/.local/share/containers/storage/volumes docker://  runc map[crun:[/usr/bin/crun /usr/local/bin/crun] runc:[/usr/bin/runc /usr/sbin/run
c /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] [crun runc] [crun] [] [/usr/libe
xec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /r
un/current-system/sw/bin/conmon] [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] cgroupfs  /home/ubuntu/.local/share/containers/storage/libp
od /run/user/1000/libpod/tmp -1 false /etc/cni/net.d/ [/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin]  []   k8s.gcr.io/pause:3.1 /pause true tru
e  2048 shm journald  ctrl-p,ctrl-q false}"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using conmon: \"/usr/libexec/podman/conmon\""
time="2019-12-01T21:08:55+02:00" level=debug msg="Initializing boltdb state at /home/ubuntu/.local/share/containers/storage/libpod/bolt_state.db"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using graph driver vfs"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using graph root /home/ubuntu/.local/share/containers/storage"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using run root /run/user/1000"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using static dir /home/ubuntu/.local/share/containers/storage/libpod"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using tmp dir /run/user/1000/libpod/tmp"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using volume path /home/ubuntu/.local/share/containers/storage/volumes"
time="2019-12-01T21:08:55+02:00" level=debug msg="Set libpod namespace to \"\""
time="2019-12-01T21:08:55+02:00" level=debug msg="No store required. Not opening container store."
time="2019-12-01T21:08:55+02:00" level=debug msg="Initializing event backend journald"
time="2019-12-01T21:08:55+02:00" level=debug msg="using runtime \"/usr/bin/runc\""
time="2019-12-01T21:08:55+02:00" level=warning msg="Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument"
time="2019-12-01T21:08:55+02:00" level=info msg="running as rootless"
time="2019-12-01T21:08:55+02:00" level=debug msg="Reading configuration file \"/home/ubuntu/.config/containers/libpod.conf\""
time="2019-12-01T21:08:55+02:00" level=debug msg="Merged system config \"/home/ubuntu/.config/containers/libpod.conf\": &{{false false false true true true} 0 {   [] [] []} /home/ubuntu/.local/share/containers/storage/volumes docker://  runc map[crun:[/usr/bin/crun /usr/local/bin/crun] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] [crun runc] [crun] [] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] cgroupfs  /home/ubuntu/.local/share/containers/storage/libpod /run/user/1000/libpod/tmp -1 false /etc/cni/net.d/ [/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin]  []   k8s.gcr.io/pause:3.1 /pause true true  2048 shm journald  ctrl-p,ctrl-q false}"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using conmon: \"/usr/libexec/podman/conmon\""
time="2019-12-01T21:08:55+02:00" level=debug msg="Initializing boltdb state at /home/ubuntu/.local/share/containers/storage/libpod/bolt_state.db"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using graph driver vfs"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using graph root /home/ubuntu/.local/share/containers/storage"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using run root /run/user/1000"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using static dir /home/ubuntu/.local/share/containers/storage/libpod"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using tmp dir /run/user/1000/libpod/tmp"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using volume path /home/ubuntu/.local/share/containers/storage/volumes"
time="2019-12-01T21:08:55+02:00" level=debug msg="Set libpod namespace to \"\""
time="2019-12-01T21:08:55+02:00" level=debug msg="[graphdriver] trying provided driver \"vfs\""
time="2019-12-01T21:08:55+02:00" level=debug msg="Initializing event backend journald"
time="2019-12-01T21:08:55+02:00" level=warning msg="Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument"
time="2019-12-01T21:08:55+02:00" level=debug msg="using runtime \"/usr/bin/runc\""
time="2019-12-01T21:08:55+02:00" level=debug msg="parsed reference into \"[vfs@/home/ubuntu/.local/share/containers/storage+/run/user/1000]docker.io/library/debian:stretch-slim\""
time="2019-12-01T21:08:55+02:00" level=debug msg="parsed reference into \"[vfs@/home/ubuntu/.local/share/containers/storage+/run/user/1000]@cb936176655335077621a1b2e4c9091fd240fb2f1d8e05acee56357d1c93aa43\""
time="2019-12-01T21:08:55+02:00" level=debug msg="exporting opaque data as blob \"sha256:cb936176655335077621a1b2e4c9091fd240fb2f1d8e05acee56357d1c93aa43\""
time="2019-12-01T21:08:55+02:00" level=debug msg="User mount tst-data:/data options []"
time="2019-12-01T21:08:55+02:00" level=debug msg="Using host netmode"
time="2019-12-01T21:08:55+02:00" level=debug msg="setting container name foo"
time="2019-12-01T21:08:55+02:00" level=debug msg="created OCI spec and options for new container"
time="2019-12-01T21:08:55+02:00" level=debug msg="Allocated lock 1 for container 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd"
time="2019-12-01T21:08:55+02:00" level=debug msg="parsed reference into \"[vfs@/home/ubuntu/.local/share/containers/storage+/run/user/1000]@cb936176655335077621a1b2e4c9091fd240fb2f1d8e05acee56357d1c93aa43\""
time="2019-12-01T21:08:55+02:00" level=debug msg="exporting opaque data as blob \"sha256:cb936176655335077621a1b2e4c9091fd240fb2f1d8e05acee56357d1c93aa43\""
time="2019-12-01T21:08:57+02:00" level=debug msg="created container \"200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd\""
time="2019-12-01T21:08:57+02:00" level=debug msg="container \"200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd\" has work directory \"/home/ubuntu/.local/share/containers/storage/vfs-containers/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/userdata\""
time="2019-12-01T21:08:57+02:00" level=debug msg="container \"200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd\" has run directory \"/run/user/1000/vfs-containers/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/userdata\""
time="2019-12-01T21:08:57+02:00" level=debug msg="New container created \"200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd\""
time="2019-12-01T21:08:57+02:00" level=debug msg="Not attaching to stdin"
time="2019-12-01T21:08:57+02:00" level=debug msg="mounted container \"200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd\" at \"/home/ubuntu/.local/share/containers/storage/vfs/dir/934e5055bbc5cdeaddfce615fbe6089db771a98f63fa4d090ac7f9a820263ea4\""
time="2019-12-01T21:08:57+02:00" level=debug msg="Created root filesystem for container 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd at /home/ubuntu/.local/share/containers/storage/vfs/dir/934e5055bbc5cdeaddfce615fbe6089db771a98f63fa4d090ac7f9a820263ea4"
time="2019-12-01T21:08:57+02:00" level=debug msg="skipping loading default AppArmor profile (rootless mode)"
time="2019-12-01T21:08:57+02:00" level=debug msg="/etc/system-fips does not exist on host, not mounting FIPS mode secret"
time="2019-12-01T21:08:58+02:00" level=debug msg="Created OCI spec for container 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd at /home/ubuntu/.local/share/containers/storage/vfs-containers/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/userdata/config.json"
time="2019-12-01T21:08:58+02:00" level=debug msg="running conmon: /usr/libexec/podman/conmon" args="[--api-version 1 -c 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd -u 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd -r /usr/bin/runc -b /home/ubuntu/.local/share/containers/storage/vfs-containers/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/userdata -p /run/user/1000/vfs-containers/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/userdata/pidfile -l k8s-file:/home/ubuntu/.local/share/containers/storage/vfs-containers/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level trace --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/1000/vfs-containers/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/userdata/oci-log --conmon-pidfile /run/user/1000/vfs-containers/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/ubuntu/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-arg trace --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd]"
time="2019-12-01T21:08:58+02:00" level=warning msg="Failed to add conmon to cgroupfs sandbox cgroup: error creating cgroup for memory: mkdir /sys/fs/cgroup/memory/libpod_parent: permission denied"
[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied

time="2019-12-01T21:08:58+02:00" level=debug msg="Received: 3221"
time="2019-12-01T21:08:58+02:00" level=info msg="Got Conmon PID as 3210"
time="2019-12-01T21:08:58+02:00" level=debug msg="Created container 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd in OCI runtime"
time="2019-12-01T21:08:58+02:00" level=debug msg="Attaching to container 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd"
time="2019-12-01T21:08:58+02:00" level=debug msg="connecting to socket /run/user/1000/libpod/tmp/socket/200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd/attach"
time="2019-12-01T21:08:58+02:00" level=debug msg="Starting container 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd with command [bash]"
time="2019-12-01T21:08:58+02:00" level=debug msg="Started container 200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd"
time="2019-12-01T21:08:58+02:00" level=debug msg="Enabling signal proxying"
ubuntu@ubuntu:~$

Reboot machine.

sudo systemctl reboot

Describe the results you received:

I started script as before boot, but it is stuck there. I can end program by C-c but for example 'podman ps' is also stuck when 'podman run' is stuck.

ubuntu@ubuntu:~$ ./tst-foo.sh |& tee -a FAIL-1.log
+ for a in data
+ podman volume create tst-data
Error: volume with name tst-data already exists: volume already exists
+ :
+ podman volume ls
DRIVER   VOLUME NAME
local    tst-data
+ podman rm foo
200381ae9755d1e5d6917a6f5d60d6d2c14b7db81247065b99fc16b6e1ab09dd
+ podman container run --volume tst-data:/data --network host --log-level=trace --name foo debian:stretch-slim
time="2019-12-01T21:11:35+02:00" level=debug msg="Reading configuration file \"/home/ubuntu/.config/containers/libpod.conf\""
time="2019-12-01T21:11:35+02:00" level=debug msg="Merged system config \"/home/ubuntu/.config/containers/libpod.conf\": &{{false false false true true true} 0 {   [] [] []} /home/ubuntu/.local/share/containers/storage/volumes docker://  runc map[crun:[/usr/bin/crun /usr/local/bin/crun] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] [crun runc] [crun] [] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] cgroupfs  /home/ubuntu/.local/share/containers/storage/libpod /run/user/1000/libpod/tmp -1 false /etc/cni/net.d/ [/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin]  []   k8s.gcr.io/pause:3.1 /pause true true  2048 shm journald  ctrl-p,ctrl-q false}"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using conmon: \"/usr/libexec/podman/conmon\""
time="2019-12-01T21:11:35+02:00" level=debug msg="Initializing boltdb state at /home/ubuntu/.local/share/containers/storage/libpod/bolt_state.db"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using graph driver vfs"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using graph root /home/ubuntu/.local/share/containers/storage"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using run root /run/user/1000"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using static dir /home/ubuntu/.local/share/containers/storage/libpod"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using tmp dir /run/user/1000/libpod/tmp"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using volume path /home/ubuntu/.local/share/containers/storage/volumes"
time="2019-12-01T21:11:35+02:00" level=debug msg="Set libpod namespace to \"\""
time="2019-12-01T21:11:35+02:00" level=debug msg="No store required. Not opening container store."
time="2019-12-01T21:11:35+02:00" level=debug msg="Initializing event backend journald"
time="2019-12-01T21:11:35+02:00" level=warning msg="Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument"
time="2019-12-01T21:11:35+02:00" level=debug msg="using runtime \"/usr/bin/runc\""
time="2019-12-01T21:11:35+02:00" level=info msg="running as rootless"
time="2019-12-01T21:11:35+02:00" level=debug msg="Reading configuration file \"/home/ubuntu/.config/containers/libpod.conf\""
time="2019-12-01T21:11:35+02:00" level=debug msg="Merged system config \"/home/ubuntu/.config/containers/libpod.conf\": &{{false false false true true true} 0 {   [] [] []} /home/ubuntu/.local/share/containers/storage/volumes docker://  runc map[crun:[/usr/bin/crun /usr/local/bin/crun] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] [crun runc] [crun] [] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] cgroupfs  /home/ubuntu/.local/share/containers/storage/libpod /run/user/1000/libpod/tmp -1 false /etc/cni/net.d/ [/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin]  []   k8s.gcr.io/pause:3.1 /pause true true  2048 shm journald  ctrl-p,ctrl-q false}"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using conmon: \"/usr/libexec/podman/conmon\""
time="2019-12-01T21:11:35+02:00" level=debug msg="Initializing boltdb state at /home/ubuntu/.local/share/containers/storage/libpod/bolt_state.db"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using graph driver vfs"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using graph root /home/ubuntu/.local/share/containers/storage"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using run root /run/user/1000"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using static dir /home/ubuntu/.local/share/containers/storage/libpod"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using tmp dir /run/user/1000/libpod/tmp"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using volume path /home/ubuntu/.local/share/containers/storage/volumes"
time="2019-12-01T21:11:35+02:00" level=debug msg="Set libpod namespace to \"\""
time="2019-12-01T21:11:35+02:00" level=debug msg="[graphdriver] trying provided driver \"vfs\""
time="2019-12-01T21:11:35+02:00" level=debug msg="Initializing event backend journald"
time="2019-12-01T21:11:35+02:00" level=debug msg="using runtime \"/usr/bin/runc\""
time="2019-12-01T21:11:35+02:00" level=warning msg="Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument"
time="2019-12-01T21:11:35+02:00" level=debug msg="parsed reference into \"[vfs@/home/ubuntu/.local/share/containers/storage+/run/user/1000]docker.io/library/debian:stretch-slim\""
time="2019-12-01T21:11:35+02:00" level=debug msg="parsed reference into \"[vfs@/home/ubuntu/.local/share/containers/storage+/run/user/1000]@cb936176655335077621a1b2e4c9091fd240fb2f1d8e05acee56357d1c93aa43\""
time="2019-12-01T21:11:35+02:00" level=debug msg="exporting opaque data as blob \"sha256:cb936176655335077621a1b2e4c9091fd240fb2f1d8e05acee56357d1c93aa43\""
time="2019-12-01T21:11:35+02:00" level=debug msg="User mount tst-data:/data options []"
time="2019-12-01T21:11:35+02:00" level=debug msg="Using host netmode"
time="2019-12-01T21:11:35+02:00" level=debug msg="setting container name foo"
time="2019-12-01T21:11:35+02:00" level=debug msg="created OCI spec and options for new container"
time="2019-12-01T21:11:35+02:00" level=debug msg="Allocated lock 0 for container 1f1428f5fdefdf5e01f534aecd2c338f41cc3c591c4bb93a7d25450c905b51d0"
time="2019-12-01T21:11:35+02:00" level=debug msg="parsed reference into \"[vfs@/home/ubuntu/.local/share/containers/storage+/run/user/1000]@cb936176655335077621a1b2e4c9091fd240fb2f1d8e05acee56357d1c93aa43\""
time="2019-12-01T21:11:35+02:00" level=debug msg="exporting opaque data as blob \"sha256:cb936176655335077621a1b2e4c9091fd240fb2f1d8e05acee56357d1c93aa43\""
time="2019-12-01T21:11:38+02:00" level=debug msg="created container \"1f1428f5fdefdf5e01f534aecd2c338f41cc3c591c4bb93a7d25450c905b51d0\""
time="2019-12-01T21:11:38+02:00" level=debug msg="container \"1f1428f5fdefdf5e01f534aecd2c338f41cc3c591c4bb93a7d25450c905b51d0\" has work directory \"/home/ubuntu/.local/share/containers/storage/vfs-containers/1f1428f5fdefdf5e01f534aecd2c338f41cc3c591c4bb93a7d25450c905b51d0/userdata\""
time="2019-12-01T21:11:38+02:00" level=debug msg="container \"1f1428f5fdefdf5e01f534aecd2c338f41cc3c591c4bb93a7d25450c905b51d0\" has run directory \"/run/user/1000/vfs-containers/1f1428f5fdefdf5e01f534aecd2c338f41cc3c591c4bb93a7d25450c905b51d0/userdata\""
time="2019-12-01T21:11:38+02:00" level=debug msg="New container created \"1f1428f5fdefdf5e01f534aecd2c338f41cc3c591c4bb93a7d25450c905b51d0\""
time="2019-12-01T21:11:38+02:00" level=debug msg="Not attaching to stdin"
time="2019-12-01T21:11:38+02:00" level=debug msg="mounted container \"1f1428f5fdefdf5e01f534aecd2c338f41cc3c591c4bb93a7d25450c905b51d0\" at \"/home/ubuntu/.local/share/containers/storage/vfs/dir/aeff9f245dcbf70bb36262adf92ae73300e03ab0b2dca53eaf4e372e03e8e003\""

Describe the results you expected:

I'd expect container to run and then go back to shell.

Additional information you deem important (e.g. issue happens only occasionally):

If I remove '--volume tst-data:/data' from 'podman container run' parameters it works as excpected. Volume is untouched.

If I change container from 'debian:stretch-lite' to 'alpine' issue stays. And further still to 'fedora' and no change.

Easiest way to see if this problem is active is to run 'podman ps' in another window and if it is stuck then there is problem.

I've tried this with running container by sudo and problem is still existing.

I guess '--network host' is not needed but it slightly simplifies testing. I needed to install 'slirp4netns'

Without --network host, last line of problematic run is like:

DEBU[0002] slirp4netns command: /usr/bin/slirp4netns --disable-host-loopback --mtu 65520 --enable-sandbox -c -e 3 -r 4 --netns-type=path /run/user/1000/netns/cni-6a764d17-c67b-ec75-8eff-f7ff48758698 tap0

When running under strace, last action is from main thread:

20:47:50.706478 execve("/usr/bin/podman", ["podman", "container", "run", "--volume", "tst-data:/data", "--network", "host", "--log-level=trace", "--name", "foo", "debian:stretch-slim"], 0xffffe54401f0 /* 28 vars */) = 0
...
20:48:01.015083 fchownat(AT_FDCWD, "/home/ubuntu/.local/share/containers/storage/vfs/dir/58fddb03aca3a4a8c1228c38ae21c36ef83617a3a572b348b3fc2ba2bd6efafc", 0, 0, 0) = 0
20:48:01.015241 openat(AT_FDCWD, "/home/ubuntu/.local/share/containers/storage/libpod/bolt_state.db", O_RDWR|O_CREAT|O_CLOEXEC, 0600) = 5
20:48:01.015387 epoll_ctl(4, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2889584192, u64=281473571327552}}) = -1 EPERM (Operation not permitted)
20:48:01.015497 epoll_ctl(4, EPOLL_CTL_DEL, 5, 0x40002359a8) = -1 EPERM (Operation not permitted)
20:48:01.015598 flock(5, LOCK_EX|LOCK_NB) = 0
20:48:01.015703 fstat(5, {st_mode=S_IFREG|0600, st_size=262144, ...}) = 0
20:48:01.015808 pread64(5, "\0\0\0\0\0\0\0\0\4\0\0\0\0\0\0\0\355\332\f\355\2\0\0\0\0\20\0\0\0\0\0\0\27\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\30\0\0\0\0\0\0\0(\0\0\0\0\0\0\0\370\0\0\0\0\0\0\0t\313\310\324 \354\364\355\0...\0", 4096, 0) = 4096
20:48:01.016013 fstat(5, {st_mode=S_IFREG|0600, st_size=262144, ...}) = 0
20:48:01.016118 mmap(NULL, 262144, PROT_READ, MAP_SHARED, 5, 0) = 0xffffac27f000
20:48:01.016280 madvise(0xffffac27f000, 262144, MADV_RANDOM) = 0
20:48:01.016503 munmap(0xffffac27f000, 262144) = 0
20:48:01.016632 flock(5, LOCK_UN)       = 0
20:48:01.016734 close(5)                = 0
20:48:01.016849 futex(0xffffac2bf048, FUTEX_WAIT, 2147485635, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)

Output of podman version:

(Version:            1.6.4-dev
RemoteAPI Version:  1
Go Version:         go1.12.10
Git Commit:         39c705e9405faa4d02b71165d05eec1e7bb44d93
Built:              Sun Dec  1 20:33:38 2019
OS/Arch:            linux/arm64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: 39c705e9405faa4d02b71165d05eec1e7bb44d93
  go version: go1.12.10
  podman version: 1.6.4-dev
host:
  BuildahVersion: 1.11.5
  CgroupVersion: v1
  Conmon:
    package: Unknown
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.4-dev, commit: 42bce451a613f56ae9d6466ac89b3e41594fb93f'
  Distribution:
    distribution: ubuntu
    version: "19.10"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  MemFree: 2299838464
  MemTotal: 2989670400
  OCIRuntime:
    name: runc
    package: Unknown
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc9+dev
      commit: 2186cfa3cd52b8e00b1de76db7859cacdf7b1f94
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: arm64
  cpus: 4
  eventlogger: journald
  hostname: ubuntu
  kernel: 5.3.0-1012-raspi2
  os: linux
  rootless: true
  uptime: 21m 44.04s
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
store:
  ConfigFile: /home/ubuntu/.config/containers/storage.conf
  ContainerStore:
    number: 1
  GraphDriverName: vfs
  GraphOptions: {}
  GraphRoot: /home/ubuntu/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 17
  RunRoot: /run/user/1000
  VolumePath: /home/ubuntu/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

Not installed by package manager.

Additional environment details (AWS, VirtualBox, physical, etc.):

Raspberry pi 4
Linux ubuntu 5.3.0-1012-raspi2 #14-Ubuntu SMP Mon Nov 11 10:06:55 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=19.10
DISTRIB_CODENAME=eoan
DISTRIB_DESCRIPTION="Ubuntu 19.10"

My current kernel commandline is

$ cat /proc/cmdline 
 coherent_pool=1M 8250.nr_uarts=1 cma=64M cma=256M  smsc95xx.macaddr=DC:A6:32:36:95:B8 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000  net.ifnames=0 dwc_otg.lpm_enable=0 console=ttyS0,115200 console=tty1 root=/dev/mapper/vg_rpi4_00-root_t rootfstype=ext4 elevator=deadline rootwait cgroup_enable=cpuset cgroup_enable=memory cgroup_memory=1 LANG=C.UTF-8 vconsole.keymap=fi zswap.enabled=1 zswap.compressor=lz4 zswap.zpool=z3fold page_poison=1 pti=on slab_nomerge slub_debug=P vsyscall=none init_on_alloc=1 init_on_free=1 quiet splash

But I've seen this issue also when I've used default cmdline.

If I remove volume it always gives error about freeing lock.

ubuntu@ubuntu:~$ podman volume rm tst-data
Error: error freeing lock for volume tst-data: no such file or directory

Test script works again as expected until reboot.

Volume is just:

ubuntu@ubuntu:~$ ls -laR ./.local/share/containers/storage/volumes/tst-data
./.local/share/containers/storage/volumes/tst-data:
total 0
drwx------ 3 ubuntu ubuntu  19 Dec  1 21:38 .
drwx------ 8 ubuntu ubuntu 112 Dec  1 21:38 ..
drwxr-xr-x 2 ubuntu ubuntu   6 Dec  1 21:38 _data

./.local/share/containers/storage/volumes/tst-data/_data:
total 0
drwxr-xr-x 2 ubuntu ubuntu  6 Dec  1 21:38 .
drwx------ 3 ubuntu ubuntu 19 Dec  1 21:38 ..

I also tried to restore /run/usr/1000 after reboot, but it did not work.

The text was updated successfully, but these errors were encountered:

rhatdan · 2019-12-03T02:48:38Z

@mheon PTAL

mheon · 2019-12-03T03:57:00Z

Got it. Damn, damn, damn. Same issue as #4621

On system reboot, we're not correctly reacquiring locks for volumes, so if you delete/recreate a container, it can cause conflicts.

mheon · 2019-12-03T03:57:35Z

We are definitely going to need a new build for RHEL once I have a patch out for this.

mheon · 2019-12-03T03:59:26Z

(Just 8.1.1 and 8.2)

After a restart, pods and containers both run a refresh() function to prepare to run after a reboot. Until now, volumes have not had a similar function, because they had no per-boot setup to perform. Unfortunately, this was not noticed when in-memory locking was introduced to volumes. The refresh() routine is, among other things, responsible for ensuring that locks are reserved after a reboot, ensuring they cannot be taken by a freshly-created container, pod, or volume. If this reservation is not done, we can end up with two objects using the same lock, potentially needing to lock each other for some operations - classic recipe for deadlocks. Add a refresh() function to volumes to perform lock reservation and ensure it is called as part of overall refresh(). Fixes containers#4605 Fixes containers#4621 Signed-off-by: Matthew Heon <[email protected]>

mheon · 2019-12-03T04:12:20Z

#4624 should have a fix

After a restart, pods and containers both run a refresh() function to prepare to run after a reboot. Until now, volumes have not had a similar function, because they had no per-boot setup to perform. Unfortunately, this was not noticed when in-memory locking was introduced to volumes. The refresh() routine is, among other things, responsible for ensuring that locks are reserved after a reboot, ensuring they cannot be taken by a freshly-created container, pod, or volume. If this reservation is not done, we can end up with two objects using the same lock, potentially needing to lock each other for some operations - classic recipe for deadlocks. Add a refresh() function to volumes to perform lock reservation and ensure it is called as part of overall refresh(). Fixes containers#4605 Fixes containers#4621 Signed-off-by: Matthew Heon <[email protected]>

openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 1, 2019

rhatdan assigned mheon Dec 3, 2019

This was referenced Dec 3, 2019

Ensure volumes reacquire locks on state refresh #4624

Merged

Fedora 31: "podman start" hangs #4621

Closed

openshift-merge-robot closed this as completed in #4624 Dec 3, 2019

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

podman fails to start container after reboot if using volume #4605

podman fails to start container after reboot if using volume #4605

maage commented Dec 1, 2019

rhatdan commented Dec 3, 2019

mheon commented Dec 3, 2019

mheon commented Dec 3, 2019

mheon commented Dec 3, 2019

mheon commented Dec 3, 2019

podman fails to start container after reboot if using volume #4605

podman fails to start container after reboot if using volume #4605

Comments

maage commented Dec 1, 2019

rhatdan commented Dec 3, 2019

mheon commented Dec 3, 2019

mheon commented Dec 3, 2019

mheon commented Dec 3, 2019

mheon commented Dec 3, 2019