on system-(re)boot podman locks while starting a pod on a autogenerated systemdfile on F31 #5377

groovyman · 2020-03-03T08:11:55Z

/kind bug

Description
i created two containers, that are running under two separate user accounts. After successful podman-build and podman podman-run both container are running. Both container are currently build on FedoraF31, the host is also running a F31-server edition. The hardware (testbed) is a cheap J4205 server with 4 raid disks, the OS is configured to load updates by itself and perform reboots, if needed. The container:

simple-app: is running user-account, that can be accessed from outside via ssh -Y user@chasmash -p 9222 ./startShellJavaSwingapp.sh. So i can access this desktop-application from any desktop. In an lan this application is running fast enough.
web-application, more complex: this container is running a perl/php fastcgi-apache web-applicaiton, that needs access to a postgres-database, that is running on the host. The podmanan-Dockerfile allows also to generate an image including its own database for development.

I can build and run both apps on the server. From time to time the server needs a reboot, so i have to integrate the podman-start/stop with systemd. By calling:

$ ssh -Y lohnbucher@chasmash -p 9222  ./jlohn28/startjl.sh > container-kivi.service
$ cp container-kivi.service ~/.config/systemd/user/container-kivi.service
$ systemctl --user daemon-reload
$ systemctl --user enable container-kivi.service
$ systemctl --user start  container-kivi.service

does exactly what i want for both application. Great Job!

But when a reboot takes place, something strange happens. It seems, that the podman-start is blocking for any reason, because when i login, the podman tells (kivi_p05 is the desired container) me:

[techlxoffice@chasmash ~]$ podman ps -a
CONTAINER ID  IMAGE                          COMMAND               CREATED       STATUS                   PORTS                        NAMES
016f2e02a570  localhost/kivi_prod_05:latest  /lib/systemd/syst...  32 hours ago  Created                  192.168.178.39:9190->80/tcp  kiki_p05
0925fa89f576  localhost/kivi_prod_04:latest  /lib/systemd/syst...  38 hours ago  Exited (0) 36 hours ago  192.168.178.39:9190->80/tcp  kiki_p04
583200deccf4  localhost/kivi_prod:latest     /lib/systemd/syst...  3 weeks ago   Created                  192.168.178.39:9190->80/tcp  funny_zhukovsky
[techlxoffice@chasmash ~]$ podman start kiki_p05

When you start kiki_p05 see above, the command simply blocks and will not return. Typing a control-c breaks the command. From time-to-time, when you issue to podman command it again, it succeeds.

Well systemctl will give you no more information, because it runs at the end a podman, but returns after some time, when the issued podman is still running (and maybe blocking other calls)

[techlxoffice@chasmash ~]$ systemctl --user start container-kivi.service
Job for container-kivi.service failed because a timeout was exceeded.
See "systemctl --user status container-kivi.service" and "journalctl --user -xe" for details.

[techlxoffice@chasmash ~]$ systemctl --user status container-kivi.service
● container-kivi.service - Podman container-kiki_p05.service
   Loaded: loaded (/mnt/raidSpace/Homes/techlxoffice/.config/systemd/user/container-kivi.service; enabled; vendor preset: enabled)
   Active: activating (start) since Tue 2020-03-03 08:18:10 CET; 10s ago
     Docs: man:podman-generate-systemd(1)
Cntrl PID: 1656 (podman)
    Tasks: 35 (limit: 18752)
   Memory: 84.8M
      CPU: 150ms
   CGroup: /user.slice/user-1001.slice/[email protected]/container-kivi.service
           ├─1611 /usr/bin/podman start kiki_p05
           ├─1625 /usr/bin/slirp4netns --disable-host-loopback --mtu 65520 -c -e 3 -r 4 --netns-type=path /run/user/1001/netns/cni-88cf2e5f-cf3d-b489-9844-2cfb42ead18>
           ├─1628 containers-rootlessport
           ├─1644 /usr/bin/fuse-overlayfs -o lowerdir=/mnt/raidSpace/Homes/techlxoffice/.local/share/containers/storage/overlay/l/VPX562TR4OBJCDWG3H2HMRVR6N:/mnt/raid>
           └─1656 /usr/bin/podman start kiki_p05

Mär 03 08:18:10 chasmash systemd[1542]: Starting Podman container-kiki_p05.service...
Mär 03 08:19:40 chasmash systemd[1542]: container-kivi.service: Found left-over process 1644 (fuse-overlayfs) in control group while starting unit. Ignoring.
Mär 03 08:19:40 chasmash systemd[1542]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mär 03 08:19:40 chasmash systemd[1542]: container-kivi.service: Found left-over process 1656 (podman) in control group while starting unit. Ignoring.
Mär 03 08:19:40 chasmash systemd[1542]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Mär 03 08:19:40 chasmash systemd[1542]: Starting Podman container-kiki_p05.service...

Anyway i also saw this behaviour before, and the simple application with the java-swing application inside has the same problem. It can not be started after a system reboot.

An idea:
Maybe you expect a logged in user and each account. When podman is called by systemctl --user there is no login on the console or via sshd. I found out, when i login on the account of the simply application from the root account by calling su - simple, then podman fails to run. A real login over ssh or terminal does something else, that systemctl --user maybe forget to do (just an idea).

Steps to reproduce the issue:

create a technical user on a server
inside this technical user do a podman build, podman run and create a systemd-file podman generate systemd --name kiki_p05. Copy the output into a file under ~/.config/systemd/user/ reload and enable the new service
perform a reboot an you gonny miss your running container

Describe the results you received:
a container, that is not running and has problems to get started

Describe the results you expected:
after a host reboot a runnning container

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

[techlxoffice@chasmash ~]$ podman version
Version:            1.8.0
RemoteAPI Version:  1
Go Version:         go1.13.6
OS/Arch:            linux/amd64

Output of podman info --debug:

[techlxoffice@chasmash ~]$ podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.13.6
  podman version: 1.8.0
host:
  BuildahVersion: 1.13.1
  CgroupVersion: v2
  Conmon:
    package: conmon-2.0.10-2.fc31.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.10, commit: 6b526d9888abb86b9e7de7dfdeec0da98ad32ee0'
  Distribution:
    distribution: fedora
    version: "31"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  MemFree: 15543816192
  MemTotal: 16427073536
  OCIRuntime:
    name: crun
    package: crun-0.12.2.1-1.fc31.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.12.2.1
      commit: cd7cea7114db5f6aa35fbb69fa307c19c2728a31
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  SwapFree: 8300523520
  SwapTotal: 8300523520
  arch: amd64
  cpus: 4
  eventlogger: journald
  hostname: chasmash
  kernel: 5.5.7-200.fc31.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.0-20.1.dev.gitbbd6f25.fc31.x86_64
    Version: |-
      slirp4netns version 0.4.0-beta.3+dev
      commit: bbd6f25c70d5db2a1cd3bfb0416a8db99a75ed7e
  uptime: 42m 3.15s
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - quay.io
store:
  ConfigFile: /mnt/raidSpace/Homes/techlxoffice/.config/containers/storage.conf
  ContainerStore:
    number: 3
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.5-2.fc31.x86_64
      Version: |-
        fusermount3 version: 3.6.2
        fuse-overlayfs: version 0.7.5
        FUSE library version 3.6.2
        using FUSE kernel interface version 7.29
  GraphRoot: /mnt/raidSpace/Homes/techlxoffice/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 171
  RunRoot: /run/user/1001/containers
  VolumePath: /mnt/raidSpace/Homes/techlxoffice/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

[techlxoffice@chasmash ~]$ rpm -q podman
podman-1.8.0-2.fc31.x86_64

Host is running a Fedora F31 server edition (with auto update)

The text was updated successfully, but these errors were encountered:

groovyman · 2020-03-03T15:40:59Z

a minor update: (see also #5094) The services are not being launched if i extend the the systemd-service-file with dependencies to [email protected].

# container-kiki_p05.service
# autogenerated by Podman 1.8.0
# Sun Mar  1 23:30:21 CET 2020

[Unit]
Description=Podman container-kiki_p05.service
Documentation=man:podman-generate-systemd(1)
Requires=postgresql.service [email protected]
After=network.target postgresql.target [email protected]

[Service]
Restart=on-failure
ExecStart=/usr/bin/podman start kiki_p05
ExecStop=/usr/bin/podman stop -t 10 kiki_p05
PIDFile=/run/user/1001/containers/overlay-containers/016f2e02a570cf36a5822dea07bb3f14d5ab685fde636594085d58c834ec07ff/userdata/conmon.pid
KillMode=none
Type=forking

[Install]
WantedBy=multi-user.target

vrothberg · 2020-03-05T10:56:54Z

Closing as we already discuss ~ the same issue in #5094.

openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 3, 2020

groovyman mentioned this issue Mar 3, 2020

Correct systemd configuration for rootless containers // mkdir /run/user/1001: permission denied #5094

Closed

vrothberg closed this as completed Mar 5, 2020

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

on system-(re)boot podman locks while starting a pod on a autogenerated systemdfile on F31 #5377

on system-(re)boot podman locks while starting a pod on a autogenerated systemdfile on F31 #5377

groovyman commented Mar 3, 2020 •

edited

Loading

groovyman commented Mar 3, 2020 •

edited

Loading

vrothberg commented Mar 5, 2020

on system-(re)boot podman locks while starting a pod on a autogenerated systemdfile on F31 #5377

on system-(re)boot podman locks while starting a pod on a autogenerated systemdfile on F31 #5377

Comments

groovyman commented Mar 3, 2020 • edited Loading

groovyman commented Mar 3, 2020 • edited Loading

vrothberg commented Mar 5, 2020

groovyman commented Mar 3, 2020 •

edited

Loading

groovyman commented Mar 3, 2020 •

edited

Loading