Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Podman containers running in "Created" status #11478

Closed
GBBx opened this issue Sep 8, 2021 · 6 comments · Fixed by containers/common#759
Closed

Podman containers running in "Created" status #11478

GBBx opened this issue Sep 8, 2021 · 6 comments · Fixed by containers/common#759
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@GBBx
Copy link

GBBx commented Sep 8, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I think it is related to #9663 but when I follow the suggested solution it doesn't solve the problem for me.

Steps to reproduce the issue:

  1. podman play a kubernetes-style yaml file with one or more containers in rootless mode

  2. Wait a couple of days

  3. podman container ls --all shows the container(s) as "Created" and I cannot use podman exec or podman logs --follow

Describe the results you received:

The pod and its containers are "Created"

Describe the results you expected:

The pod and containers should stay in Up status.

Additional information you deem important (e.g. issue happens only occasionally):

The problem doesn't happen when I restart the pod or play the kubernetes file. It occurs later, seemingly by itself.

Output of podman version:

Version:      3.2.3
API Version:  3.2.3
Go Version:   go1.15.14
Built:        Wed Aug 11 21:36:54 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.21.3
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.29-1.el8.3.10.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: '
  cpus: 4
  distribution:
    distribution: '"almalinux"'
    version: "8.4"
  eventLogger: file
  hostname: mymachine
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 795
      size: 1
    - container_id: 1
      host_id: 100000
      size: 10001
    uidmap:
    - container_id: 0
      host_id: 990
      size: 1
    - container_id: 1
      host_id: 100000
      size: 10001
  kernel: 4.18.0-305.12.1.el8_4.x86_64
  linkmode: dynamic
  memFree: 7432572928
  memTotal: 8347439104
  ociRuntime:
    name: crun
    package: crun-0.20.1-1.el8.3.7.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.20.1
      commit: 0d42f1109fd73548f44b01b3e84d04a279e99d2e
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /tmp/podman-run-990/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.8-4.el8.7.14.x86_64
    version: |-
      slirp4netns version 1.1.8
      commit: d361001f495417b880f20329121e3aa431a8f90f
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.1
  swapFree: 1724420096
  swapTotal: 2147479552
  uptime: 139h 24m 4.12s (Approximately 5.79 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
store:
  configFile: /home/podman/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 0
    stopped: 2
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.5.0-1.el8.5.3.x86_64
      Version: |-
        fusermount3 version: 3.2.1
        fuse-overlayfs: version 1.5
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  graphRoot: /home/podman/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 2
  runRoot: /tmp/podman-run-990/containers
  volumePath: /home/podman/.local/share/containers/storage/volumes
version:
  APIVersion: 3.2.3
  Built: 1628710614
  BuiltTime: Wed Aug 11 21:36:54 2021
  GitCommit: ""
  GoVersion: go1.15.14
  OsArch: linux/amd64
  Version: 3.2.3

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.2.3-1.el8.1.6.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

tmpfile:

podman info --debug | grep runRoot
  runRoot: /tmp/podman-run-990/containers
cat /etc/tmpfiles.d/podman.conf
# /tmp/podman-run-* directory can contain content for Podman containers that have run
# for many days. This following line prevents systemd from removing this content.
x /tmp/podman-run-*
x /tmp/containers-user-*
D! /run/podman 0700 root root
D! /var/lib/cni/networks
cat /usr/lib/tmpfiles.d/podman.conf
# /tmp/podman-run-* directory can contain content for Podman containers that have run
# for many days. This following line prevents systemd from removing this content.
x /tmp/podman-run-*
x /tmp/containers-user-*
D! /run/podman 0700 root root
D! /var/lib/cni/networks
@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 8, 2021
@Luap99
Copy link
Member

Luap99 commented Sep 8, 2021

Did you checked if there is still content in /tmp/podman-run-990/containers?
Can you also provide the output of podman --log-level debug ps.

@GBBx
Copy link
Author

GBBx commented Sep 8, 2021

Hi @Luap99
yes, there is content there:

ls /tmp/podman-run-990/containers
auth.json  overlay/  overlay-containers/  overlay-layers/  overlay-locks/
INFO[0000] podman filtering at log level debug
DEBU[0000] Called ps.PersistentPreRunE(podman --log-level debug ps)
DEBU[0000] Merged system config "/usr/share/containers/containers.conf"
DEBU[0000] Merged system config "/home/podman/.config/containers/containers.conf"
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /home/podman/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /home/podman/.local/share/containers/storage
DEBU[0000] Using run root /tmp/podman-run-990/containers
DEBU[0000] Using static dir /home/podman/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /tmp/run-990/libpod/tmp
DEBU[0000] Using volume path /home/podman/.local/share/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] Not configuring container store
DEBU[0000] Initializing event backend file
DEBU[0000] configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument
DEBU[0000] Using OCI runtime "/usr/bin/crun"
DEBU[0000] Default CNI network name podman is unchangeable
INFO[0000] Setting parallel job count to 13
INFO[0000] podman filtering at log level debug
DEBU[0000] Called ps.PersistentPreRunE(podman --log-level debug ps)
DEBU[0000] overlay storage already configured with a mount-program
DEBU[0000] Merged system config "/usr/share/containers/containers.conf"
DEBU[0000] Merged system config "/home/podman/.config/containers/containers.conf"
DEBU[0000] overlay storage already configured with a mount-program
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /home/podman/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] Overriding tmp dir "/tmp/podman-run-990/libpod/tmp" with "/tmp/run-990/libpod/tmp" from database
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /home/podman/.local/share/containers/storage
DEBU[0000] Using run root /tmp/podman-run-990/containers
DEBU[0000] Using static dir /home/podman/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /tmp/run-990/libpod/tmp
DEBU[0000] Using volume path /home/podman/.local/share/containers/storage/volumes
DEBU[0000] overlay storage already configured with a mount-program
DEBU[0000] Set libpod namespace to ""
DEBU[0000] [graphdriver] trying provided driver "overlay"
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false
DEBU[0000] Initializing event backend file
DEBU[0000] configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument
DEBU[0000] Using OCI runtime "/usr/bin/crun"
DEBU[0000] Default CNI network name podman is unchangeable
INFO[0000] Setting parallel job count to 13
DEBU[0000] Failed to add podman to systemd sandbox cgroup: exec: "dbus-launch": executable file not found in $PATH
CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES
DEBU[0000] Called ps.PersistentPostRunE(podman --log-level debug ps)

A little more info:

I run these pods with systemd:

[Service]
Type=forking
User=podman
Group=podman
WorkingDirectory=/home/podman
ExecStart=/bin/podman play kube ....

Although the problem does not occur when the user logs out I also tried loginctl enable-linger as suggested here.
Then after I tried to restart the service it was inactive (dead). I had to loginctl disable-linger to start it again.

@Luap99
Copy link
Member

Luap99 commented Sep 8, 2021

DEBU[0000] Using tmp dir /tmp/run-990/libpod/tmp This directory should be listed in the systemd tmp file config.

Anyway we do not recommend using play kube with systemd unit. Systemd cannot track the container processes in this case. So your unit can behave weird. @vrothberg Have you looked at play kube with systemd?

@vrothberg
Copy link
Member

vrothberg commented Sep 8, 2021

@vrothberg Have you looked at play kube with systemd?

That's currently not supported. There are many things that can go wrong when using Podman in systemd, so the only officially supported way is using the unit generated via podman generate systemd.

A wild guess: try changing away from Type=forking. Systemd may chose any process as the main PID and likely guesses wrong.

Also using User=podman Group=podman can lead to issues. Depending on how the session is set, Podman may use /tmp which according to the logs is the case and systemd may clean up files there at specific times. @giuseppe may be able to elaborate more on that.

Luap99 added a commit to Luap99/common that referenced this issue Sep 8, 2021
Podman should not use `/tmp/run-...`. The Podman PR#8241 changed the
path to `/tmp/podman-run-...` and added systemd tmpfile config to make
sure the path is not removed. However the tmpDir is set in c/common and
was never changed.

Fixes containers/podman#11478

Signed-off-by: Paul Holzinger <[email protected]>
@Luap99
Copy link
Member

Luap99 commented Sep 8, 2021

I created containers/common#759 Podman should not use /tmp/run-..., it has to be /tmp/podman-run-....

@GBBx
Copy link
Author

GBBx commented Sep 8, 2021

Thanks a lot for this insight.
But I wonder why podman info | grep runRoot defers from the tmp dir in podman --log-level debug ps.

Luap99 added a commit to Luap99/common that referenced this issue Sep 14, 2021
Podman should not use `/tmp/run-...`. The Podman PR#8241 changed the
path to `/tmp/podman-run-...` and added systemd tmpfile config to make
sure the path is not removed. However the tmpDir is set in c/common and
was never changed.

Fixes containers/podman#11478

Signed-off-by: Paul Holzinger <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants