Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image healthchecks not being used (as rootless, user systemd service) #5680

Closed
aleks-mariusz opened this issue Mar 31, 2020 · 14 comments
Closed
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@aleks-mariusz
Copy link
Contributor

aleks-mariusz commented Mar 31, 2020

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I am using a container from dockerhub that has healthchecks defined in it.

The container is a dns proxy (for supporting DNS requests over HTTPS), however at some point, the container stops responding to requests and is not working any longer (for unknown reason) and rather than try to diagnose it, it happens somewhat regularly/predictably often enough that i feel it's easier just to restart the container, which is what healthchecks are designed to do. the healthcheck performs a query and so should catch this exact situation.

I cannot however for the life of me get this health-checking functionality to actually work. and i'm not entirely sure this is supported in rootless? It seems it should be, and I've searched the docs for how it should be utilizing systemd timers to perform the check and if necessary restart the container, but there's limited available articles on using this facility in this scenario (mainly just this one), none of which what is described is what i'm actually experiencing in reality.

So either there's a defect here somewhere or i'm simply doing something wrong (entirely possible).

Please help me figure out which it is :-)

Steps to reproduce the issue:

  1. run the container (as a rootless process, then setting it up as a systemd user service)
  2. wait for it to fail
  3. see it not being restarted despite:
doh@gw:~$ podman healthcheck run cloudflared
unhealthy

Describe the results you received:

container continues to run/exist/live, however in a broken state :-(

Describe the results you expected:

container should be restarted automatically

Additional information you deem important (e.g. issue happens only occasionally):

container seems to stop functioning after a few hours:

here's what it looks like when it works..
$ nslookup
> set port=5053
> server 192.168.0.254
Default server: 192.168.0.254
Address: 192.168.0.254#5053
> www.google.co.uk
Server:         192.168.0.254
Address:        192.168.0.254#5053

Non-authoritative answer:
Name:   www.google.co.uk
Address: 216.58.210.195
..and when it stops working
$ nslookup
> set port=5053
> server 192.168.0.254
Default server: 192.168.0.254
Address: 192.168.0.254#5053
> www.google.co.uk
;; connection timed out; no servers could be reached

Output of podman version:

Version:            1.8.2
RemoteAPI Version:  1
Go Version:         go1.10.1
OS/Arch:            linux/amd64
**Output of `podman info --debug`:**
doh@gw:~$ podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.10.1
  podman version: 1.8.2
host:
  BuildahVersion: 1.14.3
  CgroupVersion: v1
  Conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.14, commit: '
  Distribution:
    distribution: ubuntu
    version: "18.04"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 231072
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 231072
      size: 65536
  MemFree: 451981312
  MemTotal: 2083917824
  OCIRuntime:
    name: runc
    package: 'cri-o-runc: /usr/lib/cri-o-runc/sbin/runc'
    path: /usr/lib/cri-o-runc/sbin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 2
  eventlogger: journald
  hostname: gw
  kernel: 5.3.0-42-generic
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: 'slirp4netns: /usr/bin/slirp4netns'
    Version: |-
      slirp4netns version 0.4.3
      commit: unknown
  uptime: 25h 36m 46.45s (Approximately 1.04 days)
registries:
  search:
  - docker.io
  - quay.io
store:
  ConfigFile: /home/doh/.config/containers/storage.conf
  ContainerStore:
    number: 1
  GraphDriverName: vfs
  GraphOptions: {}
  GraphRoot: /home/doh/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 1
  RunRoot: /tmp/run-1001/containers
  VolumePath: /home/doh/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

doh@gw:~$ apt list podman
Listing... Done
podman/unknown,now 1.8.2~1 amd64 [installed]

Additional environment details (AWS, VirtualBox, physical, etc.):

This is running on libvirt Ubuntu 18.04 LTS (Bionic) vm running linux kernel 5.3.0-42-generic.

output of `podman inspect`
doh@gw:~$ podman inspect cloudflared
[
    {
        "Id": "7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8",
        "Created": "2020-03-30T00:28:16.607122061Z",
        "Path": "/usr/local/bin/cloudflared",
        "Args": [
            "proxy-dns"
        ],
        "State": {
            "OciVersion": "1.0.1-dev",
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 15213,
            "ConmonPid": 15202,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2020-03-30T17:02:55.862429601Z",
            "FinishedAt": "2020-03-30T17:02:54.460635277Z",
            "Healthcheck": {
                "Status": "unhealthy",
                "FailingStreak": 4,
                "Log": [
                    {
                        "Start": "2020-03-30T00:39:17.920474591Z",
                        "End": "2020-03-30T00:39:18.576237185Z",
                        "ExitCode": 1,
                        "Output": ""
                    },
                    {
                        "Start": "2020-03-30T00:58:46.041322751Z",
                        "End": "2020-03-30T00:58:46.718745761Z",
                        "ExitCode": 1,
                        "Output": ""
                    },
                    {
                        "Start": "2020-03-30T01:00:49.616966334Z",
                        "End": "2020-03-30T01:00:50.243910965Z",
                        "ExitCode": 1,
                        "Output": ""
                    },
                    {
                        "Start": "2020-03-31T10:18:16.820936879Z",
                        "End": "2020-03-31T10:18:17.474132419Z",
                        "ExitCode": 1,
                        "Output": ""
                    }
                ]
            }
        },
        "Image": "599783cfa2c49f99cfc678b5948fef20c8d62362dfa1ba7773d005fa01b60854",
        "ImageName": "docker.io/crazymax/cloudflared:2020.3.1",
        "Rootfs": "",
        "Pod": "",
        "ResolvConfPath": "/tmp/run-1001/containers/vfs-containers/7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8/userdata/resolv.conf",
        "HostnamePath": "/tmp/run-1001/containers/vfs-containers/7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8/userdata/hostname",
        "HostsPath": "/tmp/run-1001/containers/vfs-containers/7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8/userdata/hosts",
        "StaticDir": "/home/doh/.local/share/containers/storage/vfs-containers/7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8/userdata",
        "OCIConfigPath": "/home/doh/.local/share/containers/storage/vfs-containers/7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8/userdata/config.json",
        "OCIRuntime": "runc",
        "LogPath": "/home/doh/.local/share/containers/storage/vfs-containers/7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8/userdata/ctr.log",
        "LogTag": "",
        "ConmonPidFile": "/tmp/run-1001/containers/vfs-containers/7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8/userdata/conmon.pid",
        "Name": "cloudflared",
        "RestartCount": 0,
        "Driver": "vfs",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "EffectiveCaps": null,
        "BoundingCaps": [
            "CAP_CHOWN",
            "CAP_DAC_OVERRIDE",
            "CAP_FSETID",
            "CAP_FOWNER",
            "CAP_MKNOD",
            "CAP_NET_RAW",
            "CAP_SETGID",
            "CAP_SETUID",
            "CAP_SETFCAP",
            "CAP_SETPCAP",
            "CAP_NET_BIND_SERVICE",
            "CAP_SYS_CHROOT",
            "CAP_KILL",
            "CAP_AUDIT_WRITE"
        ],
        "ExecIDs": [],
        "GraphDriver": {
            "Name": "vfs",
            "Data": null
        },
        "Mounts": [],
        "Dependencies": [],
        "NetworkSettings": {
            "EndpointID": "",
            "Gateway": "",
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "MacAddress": "",
            "Bridge": "",
            "SandboxID": "",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": [
                {
                    "hostPort": 5053,
                    "containerPort": 5053,
                    "protocol": "udp",
                    "hostIP": ""
                },
                {
                    "hostPort": 49312,
                    "containerPort": 49312,
                    "protocol": "tcp",
                    "hostIP": ""
                }
            ],
            "SandboxKey": "/run/user/1001/netns/cni-6935db70-34c0-6cbb-e124-ae3547e181bf"
        },
        "ExitCommand": [
            "/usr/bin/podman",
            "--root",
            "/home/doh/.local/share/containers/storage",
            "--runroot",
            "/tmp/run-1001/containers",
            "--log-level",
            "error",
            "--cgroup-manager",
            "cgroupfs",
            "--tmpdir",
            "/tmp/run-1001/libpod/tmp",
            "--runtime",
            "runc",
            "--storage-driver",
            "vfs",
            "--events-backend",
            "journald",
            "container",
            "cleanup",
            "7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8"
        ],
        "Namespace": "",
        "IsInfra": false,
        "Config": {
            "Hostname": "7b2747477b9d",
            "Domainname": "",
            "User": "cloudflared",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "TERM=xterm",
                "TUNNEL_DNS_PORT=5053",
                "container=podman",
                "TUNNEL_DNS_UPSTREAM=https://doh.opendns.com/dns-query",
                "TZ=Europe/London",
                "TUNNEL_METRICS=0.0.0.0:49312",
                "TUNNEL_DNS_ADDRESS=0.0.0.0",
                "HOSTNAME=7b2747477b9d",
                "HOME=/home/cloudflared"
            ],
            "Cmd": [
                "proxy-dns"
            ],
            "Image": "docker.io/crazymax/cloudflared:2020.3.1",
            "Volumes": null,
            "WorkingDir": "/",
            "Entrypoint": "/usr/local/bin/cloudflared",
            "OnBuild": null,
            "Labels": {
                "maintainer": "CrazyMax",
                "org.label-schema.build-date": "2020-03-28T22:11:37Z",
                "org.label-schema.description": "Cloudflared proxy-dns",
                "org.label-schema.name": "cloudflared",
                "org.label-schema.schema-version": "1.0",
                "org.label-schema.url": "https://github.com/crazy-max/docker-cloudflared",
                "org.label-schema.vcs-ref": "acb05c58",
                "org.label-schema.vcs-url": "https://github.com/crazy-max/docker-cloudflared",
                "org.label-schema.vendor": "CrazyMax",
                "org.label-schema.version": "2020.3.1"
            },
            "Annotations": {
                "io.container.manager": "libpod",
                "io.kubernetes.cri-o.Created": "2020-03-30T00:28:16.607122061Z",
                "io.kubernetes.cri-o.TTY": "false",
                "io.podman.annotations.autoremove": "FALSE",
                "io.podman.annotations.init": "FALSE",
                "io.podman.annotations.privileged": "FALSE",
                "io.podman.annotations.publish-all": "FALSE",
                "org.opencontainers.image.stopSignal": "15"
            },
            "StopSignal": 15,
            "Healthcheck": {
                "Test": [
                    "CMD-SHELL",
                    "CMD-SHELL dig +short @127.0.0.1 -p 5053 cloudflare.com A || exit 1"
                ],
                "StartPeriod": 30000000000,
                "Interval": 5000000000,
                "Timeout": 10000000000,
                "Retries": 3
            },
            "CreateCommand": [
                "podman",
                "run",
                "-d",
                "--name",
                "cloudflared",
                "-p",
                "5053:5053/udp",
                "-p",
                "49312:49312",
                "-e",
                "TZ=Europe/London",
                "-e",
                "TUNNEL_DNS_UPSTREAM=https://doh.opendns.com/dns-query",
                "--dns=1.1.1.1",
                "--restart=always",
                "--health-start-period=30s",
                "--health-timeout=10s",
                "--health-retries=3",
                "--health-interval=5s",
                "--health-cmd=CMD-SHELL dig +short @127.0.0.1 -p 5053 cloudflare.com A || exit 1",
                "docker.io/crazymax/cloudflared:2020.3.1"
            ]
        },
        "HostConfig": {
            "Binds": [],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "k8s-file",
                "Config": null
            },
            "NetworkMode": "default",
            "PortBindings": {
                "49312/tcp": [
                    {
                        "HostIp": "",
                        "HostPort": "49312"
                    }
                ],
                "5053/udp": [
                    {
                        "HostIp": "",
                        "HostPort": "5053"
                    }
                ]
            },
            "RestartPolicy": {
                "Name": "always",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": [],
            "CapDrop": [],
            "Dns": [
                "1.1.1.1"
            ],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": [],
            "GroupAdd": [],
            "IpcMode": "",
            "Cgroup": "",
            "Cgroups": "default",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": [],
            "Tmpfs": {},
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 65536000,
            "Runtime": "oci",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 1024,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": 0,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": [
                {
                    "Name": "RLIMIT_NOFILE",
                    "Soft": 1024,
                    "Hard": 1024
                }
            ],
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        }
    }
]

As visible from the above inspect output, the container is an unhealthy state currently.. it's not however being restarted. it even lists when it failed the healthchecks (4 instances, 3 of them from previous manual restarts).

additionally, here's what the service unit file i created is
doh@gw:~$ cat .config/systemd/user/cloudflared.service 
# container-7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8.service
# autogenerated by Podman 1.8.2
# Mon Mar 30 00:30:21 UTC 2020

[Unit]
Description=Podman container-7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8.service
Documentation=man:podman-generate-systemd(1)
Wants=network.target
After=network-online.target

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=always
ExecStart=/usr/bin/podman start 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8
ExecStop=/usr/bin/podman stop -t 1 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8
PIDFile=/tmp/run-1001/containers/vfs-containers/7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8/userdata/conmon.pid
KillMode=none
Type=forking

[Install]
WantedBy=multi-user.target default.target
and the output of systemctl --user status cloudflared
doh@gw:~$ systemctl --user status cloudflared
* cloudflared.service - Podman container-7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8.service
   Loaded: loaded (/home/doh/.config/systemd/user/cloudflared.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-03-30 17:02:55 UTC; 17h ago
     Docs: man:podman-generate-systemd(1)
  Process: 15133 ExecStop=/usr/bin/podman stop -t 1 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8 (code=exited, status=0/SUCCESS)
  Process: 15156 ExecStart=/usr/bin/podman start 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8 (code=exited, status=0/SUCCESS)
 Main PID: 15202 (conmon)
   CGroup: /user.slice/user-1001.slice/[email protected]/cloudflared.service
           |-  949 /usr/bin/podman
           |-15182 /usr/bin/slirp4netns --disable-host-loopback --mtu 65520 --enable-sandbox -c -e 3 -r 4 --netns-type=path /run/user/1001/netns/cni-6935db70-34c0-6cbb-e124-ae3547e181bf tap0
           |-15202 /usr/libexec/podman/conmon --api-version 1 -c 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8 -u 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8 -r /usr/lib/cri-o-runc/sbin/runc -b /home/doh/.local/share/
           `-15213 /usr/local/bin/cloudflared proxy-dns

Mar 30 17:02:54 gw systemd[846]: Starting Podman container-7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8.service...
Mar 30 17:02:54 gw podman[15150]: 2020-03-30 17:02:54.752664328 +0000 UTC m=+0.283118567 container cleanup 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8 (image=docker.io/crazymax/cloudflared:2020.3.1, name=cloudflared)
Mar 30 17:02:55 gw podman[15156]: time="2020-03-30T17:02:55Z" level=error msg="exit status 1"
Mar 30 17:02:55 gw podman[15156]: 2020-03-30 17:02:55.843932347 +0000 UTC m=+1.237275242 container init 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8 (image=docker.io/crazymax/cloudflared:2020.3.1, name=cloudflared)
Mar 30 17:02:55 gw podman[15156]: time="2020-03-30T17:02:55Z" level=error msg="Unit 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8.service not found."
Mar 30 17:02:55 gw podman[15156]: 2020-03-30 17:02:55.876783353 +0000 UTC m=+1.270126258 container start 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8 (image=docker.io/crazymax/cloudflared:2020.3.1, name=cloudflared)
Mar 30 17:02:55 gw podman[15156]: 7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8
Mar 30 17:02:55 gw systemd[846]: Started Podman container-7b2747477b9d1ff57b64a09811d30d69afafa340ef00301b8357e2942666d4d8.service.
@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 31, 2020
@mheon
Copy link
Member

mheon commented Mar 31, 2020

@baude PTAL

@aleks-mariusz
Copy link
Contributor Author

aleks-mariusz commented Apr 2, 2020

@stefanb2 @giuseppe per this PR it is implied that this should be working, however i can't replicate the tests in that PR in my environment - can you help me figure out what i'm missing that i'm not seeing the same result?

(first, i've removed any service files i previously created with podman generate systemd commands earlier)

and starting with a fresh environment
doh@gw:~$ ps ux 
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
doh        880  0.0  0.4  77184  8348 ?        Ss   Apr01   0:21 /lib/systemd/systemd --user
doh        883  0.0  0.1 109680  2408 ?        S    Apr01   0:00 (sd-pam)
doh       6529  0.0  0.1  49792  3872 ?        Ss   Apr01   0:00 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
doh      25822  0.0  0.1 101788  3632 ?        S    10:12   0:00 sshd: doh@pts/1
doh      25823  0.0  0.1  18636  3516 pts/1    Ss   10:12   0:00 -bash
doh      26832  0.0  0.1  36696  3144 pts/1    R+   10:17   0:00 ps ux

i start podman (note however i am not specifying a --healthcheck-command param (which seems now is renamed --health-cmd) as the dockerfile had this defined already, besides that, even by specifying that, it did not change anything so don't think lack of that param is the cause):

doh@gw:~$ podman run -d --name cloudflared -p 5053:5053/udp -p 49312:49312 -e TZ=Europe/London -e TUNNEL_DNS_UPSTREAM=https://doh.opendns.com/dns-query --dns=1.1.1.1 --restart=always docker.io/crazymax/cloudflared:2020.3.1
55309eaea449521306c07e4fd917908a2fa4b126b84cef5b5d5e11496edbdccb

then i do the query performed in the PR:

doh@gw:~$ systemctl --user status 55309eaea449521306c07e4fd917908a2fa4b126b84cef5b5d5e11496edbdccb
Unit 55309eaea449521306c07e4fd917908a2fa4b126b84cef5b5d5e11496edbdccb.service could not be found.
doh@gw:~$ systemctl --user status 55309eaea449521306c07e4fd917908a2fa4b126b84cef5b5d5e11496edbdccb.timer
Unit 55309eaea449521306c07e4fd917908a2fa4b126b84cef5b5d5e11496edbdccb.timer could not be found.

the container is running however:

doh@gw:~$ podman ps -a
CONTAINER ID  IMAGE                                    COMMAND    CREATED         STATUS             PORTS                                             NAMES
55309eaea449  docker.io/crazymax/cloudflared:2020.3.1  proxy-dns  12 minutes ago  Up 12 minutes ago  0.0.0.0:5053->5053/udp, 0.0.0.0:49312->49312/tcp  cloudflared

also including what the output a status query on the user session looks like:

doh@gw:~$ systemctl --user status
* gw
    State: degraded
     Jobs: 0 queued
   Failed: 2 units
    Since: Wed 2020-04-01 11:32:41 UTC; 22h ago
   CGroup: /user.slice/user-1001.slice/[email protected]
           |-init.scope
           | |-880 /lib/systemd/systemd --user
           | `-883 (sd-pam)
           `-dbus.service
             `-6529 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only

curiously, we see state as degraded?

doh@gw:~$ systemctl --user --failed
  UNIT               LOAD   ACTIVE SUB    DESCRIPTION                                                                                                                                                                                                           
* podman-26929.scope loaded failed failed podman-26929.scope                                                                                                                                                                                                    
* podman-pause.scope loaded failed failed podman-pause.scope                                                                                                                                                                                                    

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

2 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
doh@gw:~$ systemctl --user status podman-26929.scope
* podman-26929.scope
   Loaded: loaded (/run/user/1001/systemd/transient/podman-26929.scope; transient)
Transient: yes
   Active: failed (Result: resources)

Apr 02 10:17:35 gw systemd[880]: podman-26929.scope: Failed to add PIDs to scope's control group: Permission denied
Apr 02 10:17:35 gw systemd[880]: podman-26929.scope: Failed with result 'resources'.
Apr 02 10:17:35 gw systemd[880]: Failed to start podman-26929.scope.
doh@gw:~$ systemctl --user status podman-pause.scope
* podman-pause.scope
   Loaded: loaded (/run/user/1001/systemd/transient/podman-pause.scope; transient)
Transient: yes
   Active: failed (Result: resources)

Apr 02 10:17:37 gw systemd[880]: podman-pause.scope: Failed to add PIDs to scope's control group: Permission denied
Apr 02 10:17:37 gw systemd[880]: podman-pause.scope: Failed with result 'resources'.
Apr 02 10:17:37 gw systemd[880]: Failed to start podman-pause.scope.

I'm not sure if it's these failed scope that are related to the issue?

@aleks-mariusz
Copy link
Contributor Author

aleks-mariusz commented Apr 2, 2020

actually those checks also didn't work on another system where the scope issue does not exist, so that's likely unrelated

@aleks-mariusz
Copy link
Contributor Author

@baude any chance you've been able to take a look or provide any advice please?

@mybigman
Copy link

mybigman commented Apr 26, 2020

I have the same issue... The Health check is being ignored in the Dockerfile :/

@rhatdan
Copy link
Member

rhatdan commented Apr 27, 2020

Are health checks only supposed if the image is built with --format docker. I am not sure if OCI supports Healthchecks. If not then perhaps the health checks do not work because of this? @baude WDYT?

@mheon
Copy link
Member

mheon commented Apr 27, 2020

Healthchecks are indeed docker-only

@aleks-mariusz
Copy link
Contributor Author

the image i'm using (in rootless mode) is a docker image, health checks still are being ignored

@baude
Copy link
Member

baude commented Apr 27, 2020

correct, oci does not support healthchecks ... we work around this in podman by allowing users to define them. i dont think this is the problem as @aleks-mariusz states his image is a docker format image.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented May 28, 2020

Is this still broken in podman 1.9.2?

@penn5
Copy link

penn5 commented Jun 2, 2020

Looks like it. Tried with --format=docker, and it used the cache :hmm:. However, after nuking all images (with podman rmi -af) and building with --format=docker, it works! Perhaps I should file a new bug for caching between formats?

@penn5
Copy link

penn5 commented Jun 2, 2020

See containers/buildah#2388

@rhatdan
Copy link
Member

rhatdan commented Jun 9, 2020

Since this is a buildah issue, closing.

@rhatdan rhatdan closed this as completed Jun 9, 2020
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests

7 participants