Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman system service fails to start when run via systemd --user mode #9633

Closed
imperialguy opened this issue Mar 5, 2021 · 19 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@imperialguy
Copy link

imperialguy commented Mar 5, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

podman system service -t 0 fails when run via systemd unit files

Following are the systemd unit files used for testing:

podman.socket

[Unit]
Description=Podman Socket

[Socket]
ListenStream=/tmp/test.sock (or for TCP --> ListenStream=127.0.0.1:<random_available_port>)

[Install]
WantedBy=sockets.target

podman.service

[Unit]
Description=Podman Service
Requires=podman.socket
After=podman.socket
StartLimitIntervalSec=0

[Service]
Type=notify
KillMode=process
Environment=LOGGING="--log-level=debug"
ExecStart=/usr/bin/podman $LOGGING system service -t 0
TimeoutStopSec=30

[Install]
WantedBy=multi-user.target

Steps to reproduce the issue:

  1. Create two systemd unit files - one for socket and one for service (like shown above).

  2. Load and run the systemd unit files in both user mode and root mode. Of course, use different socket locations/ports for unix/tcp respectively.

  3. The root mode runs just fine without any problems. The user mode fails irrespective of unix/tcp socket. Not sure what am I doing wrong.

Describe the results you received:

Results when run as root --> this is the happy path. both podman.service and podman.socket work just fine

[root@localhost ~]# systemctl status podman
● podman.service - Podman Service
   Loaded: loaded (/run/systemd/system/podman.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2021-03-05 08:31:22 EST; 11s ago
 Main PID: 507761 (podman)
    Tasks: 10 (limit: 48707)
   Memory: 50.2M
   CGroup: /system.slice/podman.service
           └─507761 /usr/bin/podman --log-level=debug system service -t 0

Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="Methods:   POST Path: /volumes/create"
Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="Methods:    GET Path: /v{version:[0-9][0-9.]*}/volumes/>
Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="Methods:    GET Path: /volumes/{name}"
Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="Methods: DELETE Path: /v{version:[0-9][0-9.]*}/volumes/>
Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="Methods: DELETE Path: /volumes/{name}"
Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="Methods:   POST Path: /v{version:[0-9][0-9.]*}/volumes/>
Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="Methods:   POST Path: /volumes/prune"
Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="Notify sent successfully"
Mar 05 08:31:22 localhost.localdomain podman[507761]: time="2021-03-05T08:31:22-05:00" level=debug msg="API Server idle for 0s"

[root@localhost ~]# systemctl status podman.socket
● podman.socket - Podman Socket
   Loaded: loaded (/run/systemd/system/podman.socket; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2021-03-05 08:31:22 EST; 14min ago
   Listen: 127.0.0.1:8081 (Stream)
    Tasks: 0 (limit: 48707)
   Memory: 0B
   CGroup: /system.slice/podman.socket

Mar 05 08:31:22 localhost.localdomain systemd[1]: Listening on Podman Socket.

Results when run as user --> this FAILS for podman.service, but podman.socket works fine

[testuser@localhost ~]$ systemctl --user status podman
● podman.service - Podman Service
   Loaded: loaded (/home/testuser/.config/systemd/user/podman.service; enabled; vendor preset: enabled)
   Active: failed (Result: timeout) since Fri 2021-03-05 08:23:31 EST; 9min ago
  Process: 507054 ExecStart=/usr/bin/podman $LOGGING system service -t 0 (code=exited, status=1/FAILURE)
 Main PID: 507054 (code=exited, status=1/FAILURE)

Mar 05 08:23:31 localhost.localdomain systemd[5841]: podman.service: start operation timed out. Terminating.
Mar 05 08:23:31 localhost.localdomain podman[507054]: time="2021-03-05T08:23:31-05:00" level=info msg="Received shutdown signal terminated, terminating!"
Mar 05 08:23:31 localhost.localdomain podman[507054]: time="2021-03-05T08:23:31-05:00" level=info msg="Invoking shutdown handler libpod"
Mar 05 08:23:31 localhost.localdomain podman[507054]: time="2021-03-05T08:23:31-05:00" level=info msg="Received shutdown signal terminated, terminating!"
Mar 05 08:23:31 localhost.localdomain podman[507054]: time="2021-03-05T08:23:31-05:00" level=info msg="Invoking shutdown handler server"
Mar 05 08:23:31 localhost.localdomain podman[507054]: time="2021-03-05T08:23:31-05:00" level=debug msg="APIServer.Shutdown ignored as Duration is UnlimitedService"
Mar 05 08:23:31 localhost.localdomain podman[507054]: time="2021-03-05T08:23:31-05:00" level=info msg="Invoking shutdown handler libpod"
Mar 05 08:23:31 localhost.localdomain systemd[5841]: podman.service: Main process exited, code=exited, status=1/FAILURE
Mar 05 08:23:31 localhost.localdomain systemd[5841]: podman.service: Failed with result 'timeout'.
Mar 05 08:23:31 localhost.localdomain systemd[5841]: Failed to start Podman Service.

[testuser@localhost ~]$ systemctl --user status podman.socket
● podman.socket - Podman Socket
   Loaded: loaded (/home/testuser/.config/systemd/user/podman.socket; enabled; vendor preset: enabled)
   Active: active (listening) since Fri 2021-03-05 08:22:01 EST; 11min ago
   Listen: 127.0.0.1:8082 (Stream)
   CGroup: /user.slice/user-1000.slice/[email protected]/podman.socket

Mar 05 08:22:01 localhost.localdomain systemd[5841]: Listening on Podman Socket.

Describe the results you expected:

Expected the user mode podman.service to run successfully.

Additional information you deem important (e.g. issue happens only occasionally):

As mentioned before, this happens only in systemd user mode irrespective of unix/tcp socket. I tested them both.

Output of podman version:

[testuser@localhost ~]$ podman version
Version:      3.0.1
API Version:  3.0.0
Go Version:   go1.14.12
Built:        Mon Feb 22 09:36:53 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

[testuser@localhost ~]$ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.19.4
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.26-3.el8.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.26, commit: c35ce4989f35168ff023617f1ea36554ae56d952'
  cpus: 4
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 4.18.0-240.10.1.el8_3.x86_64
  linkmode: dynamic
  memFree: 3533475840
  memTotal: 8040083456
  ociRuntime:
    name: crun
    package: crun-0.18-2.el8.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.18
      commit: 808420efe3dc2b44d6db9f1a3fac8361dde42a95
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.4-2.module_el8.3.0+475+c50ce30b.x86_64
    version: |-
      slirp4netns version 1.1.4
      commit: b66ffa8e262507e37fca689822d23430f3357fe8
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
  swapFree: 1320378368
  swapTotal: 2613047296
  uptime: 214h 52m 27.38s (Approximately 8.92 days)
registries:
  search:
  - registry.gitlab.com
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /home/testuser/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.1.2-3.module_el8.3.0+507+aa0970ae.x86_64
      Version: |-
        fuse-overlayfs: version 1.1.0
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  graphRoot: /home/testuser/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 8
  runRoot: /run/user/1000/containers
  volumePath: /home/testuser/.local/share/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1614004613
  BuiltTime: Mon Feb 22 09:36:53 2021
  GitCommit: ""
  GoVersion: go1.14.12
  OsArch: linux/amd64
  Version: 3.0.1

Package info (e.g. output of rpm -q podman or apt list podman):

[testuser@localhost ~]$ rpm -q podman
podman-3.0.1-1.el8.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):
Cent OS 8 VM

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 5, 2021
@imperialguy imperialguy changed the title podman system service fails to start when used with systemd --user mode podman system service fails to start when run via systemd --user mode Mar 5, 2021
@mheon
Copy link
Member

mheon commented Mar 5, 2021

@jwhonce Is Type=notify correct for podman system service + systemd?

@imperialguy
Copy link
Author

I tried both notify and oneshot. Doesn't work. Again this problem is only in systemd --user mode. Both notify and oneshot work just fine in root mode, which kinda defeats the purpose. Coz, the whole point of using podman is because it supports rootless.

@rhatdan
Copy link
Member

rhatdan commented Mar 5, 2021

We are shipping these with Fedora

cat /usr/lib/systemd/user/podman.socket
[Unit]
Description=Podman API Socket
Documentation=man:podman-system-service(1)

[Socket]
ListenStream=%t/podman/podman.sock
SocketMode=0660

[Install]
WantedBy=sockets.target
$ cat  /usr/lib/systemd/user/podman.service 
[Unit]
Description=Podman API Service
Requires=podman.socket
After=podman.socket
Documentation=man:podman-system-service(1)
StartLimitIntervalSec=0

[Service]
Type=exec
KillMode=process
Environment=LOGGING="--log-level=info"
ExecStart=/usr/bin/podman $LOGGING system service

Do these not work for you?

@imperialguy
Copy link
Author

imperialguy commented Mar 5, 2021

Aha, so the trick is Type=exec. That did it for Cent OS 8.

But, now there's a new problem. It's on archlinux. Using the same solution i.e., Type=exec, here's what's happening on archlinux:

If I run, for e.g. curl --unix-socket /tmp/test.sock http://libpod/images/json

  • On archlinux (systemd --user mode) --> The systemd services show as up and running, but the curl is stuck indefinitely. It doesn't return anything.
  • On archlinux (systemd root mode) --> The systemd services show as up and running, and the curl returns responses just fine.

Not sure if this is specific to archlinux. I don't understand why though.

@rhatdan
Copy link
Member

rhatdan commented Mar 5, 2021

Does podman-remote images work?

@imperialguy
Copy link
Author

imperialguy commented Mar 5, 2021

Well, it works if I just run the standalone command podman service system -t 0 & and pull images using podman-remote images both for root and rootless users. The problem happens when I stick that command into a systemd unit file, specifically for --user (i.e., rootless) mode in archlinux.

CASE 1: the below works (root user mode)

[Unit]
Description=Podman API Socket

[Socket]
ListenStream=/run/podman/podman.sock
SocketMode=0660

[Install]
WantedBy=sockets.target


[Unit]
Description=Podman API Service
Requires=podman.socket
After=podman.socket
StartLimitIntervalSec=0

[Service]
Type=exec
KillMode=process
Environment=LOGGING="--log-level=debug"
ExecStart=/usr/bin/podman $LOGGING system service
TimeoutStopSec=30

[Install]
WantedBy=multi-user.target

If I spin up the above unit files as root using systemctl start podman etc. and then run podman-remote images that works. It returns the proper response.

CASE 2: the below doesn't work (rootless mode)

[Unit]
Description=Podman API Socket

[Socket]
ListenStream=/run/user/1000/podman/podman.sock
SocketMode=0660

[Install]
WantedBy=sockets.target


[Unit]
Description=Podman API Service
Requires=podman.socket
After=podman.socket
StartLimitIntervalSec=0

[Service]
Type=exec
KillMode=process
Environment=LOGGING="--log-level=debug"
ExecStart=/usr/bin/podman $LOGGING system service
TimeoutStopSec=30

[Install]
WantedBy=multi-user.target

Now, if I spin up the above unit files in rootless user mode using systemctl --user start podman etc, and then run podman-remote images, it hangs indefinitely.

In both of the above cases, the systemctl status shows that both the podman socket and the service systemd units are active and running perfectly fine.

But, when you try to test it using the podman-remote images or any curl command for that matter, those commands indefinitely hang in --user (rootless) mode on archlinux. Surprisingly, I don't see this problem on CentOS 8.

Same applies to tcp sockets.

@rhatdan
Copy link
Member

rhatdan commented Mar 6, 2021

Do you see anything in the journal about the incoming connection? This might be a systemd issue.

@imperialguy
Copy link
Author

imperialguy commented Mar 6, 2021

Actually now I am seeing the issue on Cent OS 8 and I tested it on Cent OS 8.3 as well - same problem. Perhaps I got lucky last time with Cent OS 8 - which is even weirder.

As far as systemd goes, Cent OS 8/8.3 use version 239, while archlinux uses version 247. The problem is exactly the same on all of them though - rootless service is inaccessible when spun via systemctl --user.

Attached is journalctl.log - an excerpt of the journalct logs when I make a curl request to rootless service. I do see a lot of chatter from podman. Not sure what to make out of it. The curl still hangs both on tcp/unix sockets for rootless mode.

@Krejza9
Copy link

Krejza9 commented Mar 14, 2021

Hello this is same bug as #9280

@imperialguy
Copy link
Author

They sure look similar. That one hasn't been resolved either.

@rhatdan
Copy link
Member

rhatdan commented Mar 15, 2021

Lets' concentrate on that bug.

@rhatdan rhatdan closed this as completed Mar 15, 2021
@imperialguy
Copy link
Author

imperialguy commented Mar 20, 2021

Lets' concentrate on that bug.

@rhatdan Not sure why this ticket was closed. They sure look similar, but I don't think they are exactly the same bug.

In this ticket's case, the services and sockets are running perfectly fine after switching to Type=Exec. Like I described starting from this point on, the systemctl status shows them active (running).

But, the problem is any kind of attempt to contact the service (using curl for e.g.) hangs indefinitely. You should take another look at this, separate from the other bug.

@rhatdan rhatdan reopened this Mar 23, 2021
@imperialguy
Copy link
Author

Hello this is same bug as #9280

@rhatdan @Krejza9 Does #9928 also fix the issue described in the above comment about curl hanging indefinitely ?

@Krejza9
Copy link

Krejza9 commented Apr 4, 2021

Hello this is same bug as #9280

@rhatdan @Krejza9 Does #9928 also fix the issue described in the above comment about curl hanging indefinitely ?

I can confirm #9928 works

image

@imperialguy
Copy link
Author

imperialguy commented Apr 4, 2021

@Krejza9 Just to clarify, the curl call in the screenshot you posted is tested against podman.sock that is running in rootless mode using systemctl --user, right?

@Krejza9
Copy link

Krejza9 commented Apr 4, 2021

exactly
image
If I run curl --no-buffer -XGET -s --unix-socket /home/intrapl/.config/podman.sock http://libpod/images/json|jq
You can see that the "podman system service" is called correctly .
image

@imperialguy
Copy link
Author

@Krejza9 Thanks for confirming. I'll test it once it's made available in the next release.

@github-actions
Copy link

github-actions bot commented May 5, 2021

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented May 5, 2021

I take it this is fixed in the main branch. Please reopen if I am mistaken.

@rhatdan rhatdan closed this as completed May 5, 2021
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests

5 participants