Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct systemd configuration for rootless containers // mkdir /run/user/1001: permission denied #5094

Closed
groovyman opened this issue Feb 5, 2020 · 25 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@groovyman
Copy link

groovyman commented Feb 5, 2020

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
I am trying to control the start/stop process of a rootless container with systemd. Yesterday Matthew was so nice to help me with #5057. The i checked these useful links:

My service-file looks like this:

[Unit]
Description=Kivitendo podman runner
Wants=postgres.service

[Service]
User=techlxoffice
Restart=always
ExecStart=/usr/bin/podman start -a funny_zhukovsky
ExecStop=/usr/bin/podman stop -t 10 funny_zhukovsky
StandardOutput=file:/mnt/raidSpace/Homes/techlxoffice/DataSpace/hostLog/podmankiviStdout.log
StandardError=file:/mnt/raidSpace/Homes/techlxoffice/DataSpace/hostLog/podmankiviStderr.log

[Install]
WantedBy=default.target
Alias=kivitendo.prod.service

When i perform systemctl start/stop on a running system, systemd is able to start/stop the container without any problems. Please note, that the container resides inside a user-account (i call it technical user, simply non root).
As i mentioned before, the start/stops is running from the console. Unfortunately it fails, when the host-system is restarted. Podman-start returns:

Error: could not get runtime: error creating tmpdir /run/user/1001/libpod/tmp: mkdir /run/user/1001: permission denied

I think, the podman-start should be called later, if possibe.

Steps to reproduce the issue:

  1. config environement as described before and do a reboot

Describe the results you received:
podman writes:

Describe the results you expected:
I would expect the same behaviour as i perform a systemctl restart on my console.

Additional information you deem important (e.g. issue happens only occasionally):
I receive the following message in the output (defined in the depicted service file).

Error: could not get runtime: error creating tmpdir /run/user/1001/libpod/tmp: mkdir /run/user/1001: permission denied

I guess, the user-session environement hast not been set up correctly.

Output of podman version:

rpm-4.15.1-1.fc31.x86_64

Output of podman info --debug:

[techlxoffice@chasmash hostLog]$ podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.13.5
  podman version: 1.7.0
host:
  BuildahVersion: 1.12.0
  CgroupVersion: v2
  Conmon:
    package: conmon-2.0.10-2.fc31.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.10, commit: 6b526d9888abb86b9e7de7dfdeec0da98ad32ee0'
  Distribution:
    distribution: fedora
    version: "31"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  MemFree: 15643013120
  MemTotal: 16427343872
  OCIRuntime:
    name: crun
    package: crun-0.10.6-1.fc31.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.10.6
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  SwapFree: 8300523520
  SwapTotal: 8300523520
  arch: amd64
  cpus: 4
  eventlogger: journald
  hostname: chasmash
  kernel: 5.4.15-200.fc31.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.0-20.1.dev.gitbbd6f25.fc31.x86_64
    Version: |-
      slirp4netns version 0.4.0-beta.3+dev
      commit: bbd6f25c70d5db2a1cd3bfb0416a8db99a75ed7e
  uptime: 9m 12.6s
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - quay.io
store:
  ConfigFile: /mnt/raidSpace/Homes/techlxoffice/.config/containers/storage.conf
  ContainerStore:
    number: 1
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.5-2.fc31.x86_64
      Version: |-
        fusermount3 version: 3.6.2
        fuse-overlayfs: version 0.7.5
        FUSE library version 3.6.2
        using FUSE kernel interface version 7.29
  GraphRoot: /mnt/raidSpace/Homes/techlxoffice/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 109
  RunRoot: /run/user/1001/containers
  VolumePath: /mnt/raidSpace/Homes/techlxoffice/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

rpm-4.15.1-1.fc31.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 5, 2020
@groovyman
Copy link
Author

groovyman commented Feb 5, 2020

As i mentioned it in my previous post, the reboot of the host was still in progress, while the podman tries to launch the container, that has been stored on the users-home directory. As a consequence of this, the /run/user/1001 ... is not ready to run the podman.
In the issue #5057 @mheon gave me a nice introduction of their process, that checks, if a reboot has occured.

As a consequence of this we need a condition, that delays the podman-start call until the users home environement is ready. The solution is to add Users/Group-ID Require/After to the Unit clause and the Group into the Service - clause as depicted below:

[Unit]
Description=Kivitendo podman runner
Wants=postgres.service
[email protected]
[email protected]

[Service]
User=techlxoffice
Group=techlxoffice
Restart=always
ExecStart=/usr/bin/podman start -a funny_zhukovsky
ExecStop=/usr/bin/podman stop -t 10 funny_zhukovsky
StandardOutput=file:/mnt/raidSpace/Homes/techlxoffice/DataSpace/hostLog/podmankiviStdout.log
StandardError=file:/mnt/raidSpace/Homes/techlxoffice/DataSpace/hostLog/podmankiviStderr.log

[Install]
WantedBy=default.target

Nice work !

@vrothberg
Copy link
Member

I am re-opening as this is something the generated service files should include. Thanks for the nice summary!

@vrothberg vrothberg reopened this Feb 5, 2020
@vrothberg vrothberg self-assigned this Feb 5, 2020
@groovyman
Copy link
Author

groovyman commented Feb 10, 2020

Salut Valenting,

there is another thing, you should check out. Each time the host performs a reboot, the host-database is (of course) beeing restartet, but the web-application inside the podman container is beeing melt-up again with all its tcp-connection to the outer (host) space.

So the /usr/bin/podman start -a funny_zhukovsky would run the apache-perl-web-application with all its old connection to the host database. The application then is sending an error message, i guess it is the passive endpoint to the postgres listener, that ends up in ... anything but a socket, has been closed.

When i restart the host-database via systemctl restart postgresql.service the podman-served apache-application recognize -it think- the broken connection to the passive postgres listener and reconnects. So i would like to tell the conntainer, that the application should better reconnect.

Thank you
Christian

@rhatdan
Copy link
Member

rhatdan commented Feb 18, 2020

@groovyman is this something to be fixed in podman, or something that the user would need to fix.

@groovyman
Copy link
Author

groovyman commented Feb 18, 2020

This should be your stuff. I think the podman freezes all socket connections on podman stop, regardless, if the other side of the connection is inside the container (that's ok) or outside. I think you should disconnect the connections to the outside. It seems to me, that they are melt up in a stale state.

@rhatdan
Copy link
Member

rhatdan commented Feb 18, 2020

So this is a container listening to proceses on the outside via a unix domain socket? If so then this would require SELinux separation to be disabled. Similarly if it is connecting to socket on the other side it would need to be disabled.

@groovyman
Copy link
Author

groovyman commented Mar 3, 2020

Salut Daniel,
no, it is a tcp-socket! (postgres)

This is the systemd-service file created by podman. The service is running as a user-process under techlxoffice. The file has been stored into ~/.config/systemd/user/container-kivi.service. I testet successfully systemctl --user start/stop container-kivi.serice.

# container-kiki_p05.service
# autogenerated by Podman 1.8.0
# Sun Mar  1 23:30:21 CET 2020

[Unit]
Description=Podman container-kiki_p05.service
Documentation=man:podman-generate-systemd(1)

[Service]
Restart=on-failure
ExecStart=/usr/bin/podman start kiki_p05
ExecStop=/usr/bin/podman stop -t 10 kiki_p05
PIDFile=/run/user/1001/containers/overlay-containers/016f2e02a570cf36a5822dea07bb3f14d5ab685fde636594085d58c834ec07ff/userdata/conmon.pid
KillMode=none
Type=forking

[Install]
WantedBy=multi-user.target

Do i have to add a dependency to the host postgres service?

Ups, there is a symlink of this file under ~/.config/systemd/user/multi-user.target.wants/container-kivi.service. Who created this ?

@rhatdan
Copy link
Member

rhatdan commented Mar 3, 2020

@vrothberg Any ideas?

@vrothberg
Copy link
Member

Do i have to add a dependency to the host postgres service?

If your container needs the postgres service to be running, then you need to add this a dependency.

Ups, there is a symlink of this file under ~/.config/systemd/user/multi-user.target.wants/container-kivi.service.

I assume systemd did.

@groovyman
Copy link
Author

uh yes!

[Unit]
Description=Podman container-kiki_p05.service
Documentation=man:podman-generate-systemd(1)
Requires=postgresql.service
After=network.target postgresql.target

Thank you!

@vrothberg vrothberg reopened this Mar 3, 2020
@vrothberg
Copy link
Member

... is still outstanding (working on it atm)

@groovyman
Copy link
Author

groovyman commented Mar 3, 2020

Ah i see. This belongs to the issue i posted this morning (see my idea at the bottom of my error description.) So the systemctl should wait with the start of the service, until the user-level and the user-account becomes vital. (see issue #5377)

As a consequence, i should add the [email protected] as Requires and After to my systemd.service file. This make sense.

@groovyman
Copy link
Author

I gave the change of the systemd.service file

a try. With the [email protected] in the systemd-file the services are not started. Regardless of the change you have to implement, the system is still failing to run my services. (see #5377). There is another problem.

@groovyman
Copy link
Author

I think the user@1001 service is wrong, because when i try:

[techlxoffice@chasmash ~]$ systemctl --user start container-kivi.service
Failed to start container-kivi.service: Unit [email protected] not found.

@vrothberg
Copy link
Member

I believe that's not something we can control. Systemd manages that alone. I need to test a bit in a VM. There's a dedicated run-mount service for each user.

@groovyman
Copy link
Author

Do i have to add a dependency to the host postgres service?

If your container needs the postgres service to be running, then you need to add this a dependency.

Ups, there is a symlink of this file under ~/.config/systemd/user/multi-user.target.wants/container-kivi.service.

I assume systemd did.

Yes how do you access the postgres.service out of a systemd-user-scope.

[techlxoffice@chasmash ~]$ systemctl --user start container-kivi.service
Failed to start container-kivi.service: Unit [email protected] not found.

@vrothberg
Copy link
Member

Only root can see/alter them. Give me a couple more days to look into it :)

@vrothberg
Copy link
Member

FWIW I cannot reproduce on F31 rootless. In case of root, we had to add network dependencies which is addressed with #5382.

Can you try run-user$UID.mount instead of user@$uid.service and see if that works for you?

@groovyman
Copy link
Author

groovyman commented Mar 5, 2020

  1. yes i am also running the service rootless.
  2. Sad to say, not successful.
[lohnbuchhaltung@chasmash ~]$ systemctl --user daemon-reload
[lohnbuchhaltung@chasmash ~]$ systemctl --user status container-lohnbuch
● container-lohnbuch.service - Podman container-lohnbuch_00.service
   Loaded: loaded (/mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service; enabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:podman-generate-systemd(1)

Mar 05 15:13:28 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:5: Failed to add dependency on [email protected], ignoring: Invalid argument
Mar 05 15:13:28 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:6: Failed to add dependency on [email protected], ignoring: Invalid argument
Mar 05 15:14:24 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:5: Failed to add dependency on [email protected], ignoring: Invalid argument
Mar 05 15:14:24 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:6: Failed to add dependency on [email protected], ignoring: Invalid argument
Mar 05 15:15:21 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:5: Failed to add dependency on [email protected], ignoring: Invalid argument
Mar 05 15:15:21 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:6: Failed to add dependency on [email protected], ignoring: Invalid argument
Mar 05 15:22:42 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:5: Failed to add dependency on [email protected], ignoring: Invalid argument
Mar 05 15:22:42 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:6: Failed to add dependency on [email protected], ignoring: Invalid argument
Mar 05 15:23:18 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:5: Failed to add dependency on run-user$1002.mount, ignoring: Invalid argument
Mar 05 15:23:18 chasmash systemd[5769]: /mnt/raidSpace/Homes/lohnbuchhaltung/.config/systemd/user/container-lohnbuch.service:6: Failed to add dependency on run-user$1002.mount, ignoring: Invalid argument

The weird thing is, that he did not forget old depencencies.

I think the database dependency can be implemented by connecting to the db-listner by ExecStartPre, but anyway i would expect a symbol here like [email protected].
I think, it is time for a deeper look into the manuals. I should look for a known target.

@vrothberg
Copy link
Member

@groovyman, can you try with a simpler service, maybe just running podman run -d alpine top? Once we have this running on your machine, we can tackle your more complex scenario.

@groovyman
Copy link
Author

groovyman commented Mar 7, 2020

Yeah, i did not, because i described two of my container (a more compex one and a simple one) that already was quite simple enough to play with it. But, i found a solution to this problem, simply take a look at the section first idea of inside #5377.

Under "An ideas" is i described my impression/approach, that the user context has not been completly setup to run the container. I described the situation, when i logged to the technical account not using a login, but entering it by root and then su - techuser. When i stepped in this way (without login) podman did not work at all. But when i login in via ssh it works.

So the solution is to run (as root):

[root@chasmash ~]# loginctl enable-linger techlxoffice         
[root@chasmash ~]# loginctl enable-linger lohnbuchhaltung

to enable both account context in oder to allow podman to be called by systemd --user. I also changed the service a bit:

# container-lohnbuch_00.service
# autogenerated by Podman 1.8.0
# Mon Mar  2 01:00:56 CET 2020

[Unit]
Description=Podman container-lohnbuch_00.service
Documentation=man:podman-generate-systemd(1)
After=multi-user.target

[Service]
Restart=on-failure
ExecStart=/usr/bin/podman start lohnbuch_00
ExecStop=/usr/bin/podman stop -t 10 lohnbuch_00
PIDFile=/run/user/1002/containers/overlay-containers/185bb3e27555195312460d7c2964126bc2d51f57d2cfd4b83545cfc4a816d10e/userdata/conmon.pid
KillMode=none
Type=forking

[Install]
WantedBy=default.target

but the solution was loginctl!

@vrothberg
Copy link
Member

#5427 will fix parts of the reported issues here as well.

@vrothberg
Copy link
Member

With #5427 being merged, I am closing this issue. Both, system and user services are now reported to be starting during startup.

@groovyman
Copy link
Author

groovyman commented Mar 22, 2020

stale

This issue is closed, but there was another problem i mentioned with stale connections from the host. Finally i resolved that issue (it was not on your side). I was wondering, why the connection from the web-server inside my container to the host outside database listener sometimes was in a stale state. Now, i figured out, that -sometimes- the database was started too early by systemd, finally not able to bind with an eth0 device, that was addressed by the container'ed application. So, that was my fault. ( solution is here )

Everything is working now for me, my accounting-application is now running reliable in a podman-container. In build mode, your application

  • establish a server, installing a bunch of packages
  • sucks some programm code from github,
  • configure systemd inside with up to 3 services inside

and finally completes its work, when it is beeing started (podman run).

Thanks to you all for your great job and your notable support!

@vrothberg
Copy link
Member

@groovyman, thanks a lot for sharing! I am really happy it's working.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

4 participants