Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem running rootless podman from a daemon user #6383

Closed
ck-schmidi opened this issue May 26, 2020 · 13 comments
Closed

Problem running rootless podman from a daemon user #6383

ck-schmidi opened this issue May 26, 2020 · 13 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. rootless stale-issue

Comments

@ck-schmidi
Copy link

/kind bug

Description

We plan to run our Nagios Monitoring Checks from Container via Docker or Podman.
So far, we don't have any issues with standard Docker, but we can't figure out a proper
solution for CentOS/Podman setups.

We want to run our plugins from Nagios via podman run.
For our test via just configured podman run --rm hello-world as our Plugin call. We got the following error message in
our monitoring system:

Error: could not get runtime: error generating default config from memory: cannot mkdir /run/user/0/libpod: mkdir /run/user/0/libpod: permission denied

So I tried to reproduce this error without running within Nagios.

Steps to reproduce the issue:

  1. Being logged in with the root user I try to switch the user with su nagios

  2. Running podman run --rm hello-world gives me the error I mentioned above.

Describe the results you received:

Running the podman command always gets me:

Error: could not get runtime: error generating default config from memory: cannot mkdir /run/user/0/libpod: mkdir /run/user/0/libpod: permission denied

Describe the results you expected:

Output should be:

Hello from Docker!
This message shows that your installation appears to be working correctly.
......

Additional information you deem important (e.g. issue happens only occasionally):

I already spend a lot of time finding solutions online. This issue (5049) gave me some hints and also refered to the troubleshooting document, but it doesn't really help me.

They main difference is: When I switch the user via su - nagios the proper
environment is populated and the podman run command does work. Switching the user with su nagios (and thats probably comparable with the nagios daemon call) the problem appears.

A good hint from this discussion was checking the XDG_RUNTIME_DIR env variable.

With su - nagios the value is empty,
with su nagios the value is /run/user/0 (which probably could be the problem)

Output of podman version:

With su - nagios it is:

Version:            1.6.4
RemoteAPI Version:  1
Go Version:         go1.13.4
OS/Arch:            linux/amd64

With su nagios it is (always the same error of course):

Error: could not get runtime: error generating default config from memory: cannot mkdir /run/user/0/libpod: mkdir /run/user/0/libpod: permission denied

Output of podman info --debug:

With su - nagios it is:

debug:                                                                                                                                                                                         
  compiler: gc                                                                                                                                                                                 
  git commit: ""                                                                                                                                                                               
  go version: go1.13.4                                                                                                                                                                         
  podman version: 1.6.4                                                                                                                                                                        
host:                                                                                                                                                                                          
  BuildahVersion: 1.12.0-dev                                                                                                                                                                   
  CgroupVersion: v1                                                                                                                                                                            
  Conmon:                                                                                                                                                                                      
    package: conmon-2.0.6-1.module_el8.1.0+298+41f9343a.x86_64                                                                                                                                 
    path: /usr/bin/conmon                                                                                                                                                                      
    version: 'conmon version 2.0.6, commit: 2721f230f94894671f141762bd0d1af2fb263239'                                                                                                          
  Distribution:                                                                                                                                                                                
    distribution: '"centos"'                                                                                                                                                                   
    version: "8"                                                                                                                                                                               
  IDMappings:                                                                                                                                                                                  
    gidmap:                                                                                                                                                                                    
    - container_id: 0                                                                                                                                                                          
      host_id: 1000                                                                                                                                                                            
      size: 1                                                                                                                                                                                  
    - container_id: 1                                                                                                                                                                          
      host_id: 100000                                                                                                                                                                          
      size: 65536                                                                                                                                                                              
    uidmap:                                                                                                                                                                                    
    - container_id: 0                                                                                                                                                                          
      host_id: 1000                                                                                                                                                                            
      size: 1                                                                                                                                                                                  
    - container_id: 1
      host_id: 100000
      size: 65536
  MemFree: 702074880
  MemTotal: 2035539968
  OCIRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module_el8.1.0+298+41f9343a.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 1
  eventlogger: journald
  hostname: uat1
  kernel: 4.18.0-147.8.1.el8_1.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.2-3.git21fdece.module_el8.1.0+298+41f9343a.x86_64
    Version: |-
      slirp4netns version 0.4.2+dev
      commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
  uptime: 144h 15m 41.41s (Approximately 6.00 days)
registries:
  blocked: null
  insecure: null
  search:
  - registry.access.redhat.com
  - registry.fedoraproject.org
  - registry.centos.org
  - docker.io
store:
  ConfigFile: /home/nagios/.config/containers/storage.conf
  ContainerStore:
    number: 3
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-5.module_el8.1.0+298+41f9343a.x86_64
      Version: |-
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  GraphRoot: /home/nagios/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 1
  RunRoot: /tmp/run-1000
  VolumePath: /home/nagios/.local/share/containers/storage/volumes

With su nagios it is (always the same error of course):

Error: could not get runtime: error generating default config from memory: cannot mkdir /run/user/0/libpod: mkdir /run/user/0/libpod: permission denied

Package info (output of rpm -q podman):

podman-1.6.4-4.module_el8.1.0+298+41f9343a.x86_64
@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 26, 2020
@rhatdan
Copy link
Member

rhatdan commented May 26, 2020

The problem is that you are not using a full user account. So XDG_RUNTIME_DIR is not enabled.

@rhatdan
Copy link
Member

rhatdan commented May 26, 2020

@giuseppe @mheon @vrothberg Can't we at least make the error message more helpful, since this is such a common issue that people hit.

@vrothberg
Copy link
Member

I think that's a good idea 👍 Let's see what our systemd friends come up with. I still hope we find a way to properly support that (or give guidelines).

@giuseppe
Copy link
Member

su nagios probably leaves XDG_RUNTIME_DIR set to the wrong value, so that is used.

Can you try su nagios printenv XDG_RUNTIME_DIR? What is the output? Any difference with su -l?

@ck-schmidi
Copy link
Author

ck-schmidi commented May 27, 2020

su nagios probably leaves XDG_RUNTIME_DIR set to the wrong value, so that is used.

Yes, that's the problem. It does not get changed.

Can you try su nagios printenv XDG_RUNTIME_DIR? What is the output? Any difference with su -l?

The result is /run/user/0 which is from the previous logged in user.
When I su -l with the XDG_RUNTIME_DIR is not set anymore.

I know, when doing a proper login with su -l everything works. And actually that's not the real issue. I just try to reproduce the error on the cli to find out what could be the real issue and maybe what could be a workaround.

Because the actual setup is the following:
We have a running nagios systemd service, which runs on the nagios daemon user.
So the systemd service is running the actual podman run ... command. It always gives me a 125 exit code, so I tried to reproduce this error outside of systemd. And I thought using su nagios somehow gives me more or less the same environment. But I'm actually not 100% sure.

So the real problem is rather running podman run from a systemd service I guess.

Thank you for the quick responses!

@giuseppe
Copy link
Member

are you trying to achieve something similar to #6400 ?

Could we close this issue as duplicate of #6400 ?

@ck-schmidi
Copy link
Author

ck-schmidi commented May 28, 2020

@giuseppe I don't think it's really a duplicate
I now got some more insights and the real error message from nagios.

The error message from podman run is actually
Error: could not get runtime: error generating default config from memory: cannot stat /root/.config/containers/storage.conf: stat /root/.config/containers/storage.conf: permission denied

So, my question is now: Is there any other environment varibale which leds to the reading of the wrong storage.conf file?

The executing user is not root, and the execution is done via a process which runs under systemd

Some additonal information:
The id command returns

uid=1000(nagios) gid=1000(nagios) groups=1000(nagios),1001(nagioscore) context=system_u:system_r:unconfined_service_t:s0 

in that context.

@giuseppe
Copy link
Member

So, my question is now: Is there any other environment varibale which leds to the reading of the wrong storage.conf file?

maybe HOME is still pointing to /root?

@ck-schmidi
Copy link
Author

ck-schmidi commented May 28, 2020

maybe HOME is still pointing to /root?

yes, i just figured that out HOME is not set properly from systemd !

I found systemd/systemd#9652 which somehow describes a similar problem.

For us, we now use this fix to get podman running properly from systemd:

export XDG_RUNTIME_DIR=
export HOME=/home/`id -u -n`

I will investigate a little bit more into systemd, maybe I can find a proper solution.

@giuseppe
Copy link
Member

@ck-schmidi could you give a try to machinectl --uid $USER-UID shell and use that environment?

You may need to set linger mode so that the containers are left around when you terminate the user session

@mheon mheon added the rootless label Jun 2, 2020
@github-actions
Copy link

github-actions bot commented Jul 3, 2020

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Jul 6, 2020

I think we have a work around closing, reopen if I am mistaken.

@rhatdan rhatdan closed this as completed Jul 6, 2020
@PavelSosin-320
Copy link

@giuseppe This is exactly the same problem that holds me already more than month in ##12247 and ##12264 and .. all my other issues. The only differences is that I use Fedora 34 Workstation with GNOME 4 and, another method to switch users. This is exactly the same Systemd environment hell ! When GNOME session is started by systemd including GNOME Terminal application The Systemd environment is settled by different things: generators, services: systemd-pam dbus-broker, etc. $USER, XDG_CONFIG_HOME, XDG_DATA_HOME added to environment at different timepoints. I tried to run CRun as the systemd service and it doesn't work too returning -1. instead of 125 returned by podman. Actually, I saw in some place that Systemd doesn't promise that $HOME closely follows the user switch because it can be needed for some systemd features. like Dynamic user.
Unfirtunatelly, The systemd user manager has only one target - default I didn't find any way to synchronize container's service with the systemd environment state - GNOME terminal starts in parallel with containers unit.
systemctl --user show-environment randomly doesn't contain HOME, XDG_CONFIG_HOME, XDG_DATA_HOME.
Even worst - HOME sometimes points to the previous user's home. Manual execution of systemctl --user daemon-reexec definitely breaks it.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. rootless stale-issue
Projects
None yet
Development

No branches or pull requests

7 participants