Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot enter container – crun: setrlimit RLIMIT_NPROC: Operation not permitted: OCI permission denied #1312

Closed
rugk opened this issue Jun 12, 2023 · 25 comments
Labels
1. Bug Something isn't working

Comments

@rugk
Copy link

rugk commented Jun 12, 2023

Describe the bug
A clear and concise description of what the bug is. If possible, re-run the command(s) with --log-level debug and put the output here.

Steps how to reproduce the behaviour

  1. Uhm… just created a new Fedora 38 container (toolbox create)
  2. Deleted old Fedora 36 and 37 containers (toolbox rm fedora-toolbox-36 etc.)
  3. Rebooted (and applied some rpm-ostree updates by that)
  4. And then I could not enter the container with toolbox enter anymore.

Expected behaviour
Can enter and not suddently, after rebooting, fail to enter my container… :(

Actual behaviour

$ toolbox enter                          
Error: failed to start container fedora-toolbox-38

Output of toolbox --version (v0.0.90+)

$ toolbox --version
toolbox version 0.0.99.4

Toolbox package info (rpm -q toolbox)

$ rpm -q toolbox
toolbox-0.0.99.4-1.fc38.x86_64

Output of podman version
e.g.,

$ podman version
Client:       Podman Engine
Version:      4.5.1
API Version:  4.5.1
Go Version:   go1.20.4
Built:        Fri May 26 19:58:48 2023
OS/Arch:      linux/amd64

Podman package info (rpm -q podman)

podman-4.5.1-1.fc38.x86_64

Info about your OS
Fedora Silverblue 38

$ rpm-ostree status -v
State: idle
AutomaticUpdates: stage; rpm-ostreed-automatic.timer: no runs since boot
Deployments:
● fedora:fedora/38/x86_64/silverblue (index: 0)
                  Version: 38.20230612.0 (2023-06-12T00:45:27Z)
[…]

Additional context

I mean, you had one job, and this failed… I mean, no offense, I am just sad…

$ toolbox enter -vv             
DEBU Running as real user ID 1000                 
DEBU Resolved absolute path to the executable as /usr/bin/toolbox 
DEBU Running on a cgroups v2 host                 
DEBU Looking for sub-GID and sub-UID ranges for user rugk 
DEBU TOOLBOX_PATH is /usr/bin/toolbox             
DEBU Migrating to newer Podman                    
DEBU Toolbox config directory is /var/home/rugk/.config/toolbox 
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called version.PersistentPreRunE(podman --log-level debug version --format json) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/home/rugk/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/home/rugk/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /var/home/rugk/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /var/home/rugk/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] Not configuring container store              
DEBU[0000] Initializing event backend journald          
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 49             
DEBU[0000] Called version.PersistentPostRunE(podman --log-level debug version --format json) 
DEBU[0000] Shutting down engines                        
DEBU Current Podman version is 4.5.1              
DEBU Creating runtime directory /run/user/1000/toolbox 
DEBU Old Podman version is 4.5.1                  
DEBU Migration not needed: Podman version 4.5.1 is unchanged 
DEBU Setting up configuration                     
DEBU Setting up configuration: file /var/home/rugk/.config/containers/toolbox.conf not found 
DEBU Resolving container and image names          
DEBU Container: ''                                
DEBU Distribution (CLI): ''                       
DEBU Image (CLI): ''                              
DEBU Release (CLI): ''                            
DEBU Resolved container and image names           
DEBU Container: 'fedora-toolbox-38'               
DEBU Image: 'fedora-toolbox:38'                   
DEBU Release: '38'                                
DEBU Resolving container and image names          
DEBU Container: ''                                
DEBU Distribution (CLI): ''                       
DEBU Image (CLI): ''                              
DEBU Release (CLI): ''                            
DEBU Resolved container and image names           
DEBU Container: 'fedora-toolbox-38'               
DEBU Image: 'fedora-toolbox:38'                   
DEBU Release: '38'                                
DEBU Checking if container fedora-toolbox-38 exists 
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called exists.PersistentPreRunE(podman --log-level debug container exists fedora-toolbox-38) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/home/rugk/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/home/rugk/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /var/home/rugk/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /var/home/rugk/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument 
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 49             
DEBU[0000] Called exists.PersistentPostRunE(podman --log-level debug container exists fedora-toolbox-38) 
DEBU[0000] Shutting down engines                        
DEBU Inspecting mounts of container fedora-toolbox-38 
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called inspect.PersistentPreRunE(podman --log-level debug inspect --format json --type container fedora-toolbox-38) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/home/rugk/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/home/rugk/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /var/home/rugk/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /var/home/rugk/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 49             
DEBU[0000] Looking up image "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" in local containers storage 
DEBU[0000] Trying "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" ... 
DEBU[0000] parsed reference into "[overlay@/var/home/rugk/.local/share/containers/storage+/run/user/1000/containers:overlay.mount_program=/usr/bin/fuse-overlayfs,overlay.mount_program=/usr/bin/fuse-overlayfs]@997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" 
DEBU[0000] Found image "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" as "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" in local containers storage 
DEBU[0000] Found image "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" as "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" in local containers storage ([overlay@/var/home/rugk/.local/share/containers/storage+/run/user/1000/containers:overlay.mount_program=/usr/bin/fuse-overlayfs,overlay.mount_program=/usr/bin/fuse-overlayfs]@997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f) 
DEBU[0000] Called inspect.PersistentPostRunE(podman --log-level debug inspect --format json --type container fedora-toolbox-38) 
DEBU[0000] Shutting down engines                        
DEBU Starting container fedora-toolbox-38         
Error: failed to start container fedora-toolbox-38

When I run podman manually:

$ podman start fedora-toolbox-38
Error: unable to start container "f8accc0c103a4fc741b8592de53010f8630d502d97a0050c367e2401cac1501f": crun: setrlimit `RLIMIT_NPROC`: Operation not permitted: OCI permission denied
$ toolbox list -i
IMAGE ID      IMAGE NAME                                    CREATED
215122d241c2  registry.fedoraproject.org/fedora-toolbox:36  13 months ago
90d416a5811e  registry.fedoraproject.org/fedora-toolbox:37  6 months ago
997b52ccbf85  registry.fedoraproject.org/fedora-toolbox:38  2 months ago
$ podman start --attach fedora-toolbox-38 
Error: unable to start container f8accc0c103a4fc741b8592de53010f8630d502d97a0050c367e2401cac1501f: crun: setrlimit `RLIMIT_NPROC`: Operation not permitted: OCI permission denied

Related issues

Search only turned up #1297, but I use no VirtualBox.
Alos tried containers/podman#14284 (comment) (again altghough I donÄt use VirtualBox), and it kinda works, but is still throws erros (and is no solution of course):

$ podman run --rm -it --privileged --group-add keep-groups fedora-toolbox:38 
WARN[0000] Error validating CNI config file /var/home/rugk/.config/cni/net.d/87-podman.conflist: [failed to find plugin "bridge" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin] failed to find plugin "portmap" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin] failed to find plugin "firewall" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin] failed to find plugin "tuning" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin]] 
[root@d3693d8a2176 /]# 

edit: forget the command line above, it starts a fresh container based on the …:38 image, so it's not my existing container instance…

I have no idea what config files these are…

$ cat /var/home/rugk/.config/cni/net.d/87-podman.conflist                  
{
  "cniVersion": "0.4.0",
  "name": "podman",
  "plugins": [
    {
      "type": "bridge",
      "bridge": "cni-podman0",
      "isGateway": true,
      "ipMasq": true,
      "hairpinMode": true,
      "ipam": {
        "type": "host-local",
        "routes": [{ "dst": "0.0.0.0/0" }],
        "ranges": [
          [
            {
              "subnet": "10.88.0.0/16",
              "gateway": "10.88.0.1"
            }
          ]
        ]
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    },
    {
      "type": "firewall"
    },
    {
      "type": "tuning"
    }
  ]
}

Also found this file, which seems to be created at reboot time(?):

$ cat $XDG_CONFIG_HOME/toolbox/podman-system-migrate 
4.5.1
$ ls -la $XDG_CONFIG_HOME/toolbox/podman-system-migrate
-rw-r--r--. 1 rugk rugk 6 12. Jun 22:20 /var/home/rugk/.config/toolbox/podman-system-migrate
@rugk rugk added the 1. Bug Something isn't working label Jun 12, 2023
@rugk
Copy link
Author

rugk commented Jun 12, 2023

Asked ChatGPT and got some debugging tips (that it may be about cgroups v2):

$ cat /sys/fs/cgroup/cgroup.controllers     
cpuset cpu io memory hugetlb pids rdma misc
$ ulimit -n
1024
$ mount | grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate,memory_recursiveprot)

Still does not help… 🤔

@rugk
Copy link
Author

rugk commented Jun 12, 2023

Also have this here:

$ cat $XDG_CONFIG_HOME/containers/storage.conf 
[storage]
driver = "overlay"
[storage.options]
mount_program = "/usr/bin/fuse-overlayfs"

@rugk
Copy link
Author

rugk commented Jun 12, 2023

toolbox enter -v --log-level debug
DEBU Running as real user ID 1000                 
DEBU Resolved absolute path to the executable as /usr/bin/toolbox 
DEBU Running on a cgroups v2 host                 
DEBU Looking for sub-GID and sub-UID ranges for user rugk 
DEBU TOOLBOX_PATH is /usr/bin/toolbox             
DEBU Migrating to newer Podman                    
DEBU Toolbox config directory is /var/home/rugk/.config/toolbox 
DEBU Current Podman version is 4.5.1              
DEBU Creating runtime directory /run/user/1000/toolbox 
DEBU Old Podman version is 4.5.1                  
DEBU Migration not needed: Podman version 4.5.1 is unchanged 
DEBU Setting up configuration                     
DEBU Setting up configuration: file /var/home/rugk/.config/containers/toolbox.conf not found 
DEBU Resolving container and image names          
DEBU Container: ''                                
DEBU Distribution (CLI): ''                       
DEBU Image (CLI): ''                              
DEBU Release (CLI): ''                            
DEBU Resolved container and image names           
DEBU Container: 'fedora-toolbox-38'               
DEBU Image: 'fedora-toolbox:38'                   
DEBU Release: '38'                                
DEBU Resolving container and image names          
DEBU Container: ''                                
DEBU Distribution (CLI): ''                       
DEBU Image (CLI): ''                              
DEBU Release (CLI): ''                            
DEBU Resolved container and image names           
DEBU Container: 'fedora-toolbox-38'               
DEBU Image: 'fedora-toolbox:38'                   
DEBU Release: '38'                                
DEBU Checking if container fedora-toolbox-38 exists 
DEBU Inspecting mounts of container fedora-toolbox-38 
DEBU Starting container fedora-toolbox-38         
Error: failed to start container fedora-toolbox-38

Also found the old bug #500 of me as I noticed there was something different and found #1260.

$ echo $XDG_RUNTIME_DIR
/run/user/1000

@debarshiray
Copy link
Member

What was the crun version? Our CI ran successfully with crun-1.8.5-1.fc38.x86_64.

/cc @giuseppe

@mcatanzaro
Copy link

I'm hitting this same problem with the exact same crun-1.8.5-1.fc38

@mcatanzaro
Copy link

So I just discovered that one of my toolboxes is broken while the other is fine. I have two images:

 toolbox list
IMAGE ID      IMAGE NAME                                    CREATED
901c633cace2  registry.fedoraproject.org/fedora-toolbox:38  3 months ago

CONTAINER ID  CONTAINER NAME     CREATED       STATUS   IMAGE NAME
a463cc1f5bfb  bst2               6 weeks ago   exited   registry.gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images/bst2:latest
f997e25b9494  fedora-toolbox-38  2 months ago  running  registry.fedoraproject.org/fedora-toolbox:38

The fedora-toolbox-38 is working without issue while the bst2 toolbox based on registry.gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images/bst2:latest is borked. However, in rugk's case it is actually the fedora-toolbox:38 image that is broken for him (but not for me). So we know this bug affects particular containers, but not particular images.

@giuseppe
Copy link
Member

could you show the output of podman inspect for both containers? In particular I am interested in their ulimits: podman inspect $CTR_NAME --format "{{.HostConfig.Ulimits}}"

What are your current limits if you run ulimit -aH?

@mcatanzaro
Copy link

$ podman inspect bst2 --format "{{.HostConfig.Ulimits}}"
[{RLIMIT_NOFILE 524288 524288} {RLIMIT_NPROC 256323 256323}]
$ podman inspect fedora-toolbox-38 --format "{{.HostConfig.Ulimits}}"
[]

No clue why only one container has these special limits set. I don't remember doing anything differently when I created them.

$ ulimit -aH
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 256318
max locked memory           (kbytes, -l) 8192
max memory size             (kbytes, -m) unlimited
open files                          (-n) 524288
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) unlimited
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 256318
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

@mcatanzaro
Copy link

@akdev1l
Copy link

akdev1l commented Jun 14, 2023

There was a podman change a while back in fedora that caused containers to have higher limits than the system allowed

it is probably the root cause for this, sorry don’t have time to link the issue

@debarshiray
Copy link
Member

There was a podman change a while back in fedora that caused containers to have higher limits than the system allowed

Did you mean containers/podman#17681 ?

@RishabhSaini
Copy link

I am hitting the same problem

@permezel
Copy link

permezel commented Jul 5, 2023

I hit same problem today. Had to reboot due to system being wedged. Rebooted, and my Silverblue experience significantly soured. I have used it for less than one week.
All toolboxen are failing with:

crun: setrlimit `RLIMIT_NPROC`: Operation not permitted: OCI permission denied

My rlimits are:

% ulimit -aH
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 255972
max locked memory           (kbytes, -l) 8192
max memory size             (kbytes, -m) unlimited
open files                          (-n) 524288
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) unlimited
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 255972
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited
% rpm-ostree status
State: idle
Deployments:
● fedora:fedora/38/x86_64/sericea
                  Version: 38.20230629.0 (2023-06-29T00:41:03Z)
                   Commit: 1040d5b53d219f87582f2adf6258372e8897df4fa56fee80d52988e52b18e400
             GPGSignature: Valid signature by 6A51BBABBA3D5467B6171221809A8D7CEB10B464

  fedora:fedora/38/x86_64/sericea
                  Version: 38.1.6 (2023-04-13T19:03:58Z)
                   Commit: b943b26670918f09fb6ac49e5cdff5b937c47ca1810700003f2ccb7ba5b2ea4b
             GPGSignature: Valid signature by 6A51BBABBA3D5467B6171221809A8D7CEB10B464

@permezel
Copy link

permezel commented Jul 5, 2023

Just did an update/reboot, hoping for a fix. nada.

% rpm-ostree status -v
State: idle
AutomaticUpdates: disabled
Deployments:
● fedora:fedora/38/x86_64/sericea (index: 0)
                  Version: 38.20230705.0 (2023-07-05T00:44:27Z)
                   Commit: fc15abf986d93720c2ba0f62f13a341c87bc097335c94750a3752fc8f05e06ca
                           ├─ repo-0 (2023-04-14T09:32:40Z)
                           ├─ repo-1 (2023-07-05T00:16:53Z)
                           └─ repo-2 (2023-07-05T00:19:08Z)
                   Staged: no
                StateRoot: fedora
             GPGSignature: 1 signature
                           Signature made Wed 05 Jul 2023 10:44:35 using RSA key ID 809A8D7CEB10B464
                           Good signature from "Fedora <[email protected]>"

  fedora:fedora/38/x86_64/sericea (index: 1)
                  Version: 38.20230629.0 (2023-06-29T00:41:03Z)
                   Commit: 1040d5b53d219f87582f2adf6258372e8897df4fa56fee80d52988e52b18e400
                           ├─ repo-0 (2023-04-14T09:32:40Z)
                           ├─ repo-1 (2023-06-29T00:16:11Z)
                           └─ repo-2 (2023-06-29T00:18:14Z)
                StateRoot: fedora
             GPGSignature: 1 signature
                           Signature made Thu 29 Jun 2023 10:41:08 using RSA key ID 809A8D7CEB10B464
                           Good signature from "Fedora <[email protected]>"
% podman start fc38
Error: unable to start container "32fa51a92b65bd70a1c60c84b8d2e27da5d4e5f5fa54f7ed2636023a041a1ba4": crun: setrlimit `RLIMIT_NPROC`: Operation not permitted: OCI permission denied

@permezel
Copy link

permezel commented Jul 5, 2023

The problem is that RLIMIT_NPROC from the running system is captured in the podman config for the toolbox.
The ulimit -u varies from boot to boot. No idea why.
If you reboot, and it is currently less that what was captured, you are out of luck.

One can readily recreate this as follows:

% ulimit -u
255972
% toolbox create wtf1
Created container: wtf1
Enter with: toolbox enter wtf1
% TERM=xterm toolbox enter wtf1
⬢[dap@toolbox dap]$ exit
logout
% (ulimit -u 255971; toolbox enter wtf1)
Error: failed to invoke command /bin/bash in container wtf1
Error: failed to invoke command /bin/bash in container wtf1
% podman stop wtf1
wtf1
% (ulimit -u 255971; podman start wtf1)
Error: unable to start container "643ed835a9fa7ccbae229403acffa9167663f9c42348f96d189519d493da920f": crun: setrlimit `RLIMIT_NPROC`: Operation not permitted: OCI permission denied
% podman start wtf1
wtf1

@permezel
Copy link

permezel commented Jul 6, 2023

I have "recovered" from this issue, by using this.
My current ulimit is 200000 which is less than the ~256000 I get a boot time.
I artificially reduce it to 100000 (ulimit -u 100000). I then follow the steps to recreate the containers, resulting in them capturing the 100000 setting.
When I start the recreated containers, my limit will be greater than 100000, so podman start reducing it to 100000 works.

Eventually someone will fix this and it will make it into my rpm-ostree and hopefully I will not have to save/restore/recreate the toolboxen again.

@rugk
Copy link
Author

rugk commented Jul 23, 2023

Okay found downstream issue at Fedora Silverblue, as it seems to affect many. It is fedora-silverblue/issue-tracker#460

It says this issue has been reported upstream to podman and merged in containers/podman#18714 / containers/podman#18721, but that PR is kinda old, why does it still fail?

Ah beacuse of the release cycles, v4.6.0 of podman seems to fix the bug (search for ulimit) and that seems to have been released just some days ago.

@c0n5um3r
Copy link

c0n5um3r commented Aug 3, 2023

Same problem here.

My rlimits:

> ulimit -aH
Maximum size of core files created                              (kB, -c) unlimited
Maximum size of a process’s data segment                        (kB, -d) unlimited
Control of maximum nice priority                                    (-e) 0
Maximum size of files created by the shell                      (kB, -f) unlimited
Maximum number of pending signals                                   (-i) 63103
Maximum size that may be locked into memory                     (kB, -l) 8192
Maximum resident set size                                       (kB, -m) unlimited
Maximum number of open file descriptors                             (-n) 1048576
Maximum bytes in POSIX message queues                           (kB, -q) 800
Maximum realtime scheduling priority                                (-r) 0
Maximum stack size                                              (kB, -s) unlimited
Maximum amount of CPU time in seconds                      (seconds, -t) unlimited
Maximum number of processes available to current user               (-u) 63103
Maximum amount of virtual memory available to each process      (kB, -v) unlimited
Maximum contiguous realtime CPU time                                (-y) unlimited

It is actually second time I'm having this issue.
Last time I just recreated the containers, but when it appeared again - I started worrying.

@Sorunome
Copy link

soru is still getting this issue, even with podman v4.6.1

@akdev1l
Copy link

akdev1l commented Aug 20, 2023

based on this: containers/podman#19634 (comment)

I created a script to migrate old/broken toolboxes to new ones. This error will keep persisting on older containers regardless of having an up to date podman version because the limits were previously set in the container spec by older versions of podman. Newer versions don't have this behaviour but the only way to fix this is to recreate the container with a newer podman version, just upgrading won't solve it.

Anyway, this is a script that will export, reimport and recreate new toolboxes based on the ones you have currently on your system:

podman ps -a --format json \
    | jq -r '.[] | select(.Labels."com.github.containers.toolbox").Names[0]' \
    | xargs -tI{} \
        bash -c 'podman export {} -o {}.tar.gz && podman import {}.tar.gz {}-image && toolbox create --image localhost/{}-image {}-new'

the manual steps are as follows:

podman export -o $container_name.tar.gz $container_name
podman import $container_name.tar.gz $container_name-image
toolbox create --image $container_name-image $container_name-new

The new containers will be named with a suffix -new to avoid conflicts with older toolboxes, the script will also produce tarballs on PWD corresponding to each toolbox on the system

@rugk
Copy link
Author

rugk commented Aug 21, 2023

The new containers will be named with a suffix -new to avoid conflicts with older toolboxes

A hint so people do not need to waste time on searching: You can then simply rename the container like this:

$ podman rename ${containerName}-new ${containerName}
$ podman rename fedora-toolbox-38-new fedora-toolbox-38

@rugk
Copy link
Author

rugk commented Aug 21, 2023

Though it somehow does not work:

$ podman ps -a --format json \                                           
    | jq -r '.[] | select(.Labels."com.github.containers.toolbox").Names[0]' \
    | xargs -tI{} \
        bash -c 'podman export {} -o {}.tar.gz && podman import {}.tar.gz {}-image && toolbox create --image localhost/{}-image {}-new'
bash -c 'podman export fedora-toolbox-38 -o fedora-toolbox-38.tar.gz && podman import fedora-toolbox-38.tar.gz fedora-toolbox-38-image && toolbox create --image localhost/fedora-toolbox-38-image fedora-toolbox-38-new'
Getting image source signatures
Copying blob 2b3919c89420 done  
Copying config 788d5882a5 done  
Writing manifest to image destination
sha256:788d5882a50575f328a8b85f3f92a2ad488b4ffc18949a4bdc20742ef2736bf2
Created container: fedora-toolbox-38-new
Enter with: toolbox enter fedora-toolbox-38-new
$ podman rm fedora-toolbox-38                                        
fedora-toolbox-38
$ podman rename fedora-toolbox-38-new fedora-toolbox-38 
$ toolbox enter
Error: failed to initialize container fedora-toolbox-38
$ podman --version                                     
podman version 4.6.0
$ podman inspect fedora-toolbox-38 --format "{{.HostConfig.Ulimits}}"
[{RLIMIT_NOFILE 524288 524288} {RLIMIT_NPROC 60983 60983}]

And the issue has supposedly been fixed in v4.6.0 of podman, so why is it happening?
Especially and here comes the thing, the issue just re-occurs when you rename it back without the -new prefix actually try to run it with toolnox, because if before do not do this, the ulimit is apparently not being set?

$ podman ps -a --format json \                                       
    | jq -r '.[] | select(.Labels."com.github.containers.toolbox").Names[0]' \
    | xargs -tI{} \
        bash -c 'podman export {} -o {}.tar.gz && podman import {}.tar.gz {}-image && toolbox create --image localhost/{}-image {}-new'
bash -c 'podman export fedora-toolbox-38 -o fedora-toolbox-38.tar.gz && podman import fedora-toolbox-38.tar.gz fedora-toolbox-38-image && toolbox create --image localhost/fedora-toolbox-38-image fedora-toolbox-38-new'
Getting image source signatures
Copying blob 8a5a751b18cf done  
Copying config aabcd2283e done  
Writing manifest to image destination
sha256:aabcd2283e30f618f8d72d2b5363bc0d467698eebd8e3f2031452c164765ba9b
Created container: fedora-toolbox-38-new
Enter with: toolbox enter fedora-toolbox-38-new
$ podman inspect fedora-toolbox-38-new --format "{{.HostConfig.Ulimits}}"
[]
$ toolbox enter fedora-toolbox-38-new 
Error: invalid entry point PID of container fedora-toolbox-38-new
$ podman inspect fedora-toolbox-38-new --format "{{.HostConfig.Ulimits}}"
[{RLIMIT_NOFILE 524288 524288} {RLIMIT_NPROC 60983 60983}]

Though note the error toolbox enter fedora-toolbox-38-new shows here is different to what it shows when it is being entered. Maybe there is another error related to renaming the container, so you should really rename it back?
The origin issue, is, however, still there?

@akdev1l
Copy link

akdev1l commented Aug 21, 2023

The error relates to the entrypoint of the resulting image not being correct (mm toolbox is supposed to set itself as entrypoint, it is possibly expecting entrypoint as /bin/bash but it is toolbox as this image was already prepped by toolbox)

quite unfortunate - this worked in my limited testing - you could confirm my theory by inspecting the entrypoint of the image

@rugk
Copy link
Author

rugk commented Aug 21, 2023

Yeah after cloning and running podman inspect the entrypoint is "Entrypoint": "", so well…

@debarshiray
Copy link
Member

Duplicate of containers/podman#19634

@debarshiray debarshiray marked this as a duplicate of containers/podman#19634 Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1. Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants