Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'podman start <toolbx-container>' fails with 'setrlimit RLIMIT_NPROC: Operation not permitted: OCI permission denied' #19634

Closed
debarshiray opened this issue Aug 15, 2023 · 35 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@debarshiray
Copy link
Member

Issue Description

Preexisting Toolbx containers can no longer be started after a dnf update on Fedora 38 Workstation.

Highlights from the dnf update:

  • crun-1.8.5-1.fc38.x86_64 to crun-1.8.6-1.fc38.x86_64
  • podman-5:4.5.1-1.fc38.x86_64 to podman-5:4.6.0-1.fc38.x86_64

Toolbx containers are interactive command-line environments that are meant to be long-lasting pet containers. Therefore, it's important that containers created by older versions of the tools can be used with newer versions.

If necessary, I am happy to change the configuration with which new Toolbx containers are created, but we would need a sufficient migration window for users with pre-existing older containers.

Here's an attempt to podman start a container created with toolbox create and the older version of the Podman stack:

$ podman --log-level debug start --attach fedora-toolbox-38
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called start.PersistentPreRunE(podman --log-level debug start --attach fedora-toolbox-38) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /home/rishi/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /home/rishi/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /home/rishi/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /home/rishi/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] Cached value indicated that overlay is supported 
DEBU[0000] Cached value indicated that overlay is supported 
DEBU[0000] Cached value indicated that metacopy is not being used 
DEBU[0000] Cached value indicated that native-diff is usable 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 49             
INFO[0000] Received shutdown.Stop(), terminating!        PID=69527
DEBU[0000] Enabling signal proxying                     
DEBU[0000] Cached value indicated that idmapped mounts for overlay are not supported 
DEBU[0000] Check for idmapped mounts support            
DEBU[0000] overlay: mount_data=lowerdir=/home/rishi/.local/share/containers/storage/overlay/l/B6GL45D333VZL42EE6M67UZ4I4:/home/rishi/.local/share/containers/storage/overlay/l/B6GL45D333VZL42EE6M67UZ4I4/../diff1:/home/rishi/.local/share/containers/storage/overlay/l/I3MT3GFV2QVA2ZJ2PCN7UZRZTZ:/home/rishi/.local/share/containers/storage/overlay/l/2KCBRARJTGBTMSG6OQ6A6YW643,upperdir=/home/rishi/.local/share/containers/storage/overlay/9e84edd327b8c418cf3ef92f62edcaa54df1499d3529eb323fb75151a2590ca9/diff,workdir=/home/rishi/.local/share/containers/storage/overlay/9e84edd327b8c418cf3ef92f62edcaa54df1499d3529eb323fb75151a2590ca9/work,,userxattr,context="system_u:object_r:container_file_t:s0:c1022,c1023" 
DEBU[0000] Mounted container "ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99" at "/home/rishi/.local/share/containers/storage/overlay/9e84edd327b8c418cf3ef92f62edcaa54df1499d3529eb323fb75151a2590ca9/merged" 
DEBU[0000] Created root filesystem for container ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 at /home/rishi/.local/share/containers/storage/overlay/9e84edd327b8c418cf3ef92f62edcaa54df1499d3529eb323fb75151a2590ca9/merged 
DEBU[0000] Not modifying container ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 /etc/passwd 
DEBU[0000] Not modifying container ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 /etc/group 
DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode subscription 
DEBU[0000] Setting Cgroups for container ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 to user.slice:libpod:ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 
DEBU[0000] Set root propagation to "rslave"             
DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d 
DEBU[0000] Workdir "/" resolved to host path "/home/rishi/.local/share/containers/storage/overlay/9e84edd327b8c418cf3ef92f62edcaa54df1499d3529eb323fb75151a2590ca9/merged" 
DEBU[0000] Created OCI spec for container ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 at /home/rishi/.local/share/containers/storage/overlay-containers/ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99/userdata/config.json 
DEBU[0000] /usr/bin/conmon messages will be logged to syslog 
DEBU[0000] running conmon: /usr/bin/conmon               args="[--api-version 1 -c ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 -u ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 -r /usr/bin/crun -b /home/rishi/.local/share/containers/storage/overlay-containers/ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99/userdata -p /run/user/1000/containers/overlay-containers/ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99/userdata/pidfile -n fedora-toolbox-38 --exit-dir /run/user/1000/libpod/tmp/exits --full-attach -s -l journald --log-level debug --syslog --conmon-pidfile /run/user/1000/containers/overlay-containers/ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/rishi/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /home/rishi/.local/share/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg boltdb --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99]"
INFO[0000] Running conmon under slice user.slice and unitName libpod-conmon-ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99.scope 
[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied

DEBU[0000] Received: -1                                 
DEBU[0000] Cleaning up container ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99 
DEBU[0000] Network is already cleaned up, skipping...   
DEBU[0000] Unmounted container "ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99" 
Error: unable to start container ef7f514e176ea59f18603055c5cbd776be2a8fdb2dcd1fceb481da5c4bf51b99: crun: [conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied

setrlimit `RLIMIT_NPROC`: Operation not permitted: OCI permission denied
DEBU[0000] Shutting down engines                        

As far as I can make out, Toolbx containers created with the new version of the Podman stack can be started with it.

Steps to reproduce the issue

  1. Create a Toolbx container with toolbox create using crun-1.8.5-1.fc38.x86_64, podman-5:4.5.1-1.fc38.x86_64, etc. on Fedora 38 Workstation

  2. dnf update to crun-1.8.6-1.fc38.x86_64 and podman-5:4.6.0-1.fc38.x86_64

  3. Reboot

  4. Try podman start ...

Describe the results you received

podman start ... fails with:

...
setrlimit `RLIMIT_NPROC`: Operation not permitted: OCI permission denied

Describe the results you expected

podman start should succeed.

podman info output

host:
  arch: amd64
  buildahVersion: 1.31.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 99.17
    systemPercent: 0.26
    userPercent: 0.56
  cpus: 16
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: workstation
    version: "38"
  eventLogger: journald
  freeLocks: 2041
  hostname: topinka
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.4.10-200.fc38.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 20824010752
  memTotal: 33536196608
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.7.0-1.fc38.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.7.0
    package: netavark-1.7.0-1.fc38.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.7.0
  ociRuntime:
    name: crun
    package: crun-1.8.6-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.6
      commit: 73f759f4a39769f60990e7d225f561b4f4f06bcf
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20230625.g32660ce-1.fc38.x86_64
    version: |
      pasta 0^20230625.g32660ce-1.fc38.x86_64
      Copyright Red Hat
      GNU Affero GPL version 3 or later <https://www.gnu.org/licenses/agpl-3.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-12.fc38.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 6h 45m 6.00s (Approximately 0.25 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/rishi/.config/containers/storage.conf
  containerStore:
    number: 7
    paused: 0
    running: 0
    stopped: 7
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/rishi/.local/share/containers/storage
  graphRootAllocated: 1695606808576
  graphRootUsed: 285421572096
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 10
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/rishi/.local/share/containers/storage/volumes
version:
  APIVersion: 4.6.0
  Built: 1689942206
  BuiltTime: Fri Jul 21 14:23:26 2023
  GitCommit: ""
  GoVersion: go1.20.6
  Os: linux
  OsArch: linux/amd64
  Version: 4.6.0


### Podman in a container

No

### Privileged Or Rootless

Rootless

### Upstream Latest Release

No

### Additional environment details

Fedora 38 Workstation

### Additional information

Toolbx containers are interactive command-line environments that are meant to be long-lasting pet containers.  Therefore, it's important that containers created by older versions of the tools can be used with newer versions.
@debarshiray debarshiray added the kind/bug Categorizes issue or PR as related to a bug. label Aug 15, 2023
@acheong08
Copy link

Also failed to clone:

acheong@insignificantv5 ~ [125]> podman --log-level debug container clone dev dev1
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called clone.PersistentPreRunE(podman --log-level debug container clone dev dev1) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/home/acheong/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/home/acheong/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /var/home/acheong/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /var/home/acheong/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] Cached value indicated that overlay is supported 
DEBU[0000] Cached value indicated that overlay is supported 
DEBU[0000] Cached value indicated that metacopy is not being used 
DEBU[0000] Cached value indicated that native-diff is usable 
DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 25             
DEBU[0000] Looking up image "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" in local containers storage 
DEBU[0000] Trying "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" ... 
DEBU[0000] parsed reference into "[overlay@/var/home/acheong/.local/share/containers/storage+/run/user/1000/containers]@997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" 
DEBU[0000] Found image "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" as "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" in local containers storage 
DEBU[0000] Found image "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" as "997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" in local containers storage ([overlay@/var/home/acheong/.local/share/containers/storage+/run/user/1000/containers]@997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f) 
DEBU[0000] Inspecting image 997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f 
DEBU[0000] exporting opaque data as blob "sha256:997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" 
DEBU[0000] exporting opaque data as blob "sha256:997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" 
DEBU[0000] exporting opaque data as blob "sha256:997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f" 
DEBU[0000] Inspecting image 997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f 
DEBU[0000] Inspecting image 997b52ccbf8544c42851a181e80bcd0f081eff8a879256b67d273a7e07f31f6f 
Error: invalid config provided: cannot set shmsize when running in the {host } IPC Namespace
DEBU[0000] Shutting down engines   

@acheong08

This comment was marked as off-topic.

@debarshiray
Copy link
Member Author

level=debug msg="Running as real user ID 0"
level=debug msg="Resolved absolute path to the executable as /usr/bin/toolbox"
level=debug msg="TOOLBOX_PATH is /usr/bin/toolbox"
level=debug msg="Migrating to newer Podman"

[...]

level=debug msg="Removing password for user acheong"
level=debug msg="Removing password for user root"
level=debug msg="Configuring RPM to ignore bind mounts"
level=debug msg="Setting up daily ticker"
level=debug msg="Setting up watches for file system events"
level=debug msg="Finished initializing container"
level=debug msg="Creating runtime directory /run/user/1000/toolbox"
level=debug msg="Creating initialization stamp /run/user/1000/toolbox/container-initialized-3448"
level=debug msg="Listening to file system and ticker events"
level=warning msg="Failed to run updatedb(8): updatedb(1) not found"

This seems to be the logs from a successful podman start --attach ... of a Toolbx container. Did you post it by mistake? :)

@edsantiago
Copy link
Member

Looks like #18696, and from what I can tell it looks like the only solution is to recreate the container.

@Luap99
Copy link
Member

Luap99 commented Aug 16, 2023

Yes you need to check ulimit -u before and after the reboot. Podman 4.5 and early hard code the ulimt at create time so if it changes to a lower value then it will not work and following starts will fail.

The fix #18721 makes so that we now apply the ulimit at start time so it should work all the time but that requires the container to be created with 4.6 or newer so going forward you shoul dnot see that issue again.

@Luap99 Luap99 closed this as completed Aug 16, 2023
@debarshiray
Copy link
Member Author

it looks like the only solution is to recreate the container.

Sadly, as I mentioned in the report, that's a deal breaker for Toolbx containers. :/

There's a Fedora 39 Change to treat the Toolbx stack as a release blocker. I think one of the test criteria is about preexisting containers continuing to work.

@debarshiray
Copy link
Member Author

Yes you need to check ulimit -u before and after the reboot. Podman 4.5 and early hard code the ulimt at create time so if it changes to a lower value then it will not work and following starts will fail.

The fix #18721 makes so that we now apply the ulimit at start time so it should work all the time but that requires the container to be created with 4.6 or newer so going forward you shoul dnot see that issue again.

That's not going to work for Toolbx. Can you please re-open?

@abbra
Copy link

abbra commented Aug 16, 2023

Backward compatibility with existing data (container instances are data) is important to preserve during upgrade. Podman is effectively asking customers to kill the existing data in order to upgrade. This is pretty bad and should be avoided if possible.

@Luap99
Copy link
Member

Luap99 commented Aug 16, 2023

There is nothing we can do here realistically, the ulimits were added to the container spec at create time with 4.5 and earlier so it is now impossible to tell how they got set (by a user or just the default). So once we start the container the runtime will try to apply the configured ulimit and when your limits used to be higher at create time then this will fail.

So you either recreate the container or make sure the ulimit does not change (i.e. in /etc/security/limits.conf).

The fact that is was implemented like that is unfortunate but it used to be like that for years without issue. Just recently it seems that nproc limit was lowered resulting in this bug. As said going forward this is fixed, containers created with 4.6 should not run into this bug.

@debarshiray
Copy link
Member Author

There is nothing we can do here realistically, the ulimits were added to the container spec at create time with 4.5 and earlier so it is now impossible to tell how they got set (by a user or just the default).

Toolbx containers always have the com.github.containers.toolbox label, and Toolbx doesn't offer a way to set the ulimits through its configuration file or command-line. Can't that be used for detection?

So once we start the container the runtime will try to apply the configured ulimit and when your limits used to be higher at create time then this will fail.

So you either recreate the container or make sure the ulimit does not change (i.e. in /etc/security/limits.conf).

Asking people to recreate a Toolbx container is a deal breaker. Maybe the podman RPM can have the necessary values in /etc/security/limits.conf to work around this for a couple of years until we can assume that very few people have containers created with older Podman versions?

@debarshiray
Copy link
Member Author

I see that there's a maze of issues and pull requests related to this problem, but I am unable to figure out the root cause for this Podman change. Was there a pressing issue that couldn't be fixed in any other way?

I am asking, because one way to address this can be to (temporarily?) revert the Podman change and offer a way to create new containers that don't have the ulimits-in-the-container-spec problem. Then, after sufficient time has passed and we can assume that most pet containers out there are new enough to not have this problem, we can restore the Podman change.

@edsantiago
Copy link
Member

I'm by no means an expert here, and trust that someone will correct me if what I say below is wrong.

revert the Podman change

This is not a Podman change. Clarification: the issue you are seeing has nothing to do with any podman changes. It has to do with the fact that your system rebooted with RLIMIT_NPROC != what it was on the previous reboot. Prior to 4.6, podman stored ulimits as part of container metadata. After your reboot, it doesn't matter if you run podman 4.6 or 4.5, any attempt to restart the container is going to fail because podman can't set that ulimit.

The 4.6 podman change, IIUC, is that on container creation podman will no longer store those limits. Unless explicitly requested. So this problem shouldn't recur on future reboots. Obviously that does not time-travel back and help already-created containers.

Perhaps there's some sort of sysctl you can use to restore NPROC back to the prior value?

@acheong08
Copy link

I tried a few things that didn't work:

$ cat /etc/security/limits.conf

*          hard    nproc           62703
*          soft    nproc           62703

The same thing is in /etc/security/limits.d/20-nproc.conf and /etc/security/limits.d/90-nproc.conf

@acheong08
Copy link

acheong08 commented Aug 16, 2023

this worked

$ podman export $CONTAINER_NAME -o output.tar
$ podman import output.tar $NEW_IMAGE_NAME
$ podman container rm $CONTAINER_NAME
$ toolbox create --image localhost/$NEW_IMAGE_NAME -i $CONTAINER_NAME

@acheong08
Copy link

acheong08 commented Aug 16, 2023

Podman is effectively asking customers to kill the existing data in order to upgrade. This is pretty bad and should be avoided if possible.

No data loss

@mjg
Copy link

mjg commented Aug 16, 2023

it looks like the only solution is to recreate the container.

Sadly, as I mentioned in the report, that's a deal breaker for Toolbx containers. :/

There's a Fedora 39 Change to treat the Toolbx stack as a release blocker. I think one of the test criteria is about preexisting containers continuing to work.

Well, who brokered the deal with whom? Podman containers are certainly no long-term data storage. Toolbx uses (and advertises) them for a specific purpose, and changes quite a few of the standard podman options when it creates containers. Is there no way Toolbx could reset the ulimit for an existing container? After all, it would be "fair" to ask users to operate on Toolbx containers using Toolbx (rather then podman, if it fails), and Toolbx is able to recognize Toolbx containers as such (i.e. distinguish them from non-Toolbx containers).

Maybe, as a middle ground, podman could offer an option to migrate old containers (by resetting the ulimit setting or clearing it), and leave it for Toolbx (or the user) to decide when and if they use it?

@rhatdan
Copy link
Member

rhatdan commented Aug 16, 2023

Do we know what value changed? Can it be changed back in limits.conf?

@ra0e
Copy link

ra0e commented Aug 16, 2023

Backward compatibility with existing data (container instances are data) is important to preserve during upgrade. Podman is effectively asking customers to kill the existing data in order to upgrade. This is pretty bad and should be avoided if possible.

at least a warning and prompt should be shown. However, the normal use-case should not be to have long persistent data inside a podman container.

@acheong08
Copy link

acheong08 commented Aug 17, 2023

Do we know what value changed?

ulimit -u (nproc)

Can it be changed back in limits.conf?

#19634 (comment)

Maybe, as a middle ground, podman could offer an option to migrate old containers

#19634 (comment)

Since no data is lost, I suppose it counts as migration?

@rhatdan
Copy link
Member

rhatdan commented Aug 17, 2023

After you made those changes did you fully log out and log back in, to make sure your login process had those settings?

$ ulimit -a
real-time non-blocking time (microseconds, -R) unlimited
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256005
max locked memory (kbytes, -l) 8192
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 256005
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Podman is not doing anything special other then attempting to set the ulimits for the container.

@acheong08
Copy link

After you made those changes did you fully log out and log back in, to make sure your login process had those settings?

Yup

It's not getting applied for some reason but that has nothing to do with podman.

@rhatdan
Copy link
Member

rhatdan commented Aug 17, 2023

If the ulimint -a call is not showing the change, then there is nothing podman can fix.

@ikke-t
Copy link

ikke-t commented Aug 19, 2023

hmmm, i was hit with this too today, and tried the above @acheong08 trick to export the image and create new one. I believe the -i parameter is extra there. But anyhow, the "rescued" one doesn't start:

export CONTAINER_NAME=fedora-toolbox-38
export NEW_IMAGE_NAME=temp-38
podman export $CONTAINER_NAME -o output.tar
podman import output.tar $NEW_IMAGE_NAME
podman container rm $CONTAINER_NAME
echo toolbox create --image $NEW_IMAGE_NAME  $CONTAINER_NAME
$ toolbox list
IMAGE ID      IMAGE NAME                                    CREATED
e7bc47ab5f1d  registry.access.redhat.com/ubi9/toolbox:9.2   2 months ago
997b52ccbf85  registry.fedoraproject.org/fedora-toolbox:38  4 months ago

CONTAINER ID  CONTAINER NAME     CREATED        STATUS  IMAGE NAME
295541a3d3ef  fedora-toolbox-38  6 minutes ago  exited  localhost/temp-38:latest
376457f1c700  rhel9              2 months ago   exited  registry.access.redhat.com/ubi9/toolbox:9.2

$ toolbox enter
Error: invalid entry point PID of container fedora-toolbox-38
$ podman inspect fedora-toolbox-38|grep -i pid
               "Pid": 0,
          "ConmonPidFile": "/run/user/1017/overlay-containers/295541a3d3ef06c9ae0a1f9cbf270ff62d54ffc53ecf486dd029456886072f85/userdata/conmon.pid",
          "PidFile": "/run/user/1017/overlay-containers/295541a3d3ef06c9ae0a1f9cbf270ff62d54ffc53ecf486dd029456886072f85/userdata/pidfile",
                    "--pid",
               "PidMode": "host",
               "PidsLimit": 2048,

@gptlang
Copy link

gptlang commented Aug 20, 2023

echo toolbox create --image $NEW_IMAGE_NAME $CONTAINER_NAME

Is the echo intentional?

toolbox enter

Try podman start $CONTAINER_NAME -ai first and see if that works

@ikke-t
Copy link

ikke-t commented Aug 21, 2023

@gptlang no, echo was for me seing the parameterd are right. I copied the wrong line here. Command was without it.

In a meanwhile, I got tip to specify the ulimit in /etc/security/limits.d/50-podman-ulimits.conf. That didn't help either. Now it says:

$ toolbox create --image localhost/temp-38 fedora-toolbox-38     
Created container: fedora-toolbox-38
Enter with: toolbox enter
$ toolbox enter                                             
Error: invalid entry point PID of container fedora-toolbox-38
$ cat /etc/security/limits.d/50-podman-ulimits.conf                                                                                   1 ↵
ikke hard nproc 63075
$ ulimit -u                                                                                                                         127 ↵
63075
$ podman ps -a 
CONTAINER ID  IMAGE                                            COMMAND               CREATED             STATUS                         PORTS                   NAMES
cb1bf187344b  localhost/temp-38:latest                         toolbox --log-lev...  About a minute ago  Exited (1) About a minute ago                          fedora-toolbox-38
$ podman start -ai fedora-toolbox-38 
level=debug msg="Running as real user ID 0"
level=debug msg="Resolved absolute path to the executable as /usr/bin/toolbox"
level=debug msg="TOOLBOX_PATH is /usr/bin/toolbox"
level=debug msg="Migrating to newer Podman"
level=debug msg="Setting up configuration"
level=debug msg="Setting up configuration: file /etc/containers/toolbox.conf not found"
level=debug msg="Setting up configuration: file /root/.config/containers/toolbox.conf not found"
level=debug msg="Resolving container and image names"
level=debug msg="Container: ''"
level=debug msg="Distribution (CLI): ''"
level=debug msg="Image (CLI): ''"
level=debug msg="Release (CLI): ''"
level=debug msg="Resolved container and image names"
level=debug msg="Container: 'fedora-toolbox-38'"
level=debug msg="Image: 'fedora-toolbox:38'"
level=debug msg="Release: '38'"
level=debug msg="Creating /run/.toolboxenv"
level=debug msg="Monitoring host"
level=debug msg="Path /run/host/etc exists"
level=debug msg="Resolved /etc/localtime to /run/host/usr/share/zoneinfo/Europe/Helsinki"
level=debug msg="Creating regular file /etc/machine-id"
level=debug msg="Binding /etc/machine-id to /run/host/etc/machine-id"
mount: /etc/machine-id: must be superuser to use mount.
       dmesg(1) may have more information after failed mount system call.
Error: failed to bind /etc/machine-id to /run/host/etc/machine-id

@debarshiray
Copy link
Member Author

debarshiray commented Aug 23, 2023

I'm by no means an expert here, and trust that someone will correct me if what I say below is wrong.

revert the Podman change

This is not a Podman change. Clarification: the issue you are seeing has nothing to do with any podman changes. It has to do with the fact that your system rebooted with RLIMIT_NPROC != what it was on the previous reboot. Prior to 4.6, podman stored ulimits as part of container metadata. After your reboot, it doesn't matter if you run podman 4.6 or 4.5, any attempt to restart the container is going to fail because podman can't set that ulimit.

Okay! Thanks for summarizing that so clearly.

To verify, I got rid of gvisor-tap-vsock, and downgraded podman and crun to their previous versions on my Fedora 38 Workstation, and after a reboot my older containers still don't run. That seems to confirm what you said above.

I have no trivial way to find out what the ulimits were on a traditional package based Linux distribution. I suppose I could use my Fedora Silverblue machine to figure that out. For what it's worth, the current values are:

$ ulimit -H -u
127703
$ ulimit -S -u
127703

... and:

$ podman inspect --type container --format ' {{ .HostConfig.Ulimits }}' fedora-toolbox-38
 [{RLIMIT_NOFILE 524288 524288} {RLIMIT_NPROC 127704 127704}]

The 4.6 podman change, IIUC, is that on container creation podman will no longer store those limits. Unless explicitly requested. So this problem shouldn't recur on future reboots. Obviously that does not time-travel back and help already-created containers.

I do think there's something Podman can do avoid the problem for Toolbx containers. See my comment above.

It's easy to identify a Toolbx container, and for those Podman could handle the failure to set the ulimits more softly.

Perhaps there's some sort of sysctl you can use to restore NPROC back to the prior value?

Sadly, that won't help the (surprisingly large number of) Toolbx users out there.

For a lot of people the Toolbx environment is their primary interactive command-line shell. It's unsettling when that stops working suddenly.

On top of that, if it's not trivially obvious to people like us, who are paid to work on this full-time, how to restore the old ulimit values or what caused them to change in the first place, then imagine how hard would it be for the random user out there to find out the workaround.

@debarshiray
Copy link
Member Author

I tried to explain this before. However, since folks are trying to, somewhat emphatically, state the opposite, I will risk repeating myself by responding to:

Podman is effectively asking customers to kill the existing data in order to upgrade. This is pretty bad and should be avoided if possible.

No data loss

... and this:

Backward compatibility with existing data (container instances are data) is important to preserve during upgrade. Podman is effectively asking customers to kill the existing data in order to upgrade. This is pretty bad and should be avoided if possible.

at least a warning and prompt should be shown. However, the normal use-case should not be to have long persistent data inside a podman container.

Arguing over whether there's data loss or not, gets close to playing with semantics. Of course, there's nothing catastrophic like rm -rf $HOME going on here, but that's not a requirement for a problem to be serious.

It's also not about pointing fingers at anyone. We need to find a way forward to recover from this problem. Maybe it requires reverting whatever changed the ulimits? Maybe it requires something else in the stack to handle those changes more gracefully?

Toolbx containers are by definition long-lived pet containers for continued interactive use, not short-lived service containers. Many people use them as their development environment, and some even as their primary interactive command-line shell. It's not fun if, out of the blue, the CLI shell of your choice (eg., Bash or Z shell) refuses to offer a prompt or your chosen editor (eg., Emacs or Vim) refuses to start. Note that this problem doesn't just affect unstable development distributions like Fedora Rawhide, but also stable ones where such things are not expected to happen.

Sometimes, the loss of a development environment can be a big loss, even if it can be salvaged, because time is a factor. We shouldn't be designing operating systems where users need to factor in the possibility that their CLI shell may suddenly refuse to work.

At a time when different groups of people are trying to ship OSTree based OSes, from Endless OS to the different Fedora variants to GNOME OS, the stability of Toolbx environments is crucial.

@debarshiray
Copy link
Member Author

After you made those changes did you fully log out and log back in, to make sure your login process had those settings?

Yes, I rebooted after my sudo dnf update.

$ ulimit -a
[...]
max user processes (-u) 256005
[...]

I get:

$ ulimit -H -u
127703
$ ulimit -S -u
127703

... and:

$ podman inspect --type container --format ' {{ .HostConfig.Ulimits }}' fedora-toolbox-38
 [{RLIMIT_NOFILE 524288 524288} {RLIMIT_NPROC 127704 127704}]

I wonder why you have 256005.

Podman is not doing anything special other then attempting to set the ulimits for the container.

Maybe it shouldn't do that for containers with the com.github.containers.toolbox label?

@debarshiray
Copy link
Member Author

it looks like the only solution is to recreate the container.

Sadly, as I mentioned in the report, that's a deal breaker for Toolbx containers. :/
There's a Fedora 39 Change to treat the Toolbx stack as a release blocker. I think one of the test criteria is about preexisting containers continuing to work.

Well, who brokered the deal with whom?

Umm... I am unsure about what you mean, but maybe:
https://docs.fedoraproject.org/en-US/program_management/pgm_guide/changes/

Toolbx uses (and advertises) them for a specific purpose, and changes quite a few of the standard podman options when it creates containers. Is there no way Toolbx could reset the ulimit for an existing container?

I don't know of any way to do that other than significantly side-stepping podman start ... within Toolbx.

Toolbx actually uses podman create --ulimit host ... to always reflect whatever ulimits the host has or something reasonably close. See:

$ podman inspect --type container --format ' {{ .Config.CreateCommand }}' fedora-toolbox-38
...
...

After all, it would be "fair" to ask users to operate on Toolbx containers using Toolbx (rather then podman, if it fails), and Toolbx is able to recognize Toolbx containers as such (i.e. distinguish them from non-Toolbx containers).

Like I said before, we shouldn't be asking users to do anything. We need to find a way where the existing tools are able to sort it out on their own.

Maybe, as a middle ground, podman could offer an option to migrate old containers (by resetting the ulimit setting or clearing it), and leave it for Toolbx (or the user) to decide when and if they use it?

If podman system migrate could do that, then it will just work.

@edsantiago
Copy link
Member

@debarshiray it's clear that you have deep concerns about this situation -- but it's equally clear that this is complex and will not be resolved by commenting on a closed github issue.

If you think this is critical enough for PM to intervene, please file a BZ and try to escalate.

If you think this is something Toolbx can address via special case, I encourage you to look into that option.

Or perhaps if there is a simple recipe for users to solve this on their own, you could add a solution here, make it the last comment in the thread, and we can lock the issue. Then affected people can websearch, find this, scroll down, and be happy.

I'm not sure what other options are available. Thank you for your concern and for understanding.

@debarshiray
Copy link
Member Author

@debarshiray it's clear that you have deep concerns about this situation -- but it's equally clear that this is complex and will not be resolved by commenting on a closed github issue.

If you think this is critical enough for PM to intervene, please file a BZ and try to escalate.

Sadly, I am also concerned about this approach of closing the issue in a hurry without any meaningful discussion or understanding; and then saying that there's no point commenting on a closed github issue; and that I should escalate. It's odd.

I also don't see why this problem is particularly complex. I offered one mechanism that Podman could use to avoid this problem, and I never heard back. Maybe it is complex, but I have no idea why.

If you think this is something Toolbx can address via special case, I encourage you to look into that option.

I don't know of any way to do that other than significantly side-stepping podman start ... within Toolbx.

@gptlang
Copy link

gptlang commented Aug 23, 2023

Arguing over whether there's data loss or not, gets close to playing with semantics. Of course, there's nothing catastrophic like rm -rf $HOME going on here, but that's not a requirement for a problem to be serious.

There is nothing about semantics. By no data loss, it meant that nothing (not even packages and stuff) installed in the old podman/toolbox containers are lost. All you do is export the data and import them again with the updated config.

I was panicked when toolbox stopped working because it took some effort to set up all my development stuff (vscode, build tools, miscellaneous utilities) which would take me a few hours to replace. I also had an important podman container I kept WIP projects not uploaded anywhere.

I tried this solution and it got back my containers just fine. As far as I'm aware, not a single file was lost.

@gptlang
Copy link

gptlang commented Aug 23, 2023

It would be nice if podman provided a way to automatically migrate from a deprecated config to a working one

@gptlang
Copy link

gptlang commented Aug 23, 2023

If you think this is something Toolbx can address via special case, I encourage you to look into that option.

What does this have to do with toolbox? It uses podman under the hood and this issue affects all containers. There is nothing special with toolbox that requires a different solution

@baude
Copy link
Member

baude commented Aug 23, 2023

I think everyone has a responsibility here. I am going to stop allowing comments on this issue temporarily and discuss this with the team.

@containers containers locked and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests