Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reboot after ostree update causing "error setting rlimit type 6: operation not permitted" #7466

Closed
wdouglascampbell opened this issue Aug 27, 2020 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@wdouglascampbell
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I am running Fedora CoreOS. When the system auto-updates to a new release and reboots. The rootless containers that I am running and have configured services to start upon reboot aren't starting up. Instead I see errors like the following in the journalctl logs.

error setting rlimit type 6: operation not permitted

Steps to reproduce the issue:

  1. Allow Fedora CoreOS to auto-update when a new release comes out

  2. After reboot has occur observe that none of my rootless podman containers are running.

Describe the results you received:

All containers have failed to start and cannot be started after rebooting from an auto-upgrade.

Describe the results you expected:

Containers configured to be started by systemctl service will start upon boot.

Additional information you deem important (e.g. issue happens only occasionally):

After this occurs. I must recreate the containers to get them to start again.

Output of podman version:

Version:            1.9.3
RemoteAPI Version:  1
Go Version:         go1.14.2
OS/Arch:            linux/amd64

Output of podman info --debug:

(paste your output here)

Package info (e.g. output of rpm -q podman or apt list podman):

Version:            1.9.3
RemoteAPI Version:  1
Go Version:         go1.14.2
OS/Arch:            linux/amd64
[core@coreos ~]$ podman info --debug
debug:
  compiler: gc
  gitCommit: ""
  goVersion: go1.14.2
  podmanVersion: 1.9.3
host:
  arch: amd64
  buildahVersion: 1.14.9
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.19-1.fc32.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.19, commit: 5dce9767526ed27f177a8fa3f281889ad509fea7'
  cpus: 12
  distribution:
    distribution: fedora
    version: "32"
  eventLogger: file
  hostname: coreos.l2d.biz
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.7.12-200.fc32.x86_64
  memFree: 43138428928
  memTotal: 67493195776
  ociRuntime:
    name: runc
    package: runc-1.0.0-144.dev.gite6555cc.fc32.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc10+dev
      commit: fbdbaf85ecbc0e077f336c03062710435607dbf1
      spec: 1.0.1-dev
  os: linux
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.4-1.fc32.x86_64
    version: |-
      slirp4netns version 1.1.4
      commit: b66ffa8e262507e37fca689822d23430f3357fe8
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 2
  swapFree: 0
  swapTotal: 0
  uptime: 3h 47m 35.23s (Approximately 0.12 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 18
    paused: 0
    running: 18
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.1.2-1.fc32.x86_64
      Version: |-
        fusermount3 version: 3.9.1
        fuse-overlayfs: version 1.1.0
        FUSE library version 3.9.1
        using FUSE kernel interface version 7.31
  graphRoot: /var/home/core/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 62
  runRoot: /run/user/1000/containers
  volumePath: /var/home/core/.local/share/containers/storage/volumes

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

No. I believe the latest version of Podman is 2.0.5 but Fedora CoreOS latest stable is running Podman 1.9.3. I don't know if it is possible for me to update to a newer version under Fedora CoreOS.

Additional environment details (AWS, VirtualBox, physical, etc.):

physical

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 27, 2020
@rhatdan
Copy link
Member

rhatdan commented Aug 27, 2020

@dustymabe When is fedora coreos getting updated?

@dustymabe
Copy link
Contributor

@dustymabe When is fedora coreos getting updated?

Tracking it over here: coreos/fedora-coreos-tracker#575

Currently blocked on #7441

@dustymabe
Copy link
Contributor

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

No. I believe the latest version of Podman is 2.0.5 but Fedora CoreOS latest stable is running Podman 1.9.3. I don't know if it is possible for me to update to a newer version under Fedora CoreOS.

You can test with podman 2.x by using the next stream. next currently has podman 2.0.4.

@dustymabe
Copy link
Contributor

@rhatdan - if we could get the fix for #7441 backported to the 2.0.5 rpm in fedora for f32 then we would drop the freeze on podman 1.9.3 I think.

@wdouglascampbell
Copy link
Author

You can test with podman 2.x by using the next stream. next currently has podman 2.0.4.

Thanks @dustymabe. I'm hesitant to try switching to the next stream for fear that I might mess up my production environment. I cannot tell from the instructions whether that fear is warranted or not but I am willing to wait in hopes that the freeze on podman 1.9.3 will end soon.

@dustymabe
Copy link
Contributor

dustymabe commented Aug 27, 2020

Hey @wdouglascampbell - your fears are warranted, but I would encourage you to run at least some testing nodes in order to know when breakage is coming to your production cluster and helping us fix the problems before it hits stable for everyone.

Right now (08/27/2020) testing and next only differ in podman version, nothing else.

@wdouglascampbell
Copy link
Author

Hey @wdouglascampbell - your fears are warranted, but I would encourage you to run at least some testing nodes in order to know when breakage is coming to your production cluster and helping us fix the problems before it hits stable for everyone.

Right now (08/27/2020) testing and next only differ in podman version, nothing else.

I will keep that in mind and see what I can do about setting up a testing node. Thanks!

@rhatdan
Copy link
Member

rhatdan commented Sep 10, 2020

@dustymabe podman 2.0.6 should have the fix.

@rhatdan rhatdan closed this as completed Sep 10, 2020
@dustymabe
Copy link
Contributor

FYI podman 2.0.6 is in testing and next. Coming to FCOS stable soon.

@wdouglascampbell
Copy link
Author

Just FYI now that podman 2.0.6 is in FCOS stable. I re-tested and so far so good a reboot worked. Thanks!

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

4 participants