Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"metacopy check" insufficient? all "sudo podman __" commands failing on ubuntu 18.04 kernel 4.15 #9363

Closed
buck2202 opened this issue Feb 14, 2021 · 12 comments
Assignees
Labels
In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. Packaging Bug is in a Podman package

Comments

@buck2202
Copy link

buck2202 commented Feb 14, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

On a clean installation of podman 3, all commands fail when executed as root.

Steps to reproduce the issue:

$ sudo podman version
Error: Cannot connect to the Podman socket, make sure there is a Podman REST API service running.: failed to mount overlay for metacopy check with "nodev,metacopy=on" options: invalid argument

$ dmesg|tail
[ 1927.646068] overlayfs: unrecognized mount option "metacopy=on" or missing value

Describe the results you received:
From the podman error message, it seems like there is a test to determine kernel support for "metacopy", but it fails with an untrapped error on systems that don't offer that option?

Reverting mountopt = "nodev,metacopy=on" (v3) in /etc/containers/storage.conf to mountopt = "nodev" (as it installed in 2.2) avoids the issue.

Describe the results you expected:
sudo podman <command> should function, or at least an explicit "metacopy support not present, manually edit storage.conf" message could print.

Additional information you deem important (e.g. issue happens only occasionally):
Root is still required for checkpointing, and checkpointing with overlayfs via criu has been broken in ubuntu kernels since somewhere in the 5.0.0-2x range. 4.15.x is soon to be the only still-maintained (4.4.x has a few months left), working LTS ubuntu kernel for overlayfs checkpoints. From what I can gather, metacopy was added in 4.19

Output of podman version:

$ podman version
Version:      3.0.0
API Version:  3.0.0
Go Version:   go1.15.2
Built:        Thu Jan  1 00:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

$ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.19.2
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.26, commit: '
  cpus: 1
  distribution:
    distribution: ubuntu
    version: "18.04"
  eventLogger: journald
  hostname: mw-podman-base-bionic
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1003
      size: 1
    - container_id: 1
      host_id: 296608
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 296608
      size: 65536
  kernel: 4.15.0-1092-gcp
  linkmode: dynamic
  memFree: 96141312
  memTotal: 608616448
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.17.6-58ef-dirty
      commit: fd582c529489c0738e7039cbc036781d1d039014
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1002/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CA
P_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true

Package info (e.g. output of rpm -q podman or apt list podman):

$ apt list podman
Listing... Done
podman/unknown,now 100:3.0.0-1 amd64 [installed]

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):
google cloud ubuntu-1804 image, ga (4.15.x) kernel

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 14, 2021
@RaphaelPour
Copy link

RaphaelPour commented Feb 15, 2021

I can confirm this bug.

I noticed that my 18.04 ci job failed with podman login. Having set --log-level=debug the 18.04 job diverged from the 20.04 job one starting with the following log messages:

time="2021-02-15T12:44:37+01:00" level=debug msg="overlay test mount with multiple lowers succeeded"
time="2021-02-15T12:44:37+01:00" level=info msg="overlay test mount did not indicate whether or not metacopy is being used: failed to mount overlay for metacopy check with \"nodev,metacopy=on\" options: invalid argument"
Error: Cannot connect to the Podman socket, make sure there is a Podman REST API service running.: failed to mount overlay for metacopy check with "nodev,metacopy=on" options: invalid argument

@rhatdan
Copy link
Member

rhatdan commented Feb 15, 2021

This change would at least make this a warning rather then a hard failure.
containers/storage#819

@mheon
Copy link
Member

mheon commented Feb 15, 2021

@buck2202 Are you deliberately trying to run remote Podman?

It looks like Podman is trying to connect remotely there, which could be another serious bug if you did not intend that.

@buck2202
Copy link
Author

buck2202 commented Feb 15, 2021

@mheon no, I'm not trying to run remotely. You say that because of the REST API reference in the error message? Or the "remoteSocket" section of podman info?

As root, after editing storage.conf to mountopt = "nodev":

$ sudo podman info --debug
host:
  arch: amd64
  buildahVersion: 1.19.2
  cgroupManager: systemd
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.26, commit: '
  cpus: 1
  distribution:
    distribution: ubuntu
    version: "18.04"
  eventLogger: journald
  hostname: mw-podman-base-bionic
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.15.0-1092-gcp
  linkmode: dynamic
  memFree: 90628096
  memTotal: 608616448
  ociRuntime:
    name: runc
    package: 'runc: /usr/sbin/runc'
    path: /usr/sbin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: true
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    selinuxEnabled: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 0
  swapTotal: 0
  uptime: 5m 6.14s
registries:
  search:
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 4
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.15.2
  OsArch: linux/amd64
  Version: 3.0.0

edit: and debug-level output of the failure with mountopt = "nodev,metacopy=on"

$ sudo podman --log-level=debug version
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called version.PersistentPreRunE(podman --log-level=debug version) 
DEBU[0000] Reading configuration file "/usr/share/containers/containers.conf" 
DEBU[0000] Merged system config "/usr/share/containers/containers.conf": &{Containers:{Devices:[] Volumes:[] ApparmorProfile:containers-default-0.33.4 Annotations:[] CgroupNS:host Cgroups:enabled DefaultCapabilities:[CHOWN DAC_OVERRIDE FOWNER FSETID KILL NET_BIND_SERVICE SETFCAP SETGID SETPCAP SETUID SYS_CHROOT] DefaultSysctls:[net.ipv4.ping_group_range=0 0] DefaultUlimits:[nproc=32768:32768] DefaultMountsFile: DNSServers:[] DNSOptions:[] DNSSearches:[] EnableKeyring:true EnableLabeling:false Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TERM=xterm] EnvHost:false HTTPProxy:true Init:false InitPath: IPCNS:private LogDriver:k8s-file LogSizeMax:-1 NetNS:bridge NoHosts:false PidsLimit:2048 PidNS:private SeccompProfile:/usr/share/containers/seccomp.json ShmSize:65536k TZ: Umask:0022 UTSNS:private UserNS:host UserNSSize:65536} Engine:{ImageBuildFormat:oci CgroupCheck:false CgroupManager:systemd ConmonEnvVars:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] ConmonPath:[/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] DetachKeys:ctrl-p,ctrl-q EnablePortReservation:true Env:[] EventsLogFilePath:/run/libpod/events/events.log EventsLogger:journald HooksDir:[/usr/share/containers/oci/hooks.d] ImageDefaultTransport:docker:// InfraCommand: InfraImage:k8s.gcr.io/pause:3.2 InitPath:/usr/libexec/podman/catatonit LockType:shm MultiImageArchive:false Namespace: NetworkCmdPath: NetworkCmdOptions:[] NoPivotRoot:false NumLocks:2048 OCIRuntime:crun OCIRuntimes:map[crun:[/usr/bin/crun /usr/sbin/crun /usr/local/bin/crun /usr/local/sbin/crun /sbin/crun /bin/crun /run/current-system/sw/bin/crun] kata:[/usr/bin/kata-runtime /usr/sbin/kata-runtime /usr/local/bin/kata-runtime /usr/local/sbin/kata-runtime /sbin/kata-runtime /bin/kata-runtime /usr/bin/kata-qemu /usr/bin/kata-fc] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] PullPolicy:missing Remote:false RemoteURI: RemoteIdentity: ActiveService: ServiceDestinations:map[] RuntimePath:[] RuntimeSupportsJSON:[crun runc] RuntimeSupportsNoCgroups:[crun] RuntimeSupportsKVM:[kata kata-runtime kata-qemu kata-fc] SetOptions:{StorageConfigRunRootSet:false StorageConfigGraphRootSet:false StorageConfigGraphDriverNameSet:false StaticDirSet:false VolumePathSet:false TmpDirSet:false} SignaturePolicyPath:/etc/containers/policy.json SDNotify:false StateType:3 StaticDir:/var/lib/containers/storage/libpod StopTimeout:10 TmpDir:/run/libpod VolumePath:/var/lib/containers/storage/volumes VolumePlugins:map[]} Network:{CNIPluginDirs:[/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] DefaultNetwork:podman NetworkConfigDir:/etc/cni/net.d/}} 
DEBU[0000] Reading configuration file "/etc/containers/containers.conf" 
DEBU[0000] Merged system config "/etc/containers/containers.conf": &{Containers:{Devices:[] Volumes:[] ApparmorProfile:containers-default-0.33.4 Annotations:[] CgroupNS:host Cgroups:enabled DefaultCapabilities:[CHOWN DAC_OVERRIDE FOWNER FSETID KILL NET_BIND_SERVICE SETFCAP SETGID SETPCAP SETUID SYS_CHROOT] DefaultSysctls:[net.ipv4.ping_group_range=0 0] DefaultUlimits:[nproc=32768:32768] DefaultMountsFile: DNSServers:[] DNSOptions:[] DNSSearches:[] EnableKeyring:true EnableLabeling:false Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TERM=xterm] EnvHost:false HTTPProxy:true Init:false InitPath: IPCNS:private LogDriver:k8s-file LogSizeMax:-1 NetNS:bridge NoHosts:false PidsLimit:2048 PidNS:private SeccompProfile:/usr/share/containers/seccomp.json ShmSize:65536k TZ: Umask:0022 UTSNS:private UserNS:host UserNSSize:65536} Engine:{ImageBuildFormat:oci CgroupCheck:false CgroupManager:systemd ConmonEnvVars:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] ConmonPath:[/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] DetachKeys:ctrl-p,ctrl-q EnablePortReservation:true Env:[] EventsLogFilePath:/run/libpod/events/events.log EventsLogger:journald HooksDir:[/usr/share/containers/oci/hooks.d] ImageDefaultTransport:docker:// InfraCommand: InfraImage:k8s.gcr.io/pause:3.2 InitPath:/usr/libexec/podman/catatonit LockType:shm MultiImageArchive:false Namespace: NetworkCmdPath: NetworkCmdOptions:[] NoPivotRoot:false NumLocks:2048 OCIRuntime:runc OCIRuntimes:map[crun:[/usr/bin/crun /usr/sbin/crun /usr/local/bin/crun /usr/local/sbin/crun /sbin/crun /bin/crun /run/current-system/sw/bin/crun] kata:[/usr/bin/kata-runtime /usr/sbin/kata-runtime /usr/local/bin/kata-runtime /usr/local/sbin/kata-runtime /sbin/kata-runtime /bin/kata-runtime /usr/bin/kata-qemu /usr/bin/kata-fc] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc]] PullPolicy:missing Remote:false RemoteURI: RemoteIdentity: ActiveService: ServiceDestinations:map[] RuntimePath:[] RuntimeSupportsJSON:[crun runc] RuntimeSupportsNoCgroups:[crun] RuntimeSupportsKVM:[kata kata-runtime kata-qemu kata-fc] SetOptions:{StorageConfigRunRootSet:false StorageConfigGraphRootSet:false StorageConfigGraphDriverNameSet:false StaticDirSet:false VolumePathSet:false TmpDirSet:false} SignaturePolicyPath:/etc/containers/policy.json SDNotify:false StateType:3 StaticDir:/var/lib/containers/storage/libpod StopTimeout:10 TmpDir:/run/libpod VolumePath:/var/lib/containers/storage/volumes VolumePlugins:map[]} Network:{CNIPluginDirs:[/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] DefaultNetwork:podman NetworkConfigDir:/etc/cni/net.d/}} 
DEBU[0000] Using conmon: "/usr/libexec/podman/conmon"   
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/lib/containers/storage 
DEBU[0000] Using run root /run/containers/storage       
DEBU[0000] Using static dir /var/lib/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/libpod                    
DEBU[0000] Using volume path /var/lib/containers/storage/volumes 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] cached value indicated that overlay is supported 
INFO[0000] overlay test mount did not indicate whether or not metacopy is being used: failed to mount overlay for metacopy check with "nodev,metacopy=on" options: invalid argument 
Error: Cannot connect to the Podman socket, make sure there is a Podman REST API service running.: failed to mount overlay for metacopy check with "nodev,metacopy=on" options: invalid argument

if you're concerned about the REST reference in the error message, it only seems to appear when reporting the error. Maybe just an inapplicable string being prepended to the return from the mount test?

@mheon
Copy link
Member

mheon commented Feb 15, 2021

@vrothberg found the issue there - looks like it was just an issue with the message displayed, which is a relief. Metacopy issue should be separate.

@lsm5
Copy link
Member

lsm5 commented Feb 17, 2021

I will update the containers-common package to remove metacopy=on for ubuntu 18.04.

@lsm5 lsm5 self-assigned this Feb 17, 2021
@lsm5 lsm5 added In Progress This issue is actively being worked by the assignee, please do not work on this at this time. Packaging Bug is in a Podman package labels Feb 17, 2021
@buck2202
Copy link
Author

@lsm5 not to make it too complicated, but there are 18.04 kernels that will support metacopy.

The 18.04 GA kernel (original release, supported for full LTS lifetime) is 4.15, which doesn't support it. The 18.04 HWE kernel (six month rolling release) is currently up to (I think) 5.4, which should support it fine.

I don't know the exact kernel version where support would start, but if you're adding a specific check on 18.04, you might just make it 18.04 AND a 4.x kernel.

@lsm5
Copy link
Member

lsm5 commented Feb 22, 2021

@lsm5 not to make it too complicated, but there are 18.04 kernels that will support metacopy.

The 18.04 GA kernel (original release, supported for full LTS lifetime) is 4.15, which doesn't support it. The 18.04 HWE kernel (six month rolling release) is currently up to (I think) 5.4, which should support it fine.

I don't know the exact kernel version where support would start, but if you're adding a specific check on 18.04, you might just make it 18.04 AND a 4.x kernel.

Well, checking for the kernel on the machine will happen at install time, and that'd need me to write some post-install scriptlet I guess, which I haven't done before on the debian side. Is it a deal-breaker to skip metacopy=on from all of ubuntu 18.04?

@vrothberg
Copy link
Member

I think we can uncomment it unconditionally on 18.04. Users can enable if needed.

I think we can close. Please reopen if I am mistaken.

@sebastian-philipp
Copy link

I can still reproduce this error on ubuntu 18.04:

Running command: /usr/bin/podman pull quay.ceph.io/ceph-ci/ceph:93810c6685e4f10b9d264637bb89f7e1d896da71
/usr/bin/podman: stderr Error: failed to mount overlay for metacopy check with "nodev,metacopy=on" options: invalid argument

This blocks ceph/ceph#39494

@mheon
Copy link
Member

mheon commented Mar 2, 2021

Is there a reason you need to support this? We're in the process of deprecating upstream support for 18.04.

@sebastian-philipp
Copy link

Do you have a supported major version that continues to support 10.08? Ceph Octopus is supposed to be EOL mid 2022. And 18.04 is EOL in 2028

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. Packaging Bug is in a Podman package
Projects
None yet
Development

No branches or pull requests

8 participants