Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Podman fails to run in rootless container (OKD v3.11) #7861

Closed
nickyfoster opened this issue Oct 1, 2020 · 11 comments
Closed

Podman fails to run in rootless container (OKD v3.11) #7861

nickyfoster opened this issue Oct 1, 2020 · 11 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@nickyfoster
Copy link

BUG REPORT

/kind bug

Description

When I'm trying to build image with podman inside unprivileged container in Openshift OKD cluster I get:
Error processing tar file(exit status 1): there might not be enough IDs available in the namespace (requested 0:42 for /etc/shadow): lchown /etc/shadow: invalid argument
podman system migrate does nothing and the problem still exists.
I've also tried to use the solutions provided in #3421, #6667, #3890, #2393 but nothing helped.
Please help me to solve this problem!
Thank you all in advance.

Steps to reproduce the issue:

  1. Dockerfile
FROM docker.io/openshift/jenkins-agent-maven-35-centos7:v3.11

USER root

# install ansible
RUN yum install -y python3-pip && yum clean all
RUN pip3 install ansible --user
RUN pip3 install kubernetes --user
RUN pip3 install openshift --user


RUN echo 'jenkins:200000:2000' > /etc/subuid   && \
  echo 'jenkins:200000:2000' > /etc/subgid   && \
  echo 'jenkins:x:1001:0:Default Application User:/home/jenkins:/sbin/nologin' > /etc/passwd   && \
  echo 'root:x:0:0:root:/root:/bin/bash' > /etc/passwd   && \
  chown 1001:1001 /home/jenkins/.local


RUN yum -y update; yum -y reinstall shadow-utils; yum -y install podman fuse-overlayfs; rm -rf /var/cache /var/log/dnf* /var/log/yum.*
ADD https://raw.githubusercontent.com/containers/libpod/master/contrib/podmanimage/stable/containers.conf /etc/containers/

RUN chmod 644 /etc/containers/containers.conf; sed -i -e 's|^#mount_program|mount_program|g' -e '/additionalimage.*/a "/var/lib/shared",' -e 's|^mountopt[[:space:]]*=.*$|mountopt = "nodev,fsync=0"|g' /etc/containers/storage.conf
RUN mkdir -p /var/lib/shared/overlay-images /var/lib/shared/overlay-layers /var/lib/shared/vfs-images /var/lib/shared/vfs-layers; touch /var/lib/shared/overlay-images/images.lock; touch /var/lib/shared/overlay-layers/layers.lock; touch /var/lib/shared/vfs-images/images.lock; touch /var/lib/shared/vfs-layers/layers.lock

ENV _CONTAINERS_USERNS_CONFIGURED=""


USER 1001

  1. Pod Template
apiVersion: "v1"
kind: "Pod"
metadata:
  annotations: {}
  labels:
    jenkins: "slave"
    jenkins/maven: "true"
  name: "maven-q71hq"
spec:
  containers:
  - args:
    - "********"
    - "maven-q71hq"
    env:
    - name: "JENKINS_SECRET"
      value: "********"
    - name: "http_proxy"
    - name: "no_proxy"
    - name: "JENKINS_TUNNEL"
      value: "172.30.252.54:50000"
    - name: "https_proxy"
    - name: "GIT_COMMITTER_NAME"
      value: "jenkins"
    - name: "JENKINS_AGENT_WORKDIR"
      value: "/tmp"
    - name: "USER"
      value: "jenkins"
    - name: "JENKINS_AGENT_NAME"
      value: "maven-q71hq"
    - name: "JENKINS_NAME"
      value: "maven-q71hq"
    - name: "JENKINS_URL"
      value: "http://172.30.138.30:80/"
    - name: "HOME"
      value: "/home/jenkins"
    image: "example.com/jenkins-agent-gradle-centos7-dockerless"
    imagePullPolicy: "Always"
    name: "jnlp"
    resources:
      limits: {}
      requests: {}
    securityContext:
      privileged: false
    tty: false
    volumeMounts:
    - mountPath: "/sys/fs/cgroup"
      name: "volume-1"
      readOnly: false
    - mountPath: "/tmp"
      name: "workspace-volume"
      readOnly: false
    workingDir: "/tmp"
  imagePullSecrets:
  - name: "okd-nexus3"
  nodeSelector: {}
  restartPolicy: "Never"
  serviceAccount: "jenkins"
  volumes:
  - name: "volume-1"
    persistentVolumeClaim:
      claimName: "podman-pvc"
      readOnly: true
  - emptyDir:
      medium: ""
    name: "workspace-volume"
  1. Run podman run ubuntu inside the pod.

Describe the results you received:
When I try to build image inside jenkins pipeline, I get the following result:

+ podman build . -t example.com/image-test-build:trunk-12

STEP 1: FROM alpine:3.7

Getting image source signatures
Copying blob sha256:5d20c808ce198565ff70b3ed23a991dd49afac45dece63474b27ce6ed036adc6
Copying config sha256:6d1ef012b5674ad8a127ecfa9b5e6f5178d171b90ee462846974177fd9bdd39f
Writing manifest to image destination
Storing signatures
Error: error creating build container: The following failures happened while trying to pull image specified by "alpine:3.7" based on search registries in /etc/containers/registries.conf:
* "localhost/alpine:3.7": Error initializing source docker://localhost/alpine:3.7: error pinging docker registry localhost: Get https://localhost/v2/: dial tcp [::1]:443: connect: connection refused
* "registry.access.redhat.com/alpine:3.7": Error initializing source docker://registry.access.redhat.com/alpine:3.7: Error reading manifest 3.7 in registry.access.redhat.com/alpine: name unknown: Repo not found
* "registry.redhat.io/alpine:3.7": Error initializing source docker://registry.redhat.io/alpine:3.7: unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication
* "docker.io/library/alpine:3.7": Error committing the finished image: error adding layer with blob "sha256:5d20c808ce198565ff70b3ed23a991dd49afac45dece63474b27ce6ed036adc6": Error processing tar file(exit status 1): there might not be enough IDs available in the namespace (requested 0:42 for /etc/shadow): lchown /etc/shadow: invalid argument

Describe the results you expected:
Successfully built image

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

bash-4.2$ podman version
Version:            1.6.4
RemoteAPI Version:  1
Go Version:         go1.12.12
OS/Arch:            linux/amd64

Output of podman info --debug:

bash-4.2$ podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.12.12
  podman version: 1.6.4
host:
  BuildahVersion: 1.12.0-dev
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.8-1.el7.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.8, commit: f85c8b1ce77b73bcd48b2d802396321217008762'
  Distribution:
    distribution: '"centos"'
    version: "7"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 0
      size: 1
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
  MemFree: 2776309760
  MemTotal: 32943480832
  OCIRuntime:
    name: runc
    package: runc-1.0.0-67.rc10.el7_8.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 8
  eventlogger: file
  hostname: maven-q71hq
  kernel: 3.10.0-1127.el7.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.3-4.el7_8.x86_64
    Version: |-
      slirp4netns version 0.4.3
      commit: 2244b9b6461afeccad1678fac3d6e478c28b4ad6
  uptime: 64h 36m 54.86s (Approximately 2.67 days)
registries:
  blocked: null
  insecure: null
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  ConfigFile: /home/jenkins/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-6.el7_8.x86_64
      Version: |-
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.6.1
        using FUSE kernel interface version 7.29
  GraphRoot: /home/jenkins/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: overlayfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 0
  RunRoot: /tmp/run-1001/containers
  VolumePath: /home/jenkins/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

bash-4.2$ rpm -q podman
podman-1.6.4-18.el7_8.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes, I've compiled 1.8.x version but the result is still the same

Additional environment details (AWS, VirtualBox, physical, etc.):
Host:

[root@node02 ~]# cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 1, 2020
@vrothberg
Copy link
Member

@rhatdan @giuseppe PTAL

I am not sure Podman can run in a rootless container.

@nickyfoster
Copy link
Author

nickyfoster commented Oct 1, 2020

If it is not possible to run inside a rootless container, may be you can recommend me a solution for my case.
As for me, it's acceptable to use root inside container, but it just has to be isolated from the host system

@giuseppe
Copy link
Member

giuseppe commented Oct 1, 2020

I think you should at least add the CAP_SETUID/CAP_SETGID capabilities, or try the ignore_chown_errors storage opt

@nickyfoster
Copy link
Author

@giuseppe
Alright, thanks, I'll try that and let you know.

I also have this kind of error:

+ podman build . -t example.com/test-build:trunk-123
cannot clone: Invalid argument
user namespaces are not enabled in /proc/sys/user/max_user_namespaces
Error: could not get runtime: cannot re-exec process

when I run my pipeline.
But if I then remove the pod manually and run it once again, the error disappears

@nickyfoster
Copy link
Author

What am I doing wrong here?

bash-4.2$ podman build . -t example.com/test-build:trunk-123 --storage-opt "ignore_chown_errors=true"
Error: could not get runtime: overlay: Unknown option ignore_chown_errors

@rhatdan
Copy link
Member

rhatdan commented Oct 1, 2020

Podman will not run within a rootless container without lots of work and requiring a great deal of privilege. Most likely will not work in a version of OpenShift as old as 3.11 either.

@nickyfoster
Copy link
Author

@rhatdan and which version of OKD you recommend to use?

@rhatdan
Copy link
Member

rhatdan commented Oct 1, 2020

Well I would suggest a more current version 4.6 or later. We are working on supporting user namespace from within CRI-O now, which might help a little on this.

@nickyfoster
Copy link
Author

Alright then! I'll upgrade my cluster installation to a newer version and then I will experiment with podman one more time.

@rhatdan
Copy link
Member

rhatdan commented Oct 1, 2020

Currently you will need to run a container as root and default capabilities.

@nickyfoster
Copy link
Author

Got it, thank @rhatdan!

I'm closing the ticket now.
If I have any further questions, it is possible to reopen the ticket or just ask here?

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

5 participants