Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Add podman docs #565

Closed
wants to merge 8 commits into from
Closed

Conversation

jodh-intel
Copy link
Contributor

Add a doc explaining how to run Kata Containers rootless using podman.

This replaces PR #553.

@jodh-intel jodh-intel added the wip Work in Progress (PR incomplete - needs more work or rework) label Oct 11, 2019
@jodh-intel jodh-intel requested a review from a team as a code owner October 11, 2019 08:46
@jodh-intel jodh-intel force-pushed the add-podman-docs branch 3 times, most recently from 3bc78ec to 7c27eb7 Compare October 11, 2019 09:03
Copy link

@devimc devimc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jodh-intel for fixing this, I can't run kata containers with podman, any thoughts? thanks again

@jodh-intel
Copy link
Contributor Author

Hi @devimc - agreed: it appears to be broken hence the wip label. I'm currently digging but can't spend much more time on this today.

@devimc
Copy link

devimc commented Oct 11, 2019

@jodh-intel don't worry, I'll take a look too

@devimc
Copy link

devimc commented Oct 11, 2019

@jodh-intel fixed network issue, you have to update podman and slirp4netns.

podman version 1.6.2-dev
slirp4netns version 0.4.1+dev

but now I get the following error:

$ podman run --net=slirp4netns --rm --runtime=/usr/local/bin/kata-runtime alpine date
Error: mkdir /var/lib/vc/uuid: permission denied: OCI runtime permission denied error

Copy link

@klynnrif klynnrif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scrubbed for grammar, spelling, and voice. A few suggestions. Thanks!

| ----------------|:-------:|---------------------|
| Podman | WIP | [see here](https://github.com/containers/libpod/blob/master/install.md)
| `slirp4netns` | 0.4.0 | [see here](https://github.com/rootless-containers/slirp4netns#quick-start)
| Kata Containers | WIP | [see here](https://github.com/kata-containers/documentation/blob/master/install/README.md)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On these three, you might want to be more specific on the link descriptions (just a suggestion) Example:
see the libpod installation instructions
see the quick start
see the Kata Containers installation guides

Or, whatever description makes the most sense.


If SELinux is installed and enabled, it needs to be disabled with the
following command (Kata Containers
[does not support SELinux](https://github.com/kata-containers/documentation/blob/master/Limitations.md#selinux-support)).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 69-71 suggested rewrite:
Kata Containers does not support SELinux). If you have installed and enabled SELinux, you need to disable SELinux using the following command:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "currently does not support SELinux" because I think at some point it probably would.

$ [ -e ~/.config/containers/libpod.conf ] || install -D /usr/share/containers/libpod.conf ~/.config/containers/libpod.conf
```

By default the `tmp_dir` in `libpod.conf` is set to `/var/run/libpod`, however

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comma after "default"
By default, the ....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my Fedora system this was already set to /run/user/1000 I'd just reword a bit "Check the tmp_dir in libpod.conf is set to /run/user/$(id -u). If not then change it with the following command.


## Appendix: Possible Errors

If you are building from source you might encounter the following errors.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comma after "source"
If you are building from source, you might...

[agent](https://github.com/kata-containers/agent/commit/a78e8cfda627cc350dc9d9ca9b969ebb642030c3)
and
[runtime](https://github.com/kata-containers/runtime/commit/cfedb06a19135e2ab4f18203a4f3147cdc3a4980)
code. This would probably only occur if building latest from source, since the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested rewrite:
This typically only occurs if building...

@amshinde
Copy link
Member

@jodh-intel I have raised a PR against your repo to include some changes that I came accross while setting up rootless kata with podman : jodh-intel#1

@jodh-intel
Copy link
Contributor Author

@amshinde - this still fails for me on Ubuntu 18.04:

$ podman run --log-level=debug -ti  --runtime=/usr/local/bin/kata-runtime busybox
WARN[0000] Failed to add conmon to cgroupfs sandbox cgroup: error creating cgroup for cpu: mkdir /sys/fs/cgroup/cpu/libpod_parent: permission denied
DEBU[0000] Received: -1
DEBU[0000] Cleaning up container e1dcd0442c7028d70e4a4574d741629b028f81ddca7c169aa176174b1c6fb30e
DEBU[0000] Tearing down network namespace at /run/user/1000/netns/cni-da32eef2-c338-1558-bdb6-1a8847dd2213 for container e1dcd0442c7028d70e4a4574d741629b028f81ddca7c169aa176174b1c6fb30e
DEBU[0000] unmounted container "e1dcd0442c7028d70e4a4574d741629b028f81ddca7c169aa176174b1c6fb30e"
DEBU[0000] ExitCode msg: "network device mode not determined correctly. mount sysfs in caller: oci runtime error"
ERRO[0000] Network device mode not determined correctly. Mount sysfs in caller: OCI runtime error

@amshinde
Copy link
Member

@jodh-intel Ahh I think you are using an kernel older than 4.13.0.
There is a limitation today for older kernels, for which there is a PR open in podman:
containers/podman#3831

We should document this limitation until that is merged. Meanwhile can you give this a try on a newer kernel?

@jodh-intel
Copy link
Contributor Author

@amshinde - I've recreated on Ubuntu 19.04 with a 5.0.0-36 kernel. It still doesn't work. I think it makes sense for you to take over this doc PR as you clearly have the magic touch (or I lack it :)

I'm happy to expand the docs once we understand the precise requirements more clearly.

/cc @egernst.

@amshinde
Copy link
Member

@jodh-intel Thats a bummer, can you post what error you were seing with the newer kernel?

@jodh-intel
Copy link
Contributor Author

@amshinde - here you go...

  • default runtime works:

    $ podman --log-level debug run --net=none -ti busybox
  • Kata runtime with networking fails:

    $ podman --log-level debug run --runtime=/usr/local/bin/kata-runtime -ti busybox
        :
    DEBU[0000] running conmon: /usr/libexec/podman/conmon    args="[--api-version 1 -c 37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb -u 37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb -r /usr/local/bin/kata-runtime -b /home/james/.local/share/containers/storage/vfs-containers/37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb/userdata -p /run/user/1000/vfs-containers/37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb/userdata/pidfile -l k8s-file:/home/james/.local/share/containers/storage/vfs-containers/37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level debug --syslog -t --conmon-pidfile /run/user/1000/vfs-containers/37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/james/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg /usr/local/bin/kata-runtime --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb]"
    DEBU[0001] Received: -1
    DEBU[0001] Cleaning up container 37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb
    DEBU[0001] Tearing down network namespace at /run/user/1000/netns/cni-91ac925c-0fba-7679-d039-721062664646 for container 37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb
    DEBU[0001] unmounted container "37d1dab3bcd6851c6a34473253a47d715028cc6382c3f90339dd35a246afbecb"
    DEBU[0001] ExitCode msg: "rpc error: code = internal desc = could not add route dest()/gw(10.0.2.2)/dev(tap0): network is unreachable: oci runtime error"
    ERRO[0001] rpc error: code = Internal desc = Could not add route dest()/gw(10.0.2.2)/dev(tap0): network is unreachable: OCI runtime error
  • Kata runtime without networking fails:

    $ podman --log-level debug run --runtime=/usr/local/bin/kata-runtime --net=none -ti busybox
       :
    DEBU[0000] running conmon: /usr/libexec/podman/conmon    args="[--api-version 1 -c b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf -u b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf -r /usr/local/bin/kata-runtime -b /home/james/.local/share/containers/storage/vfs-containers/b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf/userdata -p /run/user/1000/vfs-containers/b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf/userdata/pidfile -l k8s-file:/home/james/.local/share/containers/storage/vfs-containers/b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level debug --syslog -t --conmon-pidfile /run/user/1000/vfs-containers/b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/james/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg /usr/local/bin/kata-runtime --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf]"
    DEBU[0000] Received: -1
    DEBU[0000] Cleaning up container b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf
    DEBU[0000] unmounted container "b274528759ac8596a39c84d2aedd2f4c4ee5de1729821315a9fd38be0f455cbf"
    DEBU[0000] ExitCode msg: "mkdir /var/run/netns: permission denied: oci runtime permission denied error"
    ERRO[0000] mkdir /var/run/netns: permission denied: OCI runtime permission denied error

@jodh-intel
Copy link
Contributor Author

@amshinde
Copy link
Member

@jodh-intel Thanks, I'll try to reproduce this on a Fedora machine today. I have tried this on Ubuntu on two diff Azure instances and could not replicate it.

@amshinde
Copy link
Member

@jodh-intel If you have the setup handy, can you let me know if you see this logs in your runs:
https://github.com/kata-containers/runtime/blob/bcb38548f9498a06ca50b160b40cc5327c7e6e65/pkg/rootless/rootless.go#L99
I want to see if running rootless is detected correctly on your platform.

@amshinde
Copy link
Member

amshinde commented Dec 3, 2019

So, I have tried the steps on two different ubuntu instances on Azure cloud as well as a Fedora 30 VM using ccloudvm. Have been able to run a rootless container with Kata with all of them.

@jodh-intel The net=none bug is something that needs to be fixed in podman and is not a very interesting case for rootless.
The other bug that you mentioned with networking could be due to using an older kata-agent in the image, I have seen those errors in the past with roooted containers when using an older kata image.
I can only speculate at this point without seeing any detailed logs.

As far as I have tested with the specified versions and the latest instructions in here, this should work.
It would be great of @devimc @jodh-intel can give it another shot.

I also feel that we should try to minimise any extra steps required for rootless setup.
First, disabling vhost-net for rootless, is something that could be done by the runtime itself imho.
I'll raise a PR for this.

Second, changing the permissions on the kata image. I have changed the instructions slightly to change the group ownership of the image to group kvm here and add rw permissions for group since the user is required to be part of that group. What if we did this by default?
How about our packaging scripts do this so that the OBS packages and static tarballs ship the image or the /usr/share/kata-containers directory with kvm group ownership.
cc @egernst for the security implications for doing so.
I think in the long run, it will be good to have Kata run out of the box, instead of having these extra setup steps that need sudo. Would like more input on this.

Lastly, we do need to disable selinux. This causes issues not only for rootless containers but also for rooted ones. We should evaluate what is required to make Kata work with selinux.

@jodh-intel @egernst Can you please review this doc again so that we can merge it?

>
> If installing Podman with a package manager, there is usually no need to
> install `slirp4netns` separately.
> You will need to install Kata Containers from source today for rootless support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I installed using kata-deploy using the 1.10.0-alpha1 tag so this is not technically true. You could rephrase to "You will need to install Kata Containers from source today for rootless support, or you could use kata-deploy with a 1.10.0-alpha1 tag."

| ----------------|:-------------:|---------------------|
| Podman | 1.6.2 | [see here](https://github.com/containers/libpod/blob/master/install.md)
| `slirp4netns` | 0.4.0 | [see here](https://github.com/rootless-containers/slirp4netns#quick-start)
| Kata Containers | 1.10.0-alpha1 | [see here](https://github.com/kata-containers/documentation/blob/master/install/README.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think listing the supported hypervisors here makes a lot of sense. We know this probably won't work with ACRN but should work with qemu and firecracker right? Maybe list those as two rows. I'm not sure if the version of the hypervisor matters but here is one way to do it.

Kata Containers 1.10.0-alpha1
Qemu Hypervisonr xxxxversion
Firecracker Hypervisor xxxxversion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have verified this with qemu itself, I have added that to the doc.


### Add user to KVM group

If running a KVM based hypervisor, add the user running the workload to the KVM group:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this was worded was confusing to me because I thought it was asking if I was already in a VM trying to enable this. I'd reword to something like "Currently only KVM based hypervisors for rootless Podman are supported. To enable rootless support, the user running the workload needs to be added to the KVM group"

>
> `kvm` should be the group owning the device node `/dev/kvm` on most distros.
> Make sure permissions on `/dev/kvm` are as shown below:
> ```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My permissions didn't exactly match this and was crw-rw-rw- I suggest rewording to "Make sure the minimum permissions on /dev/kvm are shown below"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

gabi beyer and others added 7 commits December 18, 2019 13:12
Documentation for running Kata Containers with Podman as a non
privileged user

Fixes: kata-containers#540

Signed-off-by: gabi beyer <[email protected]>
Changes:

- Spaced out the document for clarity.
- Folded long lines.
- Removed unnecessary numbers in section names.
- Improved formatting.

Signed-off-by: James O. D. Hunt <[email protected]>
Apply the remaining review feedback to the podman doc.

Signed-off-by: James O. D. Hunt <[email protected]>
Disabling SELinux should not be undertaken lightly and users should
understand the implications before doing so, hence add a warning.

Signed-off-by: James O. D. Hunt <[email protected]>
Remove a redundant blank line.

Signed-off-by: James O. D. Hunt <[email protected]>
Fix some obvious code errors in the podman doc.

Signed-off-by: James O. D. Hunt <[email protected]>
Update required versions for podman and Kata required for running
rootless.

rootless: Make sure /dev/kvm has group ownership as kvm

rootless: Add explanation for var XDG_RUNTIME_DIR
Add a note to explicitly check if this variable is set.

Signed-off-by: Archana Shinde <[email protected]>
@jodh-intel jodh-intel force-pushed the add-podman-docs branch 4 times, most recently from 40f452f to 7ff4418 Compare December 18, 2019 15:36
@jodh-intel
Copy link
Contributor Author

@amshinde - I've cleaned up the doc and re-tested with:

  • F30
  • distro-packaged podman and slirp4netns.
  • latest runtime (from git).

... but I still get EPERM:

$ podman run --log-level=debug --runtime=/usr/local/bin/kata-runtime -ti busybox sh

    :

DEBU[0000] running conmon: /usr/bin/conmon               args="[--api-version 1 -c ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917 -u ee1b59cfd1cf68783d5eab922795b37a0b0b7f903
4e266cab691e4fe62582917 -r /usr/local/bin/kata-runtime -b /home/james/.local/share/containers/storage/overlay-containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/userda
ta -p /run/user/1000/overlay-containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/userdata/pidfile -l k8s-file:/home/james/.local/share/containers/storage/overlay-contai
ners/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level
 debug --syslog -t --conmon-pidfile /run/user/1000/overlay-containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/userdata/conmon.pid --exit-command /usr/bin/podman --exit
-command-arg --root --exit-command-arg /home/james/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-
arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-a
rg /usr/local/bin/kata-runtime --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/bin/fuse-overlayfs --ex
it-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe6
2582917]"
DEBU[0002] Received: -1
DEBU[0002] Cleaning up container ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917
DEBU[0002] Tearing down network namespace at /run/user/1000/netns/cni-15e74547-ddaa-52f2-071c-b6fdcd9fb19a for container ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917
DEBU[0002] unmounted container "ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917"
DEBU[0002] ExitCode msg: "rpc error: code = internal desc = could not run process: container_linux.go:346: starting container process caused \"process_linux.go:449: container init caused \\\"r
ootfs_linux.go:58: mounting \\\\\\\"/run/user/1000/run/kata-containers/shared/containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917-77ffebc66c56f81e-resolv.conf\\\\\\\" t
o rootfs \\\\\\\"/run/user/1000/run/kata-containers/shared/containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/rootfs\\\\\\\" at \\\\\\\"/run/user/1000/run/kata-contain
ers/shared/containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/rootfs/etc/resolv.conf\\\\\\\" caused \\\\\\\"open /run/user/1000/run/kata-containers/shared/containers/e
e1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/rootfs/etc/resolv.conf: operation not permitted\\\\\\\"\\\"\": oci runtime permission denied error"
ERRO[0002] rpc error: code = Internal desc = Could not run process: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58:
 mounting \\\"/run/user/1000/run/kata-containers/shared/containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917-77ffebc66c56f81e-resolv.conf\\\" to rootfs \\\"/run/user/100
0/run/kata-containers/shared/containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/rootfs\\\" at \\\"/run/user/1000/run/kata-containers/shared/containers/ee1b59cfd1cf6878
3d5eab922795b37a0b0b7f9034e266cab691e4fe62582917/rootfs/etc/resolv.conf\\\" caused \\\"open /run/user/1000/run/kata-containers/shared/containers/ee1b59cfd1cf68783d5eab922795b37a0b0b7f9034e266c
ab691e4fe62582917/rootfs/etc/resolv.conf: operation not permitted\\\"\"": OCI runtime permission denied error

I can't spend any more time on this today but it would be useful to know what results you see atm.

@amshinde
Copy link
Member

@jodh-intel Interesting, you are getting a permission denied error while mounting on /run/user/1000/run/kata-containers/shared/containers.
Can you check if you have rwx permissions for /run/user/1000 ?

@jodh-intel
Copy link
Contributor Author

I can create a container on 18.04 using latest runtime but only without networking. With networking gives:

$ podman run --log-level=debug --rm --runtime=/usr/local/bin/kata-runtime -ti alpine sh

        :

DEBU[0000] running conmon: /usr/bin/conmon               args="[--api-version 1 -c eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d -u eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d -r /usr/local/bin/kata-runtime -b /home/james/.local/share/containers/storage/vfs-containers/eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d/userdata -p /run/user/1000/vfs-containers/eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d/userdata/pidfile -l k8s-file:/home/james/.local/share/containers/storage/vfs-containers/eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level debug --syslog -t --conmon-pidfile /run/user/1000/vfs-containers/eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/james/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg /usr/local/bin/kata-runtime --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d]"
WARN[0000] Failed to add conmon to cgroupfs sandbox cgroup: error creating cgroup for memory: mkdir /sys/fs/cgroup/memory/libpod_parent: permission denied
DEBU[0000] Received: -1
DEBU[0000] Cleaning up container eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d
DEBU[0000] Tearing down network namespace at /run/user/1000/netns/cni-547e148d-9fb4-a148-3d62-0bafb869236a for container eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d
DEBU[0000] unmounted container "eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d"
DEBU[0000] Cleaning up container eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d
DEBU[0000] Network is already cleaned up, skipping...
DEBU[0000] Container eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d storage is already unmounted, skipping...
DEBU[0000] Container eda58c3bd4a21fa8e625026d7a35a7814eaae58fca51dbabce697d08a134a35d storage is already unmounted, skipping...
DEBU[0000] ExitCode msg: "network device mode not determined correctly. mount sysfs in caller: oci runtime error"
ERRO[0000] Network device mode not determined correctly. Mount sysfs in caller: OCI runtime error

Attempting the same on F30 fails in all scenarios, even after a chmod -R 777 frenzy:

$ chmod 777 -R /run/user/1000
$ podman run --log-level=debug --runtime=/usr/local/bin/kata-runtime -ti busybox sh

        :


EBU[0000] running conmon: /usr/bin/conmon               args="[--api-version 1 -c a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a -u a0d1af0363001f2b2d3fdfc4332a09ea94a708c56
854d752a761a4a6579b176a -r /usr/local/bin/kata-runtime -b /home/james/.local/share/containers/storage/overlay-containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/userda
ta -p /run/user/1000/overlay-containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/userdata/pidfile -l k8s-file:/home/james/.local/share/containers/storage/overlay-contai
ners/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level
 debug --syslog -t --conmon-pidfile /run/user/1000/overlay-containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/userdata/conmon.pid --exit-command /usr/bin/podman --exit
-command-arg --root --exit-command-arg /home/james/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-
arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-a
rg /usr/local/bin/kata-runtime --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/bin/fuse-overlayfs --ex
it-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a65
79b176a]"
DEBU[0001] Received: -1
DEBU[0001] Cleaning up container a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a
DEBU[0001] Tearing down network namespace at /run/user/1000/netns/cni-60233ed6-2bb6-aca1-3362-d26e46d693de for container a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a
DEBU[0001] unmounted container "a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a"
DEBU[0001] ExitCode msg: "rpc error: code = internal desc = could not run process: container_linux.go:346: starting container process caused \"process_linux.go:449: container init caused \\\"r
ootfs_linux.go:58: mounting \\\\\\\"/run/user/1000/run/kata-containers/shared/containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a-2a39472b6ed3d748-resolv.conf\\\\\\\" t
o rootfs \\\\\\\"/run/user/1000/run/kata-containers/shared/containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/rootfs\\\\\\\" at \\\\\\\"/run/user/1000/run/kata-contain
ers/shared/containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/rootfs/etc/resolv.conf\\\\\\\" caused \\\\\\\"open /run/user/1000/run/kata-containers/shared/containers/a
0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/rootfs/etc/resolv.conf: operation not permitted\\\\\\\"\\\"\": oci runtime permission denied error"
ERRO[0001] rpc error: code = Internal desc = Could not run process: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58:
 mounting \\\"/run/user/1000/run/kata-containers/shared/containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a-2a39472b6ed3d748-resolv.conf\\\" to rootfs \\\"/run/user/100
0/run/kata-containers/shared/containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/rootfs\\\" at \\\"/run/user/1000/run/kata-containers/shared/containers/a0d1af0363001f2b
2d3fdfc4332a09ea94a708c56854d752a761a4a6579b176a/rootfs/etc/resolv.conf\\\" caused \\\"open /run/user/1000/run/kata-containers/shared/containers/a0d1af0363001f2b2d3fdfc4332a09ea94a708c56854d75
2a761a4a6579b176a/rootfs/etc/resolv.conf: operation not permitted\\\"\"": OCI runtime permission denied error

@amshinde - looking back over this PR, it appears you've only tested rootless under Azure? If so, please could you try on a baremetal or local VM environment as these are where I see problems.

- Removed `--enable-man` when building shadow tools.
- Corrected invalid command (`addgroup` -> `groupadd`).
- Set trusted group to `kvm`.
- Added a distro-agnostic check for installing subuid support.
- Fixed some bash commands.
- Fixed typos and spelling mistakes.
- Fixed permissions on files.
- Fixed SELinux checks for non-SELinux distros.

Signed-off-by: James O. D. Hunt <[email protected]>
@amshinde
Copy link
Member

@jodh-intel I had tried this on ubuntu 18.04 on azure and on a local Fedora 30 VM using ccloudvm.
Your network error seems to be caused by an older kernel:
It comes from here:
https://github.com/kata-containers/runtime/blob/dc05d7dbbf6457a41a86cce1dd41caee2ba199cc/virtcontainers/network.go#L1215

Using a newer kernel should solve that issue.

I am really not sure why you are having issues on Fedora. I would need access to a machine to debug this. Perhaps you can set this up on a machine on company network?
If you dont have one, perhaps @grahamwhaley can lend one of the machines Eric gave him.

@eadamsintel
Copy link
Contributor

As an FYI I verified this works on Fedora 30 bare metal and Clear Linux as well. I had to build podman and slirp4netns from source for Clear Linux. I am working with the Clear team to get podman and slirp4netns added to a bundle but it will take some time.

@jodh-intel
Copy link
Contributor Author

@amshinde - The original doc (which I'm simply supposed to be updating on this PR btw :) states the host kernel needs to be 4.14+. Bionic uses a 4.15.0-72 kernel so please can you check in a non-cloud environment?

@jodh-intel
Copy link
Contributor Author

Progress! I don't know exactly what the problem was, but it related to permissions inside (?) the cached docker images podman was using.

I noticed that the following worked on F30:

$ podman run --log-level=debug --runtime=/usr/local/bin/kata-runtime --net=none -ti -rootfs ~/tmp/bundle/rootfs sh

I then zapped my images (should have backed these up first, but... ):

$ podman rmi -f alpine
$ podman rmi -f busybox

... and I can now run the following (with networking) for example on F30:

$ podman run --log-level=debug --runtime=/usr/local/bin/kata-runtime -ti alpine sh

However, I still cannot get networking to work on Ubuntu 18.04 (even after zapping the podman images):

podman run --log-level=debug --rm --runtime=/usr/local/bin/kata-runtime  -ti alpine sh

        :

DEBU[0004] /usr/bin/conmon messages will be logged to syslog
DEBU[0004] running conmon: /usr/bin/conmon               args="[--api-version 1 -c b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0 -u b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0 -r /usr/local/bin/kata-runtime -b /home/james/.local/share/containers/storage/vfs-containers/b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0/userdata -p /run/user/1000/vfs-containers/b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0/userdata/pidfile -l k8s-file:/home/james/.local/share/containers/storage/vfs-containers/b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level debug --syslog -t --conmon-pidfile /run/user/1000/vfs-containers/b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/james/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg /usr/local/bin/kata-runtime --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0]"
WARN[0004] Failed to add conmon to cgroupfs sandbox cgroup: error creating cgroup for cpu: mkdir /sys/fs/cgroup/cpu/libpod_parent: permission denied
DEBU[0004] Received: -1
DEBU[0005] Cleaning up container b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0
DEBU[0005] Tearing down network namespace at /run/user/1000/netns/cni-c0684446-1bfc-afb0-ad34-ea10dd16f93e for container b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0
DEBU[0005] unmounted container "b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0"
DEBU[0005] Cleaning up container b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0
DEBU[0005] Network is already cleaned up, skipping...
DEBU[0005] Container b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0 storage is already unmounted, skipping...
DEBU[0005] Container b9a23d5668025039d1dde04972f2f3fec39e6df1206954cd7c29243e819af0b0 storage is already unmounted, skipping...
DEBU[0005] ExitCode msg: "network device mode not determined correctly. mount sysfs in caller: oci runtime error"
ERRO[0005] Network device mode not determined correctly. Mount sysfs in caller: OCI runtime error

I was hoping to make this document executable but since (currently) this can only work (I think) on a very limited range of distro versions, I'm not sure that's viable at this stage:

  • Ubuntu 18.04 (only - no OBS packages for newer versions, but that work is in progress).
  • Fedora 30 (F31 contains Kata packages, but they are not fully functional it seems - see install: Update Fedora instructions #578).

@amshinde
Copy link
Member

@jodh-intel Can you send me your kernel config?

@jodh-intel
Copy link
Contributor Author

@amshinde - I'm using vanilla F30. Sent you the kernel config.

@nitkon
Copy link
Contributor

nitkon commented Mar 31, 2020

@jodh-intel: How did you get rid of the network issue? I am hitting the same on CoreOS . Works with --net=none

[core@localhost ~]$ podman run -it --runtime=/home/nitesh/static-kata-artifacts/kata-runtime ubuntu:18.04 /bin/bash

Error: rpc error: code = Internal desc = Could not add fe80::a04f:37ff:fecd:36a9/64 to interface &{{2 1500 1000 eth0 a2:4f:37:cd:36:a9 broadcast|multicast 4098 0 0 <nil>  0xc0002063b8 0 0xc000093a00 ether <nil> down 0 0 0 []}}: operation not supported: OCI runtime error

@GabyCT
Copy link
Collaborator

GabyCT commented Apr 7, 2020

@jodh-intel what is the status of this doc? currently we have a podman CI running with kata...more improvements should be done to this doc? thanks

@fidencio
Copy link
Member

Adding myself and @c3d here.
This document has to be updated, SELinux is now supported (for F32+ and kata 1.11.0-rc0+).

In any case, would be good some last review on this one considering the current scenarios (it's almost a note to self).

@jodh-intel
Copy link
Contributor Author

Hi @fidencio - I'm currently updating this PR. It's probably going to be split into distro-specific instructions so that it fits the current install/docker/* model we're using.

@jodh-intel
Copy link
Contributor Author

Closing as we have now released 2.0. We'd like to interoperate with podman for 2.x but cannot until that project supports the shimv2 architecture.

@jodh-intel jodh-intel closed this Oct 21, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants