Don't always enable rootless mode in userns #1837

AkihiroSuda · 2018-07-03T03:58:57Z

Problem

In #1688 we broke "Docker-in-LXD":

$ lxc launch ubuntu:18.04 foo -c security.nesting=true
$ lxc shell foo
foo# curl -fsSL get.docker.com -o get-docker.sh && sh get-docker.sh
foo# exit
$ lxc file push /usr/local/sbin/runc foo/usr/bin/docker-runc
$ lxc shell foo
foo# cat /proc/self/uid_map
(we are in userns here)
foo# docker run -it --rm busybox
docker: Error response from daemon: OCI runtime create failed: cannot specify gid= mount options for unmapped gid in rootless containers: unknown.

This is caused because runc enables the "rootless mode" when running in user namespace, but "Docker-in-LXD" does not expect runc to enable the rootless mode.

What "rootless mode" does actually

For "Docker-in-LXD", we need none of them, because runc is already executed in userns and cgroups is also available.

In #1688, we enabled the "rootless mode" in userns so as to support rootless img/buildkit/buildah/containerd/docker/podman, but actually we only need 1, 2, and 3 for these usecases.

Proposal

Step 1: fix Docker-in-LXD regression (PR: #1833 / Closed)

Change isRootless as follows:

func isRootless(context *cli.Context) (bool, error) {
  if context != nil {
  ...
  }
  u := os.Getenv("USER")
  return u != "" && u != "root"
}

When runc is executed in userns via Docker-in-LXD, isRootless() returns false, because LXD would set $USER to "root".
When runc is executed in userns via rootless img/buildkit/buildah/containerd/docker/podman, isRootless() returns true, because we don't change environment variables after unsharing the userns and mapping UID=0 to the current user.
When runc is executed as a regular user in the initial namespace, isRootless() returns true
When runc is executed as the root in the initial namespace, isRootless() returns false

Corner cases:

When runc is executed in userns via Docker-in-rootless-Docker ("rootless dind"), as the root in the contaienr, isRootless() returns false, and unlikely to work. The dockerd in the container would need to specify runc --rootless explicitly in this case. (And user would need to launch dockerd with --rootless explicitly, probably)
When runc is executed in "rootless dind", as a non-root in the contaienr, isRootless() returns true and likely to work.

Step 2: refactor the rootless mode: (PR: #1862)

Honor $XDG_RUNTIME_DIR

Probably, this needs to be hornored when u := os.Getenv("USER"); u != "" && u != "root".
(Note that we shouldn't check the UID in the current namespace, because we still want to honor $XDG_RUNTIME_DIR after unsharing the userns and mappping UID=0 to the current user)

Or maybe we can always honor this variable, but potentially it breaks compatibility, when runc is executed as UID=0, $USER=root, $XDG_RUNTIME_DIR=/run/user/0..

Switch the cgroup manager to libcontainer.RootlessCgroupfs

We should detect cgroup availability explicitly by probably trying mkdir /sys/fs/cgroup/foo/bar or something similar. I guess the overhead is negligible.
Or just remove libcontainer.RootlessCgroupfs manager and ignore all errors.
We can just safely use libcontainer.RootlessCgroupfs when we are not the root in the initial namespace.

Disable cgroup-specific features such as runc ps and OOM notification

We could implement runc ps without using cgroups. (in another PR in future)

Disable runc checkpoint and runc restore

I'm not familar with CRIU, but I guess we only need to disable them when runc is executed as non-zero UID, regardless of whether we are in the initial namespace or in a userns.

Make sure config.json contains userns and id mappings if euid != 0

Make sureconfig.json does not contain uid= and gid= for mounts

Write "deny" to /proc/$PID/setgroups if single-entry mapping is specified

We need to disable them only when runc is executed as non-zero UID (TODO: check capabilities instead?), regardless of whether we are in the initial namespace or in a userns.

The text was updated successfully, but these errors were encountered:

AkihiroSuda · 2018-07-03T04:02:41Z

cc @cyphar @brauner @jessfraz @giuseppe @williammartin @teddyking @julz

AkihiroSuda · 2018-07-05T03:56:05Z

Step 1: fix Docker-in-LXD regression (PR: #1833)
Step 2: refactor the rootless mode

If #1833 is not mergeable, maybe we should skip Step 1 and go to Step 2 directly

( UPDATE: POC for Step 2 is ready: https://github.com/AkihiroSuda/runc/commits/decompose-rootless )

danail-branekov · 2018-07-05T14:17:25Z

@AkihiroSuda
With the #1833 updated your POC branch looks good with CF Garden - our acceptance tests are green

AkihiroSuda · 2018-08-10T07:13:38Z

Closed the "Step 1" PR and opened the "Step 2" PR: #1862

dhiltonp · 2018-09-27T05:37:42Z

I've been working on getting docker on ChromeOS.

I merged #1862 into master (2adb837), and I'm up and running!

Thanks for your work, @cyphar and @AkihiroSuda!

AkihiroSuda mentioned this issue Jul 3, 2018

rootless: fix Docker-in-LXD regression #1833

Closed

This was referenced Jul 3, 2018

add support for --rootless containerd/go-runc#43

Merged

rootless: optional support for generating config with subuid map #1692

Closed

AkihiroSuda mentioned this issue Jul 5, 2018

Proposal: allow running dockerd as an unprivileged user (aka rootless mode) moby/moby#37375

Closed

AkihiroSuda mentioned this issue Jul 24, 2018

Rootless build: cannot specify gid= mount options for unmapped gid in rootless containers containers/podman#1147

Closed

AkihiroSuda mentioned this issue Aug 10, 2018

Disable rootless mode except RootlessCgMgr when executed as the root in userns (fix Docker-in-LXD regression) #1862

Merged

kolyshkin mentioned this issue Sep 11, 2018

Can't restore containers moby/moby#35691

Closed

mrunalp closed this as completed in #1862 Oct 16, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't always enable rootless mode in userns #1837

Don't always enable rootless mode in userns #1837

AkihiroSuda commented Jul 3, 2018 •

edited

Loading

AkihiroSuda commented Jul 3, 2018

AkihiroSuda commented Jul 5, 2018 •

edited

Loading

danail-branekov commented Jul 5, 2018

AkihiroSuda commented Aug 10, 2018

dhiltonp commented Sep 27, 2018

Don't always enable rootless mode in userns #1837

Don't always enable rootless mode in userns #1837

Comments

AkihiroSuda commented Jul 3, 2018 • edited Loading

Problem

What "rootless mode" does actually

Proposal

Step 1: fix Docker-in-LXD regression (PR: #1833 / Closed)

Step 2: refactor the rootless mode: (PR: #1862)

AkihiroSuda commented Jul 3, 2018

AkihiroSuda commented Jul 5, 2018 • edited Loading

danail-branekov commented Jul 5, 2018

AkihiroSuda commented Aug 10, 2018

dhiltonp commented Sep 27, 2018

AkihiroSuda commented Jul 3, 2018 •

edited

Loading

AkihiroSuda commented Jul 5, 2018 •

edited

Loading