-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't always enable rootless mode in userns #1837
Comments
If #1833 is not mergeable, maybe we should skip Step 1 and go to Step 2 directly ( UPDATE: POC for Step 2 is ready: https://github.com/AkihiroSuda/runc/commits/decompose-rootless ) |
@AkihiroSuda |
Closed the "Step 1" PR and opened the "Step 2" PR: #1862 |
I've been working on getting docker on ChromeOS. I merged #1862 into master (2adb837), and I'm up and running! Thanks for your work, @cyphar and @AkihiroSuda! |
Problem
In #1688 we broke "Docker-in-LXD":
This is caused because runc enables the "rootless mode" when running in user namespace, but "Docker-in-LXD" does not expect runc to enable the rootless mode.
What "rootless mode" does actually
$XDG_RUNTIME_DIR
libcontainer.RootlessCgroupfs
runc ps
and OOM notificationrunc checkpoint
andrunc restore
config.json
contains userns and id mappings if euid != 0config.json
does not containuid=
andgid=
for mounts"deny"
to/proc/$PID/setgroups
if single-entry mapping is specifiedFor "Docker-in-LXD", we need none of them, because runc is already executed in userns and cgroups is also available.
In #1688, we enabled the "rootless mode" in userns so as to support rootless img/buildkit/buildah/containerd/docker/podman, but actually we only need 1, 2, and 3 for these usecases.
Proposal
Step 1: fix Docker-in-LXD regression (PR: #1833 / Closed)
Change
isRootless
as follows:isRootless()
returnsfalse
, because LXD would set$USER
to "root".isRootless()
returnstrue
, because we don't change environment variables after unsharing the userns and mapping UID=0 to the current user.isRootless()
returnstrue
isRootless()
returnsfalse
Corner cases:
isRootless()
returnsfalse
, and unlikely to work. Thedockerd
in the container would need to specifyrunc --rootless
explicitly in this case. (And user would need to launchdockerd
with--rootless
explicitly, probably)isRootless()
returnstrue
and likely to work.Step 2: refactor the rootless mode: (PR: #1862)
Probably, this needs to be hornored when
u := os.Getenv("USER"); u != "" && u != "root"
.(Note that we shouldn't check the UID in the current namespace, because we still want to honor
$XDG_RUNTIME_DIR
after unsharing the userns and mappping UID=0 to the current user)Or maybe we can always honor this variable, but potentially it breaks compatibility, when runc is executed as UID=0,
$USER=root
,$XDG_RUNTIME_DIR=/run/user/0.
.We should detect cgroup availability explicitly by probably tryingmkdir /sys/fs/cgroup/foo/bar
or something similar. I guess the overhead is negligible.Or just remove
libcontainer.RootlessCgroupfs
manager and ignore all errors.We can just safely use
libcontainer.RootlessCgroupfs
when we are not the root in the initial namespace.We could implement
runc ps
without using cgroups. (in another PR in future)I'm not familar with CRIU, but I guess we only need to disable them when runc is executed as non-zero UID, regardless of whether we are in the initial namespace or in a userns.
We need to disable them only when runc is executed as non-zero UID (TODO: check capabilities instead?), regardless of whether we are in the initial namespace or in a userns.
The text was updated successfully, but these errors were encountered: