Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stdio permission error in userns container #4478

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

lifubang
Copy link
Member

@lifubang lifubang commented Oct 26, 2024

Fix #4475

If the user in the container is different from current user, we need
to change the owner of stdio before we enter the user namespace, or
else we may can't access stdio in the container.

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this test case does not reproduce the original issue (which is not being able to write to write to /dev/stderr).

What touch does is it changes the atime and mtime of /dev/stderr (using utimensat(2)) and this is what fails here as far as I can see.

@lifubang
Copy link
Member Author

lifubang commented Oct 27, 2024

It looks like this test case does not reproduce the original issue (which is not being able to write to write to /dev/stderr).

What touch does is it changes the atime and mtime of /dev/stderr (using utimensat(2)) and this is what fails here as far as I can see.

Changed, thanks. If we revert the first commit, the fail msg looks as follows:

 ✗ runc check stdio permission in userns [terminal=false]
   (in test file tests/integration/userns.bats, line 57)
     `[[ "$out" = "errormsg" ]]' failed
   runc spec (status=0):
   
   skipping file /home: cannot remap user 65534:65534 -> -1:-1
   runc start test_busybox (status=0):
   
   sh: can't create /dev/stderr: Permission denied
   --- teardown ---

@lifubang lifubang force-pushed the fix-stdiopermission-userns branch 3 times, most recently from 2b7dda7 to 7b9586a Compare October 27, 2024 01:19
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like it works with runc run but doesn't work with runc create + runc start. Why?

In case of runc create, setupIO is called with detach=true, which (together with config.Terminal=false) results in runc init using whatever stdin/stdout/stderr runc create has. In the case of this test stdout and stderr is pointing to log:

	__runc create test_busybox >log 2>&1

So, you end up with runc init which has fd 1 and 2 opened to something like /tmp/bats-run-NxJCXf/runc.mipFpg/bundle/log.

Next, runc start tells runc init to go ahead and run the binary. Which can't write to log because it belongs to another user. I'm a bit puzzled by why the error message is "can't create /dev/stderr: Permission denied" (this is probably shell trying to recreate the file), but what the code in this PR does is it changes the host file permissions to 666 (you can check that by adding ls -l log to the test case). Which is not good.

Now, if you change echo errormsg > /dev/stderr to echo errormsg >&2 (which tells sh to redirect echo output to fd 2), it starts working again.

So, it looks like neither the test case nor the fix is correct.

@lifubang
Copy link
Member Author

So it looks like it works with runc run but doesn't work with runc create + runc start. Why?

runc run Can work it is because you don't run it with -d, if without -d, runc run will not exist, and using setupProcessPipes to pipe the containers' stdio to os's stdio, you should know runc run process is not in user ns, it can access stdio.

runc run -d == runc create + runc start.

@lifubang
Copy link
Member Author

So, it looks like neither the test case nor the fix is correct.

Here is my detailed test:

root@iZj6cgggwb62cxurec74geZ:/opt/bb# # runc run can work
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc run test
errorlog
root@iZj6cgggwb62cxurec74geZ:/opt/bb# # runc run -d can't work
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc run -d test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# sh: can't create /dev/stderr: Permission denied

root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc delete test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# # runc create & start can't work
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc create test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc start test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# sh: can't create /dev/stderr: Permission denied

root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc delete test
root@iZj6cgggwb62cxurec74geZ:/opt/bb#

After using my patch, both runc run -d and runc create + start can work:

root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc run -d test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# errorlog

root@iZj6cgggwb62cxurec74geZ:/opt/bb# 
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc delete test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc create test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc start test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# errorlog

root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc delete test
root@iZj6cgggwb62cxurec74geZ:/opt/bb#

config.json:

{ "ociVersion": "1.0.2-dev", "process": { "terminal": false, "user": { "uid": 0, "gid": 0 }, "args": [ "sh", "-c", "echo errorlog > /dev/stderr" ], "env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "TERM=xterm" ], "cwd": "/", "capabilities": { "bounding": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ], "effective": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ], "permitted": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ], "ambient": [ "CAP_AUDIT_WRITE", "CAP_KILL", "CAP_NET_BIND_SERVICE" ] }, "rlimits": [ { "type": "RLIMIT_NOFILE", "hard": 1024, "soft": 1024 } ], "noNewPrivileges": true }, "root": { "path": "rootfs", "readonly": true }, "hostname": "runc", "mounts": [ { "destination": "/proc", "type": "proc", "source": "proc" }, { "destination": "/dev", "type": "tmpfs", "source": "tmpfs", "options": [ "nosuid", "strictatime", "mode=755", "size=65536k" ] }, { "destination": "/dev/pts", "type": "devpts", "source": "devpts", "options": [ "nosuid", "noexec", "newinstance", "ptmxmode=0666", "mode=0620", "gid=5" ] }, { "destination": "/dev/shm", "type": "tmpfs", "source": "shm", "options": [ "nosuid", "noexec", "nodev", "mode=1777", "size=65536k" ] }, { "destination": "/dev/mqueue", "type": "mqueue", "source": "mqueue", "options": [ "nosuid", "noexec", "nodev" ] }, { "destination": "/sys", "type": "sysfs", "source": "sysfs", "options": [ "nosuid", "noexec", "nodev", "ro" ] }, { "destination": "/sys/fs/cgroup", "type": "cgroup", "source": "cgroup", "options": [ "nosuid", "noexec", "nodev", "relatime", "ro" ] } ], "linux": { "resources": { "devices": [ { "allow": false, "access": "rwm" } ] }, "namespaces": [ { "type": "pid" }, { "type": "network" }, { "type": "ipc" }, { "type": "uts" }, { "type": "mount" }, { "type": "cgroup" }, { "type": "user" } ], "uidMappings": [{"hostID": 100000, "containerID": 0, "size": 65534}], "gidMappings": [{"hostID": 100000, "containerID": 0, "size": 65534}], "maskedPaths": [ "/proc/acpi", "/proc/asound", "/proc/kcore", "/proc/keys", "/proc/latency_stats", "/proc/timer_list", "/proc/timer_stats", "/proc/sched_debug", "/sys/firmware", "/proc/scsi" ], "readonlyPaths": [ "/proc/bus", "/proc/fs", "/proc/irq", "/proc/sys", "/proc/sysrq-trigger" ] } }

@lifubang
Copy link
Member Author

but what the code in this PR does is it changes the host file permissions to 666

This is what my patch does, because stdout and stderr refers to the host file fd now.

But this was also what my concern about, we have changed the host file's permission silently, it will cause some security issues?
Maybe this issue(#4475) is not a bug, I think if someone what to use user ns container, they should ensure the mapped user have the suitable access to the host file? And we can check this access in runc, if there is no access, we can just only raise an error?

@lifubang
Copy link
Member Author

we have changed the host file's permission silently

But this change is not in the container, it's in the runc parent process.

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like that we are changing host file permissions -- file passthrough stdio is already kind of dangerous (and should be avoided if possible), but this will cause a container using a file as stdio to silently change the ownership such that any user on the entire system can read and write to the file. This seems like something that is going to lead to security issues.

IMHO, starting a container with file passthrough to a file that the container process doesn't have access to is a configuration bug. Since we opened stdio as a privileged user before creating the container and dropping privileges, pid1 can already read/write to stdio (DAC permissions aren't checked during read/write). The issue is when a container process tries to re-open /proc/1/fd/.... IMHO it's not really our job to fix this -- they should configure the stdio file to have the permissions they like.

If we really need to do this, then I think we should change the owner instead. At least that way we won't allow any user to read the file, but this could cause issues with a monitor process being unable to read from the file (imagine there is a cron job that reads the output file -- once the file has its owner changed it can't read from it anymore). But at least the security story makes a little more sense than blindly doing chmod 0666.

@lifubang lifubang force-pushed the fix-stdiopermission-userns branch 4 times, most recently from 0888fdb to f0ce623 Compare October 28, 2024 03:12
@lifubang
Copy link
Member Author

It will be fixed in containerd, it’s not very urgent to fix it in runc, let’s remove the 1.2.1 milestone.

@lifubang lifubang removed this from the 1.2.1 milestone Oct 28, 2024
Comment on lines 523 to 533
// fixStdioPermissions fixes the permissions of STDIO to the specified user.
func fixStdioPermissions(uid int) error {
var null unix.Stat_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move it back to init_linux.go? This will make the patch shorter and review simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function is needed in both parent and init process, so I think it should be in container_linux.go.
And I have updated the PR, so it really should be moved to container_linux.go.

if err := fixStdioPermissions(execUser); err != nil {
if err := fixStdioPermissions(execUser.Uid); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this here if we have already done it in the parent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are different cases (in some we should do this from the parent, in some others from the child), maybe we should make calls to fixStdioPermissions conditional.

Copy link
Member Author

@lifubang lifubang Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are different cases (in some we should do this from the parent, in some others from the child), maybe we should make calls to fixStdioPermissions conditional.

Since I considered that Root in userNS != User in userNS, so I think it still need in init process.
But after I add a test case for it, I find this is also a bug in userns container.
So I updated this PR to move fixStdioPermissions from init process to parent process.
Edit:
So I updated this PR to change chown from root user in the container to the real user in the container.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks your comment to find another bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see this test: @test "check stdio permission for non-root user in userns [terminal=false]"

@lifubang lifubang force-pushed the fix-stdiopermission-userns branch 6 times, most recently from d3df716 to 0d08500 Compare October 29, 2024 13:44
@rata
Copy link
Member

rata commented Nov 5, 2024

This indeed fixes the issue with the nginx image too. Do the rejects still apply to the last version? It seems @kolyshkin might not, and it is not changing permissions to "others" now? I can review in the next few days if no one still objects to this.

@lifubang
Copy link
Member Author

lifubang commented Nov 6, 2024

This indeed fixes the issue with the nginx image too.

Yes, as I mentioned in #4478 (comment), this should be fixed by users, and @fuweid has fixed it in containerd(containerd/containerd#10906). But for runc direct users, we also need to fix this issue in runc.

This PR fixes about two issues, I'll split this PR once I have a time, and I'll describe the issue with more details.

@rata
Copy link
Member

rata commented Nov 7, 2024

I was aware of the containerd PR, but thanks anyways :)

Sure, feel free to ping me for review when you open the PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

can't start nginx image with runc and user namespaces (works with crun)
4 participants