Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MS_BIND|MS_REC to the mount of /proc #17574

Closed
wants to merge 1 commit into from

Conversation

bsilver8192
Copy link
Contributor

@bsilver8192 bsilver8192 commented Feb 24, 2023

Despite the usual meanings of these flags, this isn't actually a bind mount for proc, which doesn't appear to be documented anywhere. EDIT: This is wrong, it is a bind mount, which means this doesn't work.

Otherwise it fails if there are any other mounts under the host /proc. I think this is due to the behavior documented on the mount(2) manpage for bind mounting:

EINVAL In an unprivileged mount namespace (i.e., a mount namespace
owned by a user namespace that was created by an unprivileged user), a
bind mount operation (MS_BIND) was attempted without specifying
(MS_REC), which would have revealed the filesystem tree underneath
one of the submounts of the directory being bound.

I ended up in this scenario with docker and --gpus all (with some additional flags to make it work even without the --gpus all).

Despite the usual meanings of these flags, this isn't actually a bind
mount for `proc`, which doesn't appear to be documented anywhere.

Otherwise it fails if there are any other mounts under the host `/proc`.
I think this is due to the behavior documented on the `mount(2)`
manpage for bind mounting:

> EINVAL In an unprivileged mount namespace (i.e., a mount namespace
> owned by a user namespace that was created by an unprivileged user), a
> bind  mount  operation  (MS_BIND) was attempted without specifying
> (MS_REC), which would have revealed the filesystem tree underneath
> one of the submounts of the directory being bound.

I ended up in this scenario with docker and `--gpus all` (with some
additional flags to make it work even without the `--gpus all`).
@sgowroji sgowroji added team-Local-Exec Issues and PRs for the Execution (Local) team awaiting-user-response Awaiting a response from the author labels Feb 24, 2023
@sgowroji
Copy link
Member

Hi @bsilver8192, Can you please fix the above buildkite errors. Thanks!

@bsilver8192
Copy link
Contributor Author

bsilver8192 commented Feb 24, 2023

For anybody else from the future: if your container is privileged, mkdir /tmp/proc2 && sudo mount -t proc proc /tmp/proc2 will avoid the need for this. If your container is unprivileged, this is attempting to violate a deliberate security guarantee of the kernel, so hopefully it's impossible. You can also bind-mount the host proc (with docker this is --volume /proc:/host_proc), which will unmask any masked directories, which likely has security implications.

Looks like there's ongoing kernel-side work for a "/proc/pid-only filesystem" kind of approach which would provide another route to implement this.

The tests fail because I completely broke it, and there's no way to do what I was hoping for. My comments in the code about this not being a bind mount are wrong, it's a bind mount so it exposes host PIDs and that breaks stuff.

I can work around this by mounting a fully-unmasked proc anywhere in the container (because the container is already privileged for other reasons), but that's not really something I think should be automated in bazel because it relies on running bazel with effectively root privileges...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-user-response Awaiting a response from the author team-Local-Exec Issues and PRs for the Execution (Local) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants