-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nested container can't start #168
Comments
The NVIDIA Container CLI ensures that only the proc paths for devices requested are mounted into the container. The Since you mention GKE did you install the NVIDIA Container Runtime there, or are you launching a pod using their device plugin? |
Thanks @elezar for explanation! Regarding GKE, we followed https://cloud.google.com/kubernetes-engine/docs/how-to/gpus, and we didn't dig deeper into what's configured on the VM, and we didn't do specific things on the VMs. |
@easeway the default GKE installation does not use the NVIDIA Container Toolkit which would explain the different experience there. We are working on aligning things getter across the Cloud providers and including better support for nested containers. |
@elezar Thanks! I'm looking forward to it! |
For reference, I ran into this same problem while trying to use Bazel's linux-sandbox. Unfortunately I don't have a solution, but here's some info about what's happening that might help. I think the problem is the kernel enforces this (from
Even though this isn't technically a bind mount, it has the same effect, so I can see how it makes sense to enforce the restriction. I can't find any documentation about it though. opencontainers/runc#1658 (comment) (and other discussion in that bug) is the best discussion of the history behind this limitation I could find. Although that discussion links to some people saying that a fresh mount (like this project and Bazel both attempt) works, which does not seem to be true with the kernel versions I tried. |
made a PR to submit a patch to bazel. @bsilver8192 |
I didn't put something here because it didn't work in the end, but I attempted the same approach as @Ryang20718 at bazelbuild/bazel#17574, and concluded it was fundamentally broken and wouldn't work (sorry for the duplicate work). bazelbuild/bazel#17574 (comment) has some of my thoughts on workarounds. |
1. Issue or feature description
On AWS EKS
g4dn-xlarge
node, inside a privileged container requesting GPU resource, a nested container failed with error:2. Steps to reproduce the issue
g4dn-xlarge
nodes and also proper k8s labels on the nodes;apt-get install runc
);To reproduce this issue, using
unshare
andmount -N
maybe simpler than writing a full OCI spec.3. Root cause
The reason causing `mount "proc" to "/proc": Operation not permitted" is: nvidia container runtime will create the following mountpoints on the outer container:
/proc/driver/nvidia/gpus/BUS/...
/proc/driver/nvidia
After unmount these mountpoints, the nested container can be started without issue.
4. Thoughts
Not sure why nvidia container runtime will create mountpoints under "/proc". Based on observation, without the mountpoints, the files like
/proc/driver/nvidia/gpus/...
and/proc/driver/nvidia
are visible and accessible to the Pod. Is that for the isolation purpose in case there are multiple GPU devices on the system and only allowing the Pod to see the allocated device?We also experimented on GKE, which doesn't have this issue. We don't see the mountpoints on
/proc
on GKE.The text was updated successfully, but these errors were encountered: