-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defense in Depth - User Namespaces #228
Comments
Since the subject of Linux user namespaces is very tricky, I'll dump here what I have understood so far. Hopefully this will help in the review process, or when we want to backtrack in case we've done a mistake. Linux User NamespacesReferences:
Linux User Namespaces got introduced in Linux Kernel 3.8. They look similar to Let's demystify them: User namespaces are more than just namespaces for UIDs and GIDs. They are also a All users (unless restricted by system configuration) can create a user
Examples:
This mapping is available through
For user namespaces with empty mappings, we need to have some things in mind:
This mapping is writable only by processes with sufficient rights, and only I think that the simplest mapping that can exist is just assigning a container Once a mapping exists, then:
Rootless Podman and Linux User NamespacesReferences:
Let's see how rootless Podman deals with user namespaces. When Podman creates a new user namespace, it needs to assign a UID mapping to
Basically, they define the range of host UIDs (subordinate UIDs) that a user has If there are no
That is, the root in the container maps to the user outside the container, which
Essentially, the root user in the container maps to the user outside the Podman has several options to control the mapping (see
In the first case, we see that the container root maps to the user it started In the second case, we notice something weird. The root of the
To make translation easier, one can check the UID mapping from the parent In the above examples, we see that either the root or the user within the Problems with insufficient UID/GID mappings will occur either when pulling an |
Dangerzone and Linux User NamespacesNow that we've seen how Linux User Namespaces work, and how Podman handles them, let's see how Dangerzone should handle them. RequirementsWe'll start with some requirements and how we can cover them for Dangerzone: 1. The user IDs within the Dangerzone container should not map to any user in the hostThe reason is that we don't want any container escape to have any effect to the host. The escaped user should effectively be treated as Best way to achieve this is to use Podman's implementation can be found here: https://github.com/containers/podman/blob/67c533b85a80fd40228bedbca89a61912ca8a9a5/pkg/util/utils.go#L404. Basically, what Podman does is:
2. The files/folders mounted to the Dangerzone container should be accessible by UID/GID 1000 (
|
When we run our Dangerzone environments through dev_scripts/env.py, we use the Podman flag `--userns keep-id`. This option maps the UID in the host to the *same* UID in the container. This way, the container can access mounted files from the host. The reason this works is because the user within the container has UID 1000, and the user in the host *typically* has UID 1000 as well. This setup can break though if the user outside the host has a different UID. For instance, the UID of the GitHub actions user that runs our CI command is 1001. To fix this, we need to always map the host user UID (whatever that is) to container UID 1000. We can achieve this with the following mapping: 1000:0:1 # Map container UID 1000 to subordinate UID 0 # (sub UID 0 = owner of the user ns = host user UID) 0:1:1000 # Map container UIDs 0-999 to subordinate UIDs 1-1000 1001:1001:64536 # Map container UIDs 1001-65535 to subordinate UIDs 1001-65535 Refs #228
When we run our Dangerzone environments through dev_scripts/env.py, we use the Podman flag `--userns keep-id`. This option maps the UID in the host to the *same* UID in the container. This way, the container can access mounted files from the host. The reason this works is because the user within the container has UID 1000, and the user in the host *typically* has UID 1000 as well. This setup can break though if the user outside the host has a different UID. For instance, the UID of the GitHub actions user that runs our CI command is 1001. To fix this, we need to always map the host user UID (whatever that is) to container UID 1000. We can achieve this with the following mapping: 1000:0:1 # Map container UID 1000 to subordinate UID 0 # (sub UID 0 = owner of the user ns = host user UID) 0:1:1000 # Map container UIDs 0-999 to subordinate UIDs 1-1000 1001:1001:64536 # Map container UIDs 1001-65535 to subordinate UIDs 1001-65535 Refs #228
When we run our Dangerzone environments through dev_scripts/env.py, we use the Podman flag `--userns keep-id`. This option maps the UID in the host to the *same* UID in the container. This way, the container can access mounted files from the host. The reason this works is because the user within the container has UID 1000, and the user in the host *typically* has UID 1000 as well. This setup can break though if the user outside the host has a different UID. For instance, the UID of the GitHub actions user that runs our CI command is 1001. To fix this, we need to always map the host user UID (whatever that is) to container UID 1000. We can achieve this with the following mapping: 1000:0:1 # Map container UID 1000 to subordinate UID 0 # (sub UID 0 = owner of the user ns = host user UID) 0:1:1000 # Map container UIDs 0-999 to subordinate UIDs 1-1000 1001:1001:64536 # Map container UIDs 1001-65535 to subordinate UIDs 1001-65535 Refs #228
When we run our Dangerzone environments through dev_scripts/env.py, we use the Podman flag `--userns keep-id`. This option maps the UID in the host to the *same* UID in the container. This way, the container can access mounted files from the host. The reason this works is because the user within the container has UID 1000, and the user in the host *typically* has UID 1000 as well. This setup can break though if the user outside the host has a different UID. For instance, the UID of the GitHub actions user that runs our CI command is 1001. To fix this, we need to always map the host user UID (whatever that is) to container UID 1000. We can achieve this with the following mapping: 1000:0:1 # Map container UID 1000 to subordinate UID 0 # (sub UID 0 = owner of the user ns = host user UID) 0:1:1000 # Map container UIDs 0-999 to subordinate UIDs 1-1000 1001:1001:64536 # Map container UIDs 1001-65535 to subordinate UIDs 1001-65535 Refs #228
When we run our Dangerzone environments through dev_scripts/env.py, we use the Podman flag `--userns keep-id`. This option maps the UID in the host to the *same* UID in the container. This way, the container can access mounted files from the host. The reason this works is because the user within the container has UID 1000, and the user in the host *typically* has UID 1000 as well. This setup can break though if the user outside the host has a different UID. For instance, the UID of the GitHub actions user that runs our CI command is 1001. To fix this, we need to always map the host user UID (whatever that is) to container UID 1000. We can achieve this with the following mapping: 1000:0:1 # Map container UID 1000 to subordinate UID 0 # (sub UID 0 = owner of the user ns = host user UID) 0:1:1000 # Map container UIDs 0-999 to subordinate UIDs 1-1000 1001:1001:64536 # Map container UIDs 1001-65535 to subordinate UIDs 1001-65535 Refs #228
We can close this issue once we merge #590, since gVisor will run rootless, and the host user will not be mapped to the inner container. As a bonus, we will remove the |
This wraps the existing container image inside a gVisor-based sandbox. gVisor is an open-source OCI-compliant container runtime. It is a userspace reimplementation of the Linux kernel in a memory-safe language. It works by creating a sandboxed environment in which regular Linux applications run, but their system calls are intercepted by gVisor. gVisor then redirects these system calls and reinterprets them in its own kernel. This means the host Linux kernel is isolated from the sandboxed application, thereby providing protection against Linux container escape attacks. It also uses `seccomp-bpf` to provide a secondary layer of defense against container escapes. Even if its userspace kernel gets compromised, attackers would have to additionally have a Linux container escape vector, and that exploit would have to fit within the restricted `seccomp-bpf` rules that gVisor adds on itself. Fixes #126 Fixes #224 Fixes #225 Fixes #228
This wraps the existing container image inside a gVisor-based sandbox. gVisor is an open-source OCI-compliant container runtime. It is a userspace reimplementation of the Linux kernel in a memory-safe language. It works by creating a sandboxed environment in which regular Linux applications run, but their system calls are intercepted by gVisor. gVisor then redirects these system calls and reinterprets them in its own kernel. This means the host Linux kernel is isolated from the sandboxed application, thereby providing protection against Linux container escape attacks. It also uses `seccomp-bpf` to provide a secondary layer of defense against container escapes. Even if its userspace kernel gets compromised, attackers would have to additionally have a Linux container escape vector, and that exploit would have to fit within the restricted `seccomp-bpf` rules that gVisor adds on itself. Fixes freedomofpress#126 Fixes freedomofpress#224 Fixes freedomofpress#225 Fixes freedomofpress#228
Parent issue: #221
User namespaces are very important, since they ensure that:
By ensuring that the user within the container (
dangerzone
, UID 1000) maps to a non-existing user outside the container, we complicate the attacker significantly. The current situation is:--userns keep-id
, which makes thedangerzone
user within the container have the same UID as the user outside the container.Linux
x > 1000
outside the container) before starting the container.x > 1000
outside the container.podman
and specify the mapping for the container.Windows/MacOS
Test Podman Desktop and check if it uses user namespaces.
Further reading:
The text was updated successfully, but these errors were encountered: