From ce19b8d167caccb0685e52ec96f0a0543398f52b Mon Sep 17 00:00:00 2001 From: Aleksa Sarai Date: Sat, 28 May 2016 15:02:35 +1000 Subject: [PATCH 1/2] *: add support for cgroup namespace The cgroup namespace is a new kernel feature available in 4.6+ that allows a container to isolate its cgroup hierarchy. This currently only allows for hiding information from /proc/self/cgroup, and mounting cgroupfs as an unprivileged user. In the future, this namespace may allow for subtree management by a container. Signed-off-by: Aleksa Sarai --- config-linux.md | 16 ++++++++++------ config.md | 6 ++++++ schema/defs-linux.json | 3 ++- specs-go/config.go | 2 ++ 4 files changed, 20 insertions(+), 7 deletions(-) diff --git a/config-linux.md b/config-linux.md index 1a313ed3f..e4e4f829f 100644 --- a/config-linux.md +++ b/config-linux.md @@ -27,12 +27,13 @@ Namespaces are specified as an array of entries inside the `namespaces` root fie The following parameters can be specified to setup namespaces: * **`type`** *(string, required)* - namespace type. The following namespaces types are supported: - * **`pid`** processes inside the container will only be able to see other processes inside the same container - * **`network`** the container will have its own network stack - * **`mount`** the container will have an isolated mount table - * **`ipc`** processes inside the container will only be able to communicate to other processes inside the same container via system level IPC - * **`uts`** the container will be able to have its own hostname and domain name - * **`user`** the container will be able to remap user and group IDs from the host to local users and groups within the container + * **`pid`** processes inside the container will only be able to see other processes inside the same container. + * **`network`** the container will have its own network stack. + * **`mount`** the container will have an isolated mount table. + * **`ipc`** processes inside the container will only be able to communicate to other processes inside the same container via system level IPC. + * **`uts`** the container will be able to have its own hostname and domain name. + * **`user`** the container will be able to remap user and group IDs from the host to local users and groups within the container. + * **`cgroup`** the container will have an isolated view of the cgroup hierarchy. * **`path`** *(string, optional)* - path to namespace file in the [runtime mount namespace](glossary.md#runtime-namespace) @@ -62,6 +63,9 @@ Also, when a path is specified, a runtime MUST assume that the setup for that pa }, { "type": "user" + }, + { + "type": "cgroup" } ] ``` diff --git a/config.md b/config.md index bd42af426..d0ab4ca57 100644 --- a/config.md +++ b/config.md @@ -643,6 +643,12 @@ Here is a full example `config.json` for reference. }, { "type": "mount" + }, + { + "type": "user" + }, + { + "type": "cgroup" } ], "maskedPaths": [ diff --git a/schema/defs-linux.json b/schema/defs-linux.json index e77fe92a9..ea02361c0 100644 --- a/schema/defs-linux.json +++ b/schema/defs-linux.json @@ -224,7 +224,8 @@ "network", "uts", "ipc", - "user" + "user", + "cgroup" ] }, "NamespaceReference": { diff --git a/specs-go/config.go b/specs-go/config.go index 9b54f14ad..7f4ceb862 100644 --- a/specs-go/config.go +++ b/specs-go/config.go @@ -169,6 +169,8 @@ const ( UTSNamespace = "uts" // UserNamespace for isolating user and group IDs UserNamespace = "user" + // CgroupNamespace for isolating cgroup hierarchies + CgroupNamespace = "cgroup" ) // IDMapping specifies UID/GID mappings From d514aad1bc3997d3e72168562177757c91e919fd Mon Sep 17 00:00:00 2001 From: Aleksa Sarai Date: Fri, 27 May 2016 21:59:21 +1000 Subject: [PATCH 2/2] runtime: lifecycle: environment must match config.json Make it clear that if a runtime cannot set up an environment that *precisely* matches the config.json provided, it must generate an error. This is important because not doing this can cause security issues. Signed-off-by: Aleksa Sarai --- runtime.md | 1 + 1 file changed, 1 insertion(+) diff --git a/runtime.md b/runtime.md index 79ab96fce..e9cf6e410 100644 --- a/runtime.md +++ b/runtime.md @@ -34,6 +34,7 @@ See [Query State](#query-state) for information on retrieving the state of a con The lifecycle describes the timeline of events that happen from when a container is created to when it ceases to exist. 1. OCI compliant runtime's `create` command is invoked with a reference to the location of the bundle and a unique identifier. 2. The container's runtime environment MUST be created according to the configuration in [`config.json`](config.md). + If the runtime is unable to create the environment specified in the [`config.json`](config.md), it MUST generate an error. While the resources requested in the [`config.json`](config.md) MUST be created, the user-specified code (from [`process`](config.md#process-configuration) MUST NOT be run at this time. Any updates to `config.json` after this step MUST NOT affect the container. 3. Once the container is created additional actions MAY be performed based on the features the runtime chooses to support.