Skip to content

Commit

Permalink
Use separated process for forking
Browse files Browse the repository at this point in the history
Spawning container processes is delegated to a separated 'forker'
process. The runtime communicates with this process through a unix
domain socket.

With this change, northstar's runtime can now execute in multithreaded
mode without the danger of the libc deadlocking issue.

  ┌───────────┐    ┌────────┐   ┌────────────────────────────┐
  │ Northstar ├────┤ Forker │   │ Container A                │
  │  Runtime  │    └────┬───┘   │ ┌──────┐ ┌───────────────┐ │
  └───────────┘         ├───────┼►│ Init ├─┤ Application A │ │
                        │       │ └──────┘ └───────────────┘ │
                        │       └────────────────────────────┘
                        │
                        │       ┌────────────────────────────┐
                        │       │ Container B                │
                        │       │ ┌──────┐ ┌───────────────┐ │
                        ├───────┼►│ Init ├─┤ Application B │ │
                        │       │ └──────┘ └───────────────┘ │
                        │       └────────────────────────────┘
                        ▼
                       ...

The 'forker' process must consequently be single threaded.

Additionally, Init processes handle requests to start new processes
inside the container. This is a prerequisite to #454.

Additional details
------------------
- Northstar version is bumped to 0.7.0-dev
- Panic if the forker process exits unexpectedly

    If the forker process dies for whatever reason, it is not possible
    to recoverable and the runtime bails out.

- Do not limit the number of threads of the runtime in demo main
- Parallel loading of NPKs from disk

    The loading of NPKs from disk is slow and blocking. Spawn a thread
    for each NPK in order to speed up the boring parsing of the NPK
    headers.

- Replace manifest IO Pipe with Log

    The 'pipe' option for the container output is removed from the
    manifest. The option 'log' is renamed to 'pipe'. A new option
    'discard' is added for the output.

- Refactor container IO handling

    When the container IO configuration in the manifest indicates that
    any of `stdout` or `stderr` is to be 'piped', a socket is used to
    receive the output from the container. On the other side, the
    runtime uses a `async` task to forward the incoming output from the
    socket to the runtime log.

- Pipes are removed and replaced with sockets

Co-authored-by: Felix Obenhuber <[email protected]>
Co-authored-by: Alfonso Ros <[email protected]>
  • Loading branch information
Alfonso Ros and Felix Obenhuber committed Feb 22, 2022
1 parent e784428 commit 34d12c6
Show file tree
Hide file tree
Showing 70 changed files with 3,238 additions and 2,765 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ jobs:
uses: actions-rs/cargo@v1
with:
command: test
args: --all-features
args: --all-features -- --test-threads=1

doc:
name: Documentation
Expand Down
10 changes: 7 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@ kernel configuration with the `CONFIG_` entries in the `check_conf.sh` script.

### Container launch sequence

**TODO**: <br/><img src="images/container-startup.png" class="inline" width=600/>
<br/><img src="images/container-startup.png" class="inline" width=600/>

### Manifest Format

Expand Down
Binary file removed doc/diagrams/container_startup.png
Binary file not shown.
33 changes: 0 additions & 33 deletions doc/diagrams/container_startup.puml

This file was deleted.

6 changes: 2 additions & 4 deletions examples/console/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,8 @@ console: true
uid: 1000
gid: 1000
io:
stdout:
log:
level: DEBUG
tag: console
stdout: pipe
stderr: pipe
mounts:
/dev:
type: dev
Expand Down
6 changes: 2 additions & 4 deletions examples/cpueater/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,5 @@ mounts:
type: bind
host: /system
io:
stdout:
log:
level: DEBUG
tag: cpueater
stdout: pipe
stderr: pipe
2 changes: 1 addition & 1 deletion examples/cpueater/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use std::env::var;

fn main() {
let version = var("VERSION").expect("Failed to read VERSION");
let version = var("NORTHSTAR_VERSION").expect("Failed to read NORTHSTAR_VERSION");
let threads = var("THREADS")
.expect("Failed to read THREADS")
.parse::<i32>()
Expand Down
8 changes: 3 additions & 5 deletions examples/crashing/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ uid: 1000
gid: 1000
env:
RUST_BACKTRACE: 1
io:
stdout: pipe
stderr: discard
mounts:
/dev:
type: dev
Expand All @@ -19,8 +22,3 @@ mounts:
/system:
type: bind
host: /system
io:
stdout:
log:
level: DEBUG
tag: crashing
6 changes: 2 additions & 4 deletions examples/hello-ferris/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,5 @@ mounts:
dir: /
options: noexec,nodev,nosuid
io:
stdout:
log:
level: DEBUG
tag: ferris
stdout: pipe
stderr: pipe
6 changes: 2 additions & 4 deletions examples/hello-resource/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,5 @@ mounts:
type: bind
host: /system
io:
stdout:
log:
level: DEBUG
tag: hello
stdout: pipe
stderr: pipe
6 changes: 2 additions & 4 deletions examples/hello-world/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,8 @@ gid: 1000
env:
HELLO: northstar
io:
stdout:
log:
level: DEBUG
tag: hello
stdout: pipe
stderr: pipe
mounts:
/dev:
type: dev
Expand Down
10 changes: 3 additions & 7 deletions examples/hello-world/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,9 @@
fn main() {
let hello = std::env::var("HELLO").unwrap_or_else(|_| "unknown".into());
let version = std::env::var("VERSION").unwrap_or_else(|_| "unknown".into());
let hello = std::env::var("NORTHSTAR_CONTAINER").unwrap_or_else(|_| "unknown".into());

println!("Hello again {} from version {}!", hello, version);
println!("Hello again {}!", hello);
for i in 0..u64::MAX {
println!(
"...and hello again #{} {} from version {}...",
i, hello, version
);
println!("...and hello again #{} {} ...", i, hello);
std::thread::sleep(std::time::Duration::from_secs(1));
}
}
12 changes: 3 additions & 9 deletions examples/inspect/manifest.yaml
Original file line number Diff line number Diff line change
@@ -1,17 +1,11 @@
name: inspect
name: inspect
version: 0.0.1
init: /inspect
uid: 1000
gid: 1000
io:
stdout:
log:
level: DEBUG
tag: inspect
stderr:
log:
level: WARN
tag: inspect
stdout: pipe
stderr: discard
mounts:
/dev:
type: dev
Expand Down
6 changes: 2 additions & 4 deletions examples/memeater/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,5 @@ mounts:
type: bind
host: /system
io:
stdout:
log:
level: DEBUG
tag: memeater
stdout: pipe
stderr: pipe
6 changes: 2 additions & 4 deletions examples/persistence/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,5 @@ mounts:
type: bind
host: /system
io:
stdout:
log:
level: DEBUG
tag: persistence
stdout: pipe
stderr: pipe
8 changes: 3 additions & 5 deletions examples/seccomp/manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,8 @@ mounts:
type: bind
host: /system
io:
stdout:
log:
level: DEBUG
tag: seccomp
stdout: pipe
stderr: pipe
seccomp:
profile:
default
default
Binary file modified images/container-startup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
62 changes: 49 additions & 13 deletions images/container-startup.puml
Original file line number Diff line number Diff line change
@@ -1,33 +1,69 @@
@startuml container_startup

create Client
activate Client

create Runtime
activate Runtime
Runtime -> Runtime: Check and Mount container

create Forker
Runtime -> Forker: Fork
activate Forker

Client -> Runtime: Connect: Hello
Client <- Runtime: ConnectAck
Client -> Runtime: Start container
Runtime -> Runtime: Check and mount container(s)
Runtime -> Runtime: Open PTY

Runtime -> Forker: Create container

create Trampoline
Runtime -> Trampoline: Fork
Forker -> Trampoline: Fork
activate Trampoline
Trampoline -> Trampoline: Create PID namespace

create Init
Trampoline -> Init: Fork
activate Init
Trampoline -> Runtime: Init PID
Init -> Init: Mount, Chroot, UID / GID,\ndrop privileges, file descriptors

Trampoline -> Forker: Forked init with PID
destroy Trampoline
Runtime -> Runtime: Wait for Trampoline exit (waitpid)
Init -> Init: Wait for run signal (Condition::wait)

Forker -> Forker: reap Trampoline

Forker -> Runtime: Created init with PID

Runtime -> Runtime: Configure cgroups
Runtime -> Init: Signal run (Condition::notify)
Runtime -> Runtime: Wait for execve (Condition::wait)
Init -> Init: Mount, Chroot, UID / GID,\ndrop privileges, file descriptors
Runtime -> Runtime: Configure debug
Runtime -> Runtime: Configure PTY forward

Runtime -> Forker: Exec container
Forker -> Init: Exec Container
create Container
Init -> Container: Fork
activate Container
Forker <- Init: Exec
Runtime <- Forker: Exec
Client <- Runtime: Started
Client <- Runtime: Notification: Started

Init -> Init: Wait for container to exit (waitpid)
Container -> Container: Setup PTY
Container -> Container: Set seccomp filter
Container -> : Execve(..)
Runtime -> Runtime: Condition pipe closed: Container is started
note left: Condition pipe is CLOEXEC
Container -> Init: Exit
...
Container -> Init: SIGCHLD
destroy Container
Init -> Runtime: Exit
Runtime -> Runtime: Read exit status from pipe or waitpid on pid of init

Init -> Init: waitpid: Exit status of container
Init -> Forker: Container exit status
destroy Init

Forker -> Runtime: Container exit status
Runtime -> Runtime: Stop PTY thread
Runtime -> Runtime: Destroy cgroups
Client <- Runtime: Notification: Exit

@enduml
2 changes: 1 addition & 1 deletion main/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ clap = { version = "3.1.0", features = ["derive"] }
log = "0.4.14"
nix = "0.23.0"
northstar = { path = "../northstar", features = ["runtime"] }
tokio = { version = "1.17.0", features = ["rt", "macros", "signal"] }
tokio = { version = "1.17.0", features = ["rt-multi-thread", "macros", "signal"] }
toml = "0.5.8"

[target.'cfg(not(target_os = "android"))'.dependencies]
Expand Down
Loading

0 comments on commit 34d12c6

Please sign in to comment.