feat(sandbox): add -Zsandbox unstable flag #66

weihanglo · 2024-10-19T20:00:51Z

What does this PR try to resolve?

This is an experiment for rust-lang/rust-project-goals#108.

An unstable [sandbox] config table is added behind the -Zsandbox unstable flag:

sandbox.runner: Target platform sandboxed build scripts built for
sandbox.target: Runner that executes artifacts of build scripts

Here is an example .cargo/config.toml:

[sandbox]
target = "wasm32-wasip1"
runner = [
  "wasmtime",
  "--wasi",
  "inherit-env=y",
  "--dir",
  "/path/to/my/package",
]

Also an example main.rs:

use std::path::PathBuf;
use std::time::SystemTime;
use std::time::UNIX_EPOCH;

fn main() {
    println!("cargo:warning=Hello from sandbox wasm");
    // Access to Cargo internal environment variables
    //
    // Require `--env` or `-S inherit-env=y`
    let cargo_env = std::env::var("CARGO").unwrap();
    println!("cargo:warning=CARGO={cargo_env}");

    // Access to inherited enviroment variables
    //
    // Require `-S inherit-env=y`
    let shell = std::env::var("SHELL").unwrap();
    println!("cargo:warning=SHELL={shell}");

    // Files access under a given directory
    //
    // Require `--dir $CARGO_MANIFEST_DIR`
    let ts_file = PathBuf::from(std::env::var("CARGO_MANIFEST_DIR").unwrap())
        .join("target")
        .join("wasi-timestamp");
    let timestamp = SystemTime::now()
        .duration_since(UNIX_EPOCH)
        .unwrap()
        .as_millis();
    std::fs::write(&ts_file, format!("{timestamp}").as_bytes()).unwrap();
    let ts = std::fs::read_to_string(ts_file).unwrap();
    println!("cargo:warning=current timestamp is {ts}");

    // Spawn process
    //
    // Unsupported 😭😭😭
    std::process::Command::new(cargo_env).status().unwrap();
}

What did we achieve in this experiment?

As you can see, we can easily swap to any sandbox runner with a custom target.

We use wasm32-wasip1 above as an example,
as it is the one with smaller footprint, very cross-platform, and pretty popular in the Rust community.
However, it turns out that `wasm32-wasip1 doesn't support POSIX process spawning.
Use cases of process spawning in build scripts are essential, such as

Calling pkg-config to find system libraries.
Invoking compilers to build non Rust code, e.g., C compiler, Protobuf compiler.
Invoking cargo metadata to get crate metadata.
Retrieving version control information via git.

According to the design axioms,

Restrcting process spawning is the top one axiom.
We made that with wasm32-wasip1,
but have no way to opt-out.
This contradicts to the "ensuring -sys crates be built" axiom.

As a result, it is unlikely to use wasm32-wasip1 as a default sandbox environment with this experiment.

Other possibilities

Have talked to some other folks, there are some potential route we could take if we chose wasi as a default sandbox environment.

The offical `build-rs` crate to the rescue

The Cargo team recently adopted the build-rs crate.
It is going to be the official crate providing API for writing build scripts.
We could take the advantage of it, telling everyone instead of using std::process::Command, use something like build_rs::Command. So that Cargo could have a full control over how a build script spawning processes.

While it sounds ideal, this doesn't help the current situation because

It doesn't help old crates.
It's hard to convince people to change their build scripts for trying an unstable feature.
Using build-rs in build scripts may increase build time; people may refuse to use.
I myself tend to have a big switch to turn on and off, which is easier to try a new feature.

A Cargo-flavored wasi standard library

There was a discussion in the GSoC "sandboxed proc-macro with wasm" about shipping a custom verion of the standard library for sandboxed wasm. For Cargo's build script, we could potentially ship a wasm32-wasi-cargo target. The std in this custom target could intercept any exec call or process spawning. Then it calls back to the host process (which is Cargo in our case) to determine how to handle process execution. The host process could either reject, or run the external program and post back the result.

This idea sounds pretty hacky and need more investigations of the communication mechanism between the host process and the wasm runtime. Perhaps via sockets, The WebAssembly Component Model, other host function call mechanism. There is also WASIX project which supprots fork/exec though it is currently not a WASI standard not even a proposal.

Continue with other more mature sandbox runtime choices

Since one of major design space is the user interafce of sandbox configuration,
we could leave off sandbox runtimes and explore more on the configuration side.

We could, for example, use docker or eBPF as a temporary default runtime, and explore how the configuration should look like.
We may want to take a look at the configuration of Cackle-rs as a starting point. By doing so, we wouldn't block on waiting for wasm runtime to being more mature.

* `sandbox.runner`: Target platform sandboxed build scripts built for * `sandbox.target`: Runner that executes artifacts of build scripts Example config: ```toml [sandbox] target = "wasm32-wasip1" runner = [ "wasmtime", "--wasi", "inherit-env=y", "--dir", "/path/to/my/package", ] ```

Add support for `[sandbox]` to specify a target platform and a custom build script runner. To help the external runner finds and executes the executable produced by a build script, it is assumed that only one executable was produced.

smoelius · 2024-10-31T17:40:16Z

Hi, @weihanglo. I learned of this project from the the October project goals update.

I wanted to make you aware of a related project build-wrap that we have been working on at Trail of Bits.

build-wrap's goals are similar those of this PR, though build-wrap relies on the OS to achieve them. I hope it can serve as a source of ideas or at least inspiration. Cheers.

weihanglo · 2024-10-31T20:30:31Z

Hello @smoelius!
Thank you for sharing! I believe I have seen this awesome project but at some point I forgot. Perhaps lacking of Windows support might be the reason I didn't consider it in the place, yet it is still a thing worth further exploration. I'll look into it :)

alexcrichton · 2024-10-31T21:52:57Z

👋 If you and/or Cargo folks are interested I'd be happy to chat in-depth about WASI/WebAssembly/etc things. I can also dump info here if that helps too!

As a perhaps tl;dr; my overall read of the situation here (please correct me if I'm wrong) is that if WASI targets supported std::process::Command it would largely solve the concerns here. If that's true then it's correct that WASI has no support for that today, but I also personally think it would be reasonable to strive to support this in WASI. This certainly isn't the first project (nor will it be the last) to want to support spawning processes. I personally think it's viable to write a WASI interface for spawning processes in such a way that can get agreement in the upstream WASI proposal itself. To be clear though this would be a long process, not a quick fix. It'd require a "champion" of sorts to help the proposal through the phases of WASI which I could certainly be able to help out with though.

In any case though I'm also happy to attend a meeting if you or others would find that helpful too!

RalfJung · 2024-11-01T16:16:18Z

If we just let a wasm script spawn processes that then run outside the sandbox, that seems pointless. So I don't see how the wasm-based approach here is supposed to work.

weihanglo · 2024-11-01T18:10:01Z

If we just let a wasm script spawn processes that then run outside the sandbox, that seems pointless. So I don't see how the wasm-based approach here is supposed to work.

Those spawn processes still run inside the sandbox I believe, just like other sockets and HTTP requests wasi proposals. The wasi runtime needs to provide an interface for granting/limiting what can be spawn. It is pretty hard to design a proper interface for that though.

RalfJung · 2024-11-01T18:25:03Z

That requires a sandbox that can run actual binaries. If we have a sandbox that can run actual binaries (not just wasm code), we may as well build the build script the usual way and run it inside the sandbox. So at that point the use of wasm seems a bit pointless.

jeffparsons · 2024-11-01T21:19:52Z

If we just let a wasm script spawn processes that then run outside the sandbox, that seems pointless.

As a WASI user, the way I'd imagine this working would be that all process spawning is denied by default, and there would be embedder-defined mechanisms for allow-listing spawning, e.g., specific external processes, possibly even with specific patterns of arguments.

I don't have a lot of experience with writing interesting Rust build scripts, but I'd imagine this approach would cover the vast majority of *-sys crates etc. that just need to link to some system-provided C libraries.

Edit: to clarify, the advantage of this over, say, defining Cargo-specific Component Model interfaces, is that build scripts can still be written in Plain Old Rust, and as long as they're only calling out to a set of well-known external programs that are trusted-ish by Rust anyway, then they'll Just Work inside a Wasm sandbox. Any build scripts that try to do anything more exotic than that (e.g call Google's protoc) would fail. So then maybe we could have a mechanism for users to say "I'm okay with this crate's build script calling this external program" but deny by default in "paranoid mode".

alexcrichton · 2024-11-02T04:28:30Z

Personally I feel there's nuance that might be worth exploring here. Should arbitrary executables be allowed by default? Probably not. If the executables were themselves wasm, then probably so! As @jeffparsons mentions I'd also imagine a sort of allow-list style system which could also be in place. Alternatively if this isn't a security sandbox and is instead just a "hermetic build" style sandbox then it's perhaps totally ok to execute whatever so long as the build system can track the dependencies of the "whatever". Furthermore if the goal is to precompile build scripts to wasm then allowing arbitrary executables is totally fine.

Basically if there is room to explore here (which I'd also completely understand if there isn't appetite for that or energy to explore) then I think it could be worth digging in to this more on the wasm side without immediately bouncing off on initial concerns

RalfJung · 2024-11-02T08:16:02Z

IMO security is one of the main motivations for this -- being able to do check-builds of a project and open it in an IDE without having to worry about it exfiltrating secrets from my machine, or mutating any part of the filesystem outside the project folder. As someone who regularly tries to understand third-party libraries, which requires exploring their sourcecode, I would quite appreciate if that wouldn't require me to do manual sandboxing. :) (Note that in many cases I do not run these libraries nor their test suites, so I don't have to trust the generated code.)

Sorry if this is going off-topic, I don't know what is the right place for the "general project discussion" here.

weihanglo added 2 commits October 19, 2024 15:05

feat(sandbox): add -Zsandbox unstable flag

65865e8

weihanglo mentioned this pull request Oct 19, 2024

Explore sandboxed build scripts rust-lang/rust-project-goals#108

Open

10 tasks

weihanglo force-pushed the custom-sandbox-runner-target branch from e0889df to 4ca4cd0 Compare October 19, 2024 20:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sandbox): add -Zsandbox unstable flag #66

feat(sandbox): add -Zsandbox unstable flag #66

weihanglo commented Oct 19, 2024 •

edited

Loading

smoelius commented Oct 31, 2024

weihanglo commented Oct 31, 2024

alexcrichton commented Oct 31, 2024

RalfJung commented Nov 1, 2024

weihanglo commented Nov 1, 2024

RalfJung commented Nov 1, 2024

jeffparsons commented Nov 1, 2024 •

edited

Loading

alexcrichton commented Nov 2, 2024

RalfJung commented Nov 2, 2024

feat(sandbox): add -Zsandbox unstable flag #66

Are you sure you want to change the base?

feat(sandbox): add -Zsandbox unstable flag #66

Conversation

weihanglo commented Oct 19, 2024 • edited Loading

What does this PR try to resolve?

What did we achieve in this experiment?

Other possibilities

The offical build-rs crate to the rescue

A Cargo-flavored wasi standard library

Continue with other more mature sandbox runtime choices

smoelius commented Oct 31, 2024

weihanglo commented Oct 31, 2024

alexcrichton commented Oct 31, 2024

RalfJung commented Nov 1, 2024

weihanglo commented Nov 1, 2024

RalfJung commented Nov 1, 2024

jeffparsons commented Nov 1, 2024 • edited Loading

alexcrichton commented Nov 2, 2024

RalfJung commented Nov 2, 2024

weihanglo commented Oct 19, 2024 •

edited

Loading

The offical `build-rs` crate to the rescue

jeffparsons commented Nov 1, 2024 •

edited

Loading