Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiled wasm32-wasip2 component from simple code requires excessive WASI interfaces #133235

Open
ifsheldon opened this issue Nov 20, 2024 · 24 comments
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. O-wasi Operating system: Wasi, Webassembly System Interface T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@ifsheldon
Copy link

ifsheldon commented Nov 20, 2024

During my learning and experiments of wasip2, I tried this simple code and compiled it to a wasip2 component:

wit_bindgen::generate!({
    // the name of the world in the `*.wit` input file
    world: "formatter",
});

struct Formatter;

impl Guest for Formatter {
    fn format_str(a: String, b: String) -> String {
        let s = format!("{} + {}", a, b);
        print(s.as_str());
        s
    }
}

export!(Formatter);

and

// format.wit
package component:formatter;

world formatter {
    import print: func(s: string);
    export format-str: func(a: string, b: string) -> string; 
}

The host code, which runs the component, is:

use wasmtime::component::*;
use wasmtime::{Engine, Store};
use wasmtime_wasi::{WasiCtx, WasiImpl, WasiView};
use anyhow::Result;
// reference: https://docs.rs/wasmtime/latest/wasmtime/component/bindgen_examples/_0_hello_world/index.html
// reference: https://docs.wasmtime.dev/examples-rust-wasi.html


bindgen!({
    path: "../implementation/wit/format.wit",
    world: "formatter",
});

struct MyState {
    // These two are required basically as a standard way to enable the impl of WasiView
    wasi_ctx: WasiCtx,
    table: ResourceTable,
}

impl WasiView for MyState {
    fn table(&mut self) -> &mut ResourceTable {
        &mut self.table
    }
    fn ctx(&mut self) -> &mut WasiCtx {
        &mut self.wasi_ctx
    }
}

impl FormatterImports for MyState {
    fn print(&mut self, s: String) {
        println!("{}", s);
    }
}

/// copied from wasmtime_wasi::type_annotate, which is a private function
fn type_annotate<T: WasiView, F>(val: F) -> F
where
    F: Fn(&mut T) -> WasiImpl<&mut T>,
{
    val
}

fn main() -> Result<()> {
    let engine = Engine::default();
    let component = Component::from_file(
        &engine,
        "../implementation/target/wasm32-wasip2/release/implementation.wasm",
    )?;

    let mut linker = Linker::new(&engine);

    let ctx = wasmtime_wasi::WasiCtxBuilder::new().build();
    let state = MyState {
        wasi_ctx: ctx,
        table: ResourceTable::new(),
    };
    let mut store = Store::new(&engine, state);
    Formatter::add_to_linker(&mut linker, |s| s)?;
    
    // Note: 
    // The below block is copied from `wasmtime_wasi::add_to_linker_sync`.
    // why a "format!()" in the implementation needs `sync::filesystem::types`, `sync::io::streams`, `cli::exit`, `cli::environment`, `cli::stdin`, `cli::stdout`, `cli::stderr`?
    {
        let l = &mut linker;
        let closure = type_annotate::<MyState, _>(|t| WasiImpl(t));
        wasmtime_wasi::bindings::sync::filesystem::types::add_to_linker_get_host(l, closure)?;
        wasmtime_wasi::bindings::filesystem::preopens::add_to_linker_get_host(l, closure)?;
        wasmtime_wasi::bindings::io::error::add_to_linker_get_host(l, closure)?;
        wasmtime_wasi::bindings::sync::io::streams::add_to_linker_get_host(l, closure)?;
        wasmtime_wasi::bindings::cli::exit::add_to_linker_get_host(l, closure)?;
        wasmtime_wasi::bindings::cli::environment::add_to_linker_get_host(l, closure)?;
        wasmtime_wasi::bindings::cli::stdin::add_to_linker_get_host(l, closure)?;
        wasmtime_wasi::bindings::cli::stdout::add_to_linker_get_host(l, closure)?;
        wasmtime_wasi::bindings::cli::stderr::add_to_linker_get_host(l, closure)?;
    }

    let bindings = Formatter::instantiate(&mut store, &component, &linker)?;
    let result = bindings.call_format_str(&mut store, "a", "b")?;
    println!("format_str: {}", result);
    Ok(())
}

The block with a note binds many interfaces to avoid runtime errors that says something like
component imports instance wasi:cli/[email protected], but a matching implementation was not found in the linker. Removing any one of the lines in the block will result in a runtime error.

I expect this compiled component requires none of these WASI interfaces to run, since it has nothing to do with io, cli, etc. Binding these unnecessary interfaces may raise security concerns.

The full minimized code is here.

As a kind person pointed out on ByteAlliance Zulip, these interfaces are required by std.

Probably there's a way to minimize or prune the interface requirements in the compilation? I think rustc has all the information of which effects are used by any one of functions/macros that is used by a user.

At the very lease, I think we should document these requirements somewhere, so there are no hidden/dark interface dependencies that are not specified and unknown in WIT files.

Meta

rustc --version --verbose:

rustc 1.82.0 (f6e511eec 2024-10-15)
binary: rustc
commit-hash: f6e511eec7342f59a25f7c0534f1dbea00d01b14
commit-date: 2024-10-15
host: aarch64-apple-darwin
release: 1.82.0
LLVM version: 19.1.1
@ifsheldon ifsheldon added the C-bug Category: This is a bug. label Nov 20, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 20, 2024
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ and removed T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 20, 2024
@jieyouxu
Copy link
Member

jieyouxu commented Nov 20, 2024

I'm not a wasi/wasm expert by any means, but doesn't your example in the repo use wit_bindgen which probably has non-trivial proc-macros? Is there a more minimal repro that only uses rustc or cargo but with only std?

EDIT: Nevermind I read the zulip thread, so this is T-libs

This feels weird to me, why does format!() need to access the environment? To avoid these runtime errors, it turns out the following interfaces need to be linked to the linker:
sync::filesystem::types
sync::io::streams
cli::exit
cli::environment
cli::stdin
cli::stdout
cli::stderr
Is it expected and why? Or, is it a bug?

@jieyouxu

This comment has been minimized.

@jieyouxu jieyouxu added T-libs Relevant to the library team, which will review and decide on the PR/issue. C-discussion Category: Discussion or questions that doesn't represent real issues. O-wasi Operating system: Wasi, Webassembly System Interface and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ labels Nov 20, 2024
@jieyouxu
Copy link
Member

Ah right we have a wasi ping group now
@rustbot ping wasi

@rustbot
Copy link
Collaborator

rustbot commented Nov 20, 2024

Hey WASI notification group! This issue or PR could use some WASI-specific guidance.
Could one of you weigh in? Thanks <3

(In case it's useful, here are some instructions for tackling these sorts of
issues).

cc @alexcrichton @burakemir @juntyr

@juntyr
Copy link
Contributor

juntyr commented Nov 20, 2024

My guess would be that format is pulling in the entire panic machinery which includes a lot more string formatting. You could try compiling the wasm binary and std with panic immediate abort (I don’t know the exact flags by heart) and see if that removes those extraneous environmental dependencies.

@jieyouxu
Copy link
Member

Based on rust-lang/wg-cargo-std-aware#29 it might be something like cargo ... -Zbuild-std -Zbuild-std-features=panic_immediate_abort (I haven't tried this)

@juntyr
Copy link
Contributor

juntyr commented Nov 20, 2024

Based on rust-lang/wg-cargo-std-aware#29 it might be something like cargo ... -Zbuild-std -Zbuild-std-features=panic_immediate_abort (I haven't tried this)

Those look correct - you might also need RUSTFLAGS=-C panic=abort

@ifsheldon
Copy link
Author

Alright, I changed my build command to RUSTFLAGS="-C panic=abort" cargo build --target wasm32-wasip2 --release -Zbuild-std -Zbuild-std-features=panic_immediate_abort but I got this error

error[E0152]: duplicate lang item in crate `core`: `sized`
  |
  = note: the lang item is first defined in crate `core` (which `implementation` depends on)
  = note: first definition in `core` loaded from /Users/zhiqiu/offline_code/opensource/wit_issue/implementation/target/wasm32-wasip2/release/deps/libcore-55d3ecbce88cf86e.rlib, /Users/zhiqiu/offline_code/opensource/wit_issue/implementation/target/wasm32-wasip2/release/deps/libcore-55d3ecbce88cf86e.rmeta
  = note: second definition in `core` loaded from /Users/zhiqiu/.rustup/toolchains/nightly-2024-11-19-aarch64-apple-darwin/lib/rustlib/wasm32-wasip2/lib/libcore-65e53cbf30438bae.rlib

This is out of my mind and I don't know if it's another solvable issue or bug....... Kind of reminds me of the horrible building errors in C.

I think you can reproduce this in this branch of my repo.

@bjorn3
Copy link
Member

bjorn3 commented Nov 21, 2024

Maybe using -Zbuild-std=core,std instead of -Zbuild-std works? Also you don't need -Cpanic=abort, that is already the default for wasm32-wasip2.

@ifsheldon
Copy link
Author

Maybe using -Zbuild-std=core,std instead of -Zbuild-std works? Also you don't need -Cpanic=abort, that is already the default for wasm32-wasip2.

@bjorn3 Thanks, I tried this
cargo build --target wasm32-wasip2 --release -Zbuild-std="core,std" -Zbuild-std-features=panic_immediate_abort and cargo build --target wasm32-wasip2 --release -Zbuild-std="core,std". But I still got

error[E0152]: duplicate lang item in crate `core`: `sized`
  |
  = note: the lang item is first defined in crate `core` (which `implementation` depends on)
  = note: first definition in `core` loaded from /Users/zhiqiu/offline_code/opensource/wit_issue/implementation/target/wasm32-wasip2/release/deps/libcore-65b8a47a5d02d391.rlib, /Users/zhiqiu/offline_code/opensource/wit_issue/implementation/target/wasm32-wasip2/release/deps/libcore-65b8a47a5d02d391.rmeta
  = note: second definition in `core` loaded from /Users/zhiqiu/.rustup/toolchains/nightly-2024-11-19-aarch64-apple-darwin/lib/rustlib/wasm32-wasip2/lib/libcore-65e53cbf30438bae.rlib

For more information about this error, try `rustc --explain E0152`

And I don't quite understand the explanation given by rustc --explain E0152. Could you try my code in the repo and see whether you can reproduce this?

@alexcrichton
Copy link
Member

I believe the command you want is:

cargo +nightly build -Zbuild-std=std,panic_abort -Zbuild-std-features=panic_immediate_abort --target wasm32-wasip2 --release

Notably panic_abort needs to be passed to -Zbuild-std. After that I get:

$ wasm-tools component wit ./target/wasm32-wasip2/release/wat.wasm
package root:component;

world root {
  import print: func(s: string);

  export format-str: func(a: string, b: string) -> string;
}

which I believe is what you want

@ifsheldon
Copy link
Author

ifsheldon commented Nov 21, 2024

@alexcrichton Wow! That works like a charm. Thanks a lot!

As a Rust application developer and a WASI newbie, that long command has a lot to take in.

I have a few questions (not trying to push anything but for discussion):

  • Why isn't this profile or set of settings the default for wasm32-wasip2 target?
    As @bjorn3 mentioned, -Cpanic=abort is a default for wasm32-wasip2 target. So why don't we do the same? I know almost nothing about these -C and -Z flags, though.
  • What if we indeed need some of these interfaces, will rustc automatically/secretly add some interface requirements for us? which might be good and/or bad.
    For example, if I indeed use std::env::var("PATH"), with such a long command, will rustc add an import of cli::environment for us? I guess so.
  • If rustc will add WIT imports for us automatically, is it really a desired behavior?
    IMHO, I think this, to some degree, undermines the promise WASIp2 gives. I would rather prefer rustc to throw a hard error when std or some functions from std need some interfaces but I forgot to specify imports for them in the component's WIT file.

Do you guys think those questions are hard to answer? I might just write up a blog or actually try to solve these issues if they are not like rabbit-hole hard.

@bjorn3
Copy link
Member

bjorn3 commented Nov 21, 2024

Why isn't this profile or set of settings the default for wasm32-wasip2 target?
As @bjorn3 mentioned, -Cpanic=abort is a default for wasm32-wasip2 target. So why don't we do the same? I know almost nothing about these -C and -Z flags, though.

-Cpanic=abort and panic_immediate_abort are not the same. -Cpanic=abort will run the panic handler/hook when panicking to print the panic message before it aborts, while panic_immediate_abort replaces every panic!() with std::process::abort().

What if we indeed need some of these interfaces, will rustc automatically/secretly add some interface requirements for us? which might be good and/or bad.
For example, if I indeed use std::env::var("PATH"), with such a long command, will rustc add an import of cli::environment for us? I guess so.

Yes, if you use any libstd api that needs those interfaces, they will automatically be imported.

If rustc will add WIT imports for us automatically, is it really a desired behavior?
IMHO, I think this, to some degree, undermines the promise WASIp2 gives. I would rather prefer rustc to throw a hard error when std or some functions from std need some interfaces but I forgot to specify imports for them in the component's WIT file.

The standard library has what is effectively1 a call to:

bindgen!({
    path: "/path/to/wasip2/cli/imports.wat",
    world: "wasi:cli/imports",
});

which causes all interfaces in the wasi:cli/imports world to be imported. wasm-components-ld (the linker for wasm32-wasip2) merges all imported interfaces of all object files that get included in the final executable and strips all functions that end up never being used anywhere and then uses the result as the "world" for the wasm component. Note that at worlds are pretty much a fiction presented by the .wit text format. The binary wasm component format only deals with imported and exported interfaces.

Footnotes

  1. technically it is wasi-libc which embeds the same metadata as this macro invocation would embed in the crate, but given that libstd depends on wasi-libc, this is equivalent in terms of the user observable behavior.

@ifsheldon
Copy link
Author

Thst's.......... a lot of information and clears out most of the mist in my mind. Thanks a lot! @bjorn3

I'd like to ask a few more questions, if you don't mind.

-Cpanic=abort and panic_immediate_abort are not the same. -Cpanic=abort will run the panic handler/hook when panicking to print the panic message before it aborts, while panic_immediate_abort replaces every panic!() with std::process::abort().

This partially explains why panic_immediate_abort is not a default for wasm32-wasip2. So, as far as I understand, rustc is trying to enable helpful behaviors (i.e., run the panic handler/hook when panicking to print the panic message before it aborts) when it compiles the component and add (seemingly) excessive interface requirements.

Yes, if you use any libstd api that needs those interfaces, they will automatically be imported.

Take std::env::var("PATH") as an example, you mean if I use this and only this, the whole set of wasi:cli/imports will be imported, right? If so, that seems a bit "excessive" to me. IMHO, I'd prefer it to only import wasi:cli/environment, like really "pay for what you need".

Taking a step back, I think the information you provided is really valuable and useful for understanding how WASI generally works in Rust. Can you please put that in the rustc book? I could make a PR, but I think it's not appropriate for me to do that.

@bjorn3
Copy link
Member

bjorn3 commented Nov 24, 2024

Take std::env::var("PATH") as an example, you mean if I use this and only this, the whole set of wasi:cli/imports will be imported, right? If so, that seems a bit "excessive" to me. IMHO, I'd prefer it to only import wasi:cli/environment, like really "pay for what you need".

It only imports the get-environment method of the wasi_cli/environment interface (+ other interfaces used by the panic handler if it is possible to panic).

@ifsheldon
Copy link
Author

OK, the problem is conceptually clear to me. Thanks!

So, the behavior of the default panic handler is documented here

The default hook, which is registered at startup, prints a message to standard error and generates a backtrace if requested. This behavior can be customized using the set_hook function.

I tried set_hook function, but it still brings in a lot of unnecessary interfaces. See this commit ifsheldon/wit_interface_issue@ee637c7

Image

Maybe I should use std::panic::always_abort?

@bjorn3
Copy link
Member

bjorn3 commented Nov 25, 2024

I tried set_hook function, but it still brings in a lot of unnecessary interfaces. See this commit ifsheldon/wit_interface_issue@ee637c7

If a panic happens before the set_hook() call, libstd's panic handler would still call the default panic hook. And even if that doesn't happen, the PANIC_HOOK static is initialized with the default panic hook as value and the compiler isn't smart enough to optimize this away.

Maybe I should use std::panic::always_abort?

That won't have any effect. It only sets a flag and doesn't enable any optimization opportunities.

@ifsheldon
Copy link
Author

OK then. Are we stuck? I don't know anything else to set the panic handler globally. I found #[no_std] and #[panic_handler] when I googled how to set it, but I barely know Embedding Rust and I think probably Embedding Rust is not a good fit.

@bjorn3
Copy link
Member

bjorn3 commented Nov 25, 2024

The easiest solution is to provide the full wasip2 interface to the wasm component, so using wasmtime_wasi::add_to_linker_sync instead of wasmtime_wasi::bindings::cli::environment::add_to_linker_get_host. For that you will have to implement wasmtime_wasi::WasiView for your MyState type.

@ifsheldon
Copy link
Author

By "stuck" I mean stuck in solving this issue, not making this program run. I did implement wasmtime_wasi::WasiView and add wasmtime_wasi::add_to_linker_sync in my original code, after all. But the issue is why a simple program requires more interfaces that the programmer intended to grant and how to fix it. This issue about solving excessiveness, not making the host run a guest.

We see in the case of format!() that we can just use cargo +nightly build -Zbuild-std=std,panic_abort -Zbuild-std-features=panic_immediate_abort --target wasm32-wasip2 --release.

But in the case of std::env::var("PATH"), cargo +nightly build -Zbuild-std=std,panic_abort -Zbuild-std-features=panic_immediate_abort --target wasm32-wasip2 --release is not enough to prune interface requirements. It still brings in these interface requirements:

sync::filesystem::types
sync::io::streams
cli::exit
cli::environment
cli::stdin
cli::stdout
cli::stderr

I want to know if there's a way to configure panic handler globally ahead of compilation, so the panic handler does not brings in excessive interface requirements other than cli::environment.

@bjorn3
Copy link
Member

bjorn3 commented Nov 26, 2024

It is not possible to replace the panic handler when using libstd. There can only be a single panic handler in the whole program and libstd always defines a panic handler.

@ifsheldon
Copy link
Author

It is not possible to replace the panic handler when using libstd.

OK........ So, it seems we have to turn to #[no_std]?

I think it is definitely an issue to be resolved because:

No ways to replace the panic handler when using libstd -> No ways to prune interfaces requirements when a component uses libstd -> A host must grant all interfaces required by the panic handler of libstd to run an untrustable guest -> Potentially compromised security guarantees of WASIp2

@bjorn3
Copy link
Member

bjorn3 commented Nov 27, 2024

The default permissions WasiCtxBuilder::new() grants the guest are access to a randomness source and access to the host clock and even those can be disabled if you want. Everything else you need to explicitly grant permission. Accessing a directory? You have to use builder.preopened_dir(). Accessing the network? You have to make use builder.socket_addr_check() with a closure that doesn't always return false. Exposing env vars to the guest? You have to use builder.env(), builder.envs() or builder.inherit_env().

@ifsheldon
Copy link
Author

OK, I see your points. The host can provide random stubs even if a component requires excessive interfaces.

My feeling is that there is some inconsistency: Conceptually, a host opts in capabilities that it chooses to offer to a component, e.g., implementing FormatterImports and Formatter::add_to_linker(). The reality is that a host basically has to offer sync::filesystem::types etc for components using libstd (basically equivalent to all components) to run. These interfaces are in fact not opt-in-able, but virtually compulsory for all hosts. Hosts can provide random stubs, though.

Besides, I did some search, I guess for custom panic handler, we'll have to wait for #59222 and #32837 ...........

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. O-wasi Operating system: Wasi, Webassembly System Interface T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants