Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation to WASM? #722

Open
RReverser opened this issue May 2, 2019 · 138 comments
Open

Compilation to WASM? #722

RReverser opened this issue May 2, 2019 · 138 comments
Labels
A-meta Not about any part of Miri per se, but about shaping the environment to make something in/with Miri C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out

Comments

@RReverser
Copy link

RReverser commented May 2, 2019

Miri maintainer note: this is a fun project, but not something we currently intend to support officially. To keep maintenance manageable, Miri only supports running on platforms that rustc supports running on.

Compiling the whole Rustc to WASM is a pretty big undertaking for many reasons.

However, Miri doesn't need an actual codegen and many other parts of the whole Rustc, so I wonder how realistic it would be to compile it and the pieces it depends on to WASM instead? Are there any obvious blockers?

Mostly opening this to gauge interest and estimate complexity, as I believe there is an interest in running Rust directly in the browser on playground-like websites.

P.S. Despite what I said in the first sentence, this was actually done for Clang a while ago - https://tbfleming.github.io/cib/ - which includes LLVM compiled to WASM that, in turn, generates more WASM dynamically during runtime. In theory, it should be possible to do the same for Rust, especially since they share LLVM, but for now having just an interpreter could already be an interesting starting goal.

@oli-obk
Copy link
Contributor

oli-obk commented May 3, 2019

I think a first step would be to create a new codegen backend which doesn't actually do any codegen and just dumps the metadata. That way we should be able to build a rustc which doesn't depend on llvm or other C code.

@bjorn3
Copy link
Member

bjorn3 commented May 3, 2019

I think a first step would be to create a new codegen backend which doesn't actually do any codegen and just dumps the metadata.

I had created one in the past but it bitrotted and it was unused, so I removed it in rust-lang/rust#58847. If you copy https://github.com/bjorn3/rustc_codegen_cranelift/blob/11d816c/src/lib.rs#L182-L190 into provide and provide_extern it should work. (Also change target_features_whitelist to contain the same as cg_llvm https://github.com/rust-lang/rust/blob/2d401fb4dc89eaef5b8f31330636094f9c26b4c4/src/librustc_codegen_llvm/llvm_util.rs#L249, otherwise stdsimd wont compile.)

Another thing necessary is replacing the dlopen for loading the codegen backend with a regular extern crate. Eg replace the match at https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/librustc_interface/util.rs#L271 with _ => || Box::new(MetadataOnlyCodegenBackend) or however you call the backend.

@bjorn3
Copy link
Member

bjorn3 commented May 3, 2019

I want to do the same for https://github.com/bjorn3/rustc_codegen_cranelift/, but I want it to pass the rustc test suite first and cranelift doesnt support wasm output yet.

@RalfJung RalfJung added C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps A-meta Not about any part of Miri per se, but about shaping the environment to make something in/with Miri labels May 3, 2019
@RalfJung
Copy link
Member

RalfJung commented May 3, 2019

Intriguing. :) I should add one warning though: Miri isn't a fast interpreter. It's really slow. So I don't think it is actually a good environment to use to run code, I see it as more useful for debugging and testing.

But, don't let me stop you! I just felt I should give you a fair warning. And if ideas like this leak to people making Miri lightning fast while maintaining all the UB checking, I'll be even more happier. :D

@RReverser
Copy link
Author

Miri isn't a fast interpreter. It's really slow. So I don't think it is actually a good environment to use to run code, I see it as more useful for debugging and testing.

That's understandable, but I think it's ought to be good enough for typical playground snippets :)

@RReverser
Copy link
Author

I think a first step would be to create a new codegen backend which doesn't actually do any codegen and just dumps the metadata. That way we should be able to build a rustc which doesn't depend on llvm or other C code.

I guess that's one way, although I was wondering if Miri actually needs the main rustc crate or maybe it could be possible to depend only on some of the finer-grained rustc_* crates and avoid including codegen altogether?

@bjorn3
Copy link
Member

bjorn3 commented May 3, 2019

The codegen backend is necessary for rustc_driver to work. Using a dummy codegen backend (<100LOC mostly copyable from the code I mentioned in #722 (comment)) is a lot easier than duplicating all the things rustc_driver does (>>1000LOC).

@RReverser
Copy link
Author

Fair enough.

@bjorn3
Copy link
Member

bjorn3 commented May 4, 2019

I am currently trying to compile rustc for wasm (https://github.com/bjorn3/rust/tree/compile_rustc_for_wasm), but I am hitting a compiler bug: rust-lang/rust#60540.

@RReverser
Copy link
Author

@bjorn3 I've rebased your branch onto master, updated deps and fixed cfg's from target_env to target_os - you can check it out at https://github.com/RReverser/rust/tree/compile_rustc_for_wasm.

Eventually it compiled successfully, but then ran into the same runtime validation issue with invalid code generated by Rust.

However, I recompiled in release mode and then it passed validation!

That got me thinking it should work now, but running the generated file with wasmtime or wasmer now seems to just hang. Some infinite loop somewhere perhaps?

@RReverser
Copy link
Author

RReverser commented May 11, 2019

@bjorn3 Oh... maybe it's just been taking so long (especially the compilation part). I've tried wasmer with --backend singlepass instead now, and it has actually worked!

$ ./wasmer run target/wasm32-unknown-wasi/release/rustc_binary.wasm --backend singlepass
Usage: rustc [OPTIONS] INPUT

Options:
    -h, --help          Display this message
        --cfg SPEC      Configure the compilation environment
    -L [KIND=]PATH      Add a directory to the library search path. The
                        optional KIND can be one of dependency, crate, native,
                        framework or all (the default).
    -l [KIND=]NAME      Link the generated crate(s) to the specified native
                        library NAME. The optional KIND can be one of
                        static, dylib, or framework. If omitted, dylib is
                        assumed.
        --crate-type [bin|lib|rlib|dylib|cdylib|staticlib|proc-macro]
                        Comma separated list of types of crates
                        for the compiler to emit
        --crate-name NAME
                        Specify the name of the crate being built
        --edition 2015|2018
                        Specify which edition of the compiler to use when
                        compiling code.
        --emit [asm|llvm-bc|llvm-ir|obj|metadata|link|dep-info|mir]
                        Comma separated list of types of output for the
                        compiler to emit
        --print [crate-name|file-names|sysroot|cfg|target-list|target-cpus|target-features|relocation-models|code-models|tls-models|target-spec-json|native-static-libs]
                        Comma separated list of compiler information to print
                        on stdout
    -g                  Equivalent to -C debuginfo=2
    -O                  Equivalent to -C opt-level=2
    -o FILENAME         Write output to <filename>
        --out-dir DIR   Write output to compiler-chosen filename in <dir>
        --explain OPT   Provide a detailed explanation of an error message
        --test          Build a test harness
        --target TARGET Target triple for which the code is compiled
    -W, --warn OPT      Set lint warnings
    -A, --allow OPT     Set lint allowed
    -D, --deny OPT      Set lint denied
    -F, --forbid OPT    Set lint forbidden
        --cap-lints LEVEL
                        Set the most restrictive lint level. More restrictive
                        lints are capped at this level
    -C, --codegen OPT[=VALUE]
                        Set a codegen option
    -V, --version       Print version info and exit
    -v, --verbose       Use verbose output

Additional help:
    -C help             Print codegen options
    -W help             Print 'lint' options and default settings
    -Z help             Print unstable compiler options
    --help -v           Print the full set of options rustc accepts

@bjorn3
Copy link
Member

bjorn3 commented May 11, 2019

Oh... maybe it's just been taking so long (especially the compilation part).

Yes, it takes several minutes to compile it using wasmtime with cranelift as backend.

However, I recompiled in release mode and then it passed validation!

🎉 🎉 🎉

and it has actually worked!

I tried actually compiling something, but it errors with:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: Other, error: StringError("operation not supported on wasm yet") }', src/libcore/result.rs:999:5
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

I am currently trying to figure out were it errors.

@bjorn3
Copy link
Member

bjorn3 commented May 11, 2019

Places needing patching:

  • librustc_interface/util.rs: spawn_thread_pool must not spawn a thread.
  • librustc/session/filesearch.rs: get_or_default_sysroot must not be called, as it needs std::env::current_exe. I am now passing an explicit --sysroot.
  • librustc/session/mod.rs: build_session_ local working_dir is created using std::env::current_dir.

Edit: pushed bjorn3/rust@15f980f (based on @RReverser's branch).

Now it errors at

thread 'main' panicked at 'unknown codegen backend llvm', src/librustc_interface/util.rs:277:18

Which is expected, as I had to remove the codegen backend dynamic loader. Will try to get https://github.com/bjorn3/rustc_codegen_cranelift to work with it.

bjorn3 added a commit to rust-lang/rustc_codegen_cranelift that referenced this issue May 11, 2019
@RReverser
Copy link
Author

Places needing patching

FWIW previously (before even filing this issue) I tried compiling rustc with Emscripten instead, which should, in theory, reduce number of these places to patch, as it supports a bit more than WASI does. Haven't gotten too far though, because I tried to build completely unpatched rustc and there were few things that still didn't compile and probably needed similar fixes as in your branch.

Which is expected, as I had to remove the codegen backend dynamic loader. Will try to get bjorn3/rustc_codegen_cranelift to work with it.

I thought the plan was to build it without any codegen, just with miri? Or do you want to build an actual full rustc?

@bjorn3
Copy link
Member

bjorn3 commented May 11, 2019

I thought the plan was to build it without any codegen, just with miri? Or do you want to build an actual full rustc?

I want them both. :) I currently have rustc_codegen_cranelift hooked up, but rustc gives an error before rustc_codegen_cranelift can do actual codegen: can't find crate for `std` . Supporting miri will need that error to be fixed too.

@RReverser
Copy link
Author

can't find crate for std

Yeah for that I think you'll need to do the proper build (via x.py build) to build all components. I haven't had much luck with that yet due to failures in other crates which probably also need to be patched similarly to rustc itself.

@bjorn3
Copy link
Member

bjorn3 commented May 11, 2019

Yeah for that I think you'll need to do the proper build (via x.py build) to build all components.

Seems like it doesn't even reach the rustc version check for the libraries. I added --sysroot $(rustc --print sysroot) and disabled the rustc version check, but it still gives the same error.

@bjorn3
Copy link
Member

bjorn3 commented May 11, 2019

Switching from wasmer to wasmtime fixed it. It even got to the beginning of codegen.

Edit: filled wasmerio/wasmer#434.

@bjorn3
Copy link
Member

bjorn3 commented May 11, 2019

I am currently working on making miri compile for wasi, which this issue was actually about.

@bjorn3
Copy link
Member

bjorn3 commented May 11, 2019

It seems to trap while calling the ecx.run. :(

error while processing main module ../../target/wasm32-unknown-wasi/release/rustc_binary.wasm: Instantiation error: Trap occurred while invoking start function: wasm trap at 0x2a881eb82

I pushed the wip stuff to my branch.

@RReverser
Copy link
Author

@bjorn3 Left a comment on your MIRI commit on your branch.

@RReverser
Copy link
Author

@bjorn3 But also, I'm not sure why rustc is now depending on miri... shouldn't it be the other way around? (like in non-WASI version)

@bjorn3
Copy link
Member

bjorn3 commented May 12, 2019

I did that to be able to prevent having to recompile every rustc crate, which is slow and to prevent having to copy all files in the dir layout rustc wants a sysroot to be.

@RReverser
Copy link
Author

I'm not sure I understand what you're saying... neither should be affected by which crate you compile as an entry point.

I've changed my local copy of Rust & MIRI to do just that, and got miri.wasm successfully, but yeah, also hitting some trap.

@bjorn3
Copy link
Member

bjorn3 commented May 12, 2019

I meant that I had already compiled all crates in rust/target. When switching to miri as crate root, I would have to recompile all crates into miri/target. By keeping rustc-binary as crate root, I could reuse rust/target as target dir.

@bjorn3
Copy link
Member

bjorn3 commented Jan 25, 2024

Opened rust-lang/rust#120348 to upstream most of the build system changes, pushed rust-lang/rustc_codegen_cranelift@7d3b293 to upstream a cg_clif change and managed to somewhat reduce the diff on the compile_rustc_for_wasm13 branch. The second commit on that branch uses the wasm32-wasi-preview1-threads target to further reduce the diff, but requires wasi-threads, which browser_wasi_shim currently doesn't support.

@bjorn3
Copy link
Member

bjorn3 commented Feb 21, 2024

rust-lang/rust#120348 has been merged. Opened rust-lang/rust#121392 to make patching away dlopen usage easier. I rebased the compile_rustc_for_wasm13 branch and removed a couple of unnecessary changes.

Edit: rust-lang/rust#121392 has been merged too. Rebased.

@bjorn3
Copy link
Member

bjorn3 commented Feb 22, 2024

image

@jeff-hykin
Copy link

image

Got a link to the html/js for this?
This looks incredible even with all the caveats I'm sure it has!

@bjorn3
Copy link
Member

bjorn3 commented Feb 23, 2024

I took https://github.com/bjorn3/browser_wasi_shim/blob/main/examples/rustc.html and made it invoke the miri I built from the second to last commit of the compile_rustc_for_wasm13 branch instead.

@jeff-hykin
Copy link

jeff-hykin commented Feb 23, 2024

For others trying to run it, thats

@Nadrieril
Copy link
Member

Somehow no one pasted this here: https://garriga.dev/rubri/

Rubri is a proof-of-concept wrapper for Miri that runs in the browser.

cc @LyonSyonII

@bjorn3
Copy link
Member

bjorn3 commented Sep 25, 2024

New version available at the compile_rustc_for_wasm16 branch of https://github.com/bjorn3/rust. @oligamiq created a PR to use @whitequark's LLVM fork, so it should now be able to build for wasm, though I haven't tested linking yet. It also currently requires using wasm32-wasip1-threads, which doesn't work with browser_wasi_shim yet. You can still use the cg_clif backend with wasm32-wasip1 without threads instead by omitting --config config.llvm.toml when running ./x.py install.

@whitequark
Copy link
Member

FWIW, I'm happy to see this effort and will support it from my side as much as I'm able to. Let me know if you need any further adjustments LLVM-side (though I'm sorta waiting on consensus on handling pthreads shims in wasi-libc at the moment).

@bjorn3
Copy link
Member

bjorn3 commented Sep 25, 2024

The pthread shims thing (or adding wasip1-threads support to browser_wasi_shim) is the biggest blocker for getting this working in the browser. For complete end-to-end execution in the browser, lld would also need to be supported. I haven't checked if it already works though. Would need to rebuild LLVM again to check. And I did probably need to patch the rust standard library to support a custom host call for spawning processes such that rustc can actually invoke lld.

@oligamiq
Copy link

Regarding browser_wasi_shim's support for wasm32-wasip1-threads, it is currently under development. We're just one step away from the final stage.

rustc creates thread 1, and from there it generates 4 threads. However, after closing threads 2–5, it throws in thread 1, which causes the main wasm to not terminate. In order to terminate the currently running wasm along with its threads, the plan is to close all child threads when a throw occurs and make this observable. I plan to finish this tomorrow.

mpsc::channel is working properly. When outputting 1–1000 from both the main thread and the threads, the output alternates to some extent, so the functionality should be at a sufficient level.

However, since no thread pool has been created and files are accessible from multiple workers, the speed has significantly dropped...

I feel bad for Bjorn since I continued working after submitting the pull request when I made the file system accessible from multiple workers. Currently, I'm working on a different branch.
https://github.com/oligamiq/browser_wasi_shim/tree/wasi_multi_threads_rustc

※Require SharedArrayBuffer

@bjorn3
Copy link
Member

bjorn3 commented Sep 25, 2024

I hadn't looked at your browser_wasi_shim PR yet as I assumed it to be under heavy development. Let me know when it is ready for review/ready for me to try it out.

@whitequark
Copy link
Member

For complete end-to-end execution in the browser, lld would also need to be supported. I haven't checked if it already works though.

I've looked into that. Ideally, LLD would be built as a library and linked into the rustc binary. LLD already has a function you can call to run its main routine in-process, so you should probably use that instead of a hostcall; this has the advantage that the resulting rustc is runtime-agnostic. (I would be happy to ship yowasp-rustc for example! It would be great for firmware development.)

@bjorn3
Copy link
Member

bjorn3 commented Sep 25, 2024

Ideally, LLD would be built as a library and linked into the rustc binary.

Do you have any pointers on how to do that?

I would be happy to ship yowasp-rustc for example! It would be great for firmware development.

Be aware that rustc.wasm is pretty large at 134MB. And it needs another 72MB for the standard library compiled to wasm32-wasip1-threads.

@whitequark
Copy link
Member

Do you have any pointers on how to do that?

How familiar are you with LLVM? You're looking to call lld_main. If you can link to the right object, it seems self-explanatory after that.

Be aware that rustc.wasm is pretty large at 134MB. And it needs another 72MB for the standard library compiled to wasm32-wasip1-threads.

That sounds about right. I woulde expect LLD to add ~50M more, so ~250M before compression, maybe ~100M after. In my view this is still worth it since the object can be cached indefinitely at the client. The distribution infrastructure I'm using supports such large packages just fine.

Have you seen my FPGA toolchain integration with VS Code? The packages used in the demo are a ~50M download in total (from memory), which is a little high but is fine if you have broadband. ~100M of rustc isn't such a stretch, and I was planning on shipping yowasp-clang that would be of a similar size, anyway.

@bjorn3
Copy link
Member

bjorn3 commented Sep 25, 2024

How familiar are you with LLVM? You're looking to call lld_main. If you can link to the right object, it seems self-explanatory after that.

Is enabling building of the lld standalone tool enough to build whatever static library I need to link against for lld_main to work?

@whitequark
Copy link
Member

Is enabling building of the lld standalone tool enough to build whatever static library I need to link against for lld_main to work?

I think so, since the lld executable is a really thin wrapper (which is, iirc, autogenerated) over the function I linked to. But I would personally still scour the build system to see which target it is that does it.

@whitequark
Copy link
Member

So after looking into it, I think you should probably build the lldWasm library and invoke lld::wasm::link, provided you can add arbitrary C++ (which IIRC you can in rustc). The lld_main I linked to earlier doesn't seem to be properly exported, and it also drags in ELF, COFF, and MachO linkers that you may not care about at all.

@bjorn3
Copy link
Member

bjorn3 commented Sep 26, 2024

Managed to get linking using lld working: bjorn3/rust#8

@x0k
Copy link

x0k commented Oct 12, 2024

Somehow no one pasted this here: https://garriga.dev/rubri/

I also use miri (and a few other solutions) here to run various programming languages in the browser: https://x0k.github.io/ppp/editor

@whitequark
Copy link
Member

whitequark commented Oct 12, 2024

And you can use the YoWASP VS Code toolchain if you want a full editor in your browser when running programming language compilers.

@oligamiq
Copy link

https://github.com/oligamiq/rubrc
Demo project that rustc compiling code on browser.
https://oligamiq.github.io/rubrc/

@bjorn3
The embedded linker doesn't work unless it's wasm32-wasip1. Is there any way to resolve this?

Also, regarding the environment variable WASI_SYSROOT, it is required by cc, so it cannot be removed. Although the compilation can still proceed without it, unresolved imports such as __cxa_uncaught_exceptions, __cxa_decrement_exception_refcount, and __cxa_thread_atexit remain, preventing the wasm from running. For this reason, I wrote code that treats its absence as a compilation error.

@bjorn3
Copy link
Member

bjorn3 commented Nov 25, 2024

The embedded linker doesn't work unless it's wasm32-wasip1. Is there any way to resolve this?

wasm32-wasip2 uses wasm-component-ld, which is a wrapper around lld. It should be possible to integrate it directly into rustc, though depending on how exactly wasm-component-ld is designed, it may require patching wasm-component-ld itself. As for native targets, it is harder than conventional cross-compilation as those generally depend on gcc or clang wrapping the linker.

@oligamiq
Copy link

For native targets, it is harder than conventional cross-compilation as those generally depend on gcc or clang wrapping the linker.

LLVM's ELF, COFF, and Mach-O linker should be included as well, but is there no way to make use of them?

@bjorn3
Copy link
Member

bjorn3 commented Nov 27, 2024

You can use lld for non-wasm targets with rustc.wasm, but on Unix targets you don't normally directly invoke the linker. Instead what you invoke is gcc or clang, which in turn invoke the linker with the right arguments. Rustc doesn't (yet) support directly invoking the linker for Unix targets as it doesn't know all the arguments that are necessary to successfully link that gcc or clang would pass to the linker.

@oligamiq
Copy link

Oh, it seems I may need to pull in Clang or contribute to the Rustc code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-meta Not about any part of Miri per se, but about shaping the environment to make something in/with Miri C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out
Projects
None yet
Development

No branches or pull requests