-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebAssembly backend for rustc #33205
Comments
Look at how C’s Seeing what early stages wasm still is, I think there’s not much point in pondering about such issues. EDIT: in regards of relooping, it might be much more hard to discern original flow constructs from the MIR CFG once we run the transformation passes. |
I'm very excited about this! I think compiling directly to wasm is interesting for a language like rust for several reasons:
Basically, an option to compile directly to wasm should give a big boost in compile times, for some loss of throughput but hopefully not too much, and with hopefully not too much effort. I am of course signing up to do the work on the binaryen side. Practically speaking, emitting wasm for binaryen means emitting an IR whose types are i32, i64, f32, f64, and all the usual math operations on those (note: no odd things like i33 which LLVM has, nor i1 for that matter), but little else. I imagine rustc emits something similar to that when emitting LLVM IR? Then splitting off LLVM vs wasm could be done close to where LLVM is currently emitted in rustc, perhaps? As @brson and @nagisa mentioned, control flow is structured in wasm and binaryen: loops, blocks, breaks, ifs, etc., which is a concern. I think as a first step here, I'll add support for basic blocks/control flow graphs as input to binaryen - the exact same IR as usual, except with basic blocks that end in branches, etc., much the same as LLVM's input would be. And binaryen will create structured control flow from that, much as emscripten does on LLVM cfgs now ("relooping", which can handle arbitrary cfgs), as @brson said. Aside from that, there are a bunch of LLVM features we'll need to add special support for, like invoke/landingpad. Currently wasm doesn't have native support for those things, however, we know we can implement them (with some overhead) in an emulated manner (that's what emscripten does for asm.js), while waiting for native support. So we can just add that support to binaryen as needed. But I'm not sure of the full list of such features that rust needs? Binaryen is a C++ project. I assume rust will want a C API to communicate with? Then a second step (after cfgs in binaryen) for me could be to add a C API to binaryen. My main question for the rust side is whether emitting something at the wasm IR level (+ basic blocks; as described above) is practical for your project? And overall thoughts on the other details I wrote? Sorry for the long comment here, I just think this could be a very cool project, hopefully it makes sense for us to do :) |
As long as the C API is exhaustive. LLVM C API is suitable for small projects but is very lacking otherwise and we end up doing some of the binding ourselves anyway.
Yes, this is certainly something rust needs. That being said, unwinding in Rust is pretty much always the cold path (unlike, say, C++), so, personally, it is fine with me if its not very performant. I’m not sure what else wasm hasn’t that we use, but a few things that are more likely to come up:
An interesting idea by @eddyb was to have a CFG wrapper over plain wasm AST: basically CFG blocks with control flow and the blocks would contain regular wasm. Something to consider.
Rust’s booleans are defined to only have two valid representations (1 and 0), and we could use i1 if that was available. That being said, it is also not a problem to represent them as a i8 (like we do in LLVM most of the time). |
Understood. Yeah, I'll make sure the C API provides everything so that Rust doesn't need any extra binding work.
Thanks for the list! Ok, nothing here sounds like a problem. Wasm already has support for some of those (like abort, ctz, etc.), and I can add compiler-rt type support for the others (most are in emscripten's runtime support libraries anyhow, in JS, which can be linked with automatically, at the cost of some runtime overhead - we can start there and optimize later). Note though that wasm doesn't have threads yet, so atomics don't matter, and we can only start with single-threaded code. But threads + atomics are on the wasm roadmap.
Yeah, that's exactly what I've been thinking too. That way there isn't another special IR, just some additions to the IR for the ends of basic blocks. Then relooping replaces just those as it goes to structured control flow, but everything else is already in the final IR.
Note that wasm doesn't have i8 either - only i32, i64, f32, f64. |
@kripken Does it at least have byte-level memory access? |
@eddyb: Yes, while locals are i32, i64, f32, f64, the load and store operations can load 8, 16, 32 or 64 bits, and can also load signed or unsigned values for integers. So e.g. |
A heads up: I have the beginnings of a WAsm stack floating about. Thus far, I've got it parsing the text format and interpreting the results. I'm going through the reference test cases, implementing stuff as I go. Might possibly be of some use for testing purposes. |
An initial proposal for a C API for binaryen is at WebAssembly/binaryen#427. Feedback is very welcome. Note that this is not the basic-block + relooping API yet. That will be a second step. |
There is now what should be a sufficient C API for binaryen, including relooping/arbitrary CFGs as inputs. Also wrote some docs. |
Out of curiosity, what's the status of mir2wasm? I tried to build it but unfortunately it seems to be broken at the moment. I'm interested in contributing, but I probably only have a handful of hours a week to spare so I want to try to make the most of the time I have. It looks like mir2wasm is forked from miri, but the two code bases seem to have diverged rather significantly. |
@eholk I have a small patch lying around to make it compile on a recent nightly. I will submit it tomorrow. |
Any progress on this? I'm in love with the Rust type system and going back to JavaScript/ES6 just isn't enough. |
@sunjay It's at https://github.com/brson/mir2wasm (which has been linked above too). |
For a more directed, but a lot more incomplete, alternative to mir2wasm, see cyano. |
@ticki Is there a way to actually use cyano at the moment? Also, it will only target JavaScript, not WebAssembly, right? |
It targets JS, right, and apparantly the HEAD commit is broken. I'll fix it. |
Oops, I didn't realize this was specifically about mir2wasm. I've opened another issue to track the progress on the LLVM-based wasm backend. |
cc #36339, a PR for this. |
Now I'm confused, didn't you start this issue specifically with LLVM in mind? Edit: paragraph that made me think that:
|
@benaryorg The MIR->LLVM backend is an enabler for MIR->anything because it is much simpler than the old AST->LLVM backend. It is the rust translation pass done right. So now we can take the lessons from mirtrans to make other backends that target other IRs, and hopefully factor out the common parts. It should get increasingly easier over time to create backends for Rust that don't target LLVM. |
@brson I think this issue can be closed now, seeing how the WASM backend landed in stable half a year ago? |
I'm going to close this in favor of #44006, and we've also got a |
With the new MIR->LLVM backend being all but done, and being cleaner than the old trans, we can seriously think about writing yet more backends. A relatively important, and relatively easy target to translate to is WebAssembly.
The right starting strategy per @eddyb is to fork miri and prototype the new backend out of tree.
One interesting point that @kripken brought up recently is that wasm is an AST with controlflow, whereas MIR is just a graph of blocks. This means that somebody has to "reloop" the controlflow back out of the graph. Most likely binaryen, the C++ tool that
consumes wasm, will just additionally accept a basic-block form of wasm and do the conversion itself.
@kripken is interested in helping with this, so I'm setting up this issue as a place for interested parties to discuss.
cc @tsion @nagisa
The text was updated successfully, but these errors were encountered: