Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port the JS WASI polyfill to wasi-common #520

Closed
sunfishcode opened this issue May 18, 2019 · 19 comments
Closed

Port the JS WASI polyfill to wasi-common #520

sunfishcode opened this issue May 18, 2019 · 19 comments
Labels
help wanted wasi:impl Issues pertaining to WASI implementation in Wasmtime

Comments

@sunfishcode
Copy link
Member

Wasmtime has a JS polyfill implementation of WASI:

https://github.com/CraneStation/wasmtime/tree/polyfill/wasmtime-wasi/js-polyfill

which works by compiling wasmtime's C implementation of WASI with Emscripten, and then adding a bit of support code on top of that.

We should port the support code to work on top of wasi-common, compiled with wasm32-unknown-emscripten. In theory, all the big pieces are in place -- wasi-common provides the same functionality as Wasmtime's C implementation of WASI, Emscripten can provide the same underlying support in both cases, and the support code should roughly be the same. So the work here is in rewiring everything up in wasi.js, and porting polyfill.c to Rust, probably splitting out that EM_ASM into a standalone JS file.

I've marked this "help wanted" -- if someone's interested in doing this port, that'd be great (and I can help mentor), but it'd even help just to have more ideas for packaging, binding, or other ways we could make the polyfill better. The current support code and shell.html is super minimal.

@dingxiangfei2009
Copy link

@sunfishcode I would like to indicate my interest in helping here.

I have a question here. I have been experimenting with file descriptor syscalls in the current js-polyfill. Since WASM has yet to support 64-bit integers, WASI guests cannot make direct calls to functions like __wasi_fd_seek because __wasi_filedelta_t is 64-bit. So how would one make these calls? Currently I have a shim on top of these functions and split these 64-bit arguments up into two halves. It works, albeit being ugly.

@dingxiangfei2009
Copy link

Okay, so I want to check my understanding with you here. In order to implement the same polyfill in wasi-common, we need to cross-compile wasmtime-wasi crate in https://github.com/CraneStation/wasmtime. Especially important is this module https://github.com/CraneStation/wasmtime/blob/5164994ce1c6e76af482906357f2ad3e07ddad35/wasmtime-wasi/src/syscalls.rs#L20. We need to swap out the references to VM with emscripten implementation.

@sunfishcode
Copy link
Member Author

Concerning WASI's use of 64-bit ints: in the long term, JS engines will support this via bignums. In the short term, we're hoping that shims can be sufficient. The shims may get a little ugly, but I don't know of any better solution, short of changing the WASI APIs, which would also be ugly.

The wasmtime-wasi crate is just glue to connect the WASI implementation with Wasmtime's runtime data structure, so I don't think you need it.

I think the thing to start with is compiling wasi-common with wasm32-unknown-emscripten. That should give you a library containing C symbols prefixed with wasi_common_, such as wasi_common_fd_read. These are generated by a macro, wasi_common_cbindgen, so they should have C-like API compatible with WASI's external C API, except for the addition of the WasiCtx and memory arguments to functions which need access to extra state.

The next step is to start adapting the current polyfill's wasi.js to call that library. I think the main things to do are:

  • change from ___wasi_* functions to wasi_common_* functions (or maybe _wasi_common_*, because Emscripten prepends _ to external names?)
  • add WasiCtx and memory arguments as needed (see the signatures in src/hostcalls in wasi-common)
  • adapt polyfill.c. This may require adding a new init function to replace fd_table_init, to allocate the WasiCtx and memory.

@dingxiangfei2009
Copy link

@sunfishcode Thank you very much for the clear instruction.

If I understand correctly, we are relying on the nix crate for the syscalls. Unfortunately nix @ 0.14.1 has yet to support cfg(target_os="emscripten"). I will prepare a patch to nix for that first, if you don't mind.

@kubkon
Copy link
Member

kubkon commented Jul 29, 2019

Hey @dingxiangfei2009, some (if not most) of our syscalls impl does indeed depend on the nix crate, although we are now working towards reusing as much Rust's libstd as possible. This way, libstd will take care of a lot of platform-specific tricky details for us :-)

@dingxiangfei2009
Copy link

dingxiangfei2009 commented Jul 31, 2019

@kubkon I see. I assume that CraneStation/wasi-common#16 is related as well, so I will watch that list of functions.

Also, would you mind giving some hints on which function can reuse libstd? I have, for example, investigated host calls in fs.rs, but it would be demanding to switch to libstd since they are handling raw file descriptors while libstd does not accept raw file descriptors in most cases.

@kubkon
Copy link
Member

kubkon commented Jul 31, 2019

@kubkon I see. I assume that CraneStation/wasi-common#16 is related as well, so I will watch that list of functions.

Also, would you mind giving some hints on which function can reuse libstd? I have, for example, investigated host calls in fs.rs, but it would be demanding to switch to libstd since they are handling raw file descriptors while libstd does not accept raw file descriptors in most cases.

That's true, but as it turns out, we don't need to operate on raw descriptors at all times anymore. Instead, it's preferable to use std::fs::File which is essentially a wrapper around a raw FD or Windows Handle. And so, to provide a nice example, fd_datasync previously did a call to either nix::unistd::fdatasync or nix::unistd::fsync depending on your *nix platform, whereas now it uses std::fs::File::sync_data which internally, on a *nix platform calls one of the above for us, and, as an added bonus, it also performs the correct syscall on Windows.

So all in all, the current trend is to move away from relying on raw descriptor manipulation as much as possible (of course, sometimes, like in the case of path manipulation, it might not be possible).

I hope that made it a bit clearer. If not, fire away and I'll try to explain it more! :-)

@dingxiangfei2009
Copy link

dingxiangfei2009 commented Aug 3, 2019

@kubkon Thanks for the explanation, and I managed to get some findings after poking around. I have looked deeper into src/hostcalls_impl module and I have not find a syscall that can be switched over. Some of them do not have windows support even, where unimplemented! was filled in at poll_oneoff, clock_res_get, etc.

Access mode of file descriptors look promising and OpenOptions could be providing the necessary information. However, a File does not expose them after it is created.

Maybe I am overlooking something. Would you mind giving a direction here, like exactly which syscall can be "simplified" with libstd?

Speaking of overlooking stuff, I did find that fd_filestat_set_size can be adapted to File::set_len instead. Sorry! Anyway, I am eager to hear suggestions about other "simplification" opportunities!

@dingxiangfei2009
Copy link

Also, since emscripten target appears to be compatible with linux target anyway, wouldn't it be easier to just patch nix to support target_os="emscripten"? Of cause, I could be very wrong here about this and please correct me here if so.

@kubkon
Copy link
Member

kubkon commented Aug 3, 2019

@dingxiangfei2009 patching nix is a good idea in and of itself, that's for sure, so please feel free to do so -- it'll be a very worthwhile contribution, I'm sure of it :-) As far as our use of nix in wasi-common is concerned, perhaps it'll help if I outline what is where in the project (I probably should've done that in the first place, apologies!):

All in all, I think that your plan of patching nix first is a good idea (let me know if you need any help with that!). Afterwards, we can start figuring out how to enable emscripten target in wasi-common and slowly start porting over things mentioned by @sunfishcode. How's that sound everyone @dingxiangfei2009 @sunfishcode? :-)

@kubkon kubkon transferred this issue from CraneStation/wasi-common Nov 8, 2019
@kubkon kubkon added help wanted wasi:impl Issues pertaining to WASI implementation in Wasmtime labels Nov 8, 2019
@kubkon
Copy link
Member

kubkon commented Nov 24, 2019

After putting in some thought into this and doing a little experiment with nix, patching up the nix crate so that it compiles for the wasm32-unknown-emscripten target is not such a trivial task after all as I originally assumed. It is doable, however it will not be immediately testable which IMHO makes this task infinitely more tedious and convoluted than originally planned. Then, even after it is patched, it will probably take some (long?) time for it to make it to a release, and I'd really like to have wasi-common compile to Emscripten sooner rather than later.

My proposition here is that we remove nix as a dependency altogether and write our own wrappers around the libc functions - after all, we've already kinda started giving up on nix anyhow when @marmistrz started refactoring fd_readdir on Linux and implementing it on Windows. Besides, we don't actually need that much of nix's machinery; we're only ever using a couple of wrappers which should be trivially wrapped by us. I'm happy to do the heavy-lifting here myself, but would appreciate code review and later help in refactoring bits here and there. As an added bonus of this approach, we should be able to wrap up all uses of libc functions we currently use in wasi-common, making the implementation so much cleaner rather than like now, when some use raw libc and some rely on nix.

cc @sunfishcode @marmistrz @peterhuene

@kubkon
Copy link
Member

kubkon commented Nov 24, 2019

Actually, I'd appreciate your feedback @alexcrichton on this as well.

@marmistrz
Copy link
Contributor

If nix is an obstacle, then I think dropping the dependency on nix is a sound decision, especially that we have a slightly different understanding of unsafe than the nix guys.

Could we possibly keep the Rusty wrappers in a separate crate, similar to winx, and publish it on crates.io? I propose the name: younix!

@kubkon
Copy link
Member

kubkon commented Nov 24, 2019

I agree on keeping the wrappers in a separate crate :-)

@alexcrichton
Copy link
Member

This sounds like a reasonable solution to me, it's likely inevitable that there's a sys/* style hierarchy as libstd has, and emscripten would be "Just Another Platform" that can be targeted.

@sunfishcode
Copy link
Member Author

Me too 🙂

@kubkon
Copy link
Member

kubkon commented Nov 25, 2019

It’s settled then! I’ve already started prepping some ground work and should be able to deliver a working (not necessarily final mind you) solution, i.e., wasi-common nix-free, soon-ish.

@marmistrz, how about yanix for a name? Yet another nix crate... 😉

@marmistrz
Copy link
Contributor

marmistrz commented Nov 26, 2019

@kubkon yanix is missing the wordplay (and I love wordplays! :)), but that's fine for me too.

@sunfishcode
Copy link
Member Author

Closing this old issue, as the JS polyfill referenced here is no longer in use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted wasi:impl Issues pertaining to WASI implementation in Wasmtime
Projects
None yet
Development

No branches or pull requests

5 participants