preliminary wasm32 support for git-pack #735

Byron · 2023-02-14T10:02:58Z

The goal of this PR is to make accelerated pack resolution available in WASM. This includes the following steps

an iterator of entries that were decoded from a pack (the pack decoding is excluded here)
resolve thin-packs on the fly
create a an index for accelerated pack resolution
run pack resolution with user-definable code (it will see the decoded object in full to do whatever it needs to)

From there it should be possible to have a WASM server receive a pack assuming it has a way to decode the pack into entries.
Maybe the decoding step must be possible here too for convenience.

Tasks

CI validation of WASM support (or the lack of it)
Crates that work already and should keep working

It seems it's best to extract parts into their own crate

Tasks for Extraction

All of what follows should compile asWASM-unknown-unknown target

Research

Blockers: everything with IO, namely

git-lock and git-tempfile
libc seems to support only wasm32-unknown-emscripten

To circumvent, some crates might have to be split. Problem here is type ownership - WASM compatible crates probably shouldn't own the type in question so must provide pure functions with a lot of parameters or contexts.

There might be duplication of documentation unless these are just referring to each other. It's still a slightly strange setup to have WASM in different crates, but inclusion seems easier to handle than exclusion.

Learnings

using wasm32-wasi seems to generally produce better error messages as it supports more out of the box. This can pinpoint locations where incompatible crates are being used.

Interesting Reads

relationship between std and asm
Could WASI help?

jeffparsons · 2023-02-14T23:40:11Z

@Byron How closely have you been following Wasm/WASI developments?

WASI "preview 2" is coming soon, and it changes a lot — in particular, it is being rebased on top of the WebAssembly Component Model. If you are only writing programs that target wasm32-wasi then you won't need to care much about this (other than knowing that when rustc learns about preview 2, you'll be able to do a lot more out of the box, e.g. networking) but if you want to be able to build independently-deployable Wasm modules for different bits of of Gitoxide, then waiting for snapshot2 to land and the surrounding tooling to mature might be worthwhile.

What are the main use cases you have in mind? I may be able to sketch a concrete example of what I'm talking about.

Byron · 2023-02-15T06:44:49Z

Thanks so much for chiming in @jeffparsons, it's much appreciated!

I have updated the PR description to be more informative. Thus far I thought the target must be wasm32-unknown-unknown because the crate has to integrate with other crates that compile to that target as well.

I am looking forward to hearing how to best do that, I am definitely very green here and only have a minimal understanding of what needs to be done.

Thank you

Byron · 2023-02-15T12:30:55Z

It looks like wasi (as vendor) and unknown (as vendor) don't make much of a difference with the pre-requisites of the respective git-pack types, which is very helpful. That way it should be possible to check for target_arch throughout the git-pack crate to opt-in certain parts of the code that can already work.

For now failure is allowed as no work was done, but this should confirm the crate can at least be compiled to that target. We try different targets, including WASI, for good measure, and already build crates that are naturally working.

…n-unknown

…wasm32. It's a breaking change because we also start using the `dep:` syntax for declaring references to optional dependencies, which will prevent them from being automatically available as features. Besides that, it adds the `wasm` feature toggle to allow compiling to `wasm32` targets.

jeffparsons · 2023-02-17T11:14:32Z

I still don't have a great understanding of how you're intending for this to be used (e.g. what crates it needs to integrate with, the shape of that integration, what assumptions about the world those other crates make), so to start I'll just summarize the main ways I imagine Gitoxide being used with Wasm and my best understanding of what each would look like, and maybe we can explore from there.

Hopefully there's something helpful in here... 🤞

(1) Gitoxide is consumed as a crate by other Rust code, targeting core Wasm

Actual compilation target might be wasm32-unknown-unknown, wasm32-unknown-emscripten, or wasm32-wasi, but you don't make any assumptions other than what's supported by wasm32-unknown-unknown — i.e. core Wasm, and nothing else. (The other targets are supersets of this one.)

Any IO your own code tries to perform will (IIRC) either error or panic; the parts of std that deal with IO are stubbed out just enough to allow programs to compile. This means that the IO has to be handled by something other than Gitoxide. E.g. Gitoxide might accept something that is Read, but otherwise leave it up to the program using Gitoxide to figure out how to provide that.

The other aspect of IO that has to be handled outside of Gitoxide here is calling to/from the host (and other stuff run by the Wasm host, e.g. JavaScript). There are things like wasm-bindgen that allow you to effectively layer an ABI on top of wasm32-unknown-unknown by generating code on the Rust side and JavaScript side, but I see these as a stopgap that won't need to exist for much longer.

If this is something you want to support, there's no harm in doing it, because support for everything else (Emscripten, WASI, etc.) can be additive.

(2) Gitoxide is consumed by JavaScript code via Emscripten on the web

Compilation target is wasm32-unknown-emscripten. This builds upon (1).

This is the one I know the least about. I also think of this as mostly a stopgap, but I don't know enough about what it has to offer to dismiss it out of hand. I imagine that once the WebAssembly Component Model (see below) matures that all the things Emscripten does for Wasm can slowly be split out into WebAssembly Components.

(3) Gitoxide is consumed as a crate by other Rust code, targeting WASI

Compilation target is wasm32-wasi. This builds upon (1).

Now you can actually use IO features from std, albeit not all of them right now. Rustc's current wasm32-wasi target is based on wasi_snapshot_preview1, which is soon to be superseded by wasi_snapshot_preview2. preview2 rebases WASI on top of the WebAssembly Component Model, and is not compatible with preview1. (Although there is a polyfill in the works to ease the transition.)

In this scenario, however, you don't care that everything will change under the hood when rustc moves over to targeting preview2. All you care is that a few more things in std will magically start working. (Sockets? Threads? — Whatever is ready when they ship preview2.)

(4) Gitoxide is consumed as a WebAssembly component

Compilation target is wasm32-wasi. You can't actually do this ergonomically yet, but should be able to Soon™. This builds upon (3).

In this final scenario you are supporting people who want to use Gitoxide from their language of choice, and who are either embedding a Wasm runtime in their program or shipping their own program as a Wasm module (possibly a Wasm Component) to be run in some other runtime, e.g., Wasmtime or a web browser.

You would presumably rely on WASI for IO (via std), but could also choose to ship a "no IO" version too if you wanted to do that for some reason. (WASI builds on top of Wasm Components, which build on top of "core Wasm".)

You would probably define your Wasm Component's interface using the WIT language, and then use something like cargo-component to define and build your Wasm Components in Rust, and something like jco if you want to help people use it on the web.

Wasm and the Component Model are defined in a really neatly layered and modular way, which lets you polyfill/virtualize a lot of things. E.g. if someone wants to use your code in some non-WASI environment, they could use one of the tools available for linking multiple Wasm components together to produce a single Core Wasm binary that, e.g., assumes the presence of Emscripten, or some other virtual system environment altogether. So not everyone needs to jump on supporting WASI or even the Component Model immediately for it to be useful. You could start writing WebAssembly Components very soon, and run them in web browsers years before those web browsers decide they want to introduce native support for WebAssembly Components.

Conclusion

If I were making this decision and I didn't have a specific reason to support Emscripten, then I would probably skip that sort of thing entirely. I would start out supporting just (1) and (3), and then as soon as rustc starts targeting preview2 I would start defining a Wasm Component (or collection of them with dependencies much like the Gitoxide crates) that people could either use directly in a runtime that supports them, or "repackaged" to support environments like the web. By this point Wasm Component registries will exist, and you could publish your Components to those and/or have people download them from GitHub Releases or whatever.

On the other hand, if you do have some existing Emscripten based project you'd like to support, or just want to have a web based demo up and running sooner rather than later, supporting Emscripten might not actually be much work — I'm just not familiar enough with it to make any meaningful comment.

Byron · 2023-02-17T11:30:33Z

❤️🙏

I have linked this analysis from the tracking the tracking ticket as well to be sure to have it should I venture further down this path. For now, this isn't planned though but I am sure it will eventually happen. 2024, maybe, I should have need for running more code in WASM as well.

Byron force-pushed the git-pack-wasm branch 7 times, most recently from 994717e to f5e89fe Compare February 14, 2023 10:19

Byron force-pushed the git-pack-wasm branch from f5e89fe to ee56ed9 Compare February 15, 2023 11:45

Byron changed the title ~~preliminary WASM support for git-pack~~ preliminary wasm32 support for git-pack Feb 15, 2023

Byron force-pushed the git-pack-wasm branch 3 times, most recently from 455af8b to 4f213b4 Compare February 15, 2023 12:27

Byron force-pushed the git-pack-wasm branch 6 times, most recently from 5f079a3 to f151dda Compare February 15, 2023 16:55

Byron mentioned this pull request Feb 15, 2023

WASM support #463

Open

17 tasks

Byron added 3 commits February 15, 2023 19:04

CI validates WASM support

0d4b804

For now failure is allowed as no work was done, but this should confirm the crate can at least be compiled to that target. We try different targets, including WASI, for good measure, and already build crates that are naturally working.

feat: add wasm feature toggle to allow compilation to wasm32-unknow…

f0e40ec

…n-unknown

Byron force-pushed the git-pack-wasm branch from f151dda to 6c4c196 Compare February 15, 2023 19:07

Byron merged commit 4bc19d1 into main Feb 15, 2023

Byron deleted the git-pack-wasm branch February 15, 2023 19:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preliminary wasm32 support for git-pack #735

preliminary wasm32 support for git-pack #735

Byron commented Feb 14, 2023 •

edited

Loading

jeffparsons commented Feb 14, 2023

Byron commented Feb 15, 2023

Byron commented Feb 15, 2023

jeffparsons commented Feb 17, 2023 •

edited

Loading

Byron commented Feb 17, 2023

preliminary wasm32 support for git-pack #735

preliminary wasm32 support for git-pack #735

Conversation

Byron commented Feb 14, 2023 • edited Loading

Tasks

Tasks for Extraction

Research

Learnings

Interesting Reads

jeffparsons commented Feb 14, 2023

Byron commented Feb 15, 2023

Byron commented Feb 15, 2023

jeffparsons commented Feb 17, 2023 • edited Loading

(1) Gitoxide is consumed as a crate by other Rust code, targeting core Wasm

(2) Gitoxide is consumed by JavaScript code via Emscripten on the web

(3) Gitoxide is consumed as a crate by other Rust code, targeting WASI

(4) Gitoxide is consumed as a WebAssembly component

Conclusion

Byron commented Feb 17, 2023

Byron commented Feb 14, 2023 •

edited

Loading

jeffparsons commented Feb 17, 2023 •

edited

Loading