Initial Wasm runner implementation #173

leizaf · 2023-10-05T04:32:09Z

Still untested, but I might have a very basic proof of concept ready. Currently it requires the user to define and export buffers to transfer tensors to/from wasm. The alternative is to pass pointers and lengths manually, which I think is more complicated, and doesn't allow for multiple returns. I used color_eyre temporarily for quick error handling, let me know what error handling system you are using and I can refactor to that. Also I'm not super sure how to test the runner, I have a half completed test written.

VivekPanyam

This is a good start. Here are a few high level comments:

Currently it requires the user to define and export buffers to transfer tensors to/from wasm. The alternative is to pass pointers and lengths manually, which I think is more complicated, and doesn't allow for multiple returns.

Unfortunately, the current approach is not particularly flexible as many things cannot be dynamic (e.g. number of return tensors, number of inputs, shapes of tensors, etc.).

I'd recommend reading about the WebAssembly Component Model and then take a look at wit-bindgen.

The component model allows you to define more sophisticated interface types and defines a canonical ABI so that Wasm modules implemented in several languages can communicate in a consistent way. This is somewhat similar to the interface definitions we were talking about in #164.

We generally want to support the same infer interface as the rest of Carton (the input is an arbitrary number of named Tensors and the output is an arbitrary number of named Tensors). This should be possible using WIT.

Can you try to get a prototype working using the component model/WIT? wasmtime has support for WIT so you shouldn't have to change runtimes.

leizaf · 2023-10-05T18:58:57Z

@VivekPanyam What do you think of the .wit for tensors I drafted up. And infer could just be:

infer func(in: list<tuple<string, tensor>>) -> list<tuple<string, tensor>>

or just a list of tensors.

I did consider using the component model initially, but I wasn't sure how developed tooling around it is yet.

VivekPanyam · 2023-10-05T19:18:25Z

That infer signature looks good to me and the .wit file looks good too!

One thing to note in the interface is that since you're using list<u8> or list<string>, we actually don't need strides.

In the future, ideally we'd return an address/pointer into Wasm memory for the buffer field (along with strides). That'll help avoid an extra copy in cases where the model's output isn't contiguous/doesn't have "standard" strides.

We could just make buffer a u64 or something and treat it as an offset into Wasm linear memory, but then we'd have to explicitly handle lifetimes. We can explore that as an optimization later and just stick with list for now.

leizaf · 2023-10-06T18:26:22Z

@VivekPanyam What's the best way to copylessly create a Tensor from Vec<T>? Also I added 2 methods to TensorStorage, are those alright?

VivekPanyam

@VivekPanyam What's the best way to copylessly create a Tensor from Vec<T>?

Does the WIT interface require you to get a Vec<T> as output or is there a way to get a slice? That would let you copy out of Wasm memory directly into a new Tensor.

As far as I'm aware, we need to do at least one copy on the output path (to copy out of Wasm linear memory into something else). If we can make it exactly one, that would be ideal (i.e. without an intermediate Vec).

If you don't see a way to do this, don't worry about it and we can optimize later.

(I added one other comment to answer your other question, but I didn't review the whole PR since it looks like it's still in progress)

source/carton-runner-interface/src/do_not_modify/storage.rs

leizaf · 2023-10-07T06:15:05Z

Finally got the runner working! Here is a recap of everything:

Summary

This PR adds 2 sub-modules, carton-runner-wasm and carton-wasm-interface. The former implements the runner and specifies the components a model is required to implement. The latter is basically empty, but I'd like it to contain guest side implementations and conversions between Candle and Burn tensors. The motivation for this is that working with the raw component types is quite a rough experience.

Test Coverage

Host side conversions between carton and component tensors are covered for f32, u32, i32, string, and passing so I imagine they are working for all types. There is a a somewhat messy test for WASMModelInstance which is working for a basic model. The actual runner main.rs but I assume it's working since it's pretty simple.

Todo?

Guest side type conversions in carton-wasm-interface
Reduce number of copies
Return pointer directly from wasm

Comments

Does the WIT interface require you to get a Vec as output or is there a way to get a slice? That would let you copy out of Wasm memory directly into a new Tensor.

Wasmtime will automatically copy the return into host memory via the Lift trait, vice versa with the Lower trait. Since list translates to vec you would get Vec<u8> back. So currently each infer call does 2 copies per variable: carton -> component -> wasm. As you mentioned, and I found out the hard way, there are some caveats to copyless construction of vecs, so this might be difficult.

In the future, ideally we'd return an address/pointer into Wasm memory for the buffer field (along with strides). That'll help avoid an extra copy in cases where the model's output isn't contiguous/doesn't have "standard" strides.

For handling the lifetime I think Rcing the previous output until the next infer call would probably suffice. That or introduce a callback to free it. The user has to implement the infer function though, so I'm not sure how that would work.

What are your thoughts? Is this mergeable (after some clean up) yet?

VivekPanyam

Finally got the runner working!

This is great. Thanks for spending time on it!

Wasmtime will automatically copy the return into host memory via the Lift trait, vice versa with the Lower trait. Since list translates to vec you would get Vec<u8> back. So currently each infer call does 2 copies per variable: carton -> component -> wasm.

Makes sense. General thoughts that require no action:

Is it possible/straightforward for us to implement Lift for Tensor (and use it easily)? It looks like wasmtime implements Lift for several types that can be built from a wit list (i.e. it's not a 1:1 mapping from list to Vec). I haven't looked at this in depth so maybe it doesn't actually give us what we want. It seems like implementing Lift might require messing with wasmtime implementation details so maybe it's not worth it (definitely not in this PR at least). We can explore this more if we find that this actually matters for performance in use cases we see.

The TODOs sound reasonable overall.

Is this mergeable yet?

Almost! We need a couple other things to get to a runner we can release and deploy. Here are a few things to figure out:

Two options with backwards compatibility:
- Have a policy of not maintaining runner compatibility until the first time it shows up on the docs website (and we can mark it as experimental in the runner's readme until that happens). If that makes sense to you, add a README.md to source/carton-runner-wasm that says in bold at the top that the runner is currently experimental and while it's experimental, models created with it may not work in the future.
- The other option is to confirm that the interfaces are something we're reasonably happy with. We can easily change the implementation by publishing new versions of the runner in the nightly builds, but I'd like to make sure we don't foresee immediate breaking changes to the interface (e.g. the .wit file). Not a huge deal if we need to make a breaking change after releasing (because of how Carton does versioning of runners and the models they create), but I'd rather not make a breaking change immediately (because in theory that means we still need to keep the old runner binary available for all platforms into the future). Some of the TODOs above make it seem like we might change the .wit file relatively soon.

I'd recommend marking it as experimental.

Finally, we need to add a binary that builds a release (example), a complete test (example), and add it to CI

The latter is basically empty, but I'd like it to contain guest side implementations and conversions between Candle and Burn tensors. The motivation for this is that working with the raw component types is quite a rough experience.

That makes sense. I'd recommend removing it for now since it's an empty crate and then you can add it back when you start implementing it.

I added a few comments inline; most of them are pretty simple fixes. The big changes that need to happen are the release building binary and end-to-end test I mentioned above. Nice work!

source/carton-runner-wasm/src/main.rs

VivekPanyam · 2023-10-07T08:44:55Z

source/carton-runner-wasm/src/main.rs

+                    .unwrap();
+			}
+			RequestData::Seal { tensors } => {
+				todo!()


If you're marking the runner as experimental, this is fine. Otherwise we want to pass this through to the Wasm code.

VivekPanyam · 2023-10-07T08:45:12Z

source/carton-runner-wasm/src/main.rs

+                    .unwrap();
+			}
+			RequestData::InferWithHandle { handle, .. } => {
+				todo!()


Same as above. If you're marking the runner as experimental, this is fine. Otherwise we want to pass this through to the Wasm code.

VivekPanyam · 2023-10-07T08:47:54Z

source/carton-runner-wasm/src/types.rs

+
+impl Into<CartonTensor> for TensorNumeric {
+	fn into(self) -> CartonTensor {
+		match self.dtype {


Might be helpful to use the for_each_numeric_carton_type! macro here from the carton-macros crate

VivekPanyam · 2023-10-07T08:49:12Z

source/carton-runner-wasm/src/types.rs

+	type Error = Report;
+
+	fn try_from(value: CartonTensor) -> Result<Self> {
+		Ok(match value {


Might be helpful to use the for_each_carton_type! macro

source/carton-runner-wasm/tests/test_model/model.wasm

VivekPanyam · 2023-10-07T09:00:26Z

source/carton-runner-wasm/wit/lib.wit

+
+world model {
+    use types.{tensor};
+    export infer: func(in: list<tuple<string, tensor>>) -> list<tuple<string, tensor>>;


We need to add seal and infer_with_handle. Not necessary in this PR if you're marking the runner as experimental.

source/carton-runner-wasm/Cargo.toml

source/carton-runner-wasm/src/lib.rs

VivekPanyam · 2023-10-07T09:16:39Z

source/carton-runner-wasm/tests/test_model_instance.rs

+use carton_runner_wasm::WASMModelInstance;
+
+#[test]
+fn test_model_instance() {


General comment on what this is testing

leizaf · 2023-10-11T22:41:19Z

Finally, we need to add a binary that builds a release (example), a complete test (example), and add it to CI

@VivekPanyam Done, and implemented most of the suggestions you made. Whats the best way to make it so the wasm runner is ignored when targeting wasm/wasi.

VivekPanyam

Nice work!

Whats the best way to make it so the wasm runner is ignored when targeting wasm/wasi.

None of the runners are built for wasm/wasi in CI at the moment so nothing to do here.

I added comments on spots you could use the for_each_carton_type macros, but no need to change them in this PR (Just for future reference).

I assume the reason you didn't use them in those spots, but used them in other places is because you were trying to return a value and it didn't work. If you want to return from within one of those macros you currently need to explicitly return (as in the example in one of my comments). This is a little counterintuitive and I should probably include it in the docstrings for the macros (or we should modify the macro implementations to make this easier)

Comment below if you want to change something, otherwise I'll let CI run and then merge!

Thanks again for working on this!

VivekPanyam · 2023-10-11T22:43:44Z

source/carton-runner-wasm/src/types.rs

+        match self.dtype {
+            Dtype::Float => {
+                copy_to_storage(CartonStorage::<f32>::new(self.shape), &self.buffer).into()
+            }
+            Dtype::Double => {
+                copy_to_storage(CartonStorage::<f64>::new(self.shape), &self.buffer).into()
+            }
+            Dtype::I8 => copy_to_storage(CartonStorage::<i8>::new(self.shape), &self.buffer).into(),
+            Dtype::I16 => {
+                copy_to_storage(CartonStorage::<i16>::new(self.shape), &self.buffer).into()
+            }
+            Dtype::I32 => {
+                copy_to_storage(CartonStorage::<i32>::new(self.shape), &self.buffer).into()
+            }
+            Dtype::I64 => {
+                copy_to_storage(CartonStorage::<i64>::new(self.shape), &self.buffer).into()
+            }
+            Dtype::U8 => copy_to_storage(CartonStorage::<u8>::new(self.shape), &self.buffer).into(),
+            Dtype::U16 => {
+                copy_to_storage(CartonStorage::<u16>::new(self.shape), &self.buffer).into()
+            }
+            Dtype::U32 => {
+                copy_to_storage(CartonStorage::<u32>::new(self.shape), &self.buffer).into()
+            }
+            Dtype::U64 => {
+                copy_to_storage(CartonStorage::<u64>::new(self.shape), &self.buffer).into()
+            }
+        }


A more concise option using a macro might be

for_each_numeric_carton_type! { match self.dtype { $(Dtype::$CartonType => { return copy_to_storage(CartonStorage::<$RustType>::new(self.shape), &self.buffer).into() })* } }

VivekPanyam · 2023-10-11T22:44:31Z

source/carton-runner-wasm/src/types.rs

+        Ok(match value {
+            CartonTensor::Float(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::Double(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::I8(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::I16(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::I32(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::I64(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::U8(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::U16(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::U32(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::U64(t) => WasmTensor::Numeric(t.into()),
+            CartonTensor::String(t) => WasmTensor::String(t.into()),
+            CartonTensor::NestedTensor(_) => return Err(eyre!("Nested tensors are not supported")),
+        })


See my note on for_each_numeric_carton_type! above. You should be able to use for_each_carton_type! here

VivekPanyam · 2023-10-11T22:47:41Z

source/carton-runner-wasm/src/types.rs

+    for_each_numeric_carton_type! {
+        $(
+            paste::item! {
+                #[test]
+                fn [< $TypeStr "_tensor_carton_to_wasm" >]() {
+                    let storage = CartonStorage::<$RustType>::new(vec![3]);
+                    let carton_tensor = CartonTensor::$CartonType(
+                        copy_to_storage(
+                            storage,
+                            slice_to_bytes(
+                                &[1.0 as $RustType, 2.0 as $RustType, 3.0 as $RustType]
+                            )
+                        )
+                    );
+                    let wasm_tensor = WasmTensor::try_from(carton_tensor).unwrap();
+                    match wasm_tensor {
+                        WasmTensor::Numeric(tensor_numeric) => {
+                            assert_eq!(
+                                tensor_numeric.buffer,
+                                slice_to_bytes(&[1.0 as $RustType, 2.0 as $RustType, 3.0 as $RustType])
+                            );
+                        }
+                        _ => {
+                            panic!(concat!("Expected WasmTensor::Numeric variant"));
+                        }
+                    }
+                }
+
+                #[test]
+                fn [< $TypeStr "_tensor_wasm_to_carton" >]() {
+                    let buffer = slice_to_bytes(&[1.0 as $RustType, 2.0 as $RustType, 3.0 as $RustType]);
+                    let tensor = WasmTensor::Numeric(TensorNumeric {
+                        buffer: buffer.to_vec(),
+                        dtype: Dtype::$CartonType,
+                        shape: vec![3],
+                    });
+                    let carton_tensor: CartonTensor = tensor.into();
+                    match carton_tensor {
+                        CartonTensor::$CartonType(storage) => {
+                            assert_eq!(
+                                storage.view().as_slice().unwrap(),
+                                &[1.0 as $RustType, 2.0 as $RustType, 3.0 as $RustType]
+                            );
+                        }
+                        _ => {
+                            panic!(concat!("Expected CartonTensor::", stringify!($CartonType), " variant"));
+                        }
+                    }
+                }
+            }
+        )*
+    }


Nice! I just expected a single test for each direction that tested all the types. It looks like you got separate tests working! Hopefully figuring out how to do it didn't take too much time.

VivekPanyam · 2023-10-11T23:03:16Z

(Looks like the formatting check failed. You need to run cargo fmt)

VivekPanyam · 2023-10-11T23:26:44Z

Once you run cargo fmt and update the PR, can you

Change the PR title to remove the [WIP] and add a few words (e.g. "Initial implementation of Wasm runner")
Add a comment with what you want the commit message to be for the merged PR (normally, this is automatically set to original PR description, but that probably doesn't make sense in this case)
Mark the PR as "ready for review" (edit: I did this)

Thanks!

leizaf · 2023-10-11T23:41:47Z

Appreciate the comments on using the macro. I couldn't figure it out initially, and I also didn't know you could match with macros like that. I don't want to restart CI so I can throw those changes in the next PR, or you could refactor them as well. I'll add a more detailed description in a bit.

leizaf · 2023-10-12T01:04:42Z

Description

This PR adds a WASM runner, which can run WASM models compiled using the interface (subject to change #175) defined in ../carton-runner-wasm/wit/lib.wit. The existing implementation is still unoptimized, requiring 2 copies per Tensor moved to/from WASM. An example of compiling a compatible model can be found in carton-runner-wasm/tests/test_model.

Limitations

Only the wasm32-unknown-unknown target has been tested to be working.
Only infer is supported for now.
Packing only supports a single .wasm file and no other artifacts.
No WebGPU, and probably not for a while.

Test Coverage

All type conversions from Carton to WASM and vice versa and fully covered. Pack, Load, Infer are covered in pack.rs.

TODOs

Track in #164

VivekPanyam · 2023-10-12T01:12:20Z

Merged! Nice work :) 🎉

Although #173 included several dependency changes, it did not include an updated `Cargo.lock` file. This PR updates the lock file and adds a check to CI to ensure that lock files match manifest changes. ### Test plan CI

leizaf added 2 commits October 4, 2023 20:19

initial

fcbcd99

switch to global get

ccf524a

leizaf changed the title ~~Wasm runner~~ [WIP] Wasm runner Oct 5, 2023

Merge branch 'VivekPanyam:main' into wasm-runner

f9fc48f

VivekPanyam reviewed Oct 5, 2023

View reviewed changes

wit draft

6c8e6d5

leizaf added 2 commits October 6, 2023 02:48

outline components

87e560b

carton -> component conversions

4ca2d11

VivekPanyam reviewed Oct 6, 2023

View reviewed changes

source/carton-runner-interface/src/do_not_modify/storage.rs Outdated Show resolved Hide resolved

leizaf added 2 commits October 7, 2023 01:13

working example + tests

3ce7c03

more testing for host type conversions

a740c0d

VivekPanyam reviewed Oct 7, 2023

View reviewed changes

leizaf mentioned this pull request Oct 7, 2023

WASM Interface #175

Open

leizaf added 4 commits October 11, 2023 17:33

merge ready?

3a4cf22

cleanup

367cae6

remove instance

7c4f497

fix warnings in types.rs test

cf42bec

VivekPanyam approved these changes Oct 11, 2023

View reviewed changes

fmt

c0b3317

VivekPanyam marked this pull request as ready for review October 11, 2023 23:27

leizaf changed the title ~~[WIP] Wasm runner~~ Initial Wasm runner implementation Oct 11, 2023

VivekPanyam marked this pull request as draft October 11, 2023 23:35

VivekPanyam marked this pull request as ready for review October 11, 2023 23:35

VivekPanyam merged commit 0aef525 into VivekPanyam:main Oct 12, 2023

leizaf mentioned this pull request Oct 12, 2023

WASM support #164

Open

leizaf deleted the wasm-runner branch October 12, 2023 01:25

VivekPanyam mentioned this pull request Oct 13, 2023

Add a check to CI to ensure that Cargo.lock is up to date #177

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial Wasm runner implementation #173

Initial Wasm runner implementation #173

leizaf commented Oct 5, 2023

VivekPanyam left a comment

leizaf commented Oct 5, 2023 •

edited

Loading

VivekPanyam commented Oct 5, 2023

leizaf commented Oct 6, 2023 •

edited

Loading

VivekPanyam left a comment

leizaf commented Oct 7, 2023 •

edited

Loading

VivekPanyam left a comment

VivekPanyam Oct 7, 2023

VivekPanyam Oct 7, 2023

VivekPanyam Oct 7, 2023

VivekPanyam Oct 7, 2023

VivekPanyam Oct 7, 2023

VivekPanyam Oct 7, 2023

leizaf commented Oct 11, 2023

VivekPanyam left a comment

VivekPanyam Oct 11, 2023

VivekPanyam Oct 11, 2023

VivekPanyam Oct 11, 2023

VivekPanyam commented Oct 11, 2023

VivekPanyam commented Oct 11, 2023 •

edited

Loading

leizaf commented Oct 11, 2023

leizaf commented Oct 12, 2023

VivekPanyam commented Oct 12, 2023

Initial Wasm runner implementation #173

Initial Wasm runner implementation #173

Conversation

leizaf commented Oct 5, 2023

VivekPanyam left a comment

Choose a reason for hiding this comment

leizaf commented Oct 5, 2023 • edited Loading

VivekPanyam commented Oct 5, 2023

leizaf commented Oct 6, 2023 • edited Loading

VivekPanyam left a comment

Choose a reason for hiding this comment

leizaf commented Oct 7, 2023 • edited Loading

Summary

Test Coverage

Todo?

Comments

VivekPanyam left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leizaf commented Oct 11, 2023

VivekPanyam left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VivekPanyam commented Oct 11, 2023

VivekPanyam commented Oct 11, 2023 • edited Loading

leizaf commented Oct 11, 2023

leizaf commented Oct 12, 2023

Description

Limitations

Test Coverage

TODOs

VivekPanyam commented Oct 12, 2023

leizaf commented Oct 5, 2023 •

edited

Loading

leizaf commented Oct 6, 2023 •

edited

Loading

leizaf commented Oct 7, 2023 •

edited

Loading

VivekPanyam commented Oct 11, 2023 •

edited

Loading