-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Wasm runner implementation #173
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start. Here are a few high level comments:
Currently it requires the user to define and export buffers to transfer tensors to/from wasm. The alternative is to pass pointers and lengths manually, which I think is more complicated, and doesn't allow for multiple returns.
Unfortunately, the current approach is not particularly flexible as many things cannot be dynamic (e.g. number of return tensors, number of inputs, shapes of tensors, etc.).
I'd recommend reading about the WebAssembly Component Model and then take a look at wit-bindgen
.
The component model allows you to define more sophisticated interface types and defines a canonical ABI so that Wasm modules implemented in several languages can communicate in a consistent way. This is somewhat similar to the interface definitions we were talking about in #164.
We generally want to support the same infer
interface as the rest of Carton (the input is an arbitrary number of named Tensors and the output is an arbitrary number of named Tensors). This should be possible using WIT.
Can you try to get a prototype working using the component model/WIT? wasmtime has support for WIT so you shouldn't have to change runtimes.
@VivekPanyam What do you think of the .wit for tensors I drafted up. And infer could just be:
or just a list of tensors. I did consider using the component model initially, but I wasn't sure how developed tooling around it is yet. |
That One thing to note in the interface is that since you're using In the future, ideally we'd return an address/pointer into Wasm memory for the We could just make |
@VivekPanyam What's the best way to copylessly create a Tensor from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VivekPanyam What's the best way to copylessly create a Tensor from
Vec<T>
?
Does the WIT interface require you to get a Vec<T>
as output or is there a way to get a slice? That would let you copy out of Wasm memory directly into a new Tensor
.
As far as I'm aware, we need to do at least one copy on the output path (to copy out of Wasm linear memory into something else). If we can make it exactly one, that would be ideal (i.e. without an intermediate Vec
).
If you don't see a way to do this, don't worry about it and we can optimize later.
(I added one other comment to answer your other question, but I didn't review the whole PR since it looks like it's still in progress)
Finally got the runner working! Here is a recap of everything: SummaryThis PR adds 2 sub-modules, Test CoverageHost side conversions between carton and component tensors are covered for f32, u32, i32, string, and passing so I imagine they are working for all types. There is a a somewhat messy test for Todo?
Comments
Wasmtime will automatically copy the return into host memory via the
For handling the lifetime I think What are your thoughts? Is this mergeable (after some clean up) yet? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally got the runner working!
This is great. Thanks for spending time on it!
Wasmtime will automatically copy the return into host memory via the
Lift
trait, vice versa with theLower
trait. Since list translates to vec you would getVec<u8>
back. So currently each infer call does 2 copies per variable: carton -> component -> wasm.
Makes sense. General thoughts that require no action:
Is it possible/straightforward for us to implement Lift
for Tensor
(and use it easily)? It looks like wasmtime implements Lift
for several types that can be built from a wit
list (i.e. it's not a 1:1 mapping from list
to Vec
). I haven't looked at this in depth so maybe it doesn't actually give us what we want. It seems like implementing Lift
might require messing with wasmtime implementation details so maybe it's not worth it (definitely not in this PR at least). We can explore this more if we find that this actually matters for performance in use cases we see.
The TODOs sound reasonable overall.
Is this mergeable yet?
Almost! We need a couple other things to get to a runner we can release and deploy. Here are a few things to figure out:
- Two options with backwards compatibility:
- Have a policy of not maintaining runner compatibility until the first time it shows up on the docs website (and we can mark it as experimental in the runner's readme until that happens). If that makes sense to you, add a
README.md
tosource/carton-runner-wasm
that says in bold at the top that the runner is currently experimental and while it's experimental, models created with it may not work in the future. - The other option is to confirm that the interfaces are something we're reasonably happy with. We can easily change the implementation by publishing new versions of the runner in the nightly builds, but I'd like to make sure we don't foresee immediate breaking changes to the interface (e.g. the
.wit
file). Not a huge deal if we need to make a breaking change after releasing (because of how Carton does versioning of runners and the models they create), but I'd rather not make a breaking change immediately (because in theory that means we still need to keep the old runner binary available for all platforms into the future). Some of the TODOs above make it seem like we might change the.wit
file relatively soon.
- Have a policy of not maintaining runner compatibility until the first time it shows up on the docs website (and we can mark it as experimental in the runner's readme until that happens). If that makes sense to you, add a
I'd recommend marking it as experimental.
Finally, we need to add a binary that builds a release (example), a complete test (example), and add it to CI
The latter is basically empty, but I'd like it to contain guest side implementations and conversions between Candle and Burn tensors. The motivation for this is that working with the raw component types is quite a rough experience.
That makes sense. I'd recommend removing it for now since it's an empty crate and then you can add it back when you start implementing it.
I added a few comments inline; most of them are pretty simple fixes. The big changes that need to happen are the release building binary and end-to-end test I mentioned above. Nice work!
.unwrap(); | ||
} | ||
RequestData::Seal { tensors } => { | ||
todo!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're marking the runner as experimental, this is fine. Otherwise we want to pass this through to the Wasm code.
.unwrap(); | ||
} | ||
RequestData::InferWithHandle { handle, .. } => { | ||
todo!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. If you're marking the runner as experimental, this is fine. Otherwise we want to pass this through to the Wasm code.
|
||
impl Into<CartonTensor> for TensorNumeric { | ||
fn into(self) -> CartonTensor { | ||
match self.dtype { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be helpful to use the for_each_numeric_carton_type!
macro here from the carton-macros
crate
type Error = Report; | ||
|
||
fn try_from(value: CartonTensor) -> Result<Self> { | ||
Ok(match value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be helpful to use the for_each_carton_type!
macro
|
||
world model { | ||
use types.{tensor}; | ||
export infer: func(in: list<tuple<string, tensor>>) -> list<tuple<string, tensor>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add seal
and infer_with_handle
. Not necessary in this PR if you're marking the runner as experimental.
use carton_runner_wasm::WASMModelInstance; | ||
|
||
#[test] | ||
fn test_model_instance() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment on what this is testing
@VivekPanyam Done, and implemented most of the suggestions you made. Whats the best way to make it so the wasm runner is ignored when targeting wasm/wasi. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
Whats the best way to make it so the wasm runner is ignored when targeting wasm/wasi.
None of the runners are built for wasm/wasi in CI at the moment so nothing to do here.
I added comments on spots you could use the for_each_carton_type
macros, but no need to change them in this PR (Just for future reference).
I assume the reason you didn't use them in those spots, but used them in other places is because you were trying to return a value and it didn't work. If you want to return from within one of those macros you currently need to explicitly return
(as in the example in one of my comments). This is a little counterintuitive and I should probably include it in the docstrings for the macros (or we should modify the macro implementations to make this easier)
Comment below if you want to change something, otherwise I'll let CI run and then merge!
Thanks again for working on this!
match self.dtype { | ||
Dtype::Float => { | ||
copy_to_storage(CartonStorage::<f32>::new(self.shape), &self.buffer).into() | ||
} | ||
Dtype::Double => { | ||
copy_to_storage(CartonStorage::<f64>::new(self.shape), &self.buffer).into() | ||
} | ||
Dtype::I8 => copy_to_storage(CartonStorage::<i8>::new(self.shape), &self.buffer).into(), | ||
Dtype::I16 => { | ||
copy_to_storage(CartonStorage::<i16>::new(self.shape), &self.buffer).into() | ||
} | ||
Dtype::I32 => { | ||
copy_to_storage(CartonStorage::<i32>::new(self.shape), &self.buffer).into() | ||
} | ||
Dtype::I64 => { | ||
copy_to_storage(CartonStorage::<i64>::new(self.shape), &self.buffer).into() | ||
} | ||
Dtype::U8 => copy_to_storage(CartonStorage::<u8>::new(self.shape), &self.buffer).into(), | ||
Dtype::U16 => { | ||
copy_to_storage(CartonStorage::<u16>::new(self.shape), &self.buffer).into() | ||
} | ||
Dtype::U32 => { | ||
copy_to_storage(CartonStorage::<u32>::new(self.shape), &self.buffer).into() | ||
} | ||
Dtype::U64 => { | ||
copy_to_storage(CartonStorage::<u64>::new(self.shape), &self.buffer).into() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more concise option using a macro might be
for_each_numeric_carton_type! {
match self.dtype {
$(Dtype::$CartonType => {
return copy_to_storage(CartonStorage::<$RustType>::new(self.shape), &self.buffer).into()
})*
}
}
Ok(match value { | ||
CartonTensor::Float(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::Double(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::I8(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::I16(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::I32(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::I64(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::U8(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::U16(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::U32(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::U64(t) => WasmTensor::Numeric(t.into()), | ||
CartonTensor::String(t) => WasmTensor::String(t.into()), | ||
CartonTensor::NestedTensor(_) => return Err(eyre!("Nested tensors are not supported")), | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my note on for_each_numeric_carton_type!
above. You should be able to use for_each_carton_type!
here
for_each_numeric_carton_type! { | ||
$( | ||
paste::item! { | ||
#[test] | ||
fn [< $TypeStr "_tensor_carton_to_wasm" >]() { | ||
let storage = CartonStorage::<$RustType>::new(vec![3]); | ||
let carton_tensor = CartonTensor::$CartonType( | ||
copy_to_storage( | ||
storage, | ||
slice_to_bytes( | ||
&[1.0 as $RustType, 2.0 as $RustType, 3.0 as $RustType] | ||
) | ||
) | ||
); | ||
let wasm_tensor = WasmTensor::try_from(carton_tensor).unwrap(); | ||
match wasm_tensor { | ||
WasmTensor::Numeric(tensor_numeric) => { | ||
assert_eq!( | ||
tensor_numeric.buffer, | ||
slice_to_bytes(&[1.0 as $RustType, 2.0 as $RustType, 3.0 as $RustType]) | ||
); | ||
} | ||
_ => { | ||
panic!(concat!("Expected WasmTensor::Numeric variant")); | ||
} | ||
} | ||
} | ||
|
||
#[test] | ||
fn [< $TypeStr "_tensor_wasm_to_carton" >]() { | ||
let buffer = slice_to_bytes(&[1.0 as $RustType, 2.0 as $RustType, 3.0 as $RustType]); | ||
let tensor = WasmTensor::Numeric(TensorNumeric { | ||
buffer: buffer.to_vec(), | ||
dtype: Dtype::$CartonType, | ||
shape: vec![3], | ||
}); | ||
let carton_tensor: CartonTensor = tensor.into(); | ||
match carton_tensor { | ||
CartonTensor::$CartonType(storage) => { | ||
assert_eq!( | ||
storage.view().as_slice().unwrap(), | ||
&[1.0 as $RustType, 2.0 as $RustType, 3.0 as $RustType] | ||
); | ||
} | ||
_ => { | ||
panic!(concat!("Expected CartonTensor::", stringify!($CartonType), " variant")); | ||
} | ||
} | ||
} | ||
} | ||
)* | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I just expected a single test for each direction that tested all the types. It looks like you got separate tests working! Hopefully figuring out how to do it didn't take too much time.
(Looks like the formatting check failed. You need to run |
Once you run
Thanks! |
Appreciate the comments on using the macro. I couldn't figure it out initially, and I also didn't know you could match with macros like that. I don't want to restart CI so I can throw those changes in the next PR, or you could refactor them as well. I'll add a more detailed description in a bit. |
DescriptionThis PR adds a WASM runner, which can run WASM models compiled using the interface (subject to change #175) defined in Limitations
Test CoverageAll type conversions from Carton to WASM and vice versa and fully covered. Pack, Load, Infer are covered in pack.rs. TODOsTrack in #164 |
Merged! Nice work :) 🎉 |
Although #173 included several dependency changes, it did not include an updated `Cargo.lock` file. This PR updates the lock file and adds a check to CI to ensure that lock files match manifest changes. ### Test plan CI
Still untested, but I might have a very basic proof of concept ready. Currently it requires the user to define and export buffers to transfer tensors to/from wasm. The alternative is to pass pointers and lengths manually, which I think is more complicated, and doesn't allow for multiple returns. I used color_eyre temporarily for quick error handling, let me know what error handling system you are using and I can refactor to that. Also I'm not super sure how to test the runner, I have a half completed test written.