-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: add "named models" #36
Comments
cc: @geekbeast, @roee88 |
This also updates wasmtime to use the latest version of the wasi-nn spec instead of older commit.
A global scope would not be virtualizable. The way --dir and --listenfd work is that they pass in (what are conceptually) handles into the running program, and associating names with them only as a compatibility layer with existing code and command-line argument passing schemes. One option would be to have the registry be dynamic, and referenced by handle. Another option would be to make the registry be a "link-time authority", meaning you can have a function like |
I've come to a similar conclusion as @geekbeast and I have discussed this. The idea that And, given that names would not be globally scoped, do we even need @sunfishcode, I'm not sure I understand exactly what you mean by the two options you bring up at the end. My feeling now after some thought is that the best way to provide the name-model mapping is "out-of-band:" i.e., forget about |
@sunfishcode My understanding is that the registry being proposed here is definitely not be global in scope. While there are some common inference models that hosts may choose to expose, they can do so by modifying their initialization of the dynamic registry at the host level. Do you have any objections to a fully dynamic registry who contents are completely controlled by host policy? That's actually what I implemented in bytecodealliance/wastime#6134 (caveat there is a bug with the lifetime that I'm fixing up at the moment). |
@sunfishcode Also quick clarifying question that I suspect I know the answer to here. If two identical WASM guest programs are run, any visible shared state across those two programs would be considered to break virtualization, correct? |
The driving reason behind named models is that model compilation is expensive. I have seen 24x times inference cost for model compilation in some testing. A simpler solution here maybe to expose an import_compiled_model(...) that takes a model that a model that is ready to be executed. This maybe not be compatible with all frameworks. In particular, this is a current open issue in TensorFlow (tensorflow/tensorflow#55520) |
I would think even a local scoped load_named method is useful, as it solves the problem of maintaining state information across calls in the FaaS use case. |
This sounds great! We look forward to implementing this in WasmEdge! |
Hello, this is hydai from WasmEdge. I'd like to clarify something about the section you mentioned:
It seems like this would load directly from a host path, rather than utilizing a guest path mapped via pre-open. For instance, let's consider a model named I'm curious if it's possible to use a guest path instead. Here's an example: # Host
/host_path/NM.model
# Map it with pre-open
wasm_runtime --dir /guest_path/NM.model:/host_path/NM.model wasi-nn.wasm
# Within the guest environment
call register_named: func(name: string, path: string) -> expected<error>
prior to invoking load_named(); The reason behind this question is a need for container integration:
|
Hey @hydai, thanks for the feedback. The idea I was going for with this change is that the model bytes should not even need to be visible to the Wasm guest at all. The issue with the current
|
Hi @abrown Gocha! I would like to delve deeper into 'compiling model bytes into other forms.' Two aspects require clarification:
Another thing is that do we need to add |
I don't think we want this "precompilation" step to be guest-visible. If it were, it would expose differences between backends that compile and those that don't. And if we hide the compilation like we do, each host implementation is free to optimize when and how they do this step.
I don't think it would fit in the wasi-nn bindings because this compilation I'm talking about is a host-side thing.
@geekbeast and I discussed this over (I think via a meeting) as we were thinking through #38. (You might want to look at that PR as well). I am pretty reticent to add more surface area to the wasi-nn API unless we absolutely need it. More API surface means more to implement, more to maintain, more chances to make a mistake. So we agreed to hold off on deregistering models until a user demands it; I think we're expecting most users to clean up the wasi-nn state by just existing the Wasm instance. If this does become an issue for someone, though, it seems like a reasonable addition. |
Ok, this feature has now been added by #38. |
This change adds a way to retrieve preloaded ML models (i.e., "graphs" in wasi-nn terms) from a registry. The wasi-nn specification includes a new function, `load_by_name`, that can be used to access these models more efficiently than before; previously, a user's only option was to read/download/etc. all of the bytes of an ML model and pass them to the `load` function. [named models]: WebAssembly/wasi-nn#36 In Wasmtime's implementation of wasi-nn, we call the registry that holds the models a `GraphRegistry`. We include a simplistic `InMemoryRegistry` for use in the Wasmtime CLI (more on this later) but the idea is that production use will involve some more complex caching and thus a new implementation of a registry--a `Box<dyn GraphRegistry>`--passed into the wasi-nn context. Note that, because we now must be able to `clone` a graph out of the registry and into the "used graphs" table, the OpenVINO `BackendGraph` is updated to be easier to copy around. To allow experimentation with this "preload a named model" functionality, this change also adds a new Wasmtime CLI flag: `--graph <encoding>:<host dir>`. Wasmtime CLI users can now preload a model from a directory; the directory `basename` is used as the model name. Loading models from a directory is probably not desired in Wasmtime embeddings so it is cordoned off into a separate `BackendFromDir` extension trait.
This change adds a way to retrieve preloaded ML models (i.e., "graphs" in wasi-nn terms) from a registry. The wasi-nn specification includes a new function, `load_by_name`, that can be used to access these models more efficiently than before; previously, a user's only option was to read/download/etc. all of the bytes of an ML model and pass them to the `load` function. [named models]: WebAssembly/wasi-nn#36 In Wasmtime's implementation of wasi-nn, we call the registry that holds the models a `GraphRegistry`. We include a simplistic `InMemoryRegistry` for use in the Wasmtime CLI (more on this later) but the idea is that production use will involve some more complex caching and thus a new implementation of a registry--a `Box<dyn GraphRegistry>`--passed into the wasi-nn context. Note that, because we now must be able to `clone` a graph out of the registry and into the "used graphs" table, the OpenVINO `BackendGraph` is updated to be easier to copy around. To allow experimentation with this "preload a named model" functionality, this change also adds a new Wasmtime CLI flag: `--graph <encoding>:<host dir>`. Wasmtime CLI users can now preload a model from a directory; the directory `basename` is used as the model name. Loading models from a directory is probably not desired in Wasmtime embeddings so it is cordoned off into a separate `BackendFromDir` extension trait.
This change adds a way to retrieve preloaded ML models (i.e., "graphs" in wasi-nn terms) from a registry. The wasi-nn specification includes a new function, `load_by_name`, that can be used to access these models more efficiently than before; previously, a user's only option was to read/download/etc. all of the bytes of an ML model and pass them to the `load` function. [named models]: WebAssembly/wasi-nn#36 In Wasmtime's implementation of wasi-nn, we call the registry that holds the models a `GraphRegistry`. We include a simplistic `InMemoryRegistry` for use in the Wasmtime CLI (more on this later) but the idea is that production use will involve some more complex caching and thus a new implementation of a registry--a `Box<dyn GraphRegistry>`--passed into the wasi-nn context. Note that, because we now must be able to `clone` a graph out of the registry and into the "used graphs" table, the OpenVINO `BackendGraph` is updated to be easier to copy around. To allow experimentation with this "preload a named model" functionality, this change also adds a new Wasmtime CLI flag: `--graph <encoding>:<host dir>`. Wasmtime CLI users can now preload a model from a directory; the directory `basename` is used as the model name. Loading models from a directory is probably not desired in Wasmtime embeddings so it is cordoned off into a separate `BackendFromDir` extension trait.
This change adds a way to retrieve preloaded ML models (i.e., "graphs" in wasi-nn terms) from a registry. The wasi-nn specification includes a new function, `load_by_name`, that can be used to access these models more efficiently than before; previously, a user's only option was to read/download/etc. all of the bytes of an ML model and pass them to the `load` function. [named models]: WebAssembly/wasi-nn#36 In Wasmtime's implementation of wasi-nn, we call the registry that holds the models a `GraphRegistry`. We include a simplistic `InMemoryRegistry` for use in the Wasmtime CLI (more on this later) but the idea is that production use will involve some more complex caching and thus a new implementation of a registry--a `Box<dyn GraphRegistry>`--passed into the wasi-nn context. Note that, because we now must be able to `clone` a graph out of the registry and into the "used graphs" table, the OpenVINO `BackendGraph` is updated to be easier to copy around. To allow experimentation with this "preload a named model" functionality, this change also adds a new Wasmtime CLI flag: `--graph <encoding>:<host dir>`. Wasmtime CLI users can now preload a model from a directory; the directory `basename` is used as the model name. Loading models from a directory is probably not desired in Wasmtime embeddings so it is cordoned off into a separate `BackendFromDir` extension trait.
* wasi-nn: add [named models] This change adds a way to retrieve preloaded ML models (i.e., "graphs" in wasi-nn terms) from a registry. The wasi-nn specification includes a new function, `load_by_name`, that can be used to access these models more efficiently than before; previously, a user's only option was to read/download/etc. all of the bytes of an ML model and pass them to the `load` function. [named models]: WebAssembly/wasi-nn#36 In Wasmtime's implementation of wasi-nn, we call the registry that holds the models a `GraphRegistry`. We include a simplistic `InMemoryRegistry` for use in the Wasmtime CLI (more on this later) but the idea is that production use will involve some more complex caching and thus a new implementation of a registry--a `Box<dyn GraphRegistry>`--passed into the wasi-nn context. Note that, because we now must be able to `clone` a graph out of the registry and into the "used graphs" table, the OpenVINO `BackendGraph` is updated to be easier to copy around. To allow experimentation with this "preload a named model" functionality, this change also adds a new Wasmtime CLI flag: `--graph <encoding>:<host dir>`. Wasmtime CLI users can now preload a model from a directory; the directory `basename` is used as the model name. Loading models from a directory is probably not desired in Wasmtime embeddings so it is cordoned off into a separate `BackendFromDir` extension trait. * wasi-nn: add "named model" test Add a new example crate which loads a model by name and performs image classification. It uses the same MobileNet model as the existing test but a new version of the Rust bindings. The new crate is built and run with the new CLI flag in the `ci/run-wasi-nn-example.sh` script. prtest:full * review: rename `--graph` to `--wasi-nn-graph`
* wasi-nn: add [named models] This change adds a way to retrieve preloaded ML models (i.e., "graphs" in wasi-nn terms) from a registry. The wasi-nn specification includes a new function, `load_by_name`, that can be used to access these models more efficiently than before; previously, a user's only option was to read/download/etc. all of the bytes of an ML model and pass them to the `load` function. [named models]: WebAssembly/wasi-nn#36 In Wasmtime's implementation of wasi-nn, we call the registry that holds the models a `GraphRegistry`. We include a simplistic `InMemoryRegistry` for use in the Wasmtime CLI (more on this later) but the idea is that production use will involve some more complex caching and thus a new implementation of a registry--a `Box<dyn GraphRegistry>`--passed into the wasi-nn context. Note that, because we now must be able to `clone` a graph out of the registry and into the "used graphs" table, the OpenVINO `BackendGraph` is updated to be easier to copy around. To allow experimentation with this "preload a named model" functionality, this change also adds a new Wasmtime CLI flag: `--graph <encoding>:<host dir>`. Wasmtime CLI users can now preload a model from a directory; the directory `basename` is used as the model name. Loading models from a directory is probably not desired in Wasmtime embeddings so it is cordoned off into a separate `BackendFromDir` extension trait. * wasi-nn: add "named model" test Add a new example crate which loads a model by name and performs image classification. It uses the same MobileNet model as the existing test but a new version of the Rust bindings. The new crate is built and run with the new CLI flag in the `ci/run-wasi-nn-example.sh` script. prtest:full * review: rename `--graph` to `--wasi-nn-graph`
Currently the wasi-nn API only allows loading ML models from their byte-serialized format (i.e., using
load
). This can be problematic for several reasons:I would like to propose "named models" — a way of solving these issues. Other WASI proposals, such as wasi-filesystem and wasi-sockets, provide a way of creating pre-instantiation resources that are then available to the Wasm module once instantiated (see, e.g., the
--dir
and--listenfd
flags on the Wasmtime CLI). If a similar idea were available to wasi-nn, users could specify models before instantiation and these could be shared across instances. This sharing could only happen, however, if the models are "named."Spec changes
To support this in the specification, one would need the ability to load a model using only a name and (possibly) the ability to a load a model from bytes and name it. This way there could be some symmetry between the host and guest functionality. I think this could be supported by adding the following functions:
Obviously the ability to load a "named model" for all instances running in a host is up for debate: perhaps the available scope of that
name
should only be the Wasm instance itself or some host-specified neighborhood. I included the most controversial version, global scope, to see what people think. I also think the host may want to implement some way to limit the resources consumed bywasi-nn
; this is a host implementation concern, discussed below.Host engine changes
Though this repository is the spec repository and is primarily concerned with the Wasm-visible API, I think it would be valuable to discuss what changes this might imply for an engine implementing wasi-nn. Here are some suggestions:
The engine might want to limit the resources available to a wasi-nn-using module: this could take the form of limiting the number of models loaded via
load
orload_named
, limiting the size of the models loaded (somehow), etc. One could imagine a flag like--nn-max-models
to do something like this. (I would also think it would be great to have a generic way to limit any WASI API if anyone has thoughts on that).The engine would likely want a way to preload some models to avoid
load
-ing them repeatedly in new Wasm instances. One could imagine a flag like--nn-preload <name>:<encoding>:<path>
to tell the engine both the name of the model and how to load it. All modules instantiated by that engine would have the models available for retrieval withget_named
.The text was updated successfully, but these errors were encountered: