-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebAssembly / Web runtime (both for wasm-simd and WebGPU) #3497
Comments
cc: @mcr229 or @digantdesai regarding running XNNPACK via wasm |
Also cc: @mergennachin |
I've talked with @digantdesai about this before. I think for xnnpack he mentioned it should just be plug and play. Ive been wanting to try out wasm for sometime now just havent had the bandwidth. |
I also wonder about the fusion capabilities of executorch :) Does it allow Inductor codegen'd fused kernels (e.g. think quant/dequant fused into the flash attn kernel directly, with positional embedding computation also fused into this kernel)? Another interesting backend is webgpu/wgpu: https://github.com/huggingface/ratchet or even directly wgpu/wgsl shaders could in theory be a compilation target for fused kernels But even if executorch does not support wild codegen/fusions - it's still be good to have it as a baseline with comparisons against ort-web and tflate-tfjs and tvm-wasm and ggml compiled to wasm. This should show roughly where all these frameworks stand (especially if compiling is relatively doable) |
And given that currently PyTorch does not have its own inference wasm/WebGPU story, having executorch compiled to wasm-simd might be a nice baseline to have (especially if it's minimalistic and relatively simple to compile) |
I suspect much of the core should compilable with emscripten cpp compiler. Probably not optimized operators though and not too sure about backends/xnnpack |
Maybe best would be adding some sort of GitHub Actions CI test compiling it with emscripten... (even if no tests using it exist so far) |
It should be, given a bunch of WASM[SIMD] kernels. I haven't tried it myself though. IIRC there aren't any CI for that on github/xnnpack either. |
xnnpack is also known to compile (and maybe even tested) for wasm/simd, so somehow this should be achievable... don't know if any compact backend library/project exists for webgpu kernels |
I'm wondering if ExecuTorch can be compiled for WebAssembly target? As far as I understand, XNNPACK exists for wasm-simd, so theoretically at least for CPU it can be done? (e.g. to be compared with tflite+tfjs, ort-web and tvm-wasm at least for some popular models like MobileNets)
(This is especially interesting if strong fusion/codegen can be done to produce fused wasm-simd code/fused WebGPU programs - although maybe this is an ask for Inductor)
The text was updated successfully, but these errors were encountered: