-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Web] WebGPU backend fails to load some model due to exception during initialization inside transpose optimizer #15869
Comments
FYI, loading still fails even after conversion to .ort format. |
I will take a look |
The most likely reason is that the VAE encoder graph has operators that are not yet supported by the WebGPU execution provider, e.g., |
The operator coverage is a problem, but that should not cause the model loading failure. After debugging the issue I found the problem is in the transpose optimizer.
Need dig deeper into the source code. I am debugging it. |
### Description because of #15618 , the default allocator changed to device allocator, which will be GPU instead of CPU. in transpose optimizer we expect to read data from initializers so a CPU allocator is required here. this change fixes transpose optimizer on GPU EP Fixes the issue referred to in #15869, #15796
@gegogi This issue should have been fixed by the PR mentioned above. Please help to validate if it works. thanks |
Could you publish the latest nightly npm build? I tried to build onnxruntime myself but could't figure out compilation errors relating to protobuf version mismatch. It seems the project has the protobuf as a submodule but is trying to include headers from system directory which have different signatures. |
Please try [email protected] |
This appears to be fixed for me when running this example: https://gist.github.com/gabrielgrant/cb3e072dec5a416b4fc24f18ae902fb7 ...but, despite using @gegogi are you able to confirm this is fixed? (this should be in a release now)
|
The ONNX Runtime Web depends on the C++ code for session, graph and model execution, which is compiled into WebAssembly. In short, ONNX Runtime Web always need to load WebAssembly, no matter you use However, you don't have to always set The issue that related to "Transpose" is already fixed. So let me close the issue. |
Describe the issue
I am trying to load a model on WebGPU backend env.
I could load the model downloaded from:
https://github.com/onnx/models/blob/main/vision/classification/mobilenet/model/mobilenetv2-12.onnx
But I couldn't load the following model:
https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx/vae_encoder
Both models can be loaded using Python onnxruntime.
To reproduce
Download the model from:
https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx/vae_encoder
and run the following code:
Urgency
No response
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
[email protected]
Execution Provider
Other / Unknown
The text was updated successfully, but these errors were encountered: