Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Web] WebGPU backend fails to load some model due to exception during initialization inside transpose optimizer #15869

Closed
gegogi opened this issue May 9, 2023 · 9 comments
Labels
model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:web issues related to ONNX Runtime web; typically submitted using template

Comments

@gegogi
Copy link

gegogi commented May 9, 2023

Describe the issue

I am trying to load a model on WebGPU backend env.
I could load the model downloaded from:
https://github.com/onnx/models/blob/main/vision/classification/mobilenet/model/mobilenetv2-12.onnx
But I couldn't load the following model:
https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx/vae_encoder
Both models can be loaded using Python onnxruntime.

To reproduce

Download the model from:
https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx/vae_encoder
and run the following code:

const ort = require('onnxruntime-web/webgpu');
async function main() {
        const modelPath = './models/sd15_vae_encoder_model.onnx';
        const session = await ort.InferenceSession.create(modelPath, {executionProviders: ['webgpu']});
}

Urgency

No response

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

[email protected]

Execution Provider

Other / Unknown

@gegogi gegogi added the platform:web issues related to ONNX Runtime web; typically submitted using template label May 9, 2023
@github-actions github-actions bot added the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label May 9, 2023
@gegogi
Copy link
Author

gegogi commented May 9, 2023

FYI, loading still fails even after conversion to .ort format.

@fs-eire
Copy link
Contributor

fs-eire commented May 9, 2023

I will take a look

@visheratin
Copy link
Contributor

The most likely reason is that the VAE encoder graph has operators that are not yet supported by the WebGPU execution provider, e.g., InstanceNormalization, Slice, Reshape.

@fs-eire
Copy link
Contributor

fs-eire commented May 16, 2023

The operator coverage is a problem, but that should not cause the model loading failure. After debugging the issue I found the problem is in the transpose optimizer.

 C:\a\_work\1\s\onnxruntime\core\optimizer\transpose_optimizer\optimizer_api_impl.cc:280 virtual std::vector<uint8_t> onnxruntime::ApiTensor::Data() const [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : 
Yi @ ort.webgpu.min.js:6

Need dig deeper into the source code. I am debugging it.

@fs-eire fs-eire changed the title [Web] WebGPU backend cannot load some models that Python runtime can load. [Web] WebGPU backend fails to load some model due to exception during initialization inside transpose optimizer May 16, 2023
fs-eire added a commit that referenced this issue May 19, 2023
### Description
because of #15618 , the default allocator changed to device allocator,
which will be GPU instead of CPU. in transpose optimizer we expect to
read data from initializers so a CPU allocator is required here.

this change fixes transpose optimizer on GPU EP

Fixes the issue referred to in #15869, #15796
@fs-eire
Copy link
Contributor

fs-eire commented May 22, 2023

@gegogi This issue should have been fixed by the PR mentioned above. Please help to validate if it works. thanks

@gegogi
Copy link
Author

gegogi commented May 24, 2023

Could you publish the latest nightly npm build? I tried to build onnxruntime myself but could't figure out compilation errors relating to protobuf version mismatch. It seems the project has the protobuf as a submodule but is trying to include headers from system directory which have different signatures.

image

@fs-eire
Copy link
Contributor

fs-eire commented May 25, 2023

Please try [email protected]

@gabrielgrant
Copy link

gabrielgrant commented May 24, 2024

This appears to be fixed for me when running this example: https://gist.github.com/gabrielgrant/cb3e072dec5a416b4fc24f18ae902fb7

...but, despite using ort.webgpu.min.js and only having executionProviders: ['webgpu'] , it is still demanding that ort.env.wasm.wasmPaths be set, so it's not entirely clear to me that it's actually using the WebGPU backend instead of WASM? (is the WASM bundle just needed as fallback for kernels not yet implemented in WebGPU?)

@gegogi are you able to confirm this is fixed? (this should be in a release now)

@fs-eire:

  1. Can you confirm the gist example I've put together should be testing the issue correctly?
  2. are you confident enough that fix transpose optimizer on GPU EP #15988 fixes this to close this issue?

@fs-eire
Copy link
Contributor

fs-eire commented May 24, 2024

The ONNX Runtime Web depends on the C++ code for session, graph and model execution, which is compiled into WebAssembly. In short, ONNX Runtime Web always need to load WebAssembly, no matter you use webgpu or wasm(cpu) EP.

However, you don't have to always set ort.env.wasm.wasmPaths. If it is not set, it will try to load the .wasm files from the "current folder" (relative to the URL of the JavaScript file that is currently running). the flag just offers a way to customize the path.

The issue that related to "Transpose" is already fixed. So let me close the issue.

@fs-eire fs-eire closed this as completed May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:web issues related to ONNX Runtime web; typically submitted using template
Projects
None yet
Development

No branches or pull requests

4 participants