Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Web] WebGPU supported operator tracking #15952

Closed
fs-eire opened this issue May 15, 2023 · 4 comments
Closed

[Web] WebGPU supported operator tracking #15952

fs-eire opened this issue May 15, 2023 · 4 comments
Assignees
Labels
ep:WebGPU ort-web webgpu provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:web issues related to ONNX Runtime web; typically submitted using template

Comments

@fs-eire
Copy link
Contributor

fs-eire commented May 15, 2023

This issue is for tracking WebGPU operators. It includes the following info:

  • a list of supported operators
  • a list of WIP operator implementations
  • info about problems/correctness/performance specific to a certain oprator.

Supported operators

https://github.com/microsoft/onnxruntime/blob/main/js/web/docs/webgpu-operators.md

Currently work in progress operators

ops needed for segment anything:

OpType Assigned To Comments PR
ArgMax.float @guschmue #16882
Cast.bool @fs-eire
Equal.float @fs-eire
Einsum.float @sajandhy
Gather.float @dakenf #16855
LayerNormalization.float @dakenf #16830
Not.bool @jchen351 #16891
Softmax.float @guschmue #16882

assuming above ops are implemented, ops missing for segment anything encoder:
(offline script would replace int64 that is not supported in webgpu with int32)

OpType Assigned To Comments PR
Concat.int32
Einsum.float
Gather.int32
Pad.float
Slice.int32
Sub.int32
Transpose.int32

assuming above ops are implemented, ops missing for t5 encoder:
(offline script would replace int64 that is not supported in webgpu with int32)

OpType Assigned To Comments PR
Abs.int32
Add.int32
ConstantOfShape.int32
Greater.int32
Less.int32
Log.float
Min.int32
Mul.int32
Range.int32
Reshape.int32
Shape.int32
Sub.int32
Where.bool

assuming above ops are implemented, ops missing for t5 decoder:
(offline script would replace int64 that is not supported in webgpu with int32)

OpType Assigned To Comments PR
If.bool
LessOrEqual.int32
Tile.int32

assuming above ops are implemented, ops missing for dolly-v2-3b:
(offline script would replace int64 that is not supported in webgpu with int32, assumes fp32 for now which is not going to work)

OpType Assigned To Comments PR
Div.int32
GatherElements.float
Mul.int32
Neg.int32
Slice.bool
Slice.int32
Tile.float
Transpose.float

Failing operators

Operators that need to be optimized

OpType Assgined To Comments
FusedConv @guschmue need to fuse conv and activation
Conv TBD optimize the 1 time filter transpose at init
FusedMatmul TBD
FusedGemm TBD
@fs-eire fs-eire added the platform:web issues related to ONNX Runtime web; typically submitted using template label May 15, 2023
@fs-eire fs-eire self-assigned this May 15, 2023
This was referenced May 15, 2023
@github-actions github-actions bot added the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label Jun 27, 2023
@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Jul 5, 2023
@guschmue
Copy link
Contributor

we moved this tracking to a spreadsheet, closing this one.

@gabrielgrant
Copy link

@guschmue is there a public link we can use to stay appraised of the status of these operations?

@fs-eire
Copy link
Contributor Author

fs-eire commented Dec 6, 2023

@guschmue is there a public link we can use to stay appraised of the status of these operations?

We have implemented most (if not all) operators that required for a selected list of popular models to support. Some operators (control flows, int64 operators especially the TensorShape related ones) will not implement for WebGPU support because it is not helping the model performance. We may still add support for new operators when necessary, but for now, we shifts our major effort to improve the engine and per kernel shaders to improve the ort-web performance on webgpu, rather than implementing new ones.

The following link can be used to check latest operator support -
https://github.com/microsoft/onnxruntime/blob/main/js/web/docs/webgpu-operators.md

@gabrielgrant
Copy link

This is great news, thanks for the update @fs-eire

If others are curious about the implementation of these (as I was), they're here: https://github.com/microsoft/onnxruntime/tree/main/js/web/lib/wasm/jsep/webgpu/ops

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:WebGPU ort-web webgpu provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:web issues related to ONNX Runtime web; typically submitted using template
Projects
None yet
Development

No branches or pull requests

3 participants