Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebNN EP] Decompose Concat with input number > 4 for CPU backend #18930

Merged
merged 1 commit into from
Dec 29, 2023

Conversation

Honry
Copy link
Contributor

@Honry Honry commented Dec 26, 2023

WebNN XNNPack backend only supports the concat with inputs number <= 4, decomposing the Concat with inputs number > 4 into multiple WebNN concat ops.

WebNN XNNPack backend only supports the concat with inputs number <= 4,
decomposing the Concat with inputs number > 4 into multiple WebNN concat ops.
@Honry
Copy link
Contributor Author

Honry commented Dec 26, 2023

@fdwr, @guschmue, PTAL, thanks!

@guschmue guschmue added the ep:WebNN WebNN execution provider label Dec 28, 2023
@guschmue
Copy link
Contributor

/azp run ONNX Runtime Web CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@guschmue
Copy link
Contributor

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline

@guschmue
Copy link
Contributor

/azp run Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 9 pipeline(s).

Copy link

Azure Pipelines successfully started running 7 pipeline(s).

@guschmue
Copy link
Contributor

/azp run ONNX Runtime Web CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@guschmue guschmue merged commit 96d1f32 into microsoft:main Dec 29, 2023
62 of 70 checks passed
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is temporary, right? I'm surprised that XNNPack doesn't have a higher limit, like 16/256/.../65536. This approach reminds me of growing std::vector with linear reallocation, and that because you're also copying all the existing elements each time, a linear push_back in a loop will actually result in a higher than linear time complexity (which is why most implementations have a 1.5x or 2x growth pattern to avoid this). So, those models that have 128 concatenated inputs will experience n^2 time o_o.

We definitely don't expect WebNN callers to duplicate this code when calling CPU, and so either XNNPack should handle > 4 inputs directly, or the Chromium WebNN interface should do it (because anything ORT layer can handle, surely the WebNN front-end can directly handle).

@Honry
Copy link
Contributor Author

Honry commented Jan 17, 2024

cc/ @huningxin, hope you could address @fdwr's comment.

@huningxin
Copy link

@Honry , feel free to open a Chromium issue for WebNN XNNPACK backend. We'll seek feedback from XNNPACK developers and Chromium developers to decide where to implement this feature. Thanks!

@Honry
Copy link
Contributor Author

Honry commented Jan 17, 2024

@Honry , feel free to open a Chromium issue for WebNN XNNPACK backend. We'll seek feedback from XNNPACK developers and Chromium developers to decide where to implement this feature. Thanks!

Sure. Will do that.

@Honry
Copy link
Contributor Author

Honry commented Jan 17, 2024

Issue created at https://bugs.chromium.org/p/chromium/issues/detail?id=1519119.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:WebNN WebNN execution provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants