-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[js/web] JSEP Gather OP #16855
[js/web] JSEP Gather OP #16855
Conversation
FYI: it passes all tests but does not work correctly with StableDiffusion text encoder. Will update the PR once i fix it |
Can you run To update the document? |
/azp run ONNX Runtime Web CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run ONNX Runtime Web CI Pipeline |
No commit pushedDate could be found for PR 16855 in repo microsoft/onnxruntime |
the ci pipeline is nagging:
|
/azp run ONNX Runtime Web CI Pipeline |
No commit pushedDate could be found for PR 16855 in repo microsoft/onnxruntime |
/azp run ONNX Runtime Web CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run ONNX Runtime Web CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
sorry, lint still not happy. |
First one did not help but the last seem to fix it |
/azp run ONNX Runtime Web CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline |
No commit pushedDate could be found for PR 16855 in repo microsoft/onnxruntime |
/azp run Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 6 pipeline(s). |
/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline |
Azure Pipelines successfully started running 9 pipeline(s). |
/azp run ONNX Runtime Web CI Pipeline |
No commit pushedDate could be found for PR 16855 in repo microsoft/onnxruntime |
/azp run ONNX Runtime Web CI Pipeline |
No commit pushedDate could be found for PR 16855 in repo microsoft/onnxruntime |
/azp run ONNX Runtime Web CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline |
Azure Pipelines successfully started running 9 pipeline(s). |
/azp run Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 6 pipeline(s). |
You should know that there is some kind of issue with this OP, because when I run it with StableDiffusion, it throws an error MatMul dimensions do not match. Before recent pull from main branch it was throwing another error (with Reshape OP) using AllSupportedSize =
TypeList<
float,
// double,
// int64_t,
// uint64_t,
int32_t,
uint32_t>; everything works fine. And also MemCopy (from and to host) ops double when code is not commented out. From ~1500 to ~3500. I guess that's because all binary and unary ops don't support int64 Do you know an easy way to output every node output to browser console so i can compare and find the issue faster? |
Added Gather op that works with both i32 and i64 indices, assuming that values fall into i32 limit. The assumption is safe because it's not possible to allocate more than 2gb buffer for inputs. It treats all data from input tensor as u32, copying 1 or 2 elements for i64, u64 and double. --------- Co-authored-by: Guenther Schmuelling <[email protected]>
### Description Added Gather op that works with both i32 and i64 indices, assuming that values fall into i32 limit. The assumption is safe because it's not possible to allocate more than 2gb buffer for inputs. It treats all data from input tensor as u32, copying 1 or 2 elements for i64, u64 and double. --------- Co-authored-by: Guenther Schmuelling <[email protected]>
### Description Added Gather op that works with both i32 and i64 indices, assuming that values fall into i32 limit. The assumption is safe because it's not possible to allocate more than 2gb buffer for inputs. It treats all data from input tensor as u32, copying 1 or 2 elements for i64, u64 and double. --------- Co-authored-by: Guenther Schmuelling <[email protected]>
Description
Added Gather op that works with both i32 and i64 indices, assuming that values fall into i32 limit. The assumption is safe because it's not possible to allocate more than 2gb buffer for inputs.
It treats all data from input tensor as u32, copying 1 or 2 elements for i64, u64 and double.