Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable QNN HTP support for Node #20576

Merged
merged 19 commits into from
May 9, 2024
Merged

Enable QNN HTP support for Node #20576

merged 19 commits into from
May 9, 2024

Conversation

joncamp
Copy link
Contributor

@joncamp joncamp commented May 6, 2024

Description

Add support for using Onnx Runtime with Node

Motivation and Context

Onnx Runtime supports the QNN HTP, but does not support it for Node.js. This adds baseline support for the Onnx Runtime to be used with Node.

Note it does not update the node packages that are distributed officially. This simply patches the onnxruntime.dll to allow 'qnn' to be used as an execution provider.

Testing was done using the existing onnxruntime-node package. The onnxruntime.dll and onnxruntime_binding.node were swapped into node_modules\onnxruntime-node\bin\napi-v3\win32\arm64 with the newly built version, then the various QNN dlls and .so files were placed next to the onnxruntime.dll. Testing was performed on a variety of models and applications, but the easiest test is to modify the node quickstart example.

@joncamp
Copy link
Contributor Author

joncamp commented May 6, 2024 via email

@jywu-msft jywu-msft requested a review from fs-eire May 6, 2024 16:35
@jywu-msft
Copy link
Member

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

@jywu-msft
Copy link
Member

/azp run Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

Copy link

Azure Pipelines successfully started running 9 pipeline(s).

@fs-eire
Copy link
Contributor

fs-eire commented May 6, 2024

I have a few questions regarding this QNN HTP feature:

  • Is this feature only works ( or it's only planned to be supported ) on Windows/arm64 ? Will other OS/CPU arch also need this feature ?
  • Is QNN EP using static link (ie. included in onnxruntime.dll, like DML) or dynamic link (ie. included in onnxruntime_provider_xxx.dll, like CUDA) ?
  • Is it compatible with CPU/DML if a build enabled them all? Specifically:
    • If it is running on a non-Qualcomm device (not sure if this assumption exists, please let me know if I made wrong assumption) or an environment where QNN is not supported, is it still be able to load and run other EP?
    • Is the binary works with other non-CPU EP, like DML? (we know that CUDA has some issue that it does not work with DML in one build)
  • Do you want to include QNN HTP support in onnxruntime-node by default? If so-
    • Do we already have a build pipeline for release artifacts in the "Zip-*" pipeline?

@jywu-msft
Copy link
Member

I have a few questions regarding this QNN HTP feature:

  • Is this feature only works ( or it's only planned to be supported ) on Windows/arm64 ? Will other OS/CPU arch also need this feature ?

  • Is QNN EP using static link (ie. included in onnxruntime.dll, like DML) or dynamic link (ie. included in onnxruntime_provider_xxx.dll, like CUDA) ?

  • Is it compatible with CPU/DML if a build enabled them all? Specifically:

    • If it is running on a non-Qualcomm device (not sure if this assumption exists, please let me know if I made wrong assumption) or an environment where QNN is not supported, is it still be able to load and run other EP?
    • Is the binary works with other non-CPU EP, like DML? (we know that CUDA has some issue that it does not work with DML in one build)
  • Do you want to include QNN HTP support in onnxruntime-node by default? If so-

    • Do we already have a build pipeline for release artifacts in the "Zip-*" pipeline?

i can answer some of these questions.

  1. win/arm64 for now. maybe other platforms/later need to see what those are. (QNN iteself runs on win/arm64, win/x64, linux/x64, android)
  2. QNN EP is statically linked to onnxruntime.dll (that wouldn't change anytime soon)
  3. QNN will not run on non-qualcomm HW
  4. compatible with CPU. DML it should be. (but not sure how extensively it's been tested)
  5. I think this PR just enables build from source, but it would be nice to eventually support this in a more official manner (pipelines, default options etc.)

@mindest mindest added the ep:QNN issues related to QNN exeution provider label May 7, 2024
@hans00
Copy link
Contributor

hans00 commented May 7, 2024

  • QNN will not run on non-qualcomm HW

QNN CPU might be ok
but as my test, performance will worse then XNNPACK + CPU EP

P.S. QNN CPU seems wrapper of XNNPACK

@jywu-msft
Copy link
Member

/azp run Linux OpenVINO CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jywu-msft
Copy link
Member

@joncamp fyi, https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1371836&view=logs&j=90af55dc-07cc-5abc-02f4-1cf38a060872&t=7ee52c27-1516-5118-a868-3d2d34beb196

lib/wasm/session-options.ts:121:31 - error TS2339: Property 'preferredLayout' does not exist on type 'QnnExecutionProviderOption'.

121 if (qnnOptions?.preferredLayout) {
~~~~~~~~~~~~~~~

Found 1 error in lib/wasm/session-options.ts:121

@jywu-msft
Copy link
Member

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@jywu-msft
Copy link
Member

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@jywu-msft
Copy link
Member

/azp run Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline, Linux OpenVINO CI Pipeline

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@jywu-msft jywu-msft merged commit 768c793 into microsoft:main May 9, 2024
78 checks passed
poweiw pushed a commit to poweiw/onnxruntime that referenced this pull request Jun 25, 2024
### Description
Add support for using Onnx Runtime with Node

### Motivation and Context
Onnx Runtime supports the QNN HTP, but does not support it for Node.js.
This adds baseline support for the Onnx Runtime to be used with Node.

Note it does not update the node packages that are distributed
officially. This simply patches the onnxruntime.dll to allow 'qnn' to be
used as an execution provider.

Testing was done using the existing onnxruntime-node package. The
`onnxruntime.dll` and `onnxruntime_binding.node` were swapped into
`node_modules\onnxruntime-node\bin\napi-v3\win32\arm64` with the newly
built version, then the various QNN dlls and .so files were placed next
to the onnxruntime.dll. Testing was performed on a variety of models and
applications, but the easiest test is to modify the [node quickstart
example](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js/quick-start_onnxruntime-node).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:QNN issues related to QNN exeution provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants