forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull main branch #3
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Build iOS training package in packaging pipeline. Refactor iOS packaging pipeline to build different package variants in parallel.
### Description <!-- Describe your changes. --> Comment out ORT-Nightly feed in NuGet.config to see if that makes the Secure Supply Chain Analysis CI step happy. Add info to readme on manually adding feed and using it. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description onnxruntime-Win2022-GPU-dml-A10 is using VS2022. ### Motivation and Context 1. Upgrade VS2019 to VS2022 to fix prefast issue.
### Description Fix a wrong url in the documentation as mentioned in issue #16678. ### Motivation and Context Better documentation.
### Description <!-- Describe your changes. --> Support Op Pad for WebNN EP. It aims to support three modes (constant, reflect and edge). For now, only constant can be tested with Chrome Canary. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Support more models like SD1.5-VAE-encode.
Otherwise, an unsupported version of gtest/gmock will be found at /opt/conda/include for ROCm builds. Though this issue was initially found for ROCm builds, the issue is generic. onnxruntime requires a specific version of googletest and should not rely on locating googletest using find_package. The ROCm error was: ``` In file included from /opt/conda/include/gmock/gmock-spec-builders.h:75, from /opt/conda/include/gmock/gmock-generated-function-mockers.h:47, from /opt/conda/include/gmock/gmock-function-mocker.h:39, from /opt/conda/include/gmock/gmock.h:61, from /stage/onnxruntime/onnxruntime/test/util/test_utils.cc:17: /opt/conda/include/gmock/gmock-matchers.h: In instantiation of ‘bool testing::internal::PointwiseMatcher<TupleMatcher, RhsContainer>::Impl<LhsContainer>:: MatchAndExplain(LhsContainer, testing::MatchResultListener*) const [with LhsContainer = const gsl::span<const float>&; TupleMatcher = testing::internal:: FloatingEq2Matcher<float>; RhsContainer = gsl::span<const float>]’: /opt/conda/include/gmock/gmock-matchers.h:2303:10: required from here /opt/conda/include/gmock/gmock-matchers.h:2312:48: error: no type named ‘const_iterator’ in ‘testing::internal::PointwiseMatcher<testing::internal:: FloatingEq2Matcher<float>, gsl::span<const float> >::Impl<const gsl::span<const float>&>::LhsStlContainer’ {aka ‘class gsl::span<const float>’} ```
### Description - Updates the default QNN SDK to 2.12 for CI pipelines - Adds a disabled InstanceNormalization test for regression on QNN SDK 2.12 - Cleans up logs for unsupported ops. ### Motivation and Context Test with the latest QNN SDK.
This pull request contains a few changes: 1. Adds support for string ort values. 2. Fixes the training minimal build (that was broken with #16601) by putting custom op registration behind #ifdefs 3. Fixes the iOS pod package generation (that was again broken with #16601) by explicitly providing paths to be copied during pod creation.
- Add ifndef `__APPLE__` to skip lines which cause EXC_BAD_INSTRUCTION error. - Fix floatToHalf/doubleToHalf conversion issue and add tests.
### Description A [previous PR](#16531) added a temporary directory to save the model optimizations after loading a model into an `InferenceSession`. Many models that have an external data file, however, require the data file to be in the same directory as the ONNX model file. Because the model is saved in a temporary directory and the data is saved in another directory, this causes a `FileNotFoundError` error when trying to load the model in the temporary directory. This PR fixes this error by saving the external data file in the same directory that the optimized model is located in. ### Motivation and Context This PR fixes a bug with using a temporary directory while running the optimizer for models that have an external data file.
### Description torch.norm is deprecated as mentioned in issue #16751. This PR replaces the call to torch.norm by the options suggested by torch documentation.
### Description Add op support for LayerNorm, Asin, Sign. Enable QDQ node unit support for Sin Op --------- Co-authored-by: Adrian Lizarraga <[email protected]>
### Description 1) Added Sequence And Maps convenience APIs to create input Sequences and Maps and also visit the outputs. 2) Address OrtValue design issue when the values are created on top of the managed memory and the ortValues are used for sequence and maps creation. We should retain the original managed instances that keep the memory pinned. We opt to keep track of those and dispose of them within an instance of OrtValue that represents a Map or a Sequence. 3) Set `LangVersion` to default per [MS Versioning Docs.](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/configure-language-version) ### Motivation and Context 1) When writing code examples, use of Map and Sequences API proved to be cumbersome. 2) It is a BUG, that we should address, as the managed memory can move by the GC and lead to intermittent crashes. 3) Make use of the most feature of the C#.
### Description The Java API currently only supports fp16 output tensors which it automatically casts to floats on the way out. This PR adds support for creating fp16 and bf16 tensors (from `java.nio.Buffer` objects or as the output of models, creation from Java short arrays is not supported), along with efficient methods for casting `FloatBuffer` into `ShortBuffer` filled with fp16 or bf16 values and vice versa. The fp16 conversions use a trick to pull in the efficient conversion methods added to Java 20, falling back to ports of the MLAS methods otherwise. The Java 20 methods can be special cased by the C2 JIT compiler to emit the single instruction on x86 and ARM which converts fp32<->fp16, or the vectorized versions thereof, so they should be quite a bit faster than the MLAS ported one. ### Motivation and Context fp16 and bf16 are increasingly popular formats and we've had several requests for this functionality. Fixes #7003. cc @yuslepukhin @cassiebreviu --------- Co-authored-by: Scott McKay <[email protected]>
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #16789 * __->__ #16788 This change fixes the N802 lint errors by renaming the test case to use snake case.
…6789) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #16789 Bump ruff to 0.0.278 and fix new lint errors. I added noqa to all existing RUF012 errors which requires mutable class variables to be annotated with `ClassVar`, as well as all PERF issues. Signed-off-by: Justin Chu <[email protected]>
### Description <!-- Describe your changes. --> Allocating new GPUBuffer in every session.run is not efficient. We should make it only happen in the first run. In the following runs, we should try to reuse those buffers. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - This PR is for performance. See mobilenetv2 becomes 9.58 ms from 12.9 ms.
### Description Changes allow downloading prebuilt protoc compiler when building WebAssebly version on mac systems. Otherwise it tries to build a js/wasm version of protoc and throws an error while executing it: "protoc.js permission denied" ### Motivation and Context I need to switch between my main working computer and a PC to make changes to WebAssebly build. Would like not to do that anymore.
…Tensors (#16787) Add compile guards to gate functionality based on MIGRAPHX_STREAM_SYNC for adding the following - remove excess hipStreamSyncronize to nullstream on CopyTensor calls - Add proper call for stream synchronized CopyTensorAsync for DeviceToHost case Without this change subsequent CopyTensorAsync() calls will fail for cards that don't use pinned memory thus causing hipMemcpy() calls to occur before certain kernel operations occur. ![image](https://github.com/microsoft/onnxruntime/assets/107195283/4915c18a-fb2d-40c9-a50e-a7c6613c324b) becomes ![image](https://github.com/microsoft/onnxruntime/assets/107195283/f661acf4-e2af-4c9a-b26a-30fca339cf1d) --------- Co-authored-by: Ted Themistokleous <[email protected]>
Bumps [word-wrap](https://github.com/jonschlinkert/word-wrap) from 1.2.3 to 1.2.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jonschlinkert/word-wrap/releases">word-wrap's releases</a>.</em></p> <blockquote> <h2>1.2.4</h2> <h2>What's Changed</h2> <ul> <li>Remove default indent by <a href="https://github.com/mohd-akram"><code>@mohd-akram</code></a> in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/24">jonschlinkert/word-wrap#24</a></li> <li>🔒fix: CVE 2023 26115 (2) by <a href="https://github.com/OlafConijn"><code>@OlafConijn</code></a> in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/41">jonschlinkert/word-wrap#41</a></li> <li>:lock: fix: CVE-2023-26115 by <a href="https://github.com/aashutoshrathi"><code>@aashutoshrathi</code></a> in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/33">jonschlinkert/word-wrap#33</a></li> <li>chore: publish workflow by <a href="https://github.com/OlafConijn"><code>@OlafConijn</code></a> in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/42">jonschlinkert/word-wrap#42</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/mohd-akram"><code>@mohd-akram</code></a> made their first contribution in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/24">jonschlinkert/word-wrap#24</a></li> <li><a href="https://github.com/OlafConijn"><code>@OlafConijn</code></a> made their first contribution in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/41">jonschlinkert/word-wrap#41</a></li> <li><a href="https://github.com/aashutoshrathi"><code>@aashutoshrathi</code></a> made their first contribution in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/33">jonschlinkert/word-wrap#33</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4">https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/f64b188c7261d26b99e1e2075d6b12f21798e83a"><code>f64b188</code></a> run verb to generate README</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/03ea08256ba0c8e8b02b1b304f0f5bd2b1863207"><code>03ea082</code></a> Merge pull request <a href="https://redirect.github.com/jonschlinkert/word-wrap/issues/42">#42</a> from jonschlinkert/chore/publish-workflow</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/420dce9a2412b21881202b73a3c34f0edc53cb2e"><code>420dce9</code></a> Merge pull request <a href="https://redirect.github.com/jonschlinkert/word-wrap/issues/41">#41</a> from jonschlinkert/fix/CVE-2023-26115-2</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/bfa694edf55bb84ff84512f13da6d68bf7593f06"><code>bfa694e</code></a> Update .github/workflows/publish.yml</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/ace0b3c78f81aaf43040bab3bc91d3c5546d3fd2"><code>ace0b3c</code></a> chore: bump version to 1.2.4</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/6fd727594676f3e1b196b08a320908bec2f4ca02"><code>6fd7275</code></a> chore: add publish workflow</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/30d6daf60fce429f5f559252fa86ee78200652c4"><code>30d6daf</code></a> chore: fix test</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/655929cabea6299dddf3b4a21fc3713fca701b48"><code>655929c</code></a> chore: remove package-lock</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/49e08bbc32a84da5d79e6b7e0fa74ff6217f6d81"><code>49e08bb</code></a> chore: added an additional testcase</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/9f626935f3fac6ec0f3c4b26baea4eb9740d9645"><code>9f62693</code></a> fix: cve 2023-26115</li> <li>Additional commits viewable in <a href="https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=word-wrap&package-manager=npm_and_yarn&previous-version=1.2.3&new-version=1.2.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once it's up-to-date and CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [word-wrap](https://github.com/jonschlinkert/word-wrap) from 1.2.3 to 1.2.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jonschlinkert/word-wrap/releases">word-wrap's releases</a>.</em></p> <blockquote> <h2>1.2.4</h2> <h2>What's Changed</h2> <ul> <li>Remove default indent by <a href="https://github.com/mohd-akram"><code>@mohd-akram</code></a> in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/24">jonschlinkert/word-wrap#24</a></li> <li>🔒fix: CVE 2023 26115 (2) by <a href="https://github.com/OlafConijn"><code>@OlafConijn</code></a> in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/41">jonschlinkert/word-wrap#41</a></li> <li>:lock: fix: CVE-2023-26115 by <a href="https://github.com/aashutoshrathi"><code>@aashutoshrathi</code></a> in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/33">jonschlinkert/word-wrap#33</a></li> <li>chore: publish workflow by <a href="https://github.com/OlafConijn"><code>@OlafConijn</code></a> in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/42">jonschlinkert/word-wrap#42</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/mohd-akram"><code>@mohd-akram</code></a> made their first contribution in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/24">jonschlinkert/word-wrap#24</a></li> <li><a href="https://github.com/OlafConijn"><code>@OlafConijn</code></a> made their first contribution in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/41">jonschlinkert/word-wrap#41</a></li> <li><a href="https://github.com/aashutoshrathi"><code>@aashutoshrathi</code></a> made their first contribution in <a href="https://redirect.github.com/jonschlinkert/word-wrap/pull/33">jonschlinkert/word-wrap#33</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4">https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/f64b188c7261d26b99e1e2075d6b12f21798e83a"><code>f64b188</code></a> run verb to generate README</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/03ea08256ba0c8e8b02b1b304f0f5bd2b1863207"><code>03ea082</code></a> Merge pull request <a href="https://redirect.github.com/jonschlinkert/word-wrap/issues/42">#42</a> from jonschlinkert/chore/publish-workflow</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/420dce9a2412b21881202b73a3c34f0edc53cb2e"><code>420dce9</code></a> Merge pull request <a href="https://redirect.github.com/jonschlinkert/word-wrap/issues/41">#41</a> from jonschlinkert/fix/CVE-2023-26115-2</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/bfa694edf55bb84ff84512f13da6d68bf7593f06"><code>bfa694e</code></a> Update .github/workflows/publish.yml</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/ace0b3c78f81aaf43040bab3bc91d3c5546d3fd2"><code>ace0b3c</code></a> chore: bump version to 1.2.4</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/6fd727594676f3e1b196b08a320908bec2f4ca02"><code>6fd7275</code></a> chore: add publish workflow</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/30d6daf60fce429f5f559252fa86ee78200652c4"><code>30d6daf</code></a> chore: fix test</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/655929cabea6299dddf3b4a21fc3713fca701b48"><code>655929c</code></a> chore: remove package-lock</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/49e08bbc32a84da5d79e6b7e0fa74ff6217f6d81"><code>49e08bb</code></a> chore: added an additional testcase</li> <li><a href="https://github.com/jonschlinkert/word-wrap/commit/9f626935f3fac6ec0f3c4b26baea4eb9740d9645"><code>9f62693</code></a> fix: cve 2023-26115</li> <li>Additional commits viewable in <a href="https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=word-wrap&package-manager=npm_and_yarn&previous-version=1.2.3&new-version=1.2.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…16784) ### Description 1. use the pool with VS2022 2. upgrade System.Memory to 4.5.5 ### Motivation and Context Solve the build error while using VS2022: `[Failure] Msbuild failed when processing the file 'D:\a\_work\1\s\csharp\src\Microsoft.ML.OnnxRuntime\Microsoft.ML.OnnxRuntime.csproj' with message: Method not found: 'System.ReadOnlySpan`1<Char> Microsoft.IO.Path.GetFileName(System.ReadOnlySpan`1<Char>)'` Ref: https://stackoverflow.com/questions/73399777/azure-build-failing-due-to-method-not-found-system-readonlyspan1char-micros
…sts (#16820) ### Disable large index tests due to limited GPU mem Recently following two tests fail due to GPU mem not enough, not sure what else program running using GPU as well. So disable them for now to unblock the required CI. ``` 1: [ FAILED ] 2 tests, listed below: 1: [ FAILED ] CrossEntropyTest.SoftmaxCrossEntropyLossInternal_LargeSizeTensorUInt64Index 1: [ FAILED ] CrossEntropyTest.SoftmaxCrossEntropyLossInternalGrad_LargeSizeTensorUInt64Index 2023-07-23T02:15:39.7559251Z 1: [ RUN ] CrossEntropyTest.SoftmaxCrossEntropyLossInternal_LargeSizeTensorUInt64Index 2023-07-23T02:16:53.0904576Z 1: 2023-07-23 02:16:53.089586592 [E:onnxruntime:SoftmaxCrossEntropyLossInternal, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running SoftmaxCrossEntropyLossInternal node. Name:'node1' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* **onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 4294973440** 2023-07-23T02:16:53.0905775Z 1: 2023-07-23T02:16:53.0906087Z 1: /onnxruntime_src/onnxruntime/test/providers/base_tester.cc:323: Failure 2023-07-23T02:16:53.0906698Z 1: Expected equality of these values: 2023-07-23T02:16:53.0907086Z 1: expect_result 2023-07-23T02:16:53.0907564Z 1: Which is: 4-byte object <00-00 00-00> 2023-07-23T02:16:53.0973055Z 1: ExpectResult::kExpectFailure 2023-07-23T02:16:53.0973984Z 1: Which is: 4-byte object <01-00 00-00> 2023-07-23T02:16:53.0975375Z 1: Run failed but expected success: Non-zero status code returned while running SoftmaxCrossEntropyLossInternal node. Name:'node1' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 4294973440 2023-07-23T02:16:53.0976198Z 1: 2023-07-23T02:16:53.0976483Z 1: Google Test trace: 2023-07-23T02:16:53.0976818Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 8910 2023-07-23T02:16:53.0977229Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 8910 2023-07-23T02:16:53.0977639Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 2345 2023-07-23T02:16:53.0978035Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 5678 2023-07-23T02:16:53.0978441Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1234 2023-07-23T02:16:53.1303810Z 1: /onnxruntime_src/orttraining/orttraining/test/training_ops/cuda/cross_entropy_test.cc:443: Failure 2023-07-23T02:16:53.1304644Z 1: Expected equality of these values: 2023-07-23T02:16:53.1304974Z 1: ret.first 2023-07-23T02:16:53.1305685Z 1: Which is: 4-byte object <04-00 00-00> 2023-07-23T02:16:53.1306030Z 1: COMPARE_RESULT::SUCCESS 2023-07-23T02:16:53.1306414Z 1: Which is: 4-byte object <00-00 00-00> 2023-07-23T02:16:53.1306754Z 1: Unsupported compare with CompareOrtValueNumerals. 2023-07-23T02:16:53.1307487Z 1: Google Test trace: 2023-07-23T02:16:53.1307848Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 8910 2023-07-23T02:16:53.1308252Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 8910 2023-07-23T02:16:53.1308652Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 2345 2023-07-23T02:16:53.1309068Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 5678 2023-07-23T02:16:53.1309460Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1234 2023-07-23T02:16:53.1309889Z 1: /onnxruntime_src/orttraining/orttraining/test/training_ops/cuda/cross_entropy_test.cc:443: Failure 2023-07-23T02:16:53.1310239Z 1: Expected equality of these values: 2023-07-23T02:16:53.1310527Z 1: ret.first 2023-07-23T02:16:53.1310893Z 1: Which is: 4-byte object <04-00 00-00> 2023-07-23T02:16:53.1311208Z 1: COMPARE_RESULT::SUCCESS 2023-07-23T02:16:53.1311600Z 1: Which is: 4-byte object <00-00 00-00> 2023-07-23T02:16:53.1311921Z 1: Unsupported compare with CompareOrtValueNumerals. 2023-07-23T02:16:53.1312229Z 1: Google Test trace: 2023-07-23T02:16:53.1312556Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 8910 2023-07-23T02:16:53.1312951Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 8910 2023-07-23T02:16:53.1313362Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 2345 2023-07-23T02:16:53.1313749Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 5678 2023-07-23T02:16:53.1314156Z 1: /onnxruntime_src/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1234 2023-07-23T02:16:53.4476437Z 1: [ FAILED ] CrossEntropyTest.SoftmaxCrossEntropyLossInternal_LargeSizeTensorUInt64Index (73692 ms) ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
Current TRT EP can support model which has nested control flow ops (multiple level subgraphs). But it fails at a case where the subgraph has outer scope value that is defined several levels up in the top-level graph, in this case, the outer scope value is the input of the top-level graph. The outer scope values are not properly handled during TRT EP's subgraph reconstruction stage and fails at `graph.resolve()`. The way ORT gets capability from EPs is a bottom-up approach meaning inner most subgraph gets handled first. TRT EP reconstructs each subgraph level by level and following modifications are made to fix the outer scope values issue: - `SetGraphOuterScopeValuesAndInputs()` and `SetAllGraphInputs()` are added to handle outer scope values and add those values as graph inputs if needed in order to make `graph.resolve()` happy. - Change to use `GetNodeArgIncludingParentGraphs` so that when creating the fused TRT node for some subgraphs in` Graph::CreateFusedSubGraphNode()`, it can get the NodeArgs for outer scope values from top-level graph. This PR fixes #16217
- Set `KERNEL_EXPLORER_TEST_USE_CUPY=1` to replace numpy with cupy on kernel explorer test. KERNEL_EXPLORER_TEST_USE_CUPY=0 The CPU utilization is shown as below: ![image](https://github.com/microsoft/onnxruntime/assets/94887879/91724b78-0b4e-4cbd-ad88-83cad9976472) KERNEL_EXPLORER_TEST_USE_CUPY=1 The CPU utilization is shown as below: ![image](https://github.com/microsoft/onnxruntime/assets/94887879/58239911-667c-4d5f-bb78-deca60d0266f) - Use `Bash@3`. - Update shell script.
/builds/devtechproviz/dl/ort-builder/onnxruntime/onnxruntime/python/onnxruntime_pybind_state.cc:388:14: error: missing initializer for member 'OrtTensorRTProviderOptionsV2::trt_cuda_graph_enable' [-Werror=missing-field-initializers] 388 | 0}; | ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
…n's when Inputs Are Not Compatible (#16753) Sometimes, ONNX exporter generates rank- or shape-dependent sub-graphs. Thus, error could occur when running the ONNX model with different inputs. This PR ([78e736d](78e736d)) addresses this problem by - if needed, exporting multiple ONNX models with different inputs for the same GraphModule. - implementing a naive mechanism to determine of existing ONNX models (and the associated InferenceSession) can be reused. On the other hand, in the second commit [b5a9b5f](b5a9b5f), this PR also enables dynamic shapes in DORT by - passing dynamic_shapes = True to exporter (see how DEFAULT_DYNAMIC_BACKEND is created) - calling torch._dynamo.optimize(dynamic_ort_aot, dynamic=True) (see how dynamic_ort_aot is created).
### Fix slice upstream - (MatMul) [ShapeInferenceError] Incompatible dimensions ``` 2023-07-22 14:58:16.918478478 [I:onnxruntime:Default, constant_sharing.cc:256 ApplyImpl] Total shared scalar initializer count: 10 2023-07-22 14:58:16.919494252 [W:onnxruntime:Default, graph.cc:108 MergeShapeInfo] Error merging shape info for output. 'onnx::Cast_424' source:{-1,31,-1,-1} target:{-1,32,-1,-1}. Falling back to lenient merge. 2023-07-22 14:58:16.921014114 [W:onnxruntime:Default, graph.cc:108 MergeShapeInfo] Error merging shape info for output. 'onnx::MatMul_425' source:{-1,31,-1,-1} target:{-1,32,-1,-1}. Falling back to lenient merge. Traceback (most recent call last): File "examples/onnxruntime/training/language-modeling/run_clm.py", line 594, in <module> main() File "examples/onnxruntime/training/language-modeling/run_clm.py", line 542, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 454, in train return inner_training_loop( File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 755, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/transformers/trainer.py", line 2735, in training_step loss = self.compute_loss(model, inputs) File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 363, in compute_loss return model_with_loss(dict_inputs, return_outputs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1724, in forward loss = self.module(*inputs, **kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_utils.py", line 384, in _forward return ortmodule._torch_module.forward(*inputs, **kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_utils.py", line 364, in _forward return torch_module_ort._execution_manager(torch_module_ort.is_training()).forward(*inputs, **kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 345, in forward self._fallback_manager.handle_exception( File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_fallback.py", line 157, in handle_exception raise exception File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 280, in forward self._build_graph(graph_transformer_config) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_logger.py", line 218, in wrapper result = func(graph_execution_manager, *args, **kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 360, in _build_graph super()._build_graph(graph_transformer_config) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_graph_execution_manager.py", line 186, in _build_graph self._graph_builder.build(config) RuntimeError: /bert_ort/pengwa/onnxruntime/orttraining/orttraining/python/orttraining_pybind_state.cc:823 onnxruntime::python::addObjectMethodsForTraining(pybind11::module&, onnxruntime::python::ExecutionProviderRegistrationFn)::<lambda(onnxruntime::training::OrtModuleGraphBuilder*, const onnxruntime::training::TrainingGraphTransformerConfiguration&)> [ONNXRuntimeError] : 1 : FAIL : Node (MatMul_403) Op (MatMul) [ShapeInferenceError] Incompatible dimensions ``` Missed using `axis` attribute for `Slice` op, so change to use `axes` inputs instead. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description ### Motivation and Context It should be reverted when VS2022 is upgraded to 17.7 or above. ### Vefication https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331401&view=logs&j=7517abfd-115a-5c61-78a0-7ba3c9e3a88d
### Description Update run_CIs_for_external_pr.py to skip passed checks
### Description ### Motivation and Context Continue upgrading to VS2022 ### Verfication https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331377&view=results N.B. In practice, SDLNativeRules@3 doesn't support VS2019.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.