forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pull ORT main #2
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Upgrade from Python 3.6 to 3.8 in packaging pipeline. - Raise build.py minimum required Python version.
#16506 Cause almost every translation units on linux complaint ``` [1175/1235] Building CXX object CMakeFiles/onnxruntime_test_all.dir/home/guangyunhan/onnxruntime/orttraining/orttraining/test/training_ops/cuda/softmax_test.cc.o In file included from /home/guangyunhan/onnxruntime/include/onnxruntime/core/framework/float16.h:18, from /home/guangyunhan/onnxruntime/include/onnxruntime/core/framework/data_types.h:17, from /home/guangyunhan/onnxruntime/include/onnxruntime/core/framework/tensor.h:17, from /home/guangyunhan/onnxruntime/onnxruntime/test/common/tensor_op_test_utils.h:16, from /home/guangyunhan/onnxruntime/onnxruntime/test/providers/compare_provider_test_utils.h:7, from /home/guangyunhan/onnxruntime/orttraining/orttraining/test/training_ops/cuda/softmax_test.cc:4: /home/guangyunhan/onnxruntime/include/onnxruntime/core/session/onnxruntime_float16.h: In instantiation of ‘static constexpr uint16_t onnxruntime_float16::Float16Impl<Derived>::ToUint16Impl(float) [with Derived = onnxruntime::MLFloat16; uint16_t = short unsigned int]’: /home/guangyunhan/onnxruntime/include/onnxruntime/core/framework/float16.h:42:66: required from here /home/guangyunhan/onnxruntime/include/onnxruntime/core/session/onnxruntime_float16.h:241:7: note: ‘union onnxruntime_float16::detail::float32_bits’ has no user-provided default constructor 241 | union float32_bits { | ^~~~~~~~~~~~ /home/guangyunhan/onnxruntime/include/onnxruntime/core/session/onnxruntime_float16.h:242:16: note: and the implicitly-defined constructor does not initialize ‘unsigned int onnxruntime_float16::detail::float32_bits::u’ 242 | unsigned int u; | ^ ``` This PR shut the compiler up.
…nce (#16658) ### Description <!-- Describe your changes. --> MAUI test app with tooling to add model and generated or provided input test data. The app will load the model and validate the output. It can also run a specified number of iterations to provide basic performance information. <img width="401" alt="image" src="https://github.com/microsoft/onnxruntime/assets/979079/daf3af13-fb22-4cbb-9159-486b483a7485"> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Primarily to make it easier to test an arbitrary model on iOS. A MAUI app allows testing on all platforms. --------- Co-authored-by: Edward Chen <[email protected]>
Allow the whole pipeline to be parameterized with unary elementwise functor.
### Description <!-- Describe your changes. --> Replace the constructor function `MLFloat16()` with the public member function `FromBits()` in the file `onnxruntime/core/providers/cann/cann_common.cc` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> PR [#16506](#16506) changed the public constructor function `MLFloat16(uint16_t x)` to private, and added a public function `MLFloat16::FromBits(uint16_t x)` in the file `include/onnxruntime/core/framework/float16.h`, which broke the CANN CI. This PR aligns the CANN behavior with the modified class `MLFloat16`.
GemmSoftmaxGemmTunble occasionally broken with large numerical error. The root cause of this error is CK's Strided Batched Gemm has larger error under a specific initialization distribution `(multinormal_distribution)`. Generic(Gemm1 + Softmax + Gemm2) implementation is one instance of GemmSoftmaxGemmTunble. Gemm1 and Gemm2 in Generic implementation are TunableOps when tuning enabled. In some case GemmSoftmaxGemmTunble select Generic implentation, while Gemm1 or Gemm2 select ck implementation, the result of GemmSoftmaxGemmTunble affect by CK. - Make tolerance more loosen. - Add `GemmSoftmaxGemmPermuteGenericNestedTunable` to test Generic implementation with tuning enabled.
…16720) There are several global configs used by DORT. ```py DEFAULT_ONNX_EXPORTER_OPTIONS = torch.onnx._internal.exporter.ResolvedExportOptions( torch.onnx._internal.exporter.ExportOptions() ) # TODO(wechi): This line must generate result identical to the call of # _create_onnx_supports_op_overload_table(...) inside # create_onnx_friendly_decomposition_table(...) in # torch/onnx/_internal/fx/decomposition_table.py. _SUPPORT_DICT = torch.onnx._internal.fx.decomposition_table._create_onnx_supports_op_overload_table( DEFAULT_ONNX_EXPORTER_OPTIONS.onnx_registry ) # type: ignore _EXTRA_SUPPORT_DICT: Dict[str, Any] = { "getattr": None, "_operator.getitem": None, } DORT_DECOMPOSITION_TABLE = DEFAULT_ONNX_EXPORTER_OPTIONS.decomposition_table ``` We can see all but `_EXTRA_SUPPORT_DICT` are extracted from deduced from ONNX exporter's options. As there are many ways to configure ONNX exporter's options, we decided to move these variables to `OrtBackend`'s `__init__` so that the construction of `OrtBackend` becomes more flexible (especially for enabling dynamic shape or not).
### Description <!-- Describe your changes. --> Replace the offending bitwise `operator |` with if() logic for ARM.
### Description Fix some issues found in GPT-NeoX graph fusion: (1) GPT-NeoX uses float16 weights. The step of using onnxruntime with opt_level==1 uses CPU provider. Since most operators does not have fp16 in CPU EP, so extra Cast nodes are added to up cast to fp32. (2) Add is shared by two LayerNormalization children, and SkipLayerNormalization might cause invalid graph. (3) Reshape fusion might miss since some part only check initializer but not Constant. This PR adds a check whether model uses FP16, and output a warning when use_gpu is not True, and use GPU provider for graph optimization when use_gpu=True.
### Description - Fixes support for ArgMin/ArgMax to QNN CPU and HTP backends. - Adds Q/DQ node unit selection logic. - Handles casting int64 output to uint32 when necessary. - Adds unit tests for ArgMax/ArgMin. ### Motivation and Context QNN EP did not actually support ArgMin/ArgMax. Unit tests revealed that the existing translation was not sufficient to support these ops.
### Description This change upgrades a lot of dependencies. There are 2 motivations of doing this change: - fix the security issue reported by dependabot (protobufjs Prototype Pollution vulnerability - GHSA-h755-8qp9-cq85) - resolve the requirement of using ONNX IR_VERSION 9 (#16638) This requires: - upgrade protobufjs to v7.2.4 - upgrade library 'onnx-proto' to consume latest ONNX release (v1.14.0). Problems: - protobufjs v7.2.4 depends on long.js v5, which does not work well with typescript (commonjs). - onnx-proto depends on this fix with a new release of long.js - long.js is in maintenance and it takes longer than expected to put in new changes Solutions: - use a patch script in `preprepare` to copy type declarations to make long.js work with typescript (commonjs) - generate onnx protobuf JS/TS files and put them under js/web/lib/onnxjs/ort-schema/protobuf folder - remove 'onnx-proto' from dependency. - apply fixes to generated onnx.d.ts
Set WebNN EP minimum supported opset to 7 as ONNX Runtime currently only guarantees support for models stamped with opset 7 or above.
### Description This PR is includes changes in the documentation of _readmeOV.rst_ file and also the changes in the dockerfile which enables to build ORT with latest OpenVINO 2023.0.0 ### Motivation and Context Modified the dockerfile to incorporate the latest version of OpenVINO (2023.0.0) for building Onnxruntime. The changes in the PR aim to improve the overall user experience by providing accurate and up-to-date documentation while leveraging latest OpenVINO 2023.0.0
It gives up to 7.5% improvement in LLaMA 7B case.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.