[RLlib](deps): Bump onnxruntime from 1.8.0 to 1.9.0 in /python/requirements/rllib #3

dependabot · 2021-11-13T19:41:21Z

Bumps onnxruntime from 1.8.0 to 1.9.0.

Release notes

ONNX Runtime v1.9.0

Announcements

GCC version < 7 is no longer supported

CMAKE_SYSTEM_PROCESSOR needs be set when cross-compiling on Linux because pytorch cpuinfo was introduced as a dependency for ARM big.LITTLE support. Set it to the value of uname -m output of your target device.

General

ONNX 1.10 support

opset 15

ONNX IR 8 (SparseTensor type, model local functionprotos, Optional type not yet fully supported this release)

Improved documentation of C/C++ APIs

IBM Power support

WinML - DLL dependency fix supports learning models on Windows 8.1

Support for sub-building onnxruntime-extensions and statically linking into onnxruntime binary for custom builds

Add --_use_extensions option to run models with custom operators implemented in onnxruntime-extensions

APIs

Registration of a custom allocator for sharing between multiple sessions. (See RegisterAllocator and UnregisterAllocator APIs in onnxruntime_c_api.h)

SessionOptionsAppendExecutionProvider_TensorRT API is deprecated; use SessionOptionsAppendExecutionProvider_TensorRT_V2

New APIs: SessionOptionsAppendExecutionProvider_TensorRT_V2, CreateTensorRTProviderOptions, UpdateTensorRTProviderOptions, GetTensorRTProviderOptionsAsString, ReleaseTensorRTProviderOptions, EnableOrtCustomOps, RegisterAllocator, UnregisterAllocator, IsSparseTensor, CreateSparseTensorAsOrtValue, FillSparseTensorCoo, FillSparseTensorCsr, FillSparseTensorBlockSparse, CreateSparseTensorWithValuesAsOrtValue, UseCooIndices, UseCsrIndices, UseBlockSparseIndices, GetSparseTensorFormat, GetSparseTensorValuesTypeAndShape, GetSparseTensorValues, GetSparseTensorIndicesTypeShape, GetSparseTensorIndices,

Performance and quantization

Performance improvement on ARM

Added S8S8 (signed int8, signed int8) matmul kernel. This avoids extending uin8 to int16 for better performance on ARM64 without dot-product instruction

Expanded GEMM udot kernel to 8x8 accumulator

Added sgemm and qgemm optimized kernels for ARM64EC

Operator improvements

Improved performance for quantized operators: DynamicQuantizeLSTM, QLinearAvgPool

Added new quantized operator QGemm for quantizing Gemm directly

Fused HardSigmoid and Conv

Quantization tool - subgraph support

Transformers tool improvements

Fused Attention for BART encoder and Megatron GPT-2

Integrated mixed precision ONNX conversion and parity test for GPT-2

Updated graph fusion for embed layer normalization for BERT

Improved symbolic shape inference for operators: Attention, EmbedLayerNormalization, Einsum and Reciprocal

Packages

Official ORT GPU packages (except Python) now include both CUDA and TensorRT Execution Providers.

Python packages will be updated next release. Please note that EPs should be explicitly registered to ensure the correct provider is used.

GPU packages are built with CUDA 11.4 and should be compatible with 11.x on systems with the minimum required driver version. See: CUDA minor version compatibility

Pypi

ORT + DirectML Python packages now available: onnxruntime-directml

GPU package can be used on both CPU-only and GPU machines

Nuget

C#: Added support for using netstandard2.0 as a target framework

Windows symbol (PDB) files are no longer included in the Nuget package, reducing size of the binary Nuget package by 85%. To download, please see the artifacts below in Github.

Execution Providers

CUDA EP

... (truncated)

Commits

4daa14b Fixes to rel-1.9.0 to compile and pass for AMD ROCm (#9144)
66b3c31 Final round cherry-picks to 1.9.0 (#9133)
b73bc79 Add a pipeline for audio ops (#9102)
83dc225 Second round cherry-pick to rel-1.9.0 (#9062)
f202cf3 First round cherry-pick to rel-1.9.0 (#9019)
6fbd0a8 Change cmake_cuda_architectures to double quotes (#8990)
5ae4c54 Fix bug for validating GPU packages (#8997)
a30d9f5 fix windows gpu pipelines that use cuda 10.2 (training, reduced_ops and 10.2 ...
4505243 [js/web] WebAssembly profiling (#8932)
0193490 ReduceMin - add int64 cuda kernel support for opset12/13 (#8966)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [onnxruntime](https://github.com/microsoft/onnxruntime) from 1.8.0 to 1.9.0. - [Release notes](https://github.com/microsoft/onnxruntime/releases) - [Changelog](https://github.com/microsoft/onnxruntime/blob/master/docs/ReleaseManagement.md) - [Commits](microsoft/onnxruntime@v1.8.0...v1.9.0) --- updated-dependencies: - dependency-name: onnxruntime dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>

dependabot · 2021-12-11T08:12:31Z

Superseded by #11.

…style) #3 (ray-project#21652)

…ray-project#23821) This PR refactors `LazyBlockList` in service of out-of-band serialization (see [mono-PR](ray-project#22616)) and is a precursor to an execution plan refactor (PR #2) and adding the actual out-of-band serialization APIs (PR #3). The following is included in this refactor: 1. `ReadTask`s are now a first-class concept, replacing calls; 2. read stage progress tracking is consolidated into `LazyBlockList._get_blocks_with_metadta()` and more of the read task complexity, e.g. the read remote function, was pushed into `LazyBlockList` to make `ray.data.read_datasource()` simpler; 3. we are a bit smarter with how we progressively launch tasks and fetch and cache metadata, including fetching the metadata for read tasks in `.iter_blocks_with_metadata()` instead of relying on the pre-read task metadata (which will be less accurate), and we also fix some small bugs in the lazy ramp-up around progressive metadata fetching. (1) is the most important item for supporting out-of-band serialization and fundamentally changes the `LazyBlockList` data model. This is required since we need to be able to reference the underlying read tasks when rewriting read stages during optimization and when serializing the lineage of the Dataset. See the [mono-PR](ray-project#22616) for more context. Other changes: 1. Changed stats actor to a global named actor singleton in order to obviate the need for serializing the actor handle with the Dataset stats; without this, we were encountering serialization failures.

We encountered SIGSEGV when running Python test `python/ray/tests/test_failure_2.py::test_list_named_actors_timeout`. The stack is: ``` #0 0x00007fffed30f393 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) () from /lib64/libstdc++.so.6 #1 0x00007fffee707649 in ray::RayLog::GetLoggerName() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #2 0x00007fffee70aa90 in ray::SpdLogMessage::Flush() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #3 0x00007fffee70af28 in ray::RayLog::~RayLog() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #4 0x00007fffee2b570d in ray::asio::testing::(anonymous namespace)::DelayManager::Init() [clone .constprop.0] () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #5 0x00007fffedd0d95a in _GLOBAL__sub_I_asio_chaos.cc () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #6 0x00007ffff7fe282a in call_init.part () from /lib64/ld-linux-x86-64.so.2 #7 0x00007ffff7fe2931 in _dl_init () from /lib64/ld-linux-x86-64.so.2 #8 0x00007ffff7fe674c in dl_open_worker () from /lib64/ld-linux-x86-64.so.2 #9 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6 #10 0x00007ffff7fe5ffe in _dl_open () from /lib64/ld-linux-x86-64.so.2 #11 0x00007ffff7d5f39c in dlopen_doit () from /lib64/libdl.so.2 #12 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6 #13 0x00007ffff7b82f13 in _dl_catch_error () from /lib64/libc.so.6 #14 0x00007ffff7d5fb09 in _dlerror_run () from /lib64/libdl.so.2 #15 0x00007ffff7d5f42a in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2 #16 0x00007fffef04d330 in py_dl_open (self=<optimized out>, args=<optimized out>) at /tmp/python-build.20220507135524.257789/Python-3.7.11/Modules/_ctypes/callproc.c:1369 ``` The root cause is that when loading `_raylet.so`, `static DelayManager _delay_manager` is initialized and `RAY_LOG(ERROR) << "RAY_testing_asio_delay_us is set to " << delay_env;` is executed. However, the static variables declared in `logging.cc` are not initialized yet (in this case, `std::string RayLog::logger_name_ = "ray_log_sink"`). It's better not to rely on the initialization order of static variables in different compilation units because it's not guaranteed. I propose to change all `RAY_LOG`s to `std::cerr` in `DelayManager::Init()`. The crash happens in Ant's internal codebase. Not sure why this test case passes in the community version though. BTW, I've tried different approaches: 1. Using a static local variable in `get_delay_us` and remove the global variable. This doesn't work because `init()` needs to access the variable as well. 2. Defining the global variable as type `std::unique_ptr<DelayManager>` and initialize it in `get_delay_us`. This works but it requires a lock to be thread-safe.

…ay-project#41074) (ray-project#41212)

dependabot bot added the dependencies Pull requests that update a dependency file label Nov 13, 2021

dependabot bot closed this Dec 11, 2021

dependabot bot deleted the dependabot/pip/python/requirements/rllib/onnxruntime-1.9.0 branch December 11, 2021 08:12

simonsays1980 pushed a commit that referenced this pull request Feb 5, 2022

[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star …

d5bfb7b

…style) #3 (ray-project#21652)

simonsays1980 pushed a commit that referenced this pull request Mar 10, 2023

[Datasets] Streaming executor fixes #3 (ray-project#32836)

4cc3a53

simonsays1980 pushed a commit that referenced this pull request Dec 24, 2023

[RLlib] New ConnectorV2 API #3: Introduce actual ConnectorV2 API. (r…

bd555a0

…ay-project#41074) (ray-project#41212)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib](deps): Bump onnxruntime from 1.8.0 to 1.9.0 in /python/requirements/rllib #3

[RLlib](deps): Bump onnxruntime from 1.8.0 to 1.9.0 in /python/requirements/rllib #3

dependabot bot commented on behalf of github Nov 13, 2021

dependabot bot commented on behalf of github Dec 11, 2021

[RLlib](deps): Bump onnxruntime from 1.8.0 to 1.9.0 in /python/requirements/rllib #3

[RLlib](deps): Bump onnxruntime from 1.8.0 to 1.9.0 in /python/requirements/rllib #3

Conversation

dependabot bot commented on behalf of github Nov 13, 2021

ONNX Runtime v1.9.0

Announcements

General

APIs

Performance and quantization

Packages

Execution Providers

dependabot bot commented on behalf of github Dec 11, 2021