v3.2.1
This is a patch release containing the following changes to v3.2:
- Fixed a potential issue
SEGFAULT
when oneDNN primitives created in parallel (0a6202f) - Replaced deprecated SYCL API
get_pointer
withget_multi_ptr
(fdbff45, 51ed43b) - Fixed an error in device indices detection for persistent cache (25575c2)
- Improved benchdnn performance results accuracy for Graph API (9dfe343)
- Fixed an issue with profiling API not respecting
ONEDNN_EXPERIMENTAL_PROFILING
build knob. This behavior manifests in apparent memory leak when oneDNN primitives are executed on a queue with enabled profiling (8d796ef, 51a8f7a, 2ca2938) - Fixed a correctness issue in resampling primitive with binary and/or sum post-op on Intel CPUs (65ccd25, 4a0e087, f333bb8)
- Fixed a correctness issue in int8 matmul with zero-points for processors with Intel AVX2 and Intel DL Boost instructions support (ec0b2ee, 6d2e567)
- Fixed a correctness issue in fp32 batched matmul with transposed source tensor on processors with Intel AVX-512 instruction set support (36f355e)
- Fixed a correctness issue in matmul and inner product with post-ops on processors with Intel AVX2 and Intel DL Boost with fp16 and bfloat16 instruction set support (b76d4ca)
- Fixed a potential out of bounds issue during GPU kernel creation (190a9b2)
- Updated build system to use TBB-provided CMake config file when available (4011219)