Skip to content

v3.2.1

Compare
Choose a tag to compare
@vpirogov vpirogov released this 04 Aug 00:54
· 14 commits to rls-v3.2 since this release

This is a patch release containing the following changes to v3.2:

  • Fixed a potential issue SEGFAULT when oneDNN primitives created in parallel (0a6202f)
  • Replaced deprecated SYCL API get_pointer with get_multi_ptr (fdbff45, 51ed43b)
  • Fixed an error in device indices detection for persistent cache (25575c2)
  • Improved benchdnn performance results accuracy for Graph API (9dfe343)
  • Fixed an issue with profiling API not respecting ONEDNN_EXPERIMENTAL_PROFILING build knob. This behavior manifests in apparent memory leak when oneDNN primitives are executed on a queue with enabled profiling (8d796ef, 51a8f7a, 2ca2938)
  • Fixed a correctness issue in resampling primitive with binary and/or sum post-op on Intel CPUs (65ccd25, 4a0e087, f333bb8)
  • Fixed a correctness issue in int8 matmul with zero-points for processors with Intel AVX2 and Intel DL Boost instructions support (ec0b2ee, 6d2e567)
  • Fixed a correctness issue in fp32 batched matmul with transposed source tensor on processors with Intel AVX-512 instruction set support (36f355e)
  • Fixed a correctness issue in matmul and inner product with post-ops on processors with Intel AVX2 and Intel DL Boost with fp16 and bfloat16 instruction set support (b76d4ca)
  • Fixed a potential out of bounds issue during GPU kernel creation (190a9b2)
  • Updated build system to use TBB-provided CMake config file when available (4011219)