Releases: intel/mlir-extensions
Releases · intel/mlir-extensions
IMEX v0.3
Preview release of Intel® Extension for MLIR (IMEX)
Fixes / Improvements
Highlights
- XeTile Dialect : XeTile dialect supports the tile-based programming model and decomposes the GEMM kernel to large pre-defined tile sizes at the subgroup and workgroup level.
- XeGPU Dialect: The XeGPU dialect models Xe instructions like DPAS and 2D block load/store.
- Lowering from subgroup level XeTile to XeGPU VC Mode.
- Lowering from XeGPU to SPIR-V.
- XeGPU VC Mode to VC Intrinsics (Covers all XeGPU ops)
- XeGPU to GenISA Intrinsics & Joint Matrix (Ops supported : create_nd_descriptor, update_nd_offset, load_nd, store_nd, dpas)
- Dialect/Op, conversion & integration test cases for XeTile & XeGPU.
- High performance end-to-end GEMM code example based on SPIR-V dialect.
- RFC summarizing XeTile & XeGPU design
List of all changes
Dependencies revisions
Project | Revision |
---|---|
LLVM Project | 49af650 |
Supported System configurations
- Ubuntu 22.04 LTS
- x86 CPU
- Intel® Data Center GPU Max Series
Supported data types for GPU
- FP32
- FP16
- BF16
- I32
- I16
- I8
Limitations
- For GEMM end-to-end test cases we only support FP16 & BF16 data types.
- IMEX v0.3 does not support TF32 & FP64.
- When the input program uses multiple tile_mma op, tensor used as A matrix in one tile_mma can not be used as B matrix in another.
- XeTile fusion use case is not fully supported.
- XeGPU doesn't support SLM use case.
Dependencies for GPU execution
GPU execution supports two different wrapper libraries for interacting with GPU. Level Zero and Sycl wrapper libraries.
- oneAPI Level Zero: https://github.com/oneapi-src/level-zero (Required for both Level Zero and Sycl wrapper)
- Intel® oneAPI DPC++/C++ Compiler: https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#dpcpp-cpp (Required for Sycl wrapper)
IMEX v0.2
Preview release of Intel® Extension for MLIR (IMEX)
Fixes / Improvements
Highlights
- BF16 Support
- Performance benchmark for popular models - elementwise, reduction, softmax, transpose, kloop
- Caching for level-zero/sycl runtime compilation
- Fix GPU memory leak
- Change to Synchronous queue for Level-zero runtime to fix sporadic issues with GPU tests
List of all changes
Dependencies revisions
Project | Revision |
---|---|
LLVM Project | d2a559f |
Supported configurations
- Ubuntu 20.04 LTS
- x86 CPU
- Intel® Data Center GPU Flex Series or Intel® Data Center GPU Max Series
Supported data types for GPU
- FP32
- FP16
- BF16
- I32
- I16
- I8
Dependencies for GPU execution
GPU execution supports two different wrapper libraries for interacting with GPU. Level Zero and Sycl wrapper libraries.
- oneAPI Level Zero: https://github.com/oneapi-src/level-zero (Required for both Level Zero and Sycl wrapper)
- Intel® oneAPI DPC++/C++ Compiler: https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#dpcpp-cpp (Required for Sycl wrapper)
IMEX v0.1
Preview release of Intel® Extension for MLIR (IMEX)
This release focuses on functionally enabling Deep Learning workloads with static shaped input tensors written in upstream MLIR Entry dialects (Linalg and other lower level dialects) on select Intel GPUs. The development is test case driven and does not cover all ops in the MLIR entry dialects.
Supported configurations
- Ubuntu 20.04 LTS
- x86 CPU
- Intel® Data Center GPU Flex Series or Intel® Data Center GPU Max Series
Supported data types for GPU
- Fp32
Dependencies for GPU execution
GPU execution supports two different wrapper libraries for interacting with GPU. Level Zero and Sycl wrapper libraries.
- oneAPI Level Zero: https://github.com/oneapi-src/level-zero (Required for both Level Zero and Sycl wrapper)
- Intel® oneAPI DPC++/C++ Compiler: https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#dpcpp-cpp (Required for Sycl wrapper)
Tested DL workloads
- Resnet-50 inference
- Mobilenet-v3 inference
- Mnist mlp
Tested DL Operations
- Elementwise
- Convolution
- GEMM
- Reduction
- Data movement
- Padding