Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Miopen dialect opt step13 : apply index diff maps in `threadwise_copy…
…` and `threadwise_copy_v2`. (#226) * Apply index diff map in miopen.threadwise_copy_v2 for loads. * Remove unused codes. * Apply index diff map in miopen.threadwise_copy_v2 for stores. * Remove unused codes. * Factor out common logic. * Fix clang-format * Start to use populayeLayerIndices to compute lower-level coordinates. * Remove composedSource/DestTransform plus renaming some variables. * Fix unit tests. * Consolidate default lengths of SmallVector instances. * Make loopIVsPerAccessOrder be an argument rather than a captured object. * Supply coordinate transformations metadata to index diff map lambda. * Split source / dest coordinate transformation specifications. * Populate coord transform attributes for source vectors. * Amend unit tests. * Start to use the metadata and remove inputType from the lambda. * Fix clang-format. * Revise computeIndexDiffMap interface. Output two vectors: a) lower index diff. b) lower index updated. * Move logic to where it's truly needed. * Start to progressively apply index diff maps. * Extract common logic to a lambda. * Remove unused codes. * Rename some variables. * Reorder dim_access_order. Experimental commit. * Change default lengths of SmallVector instances. * Switch between legacy and new approach. * Carve out lambda from threadwise_copy_v2 to a function. * Carve out lambda from threadwise_copy_v2 to a function. populateLayeredIndices -> - populateLayeredIndicesWithAffineMap - populateLayeredIndicesWithIndexDiffMap * Populate initial upper and lower indices for index diff map computation. * Add logic to cope with incomplete metadata. * Adopt index diff map logic in threadwise_copy. Disabled by default. * Experimental commit to test 5->3 transfer. * Change comments. * Supply UnMerge parameters for matrix C write out logic. * Supply transform metadata for users of subview op. * Fix clang-format. * Proper implementation of index diff maps logic. F_infinite algorithm. * Fix one unit test. * XXX FIXME disable a test for conv2d_bwd_data. Tame check-mlir for now. This needs to be studied. * Fix clang-format. * Consider Slice in computeIndexDiffMap. Fix Embed parameters in bwd_data. * XXX. HACKS for bwd_data. - Consider Slice in computeIndexDiffMap. - Fix Embed parameters in bwd_data. - Populate a fake identity map to prevent it from being optimized. - Hack a unit test. * Revert "XXX FIXME disable a test for conv2d_bwd_data." This reverts commit 734c407. * Embed affine maps within the metadata of transformations and use them. Avoid the identity map being optimized away by MLIR when it's embedded as a part of memref type. Fix unit tests. Remove those XXX hacks for conv2d_bwd_data. * Use constantFold whenever possible.
- Loading branch information