You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#17455 uncovered several problems when dealing with tensor.extract_slice that consume the results of dequantization-like linalg.generic ops. See this gist for an mlir example.
What should happen
The dequantization + ExtactSliceOp + consumer will be placed in the same dispatch AND bufferization will convert the ExtractSliceOp into a view instead of allocating an entirely new high bitwidth tensor.
dequantization + ExtactSliceOp + consumer
-> (make slice continuous via transpose)
dequantization + transpose + ExtactSliceOp + consumer
-> (move transpose before dequant)
transpose + dequantization + ExtactSliceOp + consumer
-> dequant and extractsliceop cloned into consumer
What is currently happening
Although dequantization + ExtactSliceOp + consumer gets placed in the same dispatch, bufferiation cannot handle the tensor.extract_slice since it is extracting on the innermost dim. Full logs
Problems
Add transpose before ExtractSliceOp #17574 Transpose the extract_slice to extract along the outermost dimension so that bufferization can handle the extract better. However, the new transpose causes the dequant to not be cloned (the dequant and transpose get fused together and not cloned).
Bubble the transpose above dequantization ops so that they don't get in the way of dequant + extract getting cloned into dispatches
The transpose gets fused with the dequant op, which prevents cloning the dequantization into the dispatch IR example
Prevent the transpose from being fused with the dequant
OR propagate the slice before the dequant (hard with multiple slices)
tensor.extract_slice ops shouldn't be unconditionally cloned into dispatch regions. This PR looks at only cloning when result is continuous conditionally clone extract_slice #17638
The text was updated successfully, but these errors were encountered:
IanWood1
changed the title
[FLOW] Dequantization + Extract Slice Problems
Dequantization + Extract Slice Problems
Jun 11, 2024
#17455 uncovered several problems when dealing with
tensor.extract_slice
that consume the results of dequantization-likelinalg.generic
ops. See this gist for an mlir example.What should happen
The dequantization + ExtactSliceOp + consumer will be placed in the same dispatch AND bufferization will convert the ExtractSliceOp into a view instead of allocating an entirely new high bitwidth tensor.
What is currently happening
Although dequantization + ExtactSliceOp + consumer gets placed in the same dispatch, bufferiation cannot handle the
tensor.extract_slice
since it is extracting on the innermost dim. Full logsProblems
extract_slice
to extract along the outermost dimension so that bufferization can handle the extract better. However, the new transpose causes the dequant to not be cloned (the dequant and transpose get fused together and not cloned).tensor.extract_slice
ops shouldn't be unconditionally cloned into dispatch regions. This PR looks at only cloning when result is continuous conditionally cloneextract_slice
#17638The text was updated successfully, but these errors were encountered: