Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dequantization + Extract Slice Problems #17642

Closed
4 tasks
IanWood1 opened this issue Jun 11, 2024 · 1 comment
Closed
4 tasks

Dequantization + Extract Slice Problems #17642

IanWood1 opened this issue Jun 11, 2024 · 1 comment

Comments

@IanWood1
Copy link
Contributor

IanWood1 commented Jun 11, 2024

#17455 uncovered several problems when dealing with tensor.extract_slice that consume the results of dequantization-like linalg.generic ops. See this gist for an mlir example.

What should happen

The dequantization + ExtactSliceOp + consumer will be placed in the same dispatch AND bufferization will convert the ExtractSliceOp into a view instead of allocating an entirely new high bitwidth tensor.

dequantization + ExtactSliceOp + consumer
-> (make slice continuous via transpose)
dequantization + transpose + ExtactSliceOp + consumer
-> (move transpose before dequant)
transpose + dequantization + ExtactSliceOp + consumer
-> dequant and extractsliceop cloned into consumer

What is currently happening

Although dequantization + ExtactSliceOp + consumer gets placed in the same dispatch, bufferiation cannot handle the tensor.extract_slice since it is extracting on the innermost dim. Full logs

Problems

  • Add transpose before ExtractSliceOp #17574 Transpose the extract_slice to extract along the outermost dimension so that bufferization can handle the extract better. However, the new transpose causes the dequant to not be cloned (the dequant and transpose get fused together and not cloned).
  • Bubble the transpose above dequantization ops so that they don't get in the way of dequant + extract getting cloned into dispatches
  • The transpose gets fused with the dequant op, which prevents cloning the dequantization into the dispatch IR example
    1. Prevent the transpose from being fused with the dequant
    2. OR propagate the slice before the dequant (hard with multiple slices)
  • tensor.extract_slice ops shouldn't be unconditionally cloned into dispatch regions. This PR looks at only cloning when result is continuous conditionally clone extract_slice  #17638
@IanWood1 IanWood1 changed the title [FLOW] Dequantization + Extract Slice Problems Dequantization + Extract Slice Problems Jun 11, 2024
@IanWood1
Copy link
Contributor Author

Issue resolved with #18332

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant