Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using rank-reducing subtensor/subviews #5385

Closed
hanhanW opened this issue Apr 9, 2021 · 12 comments
Closed

Using rank-reducing subtensor/subviews #5385

hanhanW opened this issue Apr 9, 2021 · 12 comments
Assignees
Labels
codegen Shared code generation infrastructure and dialects

Comments

@hanhanW
Copy link
Contributor

hanhanW commented Apr 9, 2021

subtensor and subview ops support rank-reducing behavior. E.g.,

  func @main(%arg0: tensor<1x4xf32>) -> tensor<4xf32> {
    %0 = subtensor %arg0[0, 0] [1, 4] [1, 1] : tensor<1x4xf32> to tensor<4xf32>
    return %0 : tensor<4xf32>
  }

This can fold some tensor_reshape away.

I also found that there are some issues in canonicalization pass. Some tensor.cast op will be created, and raising errors like error: 'tensor.cast' op operand type 'tensor<1x513xi16>' and result type 'tensor<513xi16>' are cast incompatible.

  1. they should be tensor_reshape, not tensor.cast
  2. they should be folded into subtensor.
@hanhanW hanhanW added the codegen Shared code generation infrastructure and dialects label Apr 9, 2021
@hanhanW
Copy link
Contributor Author

hanhanW commented Apr 12, 2021

I did some study on this, it looks like subtensor op has a canonicalization pattern to do it when the types are not match.

It failed (in my case) because the source type and result type do not have the same rank. Dumping IRs from the pattern could see

%2449 = subtensor %2377[%2447, %c0] [1, 512] [1, 1] : tensor<1x512xf32> to tensor<512xf32>
%2448 = subtensor %2377[%2447, 0] [1, 512] [1, 1] : tensor<1x512xf32> to tensor<1x512xf32>
%2449 = tensor.cast %2448 : tensor<1x512xf32> to tensor<512xf32>

@MaheshRavishankar suggested that it should be tensor_reshape, but I don't think that we should replace it with linalg.reshape_tensor and there is no tensor.reshape. This is definitely a bug in the pattern because it would create an invalid cast op. I actually hit a similar issue before: https://llvm.discourse.group/t/lack-of-support-when-lowering-mhlo-reduce-to-linalg/2674

The issue in step one is that we are lacking a tensor_reshape like operation. I tried tensor::CastOp and it complains about 'tensor<1xf32>' and result type 'tensor' are cast incompatible. One workaround is to cast it twice, but I don’t think this is the way to go.

cc @nicolasvasilache

side question:
I can not trigger the pattern with simpler example. I ran mlir-opt -canonicalize a.mlir, but it seems that it never enter the pattern. It did not dump anything. Is there a way to trigger it?

module  {
  func @dynamic_slice(%arg0: tensor<2x513xi16>, %arg1: tensor<i64>, %arg2: tensor<i64>) -> tensor<513xi16> {
    %c1_i64 = constant 1 : i64
    %c0_i64 = constant 0 : i64
    %0 = tensor.extract %arg1[] : tensor<i64>
    %1 = cmpi slt, %0, %c1_i64 : i64
    %2 = select %1, %0, %c1_i64 : i64
    %3 = cmpi sgt, %2, %c0_i64 : i64
    %4 = select %3, %2, %c0_i64 : i64
    %5 = index_cast %4 : i64 to index
    %6 = tensor.extract %arg2[] : tensor<i64>
    %7 = cmpi slt, %6, %c0_i64 : i64
    %8 = select %7, %6, %c0_i64 : i64
    %9 = cmpi sgt, %8, %c0_i64 : i64
    %10 = select %9, %8, %c0_i64 : i64
    %11 = index_cast %10 : i64 to index
    %12 = subtensor %arg0[%5, %11] [1, 513] [1, 1] : tensor<2x513xi16> to tensor<513xi16>
    return %12 : tensor<513xi16>
  }
}

@MaheshRavishankar
Copy link
Contributor

Definitely doing this as a cast is invalid. (you hit the same error message, but the issue is different).

From my experimentation this is all coming from concatenate and slice operations that are inserting/taking unit-dim slices. The way to go is to have rank-reducing versions of these generated to begin with. Then you could insert a linalg.tensor_reshape for those.
The rank-reducing versions are not handled properly in IREE, and there are some canonicalization patterns missing in different places to make thse work correctly. Here is the WIP changes to core I plan to work on this week MaheshRavishankar/llvm-project@1eae450 . I can assign this bug to myself, it will take a few steps to plumb this through.

@MaheshRavishankar MaheshRavishankar self-assigned this Apr 12, 2021
@nicolasvasilache
Copy link
Contributor

nicolasvasilache commented Apr 12, 2021 via email

@nicolasvasilache
Copy link
Contributor

nicolasvasilache commented Apr 12, 2021 via email

@hanhanW
Copy link
Contributor Author

hanhanW commented Apr 12, 2021

Thanks @MaheshRavishankar , my case is actually coming from mhlo.dynamic-slice. The pattern was landed early today: tensorflow/mlir-hlo@a3fc99e

Re @nicolasvasilache I was trying to say that linalg.tensor_reshape lives in Linalg, so I don't think that we should replace tensor.cast directly. Having a reshape op in tensor dialect looks really good to me.

Is it mhlo slice or something else (IIRC I killed linalg.slice a while back).

I think it is mhlo.slice and it will be lowered to subtensor op.

Still it seems like a core canonicalization issue that does not look at whether the cast will be valid?

Yes, you are right. it unconditionally creates a cast op when the operand type and result type mismatch.

@nicolasvasilache
Copy link
Contributor

nicolasvasilache commented Apr 12, 2021 via email

@pifon2a
Copy link
Contributor

pifon2a commented Apr 12, 2021 via email

@MaheshRavishankar
Copy link
Contributor

Do you mind involving some folks from IREE too, or we could discuss in the codegen meeting tomorrow (9 AM PDT).
I think the issue of combining linalg.tensor_reshape and linalg.reshape is orthogonal. If memref.reshape is essentially linalg.reshape but in memref dialect then thats fine.

hanhanW added a commit to hanhanW/iree that referenced this issue Apr 19, 2021
Currently, the mhlo.pad will be lowered to linalg.pad_tensor and then
lowered to `linalg.init_tensor + linalg.fill + subtensor_insert`. The
init_tensor op will produce a dynamic shape even if the shape is static.
This leads a `tensor.cast` op added and relies on further patterns to
fix them. This is not needed to static shape and will hit some issues
related to iree-org#5385.
hanhanW added a commit that referenced this issue Apr 19, 2021
…tic (#5516)

Currently, the mhlo.pad will be lowered to linalg.pad_tensor and then
lowered to `linalg.init_tensor + linalg.fill + subtensor_insert`. The
init_tensor op will produce a dynamic shape even if the shape is static.
This leads a `tensor.cast` op added and relies on further patterns to
fix them. This is not needed to static shape and will hit some issues
related to #5385.
@MaheshRavishankar
Copy link
Contributor

Fixed by dbd43a7

@MaheshRavishankar
Copy link
Contributor

Reopening (closed wrong bug)

@MaheshRavishankar
Copy link
Contributor

Should be fixed by this commit llvm/llvm-project@41849a9.

@hanhanW please verify and close after the change comes in through the integrate process.

@MaheshRavishankar
Copy link
Contributor

Closing this. rank-reducing subtensors/subtensor_insert support is implemented now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen Shared code generation infrastructure and dialects
Projects
None yet
Development

No branches or pull requests

4 participants