LowerToBackendContract: Explicitly error out on unimplemented operator #1947

mgehre-amd · 2023-03-16T13:11:57Z

As discussed on discord [0], this PR will emit more direct errors like unsupported by backend contract: Unimplemented operator 'aten.clamp.Tensor' instead of diagnosing unknown ranks or missing value conversion.

[0] https://discordapp.com/channels/636084430946959380/742573221882364009/1083826291301679225

mgehre-amd · 2023-03-17T10:09:50Z

I'm a bit surprised by the CI failure. Turn out that
./tools/e2e_test.sh -f NormalizeModule_basic -c lazy_tensor_core fails on this PR with error: unsupported by backend contract: Unimplemented operator 'aten.clamp_min.Tensor' but it passes without my PR.

(Same for LTC on ElementwiseClampMaxModule_basic, fails with aten.clamp_max.Tensor)

I don't quite understand how it could be passing before when the operator is not implemented?

ramiro050 · 2023-03-17T16:45:32Z

That's very strange. Let me take a look and see what I find

ramiro050 · 2023-03-17T18:25:26Z

I don't quite understand how it could be passing before when the operator is not implemented?

I think I know what's happening. The LTC reference backend in Torch-MLIR creates an MLIR graph that is expected to satisfy the backend contract, but then executes things using the TorchScript LTC backend. In other words, the MLIR graph is not compiled further down and executed.

torch-mlir/python/torch_mlir/csrc/reference_lazy_backend/backend_impl.cpp

Lines 102 to 103 in 3f3833d

    
           // We don't have a way to execute a computation based on the generated MLIR, 
        
           // so we'll fallback to the implementation used by the TS LTC backend.

This is why the tests were passing even though there is no ODS for aten.clamp_max.Tensor.

In reality, the torch.operator is not the erroneous part. We expect to use torch.operators to represent custom ops in the future. The cases we want to error out on are when torch.operators don't satisfy the guarantees of the backend contract: value semantics, dtype, and shapes for all tensors.

In the case of LTC, the clamp tests create the op

%4 = torch.operator "aten.clamp_min.Tensor"(%3, %arg0) : (!torch.vtensor<[3,1],f32>, !torch.vtensor<[],f64>) -> !torch.vtensor<[3,1],f32>

which does satisfy the backend contract. The check should be modified to use the checkType helper to make sure the outputs of torch.operator satisfy the backend contract.

…nvalid Otherwise it might be a custom op that the backend supports.

mgehre-amd · 2023-03-20T14:13:35Z

I don't quite understand how it could be passing before when the operator is not implemented?

I think I know what's happening. The LTC reference backend in Torch-MLIR creates an MLIR graph that is expected to satisfy the backend contract, but then executes things using the TorchScript LTC backend. In other words, the MLIR graph is not compiled further down and executed.

torch-mlir/python/torch_mlir/csrc/reference_lazy_backend/backend_impl.cpp

Lines 102 to 103 in 3f3833d

// We don't have a way to execute a computation based on the generated MLIR,

// so we'll fallback to the implementation used by the TS LTC backend.

This is why the tests were passing even though there is no ODS for aten.clamp_max.Tensor.

In reality, the torch.operator is not the erroneous part. We expect to use torch.operators to represent custom ops in the future. The cases we want to error out on are when torch.operators don't satisfy the guarantees of the backend contract: value semantics, dtype, and shapes for all tensors.

In the case of LTC, the clamp tests create the op
%4 = torch.operator "aten.clamp_min.Tensor"(%3, %arg0) : (!torch.vtensor<[3,1],f32>, !torch.vtensor<[],f64>) -> !torch.vtensor<[3,1],f32>
which does satisfy the backend contract. The check should be modified to use the checkType helper to make sure the outputs of torch.operator satisfy the backend contract.

Thanks, that's it. I updated the PR and now it passes :-)

ramiro050

Great, thanks!

llvm#1947) * LowerToBackendContract: Explicitly error out on unimplemented operator But only reject torch.operator when results are invalid. Otherwise it might be a custom op that the backend supports.

mgehre-amd requested a review from ramiro050 March 16, 2023 13:11

mgehre-amd force-pushed the mgehre.unsupported_op branch from bb69a9f to e8b9d10 Compare March 16, 2023 15:10

LowerToBackendContract: Explicitly error out on unimplemented operator

bb80407

mgehre-amd force-pushed the mgehre.unsupported_op branch from e8b9d10 to bb80407 Compare March 17, 2023 08:35

LowerToBackendContract: Only reject torch.operator when results are i…

b578814

…nvalid Otherwise it might be a custom op that the backend supports.

ramiro050 approved these changes Mar 20, 2023

View reviewed changes

mgehre-amd merged commit aa5bcb3 into llvm:main Mar 20, 2023

makslevental mentioned this pull request Mar 20, 2023

RFC: Support pushing custom ops through backend-contract using torch.operator #1959

Closed

mgehre-amd deleted the mgehre.unsupported_op branch March 21, 2023 12:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LowerToBackendContract: Explicitly error out on unimplemented operator #1947

LowerToBackendContract: Explicitly error out on unimplemented operator #1947

mgehre-amd commented Mar 16, 2023

mgehre-amd commented Mar 17, 2023 •

edited

Loading

ramiro050 commented Mar 17, 2023

ramiro050 commented Mar 17, 2023

mgehre-amd commented Mar 20, 2023

ramiro050 left a comment

LowerToBackendContract: Explicitly error out on unimplemented operator #1947

LowerToBackendContract: Explicitly error out on unimplemented operator #1947

Conversation

mgehre-amd commented Mar 16, 2023

mgehre-amd commented Mar 17, 2023 • edited Loading

ramiro050 commented Mar 17, 2023

ramiro050 commented Mar 17, 2023

mgehre-amd commented Mar 20, 2023

ramiro050 left a comment

Choose a reason for hiding this comment

mgehre-amd commented Mar 17, 2023 •

edited

Loading