Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Custom ops support #1462

Closed
silvasean opened this issue Oct 5, 2022 · 10 comments
Closed

[RFC] Custom ops support #1462

silvasean opened this issue Oct 5, 2022 · 10 comments

Comments

@silvasean
Copy link
Contributor

Hey folks, custom ops has been a common feature request, and I'd like to share a design that we've been thinking about.

Let's talk first about the functionality we want to unlock for users. Generally speaking, custom ops are a way to "tunnel" information from the frontend to the backend. This can be something like a custom fusion, or something more advanced, like a representation of quantization or sparsity. Note that ops can also carry various different types of information, such as !torch.int, !torch.string, etc. (which we would generally need to be constrained to statically known values to be useful in the compiler). So an op like my.custom_quantize %foo, %quantization_info, ... etc. could actually carry a fairly elaborate representation of quantization/sparsity/etc. in the %quantization_info string.

The key requirement here is the desire to support new ops without recompiling Torch-MLIR. We've had some initial efforts in the custom op direction, but they always required recompiling Torch-MLIR. Put another way, the goal is to generalize the current way of handling ops so that there is no "hardcoded C++" for each op. Here by "support new ops" we really mean lowering to the backend contract (see architecture.md). Lowering to Linalg/etc. generally is out of scope of this effort, since a different set of technology is involved.

In our longer-term roadmap (see slide) of aligning with TorchDynamo and upstream PyTorch, we will directly import into the backend contract, so all of this work on custom ops will be obsoleted. But the indication is that this longer-term roadmap will take 1-2 years to fully pan out, while we want to unlock this user requirement in a shorter timeframe (ideally months).


When lowering to the backend contract, there are 4 main responsibilities (see architecture.md):

  • All tensors have been converted to value semantics.
  • All tensors have at least a known number of dimensions (i.e. rank), and ideally also have a precise size for each dimension.
  • All tensors have a known dtype.
  • Certain ops have been decomposed to make them easier to handle (this is configurable).

Let's go over these one by one:

  1. Conversion to value semantics. This is a very tricky conversion (see MaximizeValueSemantics pass, test). From a design perspective, I don't see any option other than to require all custom ops be value-semantic. This seems like it will be enough for the use cases that the community has presented so far.
  2. Shape inference. We already do this with the Shape Library. The shape functions are written in a Python script that is then used to generate the shape functions as MLIR. Since the shape functions are represented as MLIR snippets, we can (conceptually) already load new shape functions from a .mlir file on the side without recompilation. So there is no big architectural change here.
  3. Dtype inference. This currently happens in a hardcoded C++ pass RefineTypes. This pass needs to be rewritten to not require recompiling to support new ops. We can follow the same design as the shape library -- reify, abstractly interpret, drop. This will be a total rewrite of this component.
  4. Decompositions. We currently have thousands of lines of decompositions in DecomposeComplexOps. We should be able to follow a similar approach as the shape library, and basically just replace aten.foo ... with call @aten.foo$decomposition .... PyTorch already maintains many such decompositions (here and here), so this will also allow unifying efforts with those upstream, which is a big win by itself.

The overall design that we are left with is that we need to generalize the shape library infrastructure to encompass dtype refinement rules and decompositions. The system could look something like this, which is a generalization of shape_lib_gen.py:

def aten〇tanh〡shape(self: List[int]) -> List[int]:
    return upstream_shape_functions.unary(self)
def aten〇tanh〡dtype(self_shape: List[int], self_dtype: int) -> int:
    return dtype_functions.unary(self_shape, self_dtype)
# This is just an example -- we would probably not want this decomposition for tanh.
# Decompositions would be optional, but if provided would be available for use.
def aten〇tanh〡decomposition(self: torch.Tensor) -> torch.Tensor:
    exp, expn = torch.exp(self), torch.exp(-self)
    return (exp - expn) / (exp + expn)

We would process this and generate a .mlir file similar to the current shape library, which would be usable to service the goals listed above. When users want to support custom ops, they will write such functions, and use tools provided by Torch-MLIR to import that into a .mlir file and then include those for the passes that need them.


The main work items here are:

  • Generalize shape library testing framework to work with dtype functions and decompositions
  • Migrate PyTorch’s type promotion logic (currently hardcoded in refine types) to Python so that dtype functions can use it
  • Generalize (and potentially improve) the abstract interpretation code from the shape library to also work with dtype calculations and decompositions
  • Create an interface for users to load their custom ops: $ torch-mlir-opt -pass-pipeline='torchscript-module-to-torch-backend-pipeline{extra-library=my_custom_ops.mlir}'

Thoughts or feedback?

@powderluv
Copy link
Collaborator

I assume we will also add the python wrapper to point to the custom ops from python ?

@ramiro050
Copy link
Collaborator

I assume we will also add the python wrapper to point to the custom ops from python ?

Yeah, the Torch-MLIR Python API will have a way to specify custom ops file.

ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Feb 9, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype functions for
the convolution ops, all of which take one optional tensor as an
argument.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Feb 9, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype functions for
the convolution ops, all of which take one optional tensor as an
argument.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Feb 13, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype functions for
the convolution ops, all of which take one optional tensor as an
argument.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Feb 13, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype functions for
the convolution ops, all of which take one optional tensor as an
argument.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Feb 13, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype functions for
the convolution ops, all of which take one optional tensor as an
argument.
ramiro050 added a commit that referenced this issue Feb 14, 2023
…1865)

The original design for the dtype functions outlined in
#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype functions for
the convolution ops, all of which take one optional tensor as an
argument.
@makslevental
Copy link
Collaborator

Create an interface for users to load their custom ops: $ torch-mlir-opt -pass-pipeline='torchscript-module-to-torch-backend-pipeline{extra-library=my_custom_ops.mlir}'

Another way to implement this is to enable/allow users to provide these shape and dtype functions in the same module that they provide their func.func @forward(.... I show a minimal implementation in #1959.

@ramiro050
Copy link
Collaborator

Another way to implement this is to enable/allow users to provide these shape and dtype functions in the same module that they provide their func.func @forward(.... I show a minimal implementation in #1959.

I have a couple of questions. The first is regarding the user experience. How would the interface for users work to add their dtype functions into their module? Ideally we want to avoid users having to directly modify an MLIR module at the Python level to attach more MLIR.

The second question is regarding how things are expected to work internally in torch-mlir. Carrying around the shape+dtype functions in the module means that now every pass in torch-mlir will also be spending time applying to shape+dtype functions, which would increase lowering time (might not be a concern in practice, but it's something to think about). The second issue I foresee is that shape+dtype functions are not required to satisfy the backend contract. Carrying around the functions in the module would mean that we now have to have special casing in the LowerToBackendContract pass to not consider those functions. Lastly, carrying around many shape+dtype functions would make reading MLIR dumps harder, since passes that have nothing to do with type refinement would be also dumping the shape+dtype functions, worsening the debugging experience.

What are the downsides from the current proposed design you're trying to solve with this new design?

@makslevental
Copy link
Collaborator

makslevental commented Mar 22, 2023

What are the downsides from the current proposed design you're trying to solve with this new design?

There is no current design - there is only an idea for an API.

Create an interface for users to load their custom ops: $ torch-mlir-opt -pass-pipeline='torchscript-module-to-torch-backend-pipeline{extra-library=my_custom_ops.mlir}'

I can certainly envisage how that would be implemented but it's not here yet.

Next;

Ideally we want to avoid users having to directly modify an MLIR module at the Python level to attach more MLIR

There's some premise/assumption implicit here that I don't understand/agree with - there's simultaneously mentions of the python API:

Torch-MLIR's only supported "stable" interface is the torch_mlir.compile

and the proposal here (re: going through torch-mlir-opt). So which is it? Is torch-mlir Python first or CLI first? Regardless, my answer is: we have IR so you can pass textual source between passes (instead requiring transforming one single AST), i.e., to decouple passes for exactly this kind of thing - run a pass, fiddle, run another pass, etc. Whether that fiddling happens through the Python bindings or by handwriting a my_custom_ops.mlir is irrelevant (how would people create this my_custom_ops.mlir if not through Python anyway...). Indeed, for custom ops (wrt shape/dtype), my PR would be moot if the implementation of this API existed today because I would just generate those shape functions using Python bindings and print(module) and then pass them that to extra-library.

Next;

The second issue I foresee is that shape+dtype functions are not required to satisfy the backend contract.

Bu they are through TensorStaticInfoCastOp; because even though torch.operator carries a non-valsem type (and thus doesn't trip LowerToBackendContract.cpp#L91), tensor_static_info_casts are generated around it and those thwart passing the backend contract.

Finally;

Carrying around the shape+dtype functions in the module means that now every pass in torch-mlir will also be spending time applying to shape+dtype functions, which would increase lowering time (might not be a concern in practice, but it's something to think about)... Lastly, carrying around many shape+dtype functions would make reading MLIR dumps harder...

Sure, but we should weigh the costs/benefits of increased compile times and decreased legibility against today's users' needs wrt lowering models which include currently unsupported ops.

@silvasean silvasean mentioned this issue Mar 22, 2023
4 tasks
@silvasean
Copy link
Contributor Author

Hey folks, it looks like the main stakeholders here are happy with pushing this feature through as we have specced here. I created another issue to help shepherd this work to completion at a detailed work level. Let's move further discussion there: #1963

ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Mar 22, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype function for
convolution op, which takes one optional tensor as an argument.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Mar 22, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype function for
convolution op, which takes one optional tensor as an argument.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Mar 22, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype function for
convolution op, which takes one optional tensor as an argument.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue Mar 22, 2023
The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype function for
convolution op, which takes one optional tensor as an argument.
ramiro050 added a commit that referenced this issue Mar 23, 2023
…1965)

The original design for the dtype functions outlined in
#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype function for
convolution op, which takes one optional tensor as an argument.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue May 9, 2023
commit bafe339
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon May 8 21:26:56 2023 +0000

    Add dtype functions for aten.atan and prims.squeeze

commit bebf695
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon May 8 21:26:10 2023 +0000

    Remove duplicate code from merge with main

commit 0d11895
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Fri May 5 21:39:02 2023 +0000

    Update LLVM tag

commit 73d5c07
Merge: 899d8bc eaaaeb6
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Fri May 5 21:30:09 2023 +0000

    Merge remote-tracking branch 'upstream/main' into merge-main

commit 899d8bc
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Mar 13 21:39:14 2023 +0000

    Add dtype functions for `aten.ge.Tensor` and `aten.le.Tensor`

commit f58f9c2
Merge: ce7abf4 4912c39
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Mar 13 21:32:00 2023 +0000

    Merge branch 'main' into merge-main

commit ce7abf4
Author: Jiahao Li <[email protected]>
Date:   Wed Feb 22 06:54:41 2023 +0800

    Add dtype functions for ops that take dtype from 2nd operand (llvm#1891)

commit 63945a2
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Feb 13 17:56:09 2023 -0800

    Change dtype functions interface to take ints tuple for each tensor (llvm#1865)

    The original design for the dtype functions outlined in
    llvm#1462 was unable to properly
    handle ops that take optional tensors as an input when the optional
    tensor has a value of None. By the time the op gets imported into
    torch-mlir, if an optional value is None, all information about the
    original type is lost from the op type signature, preventing
    torch-mlir from knowing if a value of None was from an optional tensor
    or not, which was crucial in the original design since each tensor
    argument must be turned into two separate arguments for the dtype
    function.

    This commit changes the interface to dtype functions such that each
    tensor turns into a tuple of two ints, the first representing the rank
    of the tensor and the second the dtype of the tensor. Since now there
    is a one-to-one correspondence between the operands of an op and the
    operands of its dtype function, there is no ambiguity about which
    operand of the op corresponds with which operand of the dtype
    function.

    To test the implementation, this commit defines dtype functions for
    the convolution ops, all of which take one optional tensor as an
    argument.

commit 981ac88
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Wed Feb 1 22:30:27 2023 +0000

    Add dtype functions for two tensor promotion ops (llvm#1831)

    This commit adds dtype functions for ops in RefineTypes under the
    category of "Promote the two dtypes". The only ops not added here are
    convolution ops, since they take an optional tensor argument, and the
    dtype pipeline currently does not correctly handle that case. I will
    add a follow up patch fixing this.

    This commit also adds two helper functions that perform a very
    thorough testing of dtype functions. The helper function
    `_check_two_tensor_op` is able to independently test invalid input
    dtypes and invalid output dtypes.

    Lastly, this commit also XFAILs "MobilenetV3Module_basic".

commit 83d4e89
Author: Jiahao Li <[email protected]>
Date:   Sat Jan 21 02:39:41 2023 +0800

    Add dtype functions for floating point ops (llvm#1813)

commit 8cae5ba
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Jan 16 14:32:23 2023 -0800

    Add dtype functions for comparison ops (llvm#1806)

    This commit adds dtype functions for comparison ops that always return
    a tensor of dtype `i1`.

commit 5b77c15
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Jan 16 20:27:49 2023 +0000

    Add CI to `dtype-functions-staging` branch

commit ac94ba2
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Thu Jan 12 22:41:04 2023 +0000

    Move dtype functions into their own section in lib gen file

    In order to easily keep track of the dtype functions that have been
    moved to `abstract_interp_lib_gen.py` and make it easier to add new
    ones, this commit groups all the dtype functions together, rather than
    having them interspersed between the shape functions.
ramiro050 added a commit to ramiro050/torch-mlir that referenced this issue May 9, 2023
commit bafe339
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon May 8 21:26:56 2023 +0000

    Add dtype functions for aten.atan and prims.squeeze

commit bebf695
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon May 8 21:26:10 2023 +0000

    Remove duplicate code from merge with main

commit 0d11895
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Fri May 5 21:39:02 2023 +0000

    Update LLVM tag

commit 73d5c07
Merge: 899d8bc eaaaeb6
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Fri May 5 21:30:09 2023 +0000

    Merge remote-tracking branch 'upstream/main' into merge-main

commit 899d8bc
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Mar 13 21:39:14 2023 +0000

    Add dtype functions for `aten.ge.Tensor` and `aten.le.Tensor`

commit f58f9c2
Merge: ce7abf4 4912c39
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Mar 13 21:32:00 2023 +0000

    Merge branch 'main' into merge-main

commit ce7abf4
Author: Jiahao Li <[email protected]>
Date:   Wed Feb 22 06:54:41 2023 +0800

    Add dtype functions for ops that take dtype from 2nd operand (llvm#1891)

commit 63945a2
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Feb 13 17:56:09 2023 -0800

    Change dtype functions interface to take ints tuple for each tensor (llvm#1865)

    The original design for the dtype functions outlined in
    llvm#1462 was unable to properly
    handle ops that take optional tensors as an input when the optional
    tensor has a value of None. By the time the op gets imported into
    torch-mlir, if an optional value is None, all information about the
    original type is lost from the op type signature, preventing
    torch-mlir from knowing if a value of None was from an optional tensor
    or not, which was crucial in the original design since each tensor
    argument must be turned into two separate arguments for the dtype
    function.

    This commit changes the interface to dtype functions such that each
    tensor turns into a tuple of two ints, the first representing the rank
    of the tensor and the second the dtype of the tensor. Since now there
    is a one-to-one correspondence between the operands of an op and the
    operands of its dtype function, there is no ambiguity about which
    operand of the op corresponds with which operand of the dtype
    function.

    To test the implementation, this commit defines dtype functions for
    the convolution ops, all of which take one optional tensor as an
    argument.

commit 981ac88
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Wed Feb 1 22:30:27 2023 +0000

    Add dtype functions for two tensor promotion ops (llvm#1831)

    This commit adds dtype functions for ops in RefineTypes under the
    category of "Promote the two dtypes". The only ops not added here are
    convolution ops, since they take an optional tensor argument, and the
    dtype pipeline currently does not correctly handle that case. I will
    add a follow up patch fixing this.

    This commit also adds two helper functions that perform a very
    thorough testing of dtype functions. The helper function
    `_check_two_tensor_op` is able to independently test invalid input
    dtypes and invalid output dtypes.

    Lastly, this commit also XFAILs "MobilenetV3Module_basic".

commit 83d4e89
Author: Jiahao Li <[email protected]>
Date:   Sat Jan 21 02:39:41 2023 +0800

    Add dtype functions for floating point ops (llvm#1813)

commit 8cae5ba
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Jan 16 14:32:23 2023 -0800

    Add dtype functions for comparison ops (llvm#1806)

    This commit adds dtype functions for comparison ops that always return
    a tensor of dtype `i1`.

commit 5b77c15
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Mon Jan 16 20:27:49 2023 +0000

    Add CI to `dtype-functions-staging` branch

commit ac94ba2
Author: Ramiro Leal-Cavazos <[email protected]>
Date:   Thu Jan 12 22:41:04 2023 +0000

    Move dtype functions into their own section in lib gen file

    In order to easily keep track of the dtype functions that have been
    moved to `abstract_interp_lib_gen.py` and make it easier to add new
    ones, this commit groups all the dtype functions together, rather than
    having them interspersed between the shape functions.
gpetters94 pushed a commit to gpetters94/mlir-npcomp that referenced this issue May 10, 2023
…lvm#1965)

The original design for the dtype functions outlined in
llvm#1462 was unable to properly
handle ops that take optional tensors as an input when the optional
tensor has a value of None. By the time the op gets imported into
torch-mlir, if an optional value is None, all information about the
original type is lost from the op type signature, preventing
torch-mlir from knowing if a value of None was from an optional tensor
or not, which was crucial in the original design since each tensor
argument must be turned into two separate arguments for the dtype
function.

This commit changes the interface to dtype functions such that each
tensor turns into a tuple of two ints, the first representing the rank
of the tensor and the second the dtype of the tensor. Since now there
is a one-to-one correspondence between the operands of an op and the
operands of its dtype function, there is no ambiguity about which
operand of the op corresponds with which operand of the dtype
function.

To test the implementation, this commit defines dtype function for
convolution op, which takes one optional tensor as an argument.
@igozali
Copy link

igozali commented Jun 28, 2023

Apologies for the very silly question, but wondering where I can find the rationale/RFC/discussion for using the unicode character 〇 in the function names above?

@powderluv
Copy link
Collaborator

Its a cust〇m 〇p :)

@powderluv
Copy link
Collaborator

And I think can be closed now that it is implemented.

@ramiro050
Copy link
Collaborator

@igozali, there is some information about why use the character here:

Note the use of `` as a separator since `.` or `::` aren't legal
in a Python identifier.

and the rationale for using the entire operator name for each function here.

In short, we wanted to keep the full operator name with namespace information in a way that would then be easily parsable by the Python script that generates the MLIR snippets, so that in MLIR it is easy to know which abstract interpretation function goes with which op. If we had resorted to using normal characters (letters and _), there would be several ambiguous cases, because some ops have _ in their names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants