Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompose torch.slice_scatter #1622

Merged
merged 5 commits into from
Nov 23, 2022
Merged

Decompose torch.slice_scatter #1622

merged 5 commits into from
Nov 23, 2022

Conversation

tanyokwok
Copy link
Collaborator

No description provided.

Copy link
Contributor

@silvasean silvasean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete this code as well:

class ConvertAtenSliceScatterOp

Also, do you need to update XFAIL set for MHLO?

lib/Dialect/Torch/Transforms/DecomposeComplexOps.cpp Outdated Show resolved Hide resolved
@tanyokwok tanyokwok force-pushed the tanyo/slice_scatter branch 2 times, most recently from 1b9cde3 to 3532359 Compare November 22, 2022 10:17
@tanyokwok
Copy link
Collaborator Author

tanyokwok commented Nov 22, 2022

You can delete this code as well:

The linalg_on_tensors backend will core dump during the run of the executable of related e2e tests if I delete it from torch-mlir/lib/Conversion/TorchToLinalg/DataMovement.cpp.

Also, do you need to update XFAIL set for MHLO?

No XFAIL will be updated because some extra torch ops still can't be lowered correctly.

@ramiro050
Copy link
Collaborator

The linalg_on_tensors backend will core dump during the run of the executable of related e2e tests if I delete it from torch-mlir/lib/Conversion/TorchToLinalg/DataMovement.cpp.

That's very strange. Can you open an issue with the backtrace generated? I can take a look

@tanyokwok
Copy link
Collaborator Author

tanyokwok commented Nov 23, 2022

Can you open an issue with the backtrace generated

@ramiro050 I can't get a concrete stack, but you can recover the exception with branch tanyo/slice_scatter_stage.

The crash stack:

Running SliceScatterModule_basic...

Thread 1 "python" hit Catchpoint 1 (signal SIGABRT), __pthread_kill_implementation (no_tid=0, signo=6, threadid=139922885132736) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139922885132736) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=139922885132736) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=139922885132736, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007f4255e17476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007f4255dfd7f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007f42558a642a in forward ()
#6  0x00007f425589fe2e in ?? () from /lib/x86_64-linux-gnu/libffi.so.8
#7  0x00007f425589c493 in ?? () from /lib/x86_64-linux-gnu/libffi.so.8
#8  0x00007f4255afb451 in ?? () from /usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so
#9  0x00007f4255b04ce2 in ?? () from /usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so
#10 0x0000555efe90730b in _PyObject_MakeTpCall ()
#11 0x0000555efe8ff610 in _PyEval_EvalFrameDefault ()
#12 0x0000555efe91e80e in ?? ()
#13 0x0000555efe8fb8c4 in _PyEval_EvalFrameDefault ()
#14 0x0000555efe910fbc in _PyFunction_Vectorcall ()
#15 0x0000555efe8fb8c4 in _PyEval_EvalFrameDefault ()
#16 0x0000555efe910fbc in _PyFunction_Vectorcall ()
#17 0x0000555efe8f95c9 in _PyEval_EvalFrameDefault ()
#18 0x0000555efe910fbc in _PyFunction_Vectorcall ()
#19 0x0000555efe8f9483 in _PyEval_EvalFrameDefault ()
#20 0x0000555efe910fbc in _PyFunction_Vectorcall ()
#21 0x0000555efe8f9483 in _PyEval_EvalFrameDefault ()
#22 0x0000555efe910fbc in _PyFunction_Vectorcall ()
#23 0x0000555efe8f9483 in _PyEval_EvalFrameDefault ()
#24 0x0000555efe910fbc in _PyFunction_Vectorcall ()
#25 0x0000555efe8f9483 in _PyEval_EvalFrameDefault ()
#26 0x0000555efe8f5cc6 in ?? ()
#27 0x0000555efe9eaeb6 in PyEval_EvalCode ()
#28 0x0000555efe9f04bd in ?? ()
#29 0x0000555efe911219 in ?? ()
#30 0x0000555efe8f9483 in _PyEval_EvalFrameDefault ()
#31 0x0000555efe910fbc in _PyFunction_Vectorcall ()
#32 0x0000555efe8f9483 in _PyEval_EvalFrameDefault ()
#33 0x0000555efe910fbc in _PyFunction_Vectorcall ()
#34 0x0000555efea08f59 in ?? ()
#35 0x0000555efea079d8 in Py_RunMain ()
#36 0x0000555efe9dde6d in Py_BytesMain ()
#37 0x00007f4255dfed90 in __libc_start_call_main (main=main@entry=0x555efe9dde30, argc=argc@entry=8, argv=argv@entry=0x7fff9c321f58) at ../sysdeps/nptl/libc_start_call_main.h:58
#38 0x00007f4255dfee40 in __libc_start_main_impl (main=0x555efe9dde30, argc=8, argv=0x7fff9c321f58, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff9c321f48)
    at ../csu/libc-start.c:392
#39 0x0000555efe9ddd65 in _start ()

@tanyokwok tanyokwok merged commit f3f2f10 into main Nov 23, 2022
@tanyokwok tanyokwok deleted the tanyo/slice_scatter branch November 23, 2022 10:14
tanyokwok pushed a commit to pai-disc/torch-mlir that referenced this pull request Nov 24, 2022
* Decompose torch.slice_scatter

* fix compilation error

* update file check

* fix ci

* fix i64 torch.tensor dtype
@ramiro050
Copy link
Collaborator

No XFAIL will be updated because some extra torch ops still can't be lowered correctly.

@tanyokwok, there is currently no e2e test that checks that this decomposition is correct. Every time support is added for an op (especially as a decomposition, since it affects every backend), the PR should contain passing e2e tests. Can you revert this commit and add it back once it is passing on at least one of the three backends?

@tanyokwok
Copy link
Collaborator Author

tanyokwok commented Nov 29, 2022

@ramiro050 The decomposition had passed all the e2e tests of SliceScatter in eager_mode.

I think it's sufficient to say the decomposition is correct in those cases. The EagerModeTestConfig will map the input torch.Tensor into TorchMLIRTensor, which is defined in:
https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/eager_mode/torch_mlir_tensor.py#L51.

The TorchMLIRTensor use __torch_dispatch__ to run the TorchMLIR compiled module, which will run by the torch_mlir_e2e_test.eager_backends.refbackend(where the decompositions happened).

@ramiro050
Copy link
Collaborator

@ramiro050 The decomposition had passed all the e2e tests of SliceScatter in eager_mode.

I think it's sufficient to say the decomposition is correct in those cases. The EagerModeTestConfig will map the input torch.Tensor into TorchMLIRTensor, which is defined in: https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/eager_mode/torch_mlir_tensor.py#L51.

The TorchMLIRTensor use __torch_dispatch__ to run the TorchMLIR compiled module, which will run by the torch_mlir_e2e_test.eager_backends.refbackend(where the decompositions happened).

The eager mode tests have two limitations that make it difficult to rely on them for correctness. The first is that it only uses static shapes in the IR, so dynamic support in implementations is not tested. The second limitation is that if the torch-mlir compilation fails, eager mode will fallback on executing things on conventional PyTorch, so your decomposition could be failing and the tests will still pass. You can try this out by having your decomposition return failure() at the start and running the tests.

Proper testing of a decomposition should involve having passing tests with dynamic shapes on at least one of the three backends: linalg, tosa, mhlo.

tanyokwok pushed a commit that referenced this pull request Nov 30, 2022
@tanyokwok
Copy link
Collaborator Author

@ramiro050 Thanks for reminding me. The revert was created #1659

tanyokwok pushed a commit that referenced this pull request Nov 30, 2022
tanyokwok pushed a commit to pai-disc/torch-mlir that referenced this pull request Dec 5, 2022
* Decompose torch.slice_scatter

* fix compilation error

* update file check

* fix ci

* fix i64 torch.tensor dtype
zzpmiracle pushed a commit to pai-disc/torch-mlir that referenced this pull request Dec 29, 2022
* Decompose torch.slice_scatter

* fix compilation error

* update file check

* fix ci

* fix i64 torch.tensor dtype
tanyokwok pushed a commit to pai-disc/torch-mlir that referenced this pull request Feb 2, 2023
* Decompose torch.slice_scatter

* fix compilation error

* update file check

* fix ci

* fix i64 torch.tensor dtype
JamesTheZ pushed a commit to pai-disc/torch-mlir that referenced this pull request Jul 19, 2023
* Fix float width
* Fix divide_floor & export promoteTypes api (#9)
* To comply with the old pytorch versions
* Add native_dropout_backward & native_layer_norm_backward decomposition (#15)
* Add native_dropout and related ops pattern (llvm#1211)
* [MHLO] fix dot general contract
* Fix batch_norm, div.Tensor_mode and folder (#21)
* Reimplement linear lowering
* Reimplement 2-D rhs for mutmul
* Add torchdynamo
* Decompose torch.slice_scatter (llvm#1622)
* Fix i64 torch.tensor dtype
* Add more mhlo basic converters
* Alleviate softmax datatype check (#24)
* Fix decompose native_batch_norm (#27)
* Support group_norm lowering (#25)
* Decompose torch.ones/zeros (#28)
* Fix softmax output type
* Fix gather
* Fix some decompose patterns
* Not check assert at runtime (#31)
* Fix bool tensor attr conversion bug (#32)
* Fix mlirDenseElementsAttrBoolGet
JamesTheZ added a commit to pai-disc/torch-mlir that referenced this pull request Jul 19, 2023
* Fix float width
* Fix divide_floor & export promoteTypes api (#9)
* To comply with the old pytorch versions
* Add native_dropout_backward & native_layer_norm_backward decomposition (#15)
* Add native_dropout and related ops pattern (llvm#1211)
* [MHLO] fix dot general contract
* Fix batch_norm, div.Tensor_mode and folder (#21)
* Reimplement linear lowering
* Reimplement 2-D rhs for mutmul
* Add torchdynamo
* Decompose torch.slice_scatter (llvm#1622)
* Fix i64 torch.tensor dtype
* Add more mhlo basic converters
* Alleviate softmax datatype check (#24)
* Fix decompose native_batch_norm (#27)
* Support group_norm lowering (#25)
* Decompose torch.ones/zeros (#28)
* Fix softmax output type
* Fix gather
* Fix some decompose patterns
* Not check assert at runtime (#31)
* Fix bool tensor attr conversion bug (#32)
* Fix mlirDenseElementsAttrBoolGet

Co-Authored-By: ZHENG, Zhen <[email protected]>
JamesTheZ added a commit to pai-disc/torch-mlir that referenced this pull request Jul 25, 2023
* Rewrite mhlo with stablehlo after rebase.
* Fix BAZEL building error of multiple definition.
* Fix float width
* Fix divide_floor & export promoteTypes api (#9)
* To comply with the old pytorch versions
* Add native_dropout_backward & native_layer_norm_backward decomposition (#15)
* Add native_dropout and related ops pattern (llvm#1211)
* [MHLO] fix dot general contract
* Fix batch_norm, div.Tensor_mode and folder (#21)
* Reimplement linear lowering
* Reimplement 2-D rhs for mutmul
* Add torchdynamo
* Decompose torch.slice_scatter (llvm#1622)
* Fix i64 torch.tensor dtype
* Add more mhlo basic converters
* Alleviate softmax datatype check (#24)
* Fix decompose native_batch_norm (#27)
* Support group_norm lowering (#25)
* Decompose torch.ones/zeros (#28)
* Fix softmax output type
* Fix gather
* Fix some decompose patterns
* Not check assert at runtime (#31)
* Fix bool tensor attr conversion bug (#32)
* Fix mlirDenseElementsAttrBoolGet

---------

Co-authored-by: ZHENG, Zhen <[email protected]>
JamesTheZ added a commit to pai-disc/torch-mlir that referenced this pull request Jul 25, 2023
* Rewrite mhlo with stablehlo after rebase.
* Fix BAZEL building error of multiple definition.
* Fix float width
* Fix divide_floor & export promoteTypes api (#9)
* To comply with the old pytorch versions
* Add native_dropout_backward & native_layer_norm_backward decomposition (#15)
* Add native_dropout and related ops pattern (llvm#1211)
* [MHLO] fix dot general contract
* Fix batch_norm, div.Tensor_mode and folder (#21)
* Reimplement linear lowering
* Reimplement 2-D rhs for mutmul
* Add torchdynamo
* Decompose torch.slice_scatter (llvm#1622)
* Fix i64 torch.tensor dtype
* Add more mhlo basic converters
* Alleviate softmax datatype check (#24)
* Fix decompose native_batch_norm (#27)
* Support group_norm lowering (#25)
* Decompose torch.ones/zeros (#28)
* Fix softmax output type
* Fix gather
* Fix some decompose patterns
* Not check assert at runtime (#31)
* Fix bool tensor attr conversion bug (#32)
* Fix mlirDenseElementsAttrBoolGet
JamesTheZ added a commit to pai-disc/torch-mlir that referenced this pull request Jul 27, 2023
* Rewrite mhlo with stablehlo after rebase.
* Fix BAZEL building error of multiple definition.
* Fix float width
* Fix divide_floor & export promoteTypes api (#9)
* To comply with the old pytorch versions
* Add native_dropout_backward & native_layer_norm_backward decomposition (#15)
* Add native_dropout and related ops pattern (llvm#1211)
* [MHLO] fix dot general contract
* Fix batch_norm, div.Tensor_mode and folder (#21)
* Reimplement linear lowering
* Reimplement 2-D rhs for mutmul
* Add torchdynamo
* Decompose torch.slice_scatter (llvm#1622)
* Fix i64 torch.tensor dtype
* Add more mhlo basic converters
* Alleviate softmax datatype check (#24)
* Fix decompose native_batch_norm (#27)
* Support group_norm lowering (#25)
* Decompose torch.ones/zeros (#28)
* Fix softmax output type
* Fix gather
* Fix some decompose patterns
* Not check assert at runtime (#31)
* Fix bool tensor attr conversion bug (#32)
* Fix mlirDenseElementsAttrBoolGet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants