-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IR] Function definition/function call support in Taichi IR #602
Comments
@yuanming-hu Should we have distinct FrontendFuncBodyStmt and FuncBodyStmt? |
I think so. We will need Before start implementation, we should thoroughly discuss implementation plans. Introducing functions to Taichi IR is a lot of work and needs careful considerations. |
Great! I'm adding FuncBodyStmt, with |
|
This basically makes sense to me! Before we introduce the new IR, as you mentioned (#612 (comment)), we'd better consider where to store the definitions of the functions. Currently, the Taichi IR only has a kernel body (which is a Pre-mature thoughtsI suggest we create a
New IR to be introduced:
(after AST lowering, )
Future steps:
More considerations:
|
You may also want
Currently I'm only doing FuncCallStmt, so no return value worrying me. |
That makes sense - let me think about it. (Taking a bath before it's too late. Will be back to this in 30 min) |
You also want plan A: plan B: plan C: |
Back. What does |
I feel like this needs a lot of changes to be made and therefore its implementation plan must be considered very carefully. Taichi IR is the core of this project and I believe introducing function support to it worth the time. I suggest we postpone implementing this until things are mature. Maybe it's also worth considering to finish #583 and #601 first. |
Potentially source_code for GL backend.. |
I see. I guess probably it's easier to re-emit the source code during code-gen instead of saving it? I have to sleep now. Let's try to finalize the two PRs mentioned above first - Good night! |
I've been following some works regarding IRs/lambda calculus supporting automatic differentiation as first-class concepts (e.g. Relay IR in TVM). This makes me wonder if we should consider the IR works at a (potentially much) larger scope.
But I'm not sure if my concerns here are valid. @yuanming-hu to comment on this.. A few references that I've found so far (I haven't understood them completely, nor are they close to be exhaustive): |
I almost forgot the need for supporting autodiff here. Thanks for reminding me of this. Autodiffing functions wouldn't be too hard though. Given a function that takes
The kernel simplicity rule happens because we currently do not support auto diff for-loops with mutable local variables that are carried through iterations. For example, p = 1
for i in range(10):
p += p
ret[None] = p #581 added support for autodiff mutable variables within
This can be a completely new research direction :-) Thanks for the references. They seem very interesting. I did read part of these when I was designing DiffTaichi. I'll take a deeper look if necessary. |
Relay Dev who designed&implemented the AD algorithm here. Arbitrary order of derivative is trivial as long as the output of ad can be feed back as the input of ad. |
Thanks! Could you also briefly explain the purpose of I guess Taichi doesn't really need to support closures (in the short term). On the other hand, closure at Taichi level is not the same thing as that at the IR level. Would this support be a requirement for the transformation operator to support function defs or higher-order gradients? That is, maybe the operator would produce closures during the IR transformation? |
Welcome! :-) I agree it is trivial as long as our This means stack instructions will be inserted into the IR after autodiff. However, currently
This is interesting - are you talking about evaluating the contraction of the For n=2 (Hessian vector prod as you mentioned, which I believe is the most common case), I think we can do either forward + reverse or reverse + forward.
Right, I think for some problems Taichi is trying to solve, we won't even have the space to store the Hessian matrix with Thanks for the discussions here. I think one possibility is to implement a forward-mode AD pass to that we can do Hessian vector product, which will enable the use of second-order optimizers. |
The biggest difference between taichi and relay is that they are not even on the same level. Relay do not care about kernel - it treat all kernel as black box, and rely on tvm to optimize them. So, relay support diffing mutable variable and branch just like pytorch. However, differentiating tvm kernel is hard. There is some pr working on it, but rn we just do it by hand. If you want to support more kernel, I highly recommend reading the relevant chapter in Evaluating Derivatives. It cover way more cases (with more optimizations) then the Zygote paper. for Hessian Vector Product reverse then forward make more sense, as a function take multiple input and have single output, which make it amicable for reverse mode. For higher order it is essentially Hessian Vector Product too, but when going forward mode ad, instead of the classical dual number the right hand side is a taylor series truncated to nth order (classical dual number is just right hand side truncated to first order). |
Issue: #602 ### Brief Summary We only need to disassemble the arguments and pass them to the real function one by one just as we pass matrix arguments to the kernel. It does not support local matrix and element of matrix field as argument when real_matrix=True yet. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Issue: #602 ### Brief Summary Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Issue: #602 #6590 ### Brief Summary Only supports scalar struct (every element in the struct is a scalar) for now. This PR does the following things: 1. Let `FuncCallStmt` return the `real_func_ret_struct *` result buffer instead of returning the return value directly. 2. Add `GetElementStmt` and `GetElementExpression` to get the i-th return value in a result buffer 3. Add `StructType.from_real_func_ret` to construct the returned struct to the `StructType` in Python Will add support for nested struct and matrix in struct in the following PRs. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
#6734) Issue: #602 #6590 Also fixed the bug that scalarize pass is not run on real functions thanks to @jim19930609. ### Brief Summary Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…name (taichi-dev#6495) Issue: taichi-dev#602 ### Brief Summary Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Issue: taichi-dev#602 ### Brief Summary We only need to disassemble the arguments and pass them to the real function one by one just as we pass matrix arguments to the kernel. It does not support local matrix and element of matrix field as argument when real_matrix=True yet. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…6522) Issue: taichi-dev#602 ### Brief Summary Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Issue: taichi-dev#602 taichi-dev#6590 ### Brief Summary Only supports scalar struct (every element in the struct is a scalar) for now. This PR does the following things: 1. Let `FuncCallStmt` return the `real_func_ret_struct *` result buffer instead of returning the return value directly. 2. Add `GetElementStmt` and `GetElementExpression` to get the i-th return value in a result buffer 3. Add `StructType.from_real_func_ret` to construct the returned struct to the `StructType` in Python Will add support for nested struct and matrix in struct in the following PRs. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
taichi-dev#6734) Issue: taichi-dev#602 taichi-dev#6590 Also fixed the bug that scalarize pass is not run on real functions thanks to @jim19930609. ### Brief Summary Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Issue: #602 Assuming `FuncCallStmt` can load or modify any address. I will write a pass to find out all the addresses loaded and modified in a `FuncCallStmt` later. <!-- copilot:all --> ### <samp>🤖 Generated by Copilot at cf4766c</samp> ### Summary 📞🔧🧪 <!-- 1. 📞 for enhancing the control flow graph analysis and optimization to support function calls. 2. 🔧 for improving the simplification of basic blocks by removing a redundant check and adding a condition for function call statements. 3. 🧪 for testing some experimental features of taichi, such as template arguments and assertions in kernels. --> This pull request enhances the control flow graph analysis and optimization to support function calls in taichi kernels. It also improves the simplification of basic blocks and tests some experimental features of taichi. The main files affected are `taichi/ir/control_flow_graph.cpp`, `taichi/transforms/simplify.cpp`, and `tests/python/test_function.py`. > _We unleash the power of `function calls`_ > _We optimize the `control flow graph`_ > _We test the limits of `taichi`'s core_ > _We simplify the `basic blocks` of wrath_ ### Walkthrough * Add support for function calls in control flow graph analysis and optimization ([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR220-R222), [link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL979-R983)) * Prevent simplifying function call statements that may have side effects or modify global variables ([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-58b0ebe6a129091d8ae4753ba2bba80c7cc000e7f8eab635a337094582f543edL98-L100), [link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-58b0ebe6a129091d8ae4753ba2bba80c7cc000e7f8eab635a337094582f543edR107-R108)) * Test the experimental template feature that allows passing arguments to functions as template parameters ([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L158-R166), [link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L175-R177)) * Test the support for assertions inside kernels ([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L327-L329), [link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L335-R344), [link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L361-R362)) * Remove a redundant check for function call statements in `BasicBlockSimplify` ([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-58b0ebe6a129091d8ae4753ba2bba80c7cc000e7f8eab635a337094582f543edL98-L100)) * Remove a comment that is no longer relevant in `test_function.py` ([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L327-L329))
…8155) Issue: #602 Pass `gather_func_store_dests` gathers all destinations whose content may change after a real function is called. The change may happen in the real function or in another real function that the real function calls. This pass uses Tarjan's strongly connected components algorithm to find the store destinations for all real functions a kernel calls, and store them in `store_dests` of the respective function. The global pointers are lowered in `lower_access`, so we need to gather the store destinations twice: before and after pass `lower_access`. <!-- copilot:all --> ### <samp>🤖 Generated by Copilot at 2c5586e</samp> ### Summary 📝🛠️🚀 <!-- 1. 📝 This emoji represents the addition of a new file and a new analysis pass declaration, which are documentation-related changes. 2. 🛠️ This emoji represents the update of the `ControlFlowGraph` class and the removal of some redundant or incorrect checks, which are bug-fixing or improvement-related changes. 3. 🚀 This emoji represents the introduction of a new enum type, a new method, and a new parameter, which are feature-related changes. --> This pull request introduces a new analysis pass `gather_func_store_dests` that can handle function calls in the IR and optimize their memory access and aliasing. It updates the `Function`, `FuncCallStmt`, and `ControlFlowGraph` classes and the `compile_function` and `compile_taichi_functions` transforms to use a new enum type `IRStage` and a new parameter `target_stage` to track and control the IR stage of each function. It also modifies some existing analysis functions and adds some include directives and forward declarations to support the new pass. > _To optimize function calls in the IR_ > _We need a new pass to infer_ > _The store destinations_ > _At different stages_ > _And use `IRStage` instead of `IRType` for sure_ ### Walkthrough * Add a new analysis pass `gather_func_store_dests` to collect the store destinations of each function in the IR ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-0bfbe49ff08844a76d5d2e1c5b81c2cf813be4a9089422b997bc380ec9a68eadR1-R103), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f6bc75768d2e24c782fefa45a7232d0e2b2bae091e697040e7f442a77d80ad45L216-R216)) * Modify the `FuncCallStmt` class to inherit from the `Store` trait and implement the `get_store_destination` method, using the arguments of the function call and the `store_dests` set of the called function ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R277-R289), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260L1062-R1062), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260R1074-R1080)) * Remove or modify the checks for `FuncCallStmt` in the `ControlFlowGraph` class, and use the `store_dests` set of the called function to update the reaching definition analysis ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL164-L167), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL219-R216), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR695), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL982-R977), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR988-R990)) * Add a new member variable `func_store_dests` to the `ControlFlowGraph` class, which is a map from `Function` pointers to sets of `Stmt` pointers, representing the store destinations of each function in the IR ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR117-R118)) * Replace the old enum type `IRType` with the new enum type `IRStage`, which has more values to indicate different IR stages of function compilation ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L17-R17), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L32-R32), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL9-R20), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL31-R50), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L13-R33), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L39-R48), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390)) * Modify the signature of the `compile_function` function to use the new parameter `target_stage` instead of the old parameter `start_from_ast`, to indicate the desired IR stage of the function compilation ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L199-R200), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390)) * Modify the definition of the `compile_to_offloads` function to add two calls to the new analysis pass `gather_func_store_dests`, before and after the call to the `compile_taichi_functions` function, and to pass different `target_stage` parameters to the `compile_taichi_functions` function ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL47-R51)) * Add or modify the include directives and forward declarations for the header files `function.h`, `statements.h`, and `unordered_set` in the source files and header files that use the `Function` class, the `FuncCallStmt` class, or the `std::unordered_set` container ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR9), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR10), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R5), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934R20), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dR3)) * Modify some comments in the header file `transforms.h` to remove the mentions of not demoting dense struct fors or reducing the number of statements before inlining, since these are no longer relevant or necessary after the new analysis pass ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L160-R161), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L192-R190))
…aichi-dev#8155) Issue: taichi-dev#602 Pass `gather_func_store_dests` gathers all destinations whose content may change after a real function is called. The change may happen in the real function or in another real function that the real function calls. This pass uses Tarjan's strongly connected components algorithm to find the store destinations for all real functions a kernel calls, and store them in `store_dests` of the respective function. The global pointers are lowered in `lower_access`, so we need to gather the store destinations twice: before and after pass `lower_access`. <!-- copilot:all --> ### <samp>🤖 Generated by Copilot at 2c5586e</samp> ### Summary 📝🛠️🚀 <!-- 1. 📝 This emoji represents the addition of a new file and a new analysis pass declaration, which are documentation-related changes. 2. 🛠️ This emoji represents the update of the `ControlFlowGraph` class and the removal of some redundant or incorrect checks, which are bug-fixing or improvement-related changes. 3. 🚀 This emoji represents the introduction of a new enum type, a new method, and a new parameter, which are feature-related changes. --> This pull request introduces a new analysis pass `gather_func_store_dests` that can handle function calls in the IR and optimize their memory access and aliasing. It updates the `Function`, `FuncCallStmt`, and `ControlFlowGraph` classes and the `compile_function` and `compile_taichi_functions` transforms to use a new enum type `IRStage` and a new parameter `target_stage` to track and control the IR stage of each function. It also modifies some existing analysis functions and adds some include directives and forward declarations to support the new pass. > _To optimize function calls in the IR_ > _We need a new pass to infer_ > _The store destinations_ > _At different stages_ > _And use `IRStage` instead of `IRType` for sure_ ### Walkthrough * Add a new analysis pass `gather_func_store_dests` to collect the store destinations of each function in the IR ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-0bfbe49ff08844a76d5d2e1c5b81c2cf813be4a9089422b997bc380ec9a68eadR1-R103), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f6bc75768d2e24c782fefa45a7232d0e2b2bae091e697040e7f442a77d80ad45L216-R216)) * Modify the `FuncCallStmt` class to inherit from the `Store` trait and implement the `get_store_destination` method, using the arguments of the function call and the `store_dests` set of the called function ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R277-R289), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260L1062-R1062), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260R1074-R1080)) * Remove or modify the checks for `FuncCallStmt` in the `ControlFlowGraph` class, and use the `store_dests` set of the called function to update the reaching definition analysis ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL164-L167), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL219-R216), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR695), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL982-R977), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR988-R990)) * Add a new member variable `func_store_dests` to the `ControlFlowGraph` class, which is a map from `Function` pointers to sets of `Stmt` pointers, representing the store destinations of each function in the IR ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR117-R118)) * Replace the old enum type `IRType` with the new enum type `IRStage`, which has more values to indicate different IR stages of function compilation ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L17-R17), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L32-R32), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL9-R20), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL31-R50), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L13-R33), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L39-R48), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390)) * Modify the signature of the `compile_function` function to use the new parameter `target_stage` instead of the old parameter `start_from_ast`, to indicate the desired IR stage of the function compilation ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L199-R200), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390)) * Modify the definition of the `compile_to_offloads` function to add two calls to the new analysis pass `gather_func_store_dests`, before and after the call to the `compile_taichi_functions` function, and to pass different `target_stage` parameters to the `compile_taichi_functions` function ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL47-R51)) * Add or modify the include directives and forward declarations for the header files `function.h`, `statements.h`, and `unordered_set` in the source files and header files that use the `Function` class, the `FuncCallStmt` class, or the `std::unordered_set` container ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR9), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR10), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R5), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934R20), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dR3)) * Modify some comments in the header file `transforms.h` to remove the mentions of not demoting dense struct fors or reducing the number of statements before inlining, since these are no longer relevant or necessary after the new analysis pass ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L160-R161), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L192-R190))
…aichi-dev#8155) Issue: taichi-dev#602 Pass `gather_func_store_dests` gathers all destinations whose content may change after a real function is called. The change may happen in the real function or in another real function that the real function calls. This pass uses Tarjan's strongly connected components algorithm to find the store destinations for all real functions a kernel calls, and store them in `store_dests` of the respective function. The global pointers are lowered in `lower_access`, so we need to gather the store destinations twice: before and after pass `lower_access`. <!-- copilot:all --> ### <samp>🤖 Generated by Copilot at 2c5586e</samp> ### Summary 📝🛠️🚀 <!-- 1. 📝 This emoji represents the addition of a new file and a new analysis pass declaration, which are documentation-related changes. 2. 🛠️ This emoji represents the update of the `ControlFlowGraph` class and the removal of some redundant or incorrect checks, which are bug-fixing or improvement-related changes. 3. 🚀 This emoji represents the introduction of a new enum type, a new method, and a new parameter, which are feature-related changes. --> This pull request introduces a new analysis pass `gather_func_store_dests` that can handle function calls in the IR and optimize their memory access and aliasing. It updates the `Function`, `FuncCallStmt`, and `ControlFlowGraph` classes and the `compile_function` and `compile_taichi_functions` transforms to use a new enum type `IRStage` and a new parameter `target_stage` to track and control the IR stage of each function. It also modifies some existing analysis functions and adds some include directives and forward declarations to support the new pass. > _To optimize function calls in the IR_ > _We need a new pass to infer_ > _The store destinations_ > _At different stages_ > _And use `IRStage` instead of `IRType` for sure_ ### Walkthrough * Add a new analysis pass `gather_func_store_dests` to collect the store destinations of each function in the IR ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-0bfbe49ff08844a76d5d2e1c5b81c2cf813be4a9089422b997bc380ec9a68eadR1-R103), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f6bc75768d2e24c782fefa45a7232d0e2b2bae091e697040e7f442a77d80ad45L216-R216)) * Modify the `FuncCallStmt` class to inherit from the `Store` trait and implement the `get_store_destination` method, using the arguments of the function call and the `store_dests` set of the called function ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R277-R289), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260L1062-R1062), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260R1074-R1080)) * Remove or modify the checks for `FuncCallStmt` in the `ControlFlowGraph` class, and use the `store_dests` set of the called function to update the reaching definition analysis ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL164-L167), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL219-R216), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR695), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL982-R977), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR988-R990)) * Add a new member variable `func_store_dests` to the `ControlFlowGraph` class, which is a map from `Function` pointers to sets of `Stmt` pointers, representing the store destinations of each function in the IR ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR117-R118)) * Replace the old enum type `IRType` with the new enum type `IRStage`, which has more values to indicate different IR stages of function compilation ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L17-R17), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L32-R32), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL9-R20), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL31-R50), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L13-R33), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L39-R48), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390)) * Modify the signature of the `compile_function` function to use the new parameter `target_stage` instead of the old parameter `start_from_ast`, to indicate the desired IR stage of the function compilation ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L199-R200), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390)) * Modify the definition of the `compile_to_offloads` function to add two calls to the new analysis pass `gather_func_store_dests`, before and after the call to the `compile_taichi_functions` function, and to pass different `target_stage` parameters to the `compile_taichi_functions` function ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL47-R51)) * Add or modify the include directives and forward declarations for the header files `function.h`, `statements.h`, and `unordered_set` in the source files and header files that use the `Function` class, the `FuncCallStmt` class, or the `std::unordered_set` container ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR9), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR10), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R5), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934R20), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dR3)) * Modify some comments in the header file `transforms.h` to remove the mentions of not demoting dense struct fors or reducing the number of statements before inlining, since these are no longer relevant or necessary after the new analysis pass ([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L160-R161), [link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L192-R190))
Concisely describe the proposed feature
I would like to add real function support so that no more IR spam space wasting and finally support recursion.
Describe the solution you'd like (if any)
ti.inline
/ti.noinline
decorator, also may do detection if a function is better inlined.Additional comments
I'm giving up #543, it begin from stage 4.
This issue is also related to #536.
The text was updated successfully, but these errors were encountered: