Skip to content

Commit

Permalink
[Lang] Migrate irpass::scalarize() after optimize_bit_struct_stores &…
Browse files Browse the repository at this point in the history
… determine_ad_stack_size (#8097)

Issue: #

### Brief Summary

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 5e824fd</samp>

This pull request fixes several bugs and improves the performance and
usability of the code generation and IR transformation modules. It
affects the files `codegen_llvm.cpp`, `ir_builder.h`,
`compile_to_offloads.cpp`, and `demote_operations.cpp`. It also adds a
new header file for the demote_operations transform.

### Walkthrough

<!--
copilot:walkthrough
-->
### <samp>🤖 Generated by Copilot at 5e824fd</samp>

* Fix bugs in code generation for binary division and right shift
operations in `codegen_llvm.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-3c663c78745adcd3f6a7ac81fe99e628decc3040f292ea1e20ecd4b85a7f4313L614-R614),
[link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-3c663c78745adcd3f6a7ac81fe99e628decc3040f292ea1e20ecd4b85a7f4313L661-R661))
* Add missing header file for `ConstStmt` class in `ir_builder.h`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-1894085b261e833e3e66924fc5b1cf63b9dd8b8aa0b3e78ec64366396131470dR5))
* Modify return type of `get_constant` method in `IRBuilder` class to be
more general in `ir_builder.h`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-1894085b261e833e3e66924fc5b1cf63b9dd8b8aa0b3e78ec64366396131470dL140-R141))
* Remove redundant call to `scalarize` pass in `compile_to_offloads`
transform in `compile_to_offloads.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL234-L241))
* Add missing argument to `full_simplify` pass in `compile_to_offloads`
transform in `compile_to_offloads.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bR288-R296))
* Add new header file for `IRBuilder` class in `demote_operations.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bR19-R129))
* Add new logic to handle tensor types in binary operations in
`demote_operations` transform in `demote_operations.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bR135-R146),
[link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bR165),
[link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bL67-R198),
[link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bL99-R228),
[link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bL159-R254),
[link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bL191-R287))
* Remove redundant logic for tensor types in binary operations in
`demote_operations` transform in `demote_operations.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bL109-L150))
* Add comment to explain motivation and strategy for demoting power
operation in `demote_operations` transform in `demote_operations.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8097/files?diff=unified&w=0#diff-d217f2b07d4578612dc805b0f01e5dc1883be9acb906b222a8762313cfd0596bL176-R265))
  • Loading branch information
jim19930609 authored May 31, 2023
1 parent 3332eee commit 7555f30
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 1 deletion.
15 changes: 14 additions & 1 deletion taichi/ir/control_flow_graph.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -384,7 +384,10 @@ void CFGNode::reaching_definition_analysis(bool after_lower_access) {
auto data_source_ptrs = irpass::analysis::get_store_destination(stmt);
for (auto data_source_ptr : data_source_ptrs) {
// stmt provides a data source
if (after_lower_access && !(data_source_ptr->is<AllocaStmt>())) {
if (after_lower_access &&
!((data_source_ptr->is<MatrixPtrStmt>() &&
data_source_ptr->as<MatrixPtrStmt>()->origin->is<AllocaStmt>()) ||
data_source_ptr->is<AllocaStmt>())) {
// After lower_access, we only analyze local variables.
continue;
}
Expand Down Expand Up @@ -552,6 +555,8 @@ void CFGNode::live_variable_analysis(bool after_lower_access) {
irpass::analysis::get_load_pointers(stmt, true /*get_alias*/);
for (auto &load_ptr : load_ptrs) {
if (!after_lower_access ||
(load_ptr->is<MatrixPtrStmt>() &&
load_ptr->as<MatrixPtrStmt>()->origin->is<AllocaStmt>()) ||
(load_ptr->is<AllocaStmt>() || load_ptr->is<AdStackAllocaStmt>())) {
// After lower_access, we only analyze local variables and stacks.
if (!contain_variable(live_kill, load_ptr)) {
Expand All @@ -576,6 +581,8 @@ void CFGNode::live_variable_analysis(bool after_lower_access) {
}
for (auto store_ptr : store_ptrs) {
if (!after_lower_access ||
(store_ptr->is<MatrixPtrStmt>() &&
store_ptr->as<MatrixPtrStmt>()->origin->is<AllocaStmt>()) ||
(store_ptr->is<AllocaStmt>() || store_ptr->is<AdStackAllocaStmt>())) {
// After lower_access, we only analyze local variables and stacks.
live_kill.insert(store_ptr);
Expand Down Expand Up @@ -707,6 +714,8 @@ bool CFGNode::dead_store_elimination(bool after_lower_access) {
auto store_ptr = *store_ptrs.begin();

if (!after_lower_access ||
(store_ptr->is<MatrixPtrStmt>() &&
store_ptr->as<MatrixPtrStmt>()->origin->is<AllocaStmt>()) ||
(store_ptr->is<AllocaStmt>() || store_ptr->is<AdStackAllocaStmt>())) {
// !may_contain_variable(live_in_this_node, store_ptr): address is not
// loaded after this store
Expand Down Expand Up @@ -806,6 +815,8 @@ bool CFGNode::dead_store_elimination(bool after_lower_access) {
auto load_ptr = load_ptrs.begin()[0];

if (!after_lower_access ||
(load_ptr->is<MatrixPtrStmt>() &&
load_ptr->as<MatrixPtrStmt>()->origin->is<AllocaStmt>()) ||
(load_ptr->is<AllocaStmt>() || load_ptr->is<AdStackAllocaStmt>())) {
// live_load_in_this_node[addr]: tracks the
// next load to the same address
Expand All @@ -832,6 +843,8 @@ bool CFGNode::dead_store_elimination(bool after_lower_access) {
// Update live_in_this_node
for (auto &load_ptr : load_ptrs) {
if (!after_lower_access ||
(load_ptr->is<MatrixPtrStmt>() &&
load_ptr->as<MatrixPtrStmt>()->origin->is<AllocaStmt>()) ||
(load_ptr->is<AllocaStmt>() || load_ptr->is<AdStackAllocaStmt>())) {
// Addr is used in this node, so it's live in this node
update_container_with_alias(tensor_to_matrix_ptrs_map,
Expand Down
9 changes: 9 additions & 0 deletions taichi/transforms/compile_to_offloads.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,15 @@ void offload_to_executable(IRNode *ir,
print("Bit struct stores optimized");
}

if (config.real_matrix_scalarize) {
if (irpass::scalarize(ir)) {
// Remove redundant MatrixInitStmt inserted during scalarization
irpass::full_simplify(ir, config,
{lower_global_access, /*autodiff_enabled*/ false});
print("Scalarized");
}
}

if (config.arch == Arch::cuda && config.half2_vectorization &&
!get_custom_cuda_library_path().empty()) {
irpass::vectorize_half2(ir);
Expand Down

0 comments on commit 7555f30

Please sign in to comment.