Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[type] Local adder structure #2136

Merged
merged 6 commits into from
Jan 6, 2021

Conversation

TH3CHARLie
Copy link
Collaborator

@TH3CHARLie TH3CHARLie marked this pull request as draft January 1, 2021 16:11
@yuanming-hu yuanming-hu changed the title [type] Local Structure [type] Local adder structure Jan 1, 2021
@TH3CHARLie
Copy link
Collaborator Author

some performance numbers comparing two kinds of loops:

[ 41.96%   0.105 s      1x |  105.085   105.085   105.085 ms] evolve_naive_c8_0_kernel_10_range_for
[  0.70%   0.002 s      1x |    1.756     1.756     1.756 ms] evolve_vectorized_c6_0_kernel_7_struct_for
[  0.51%   0.001 s      1x |    1.272     1.272     1.272 ms] evolve_vectorized_c6_0_kernel_6_listgen_S2dense
[  0.37%   0.001 s      1x |    0.926     0.926     0.926 ms] evolve_vectorized_c6_0_kernel_4_listgen_S1pointer
[  0.00%   0.000 s      1x |    0.001     0.001     0.001 ms] evolve_vectorized_c6_0_kernel_5_serial
[  0.00%   0.000 s      1x |    0.001     0.001     0.001 ms] evolve_vectorized_c6_0_kernel_3_serial

@TH3CHARLie TH3CHARLie requested review from yuanming-hu and removed request for taichi-gardener January 5, 2021 19:31
@TH3CHARLie TH3CHARLie marked this pull request as ready for review January 5, 2021 19:32
Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!! Just a few nits. Thanks!

: op_type(op_type),
lhs(lhs),
rhs(rhs),
is_bit_vectorized(is_bit_vectorized) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little too much intrusion into the existing system. It seems to me that is_bit_vectorized is only used in the bit_loop_vectorize pass - maybe you can use an std::unordered_map<Stmt *, bool> member variable in class BitLoopVectorize, instead of adding a new field in class BinaryOpStmt? (Just like llvm_val in LLVM codegens.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_bit_vectorized is used only in the bit_loop_ vectorize pass when tagged on BinaryOpStmt and this part should be replaced with some pass-scope data structure, just as you suggested. But for GlobalPtrStmt and GetChStmt, they need the tag to pass later passes including lower_access and type_check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that's what I meant: we can use a pass-scope data structure just for BinaryOpStmt::is_bit_vectorized. Given we are rushing for the deadline it's fine that we don't do it now.

tests/python/test_bit_array_vectorization.py Show resolved Hide resolved
tests/python/test_bit_array_vectorization.py Show resolved Hide resolved
tests/python/test_bit_array_vectorization.py Outdated Show resolved Hide resolved
and_b_c->is_bit_vectorized = true;
// modify IR
auto and_a_b_p = and_a_b.get();
stmt->insert_before_me(std::move(load_a));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more elegant way to do this:

VecStatement statements;

Use VecStatement::push_back<...> to create the statements and stmt->insert_before_me(vec_stmt) to insert. (You don't need the DelayedIRModifier part.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find to do it the elegant way we may still need DelayedIRModifier and there are other places that require changes as well(e.g. see the visitor for GlobalLoadStmt), therefore I think we should do it in a separate PR later and refactoring all this together. This also applies to the changes for BinaryOpStmt::is_bit_vectorized.

Comment on lines +299 to +305
stmt->insert_before_me(std::move(load_c));
stmt->insert_before_me(std::move(carry_c));
stmt->insert_before_me(std::move(sum_c));
stmt->insert_before_me(std::move(load_b));
stmt->insert_before_me(std::move(carry_b));
stmt->insert_before_me(std::move(sum_b));
stmt->insert_before_me(std::move(sum_a));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@TH3CHARLie TH3CHARLie merged commit 7df6f22 into taichi-dev:master Jan 6, 2021
@k-ye k-ye mentioned this pull request Feb 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants