-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[type] [Opt] Bit_array vectorization(Step 1) #2101
[type] [Opt] Bit_array vectorization(Step 1) #2101
Conversation
before:
after bit_loop_vectorize:
|
TODO:
|
…PtrStmt tagged with bit_vectorized mark
IR changes after some significant changes: avoid listgen on bit_array node: before:
Fix before:
after:
|
marking this as ready for review as it passes Some performance numbers, for a 4096 * 4096 2D bit array, two implementations
There are some minor costs on listgen as we are using sparse data structure but the speedup in the for-loop is significant. |
if (is_bit_vectorized && snode->type == SNodeType::bit_array && | ||
i == length - 1 && snodes[i - 1]->type == SNodeType::dense) { | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core change: do not generate lookup/getch for bit_array snode when it's the last snode and its parent is a dense node.
TypeFactory::get_instance().get_pointer_type(physical_type); | ||
stmt->ret_type = DataType(ptr_ret_type); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core change: make sure the tagged statements pass the typecheck
pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!! It's nice to see a 100x speed up. Some early comments for now. I'll take a closer look after lunch :-)
Thanks!
Co-authored-by: Yuanming Hu <[email protected]>
Co-authored-by: Yuanming Hu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!! All LGTM now. Feel free to merge after the final comment is addressed. Thanks for the great implementation! Can't wait to see the more powerful version for GoL :-)
// TODO: Do we need to explicitly make the load stmt's return type same | ||
// as physical type for now, this seems to hold under the demo code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Let's assert the bit width of the physical type equals bit_vectorize
for safety.
thanks for the review and guidance! |
Related issue = #1905
[Click here for the format server]