-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Perf] Support thread local storage for reduction in struct-fors #1941
Conversation
So we used to only support TLS for reduction in range-fors? |
Codecov Report
@@ Coverage Diff @@
## master #1941 +/- ##
=======================================
Coverage 43.72% 43.72%
=======================================
Files 45 45
Lines 6207 6207
Branches 1103 1103
=======================================
Hits 2714 2714
Misses 3322 3322
Partials 171 171
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for documenting the control flow, much cleaner!
Co-authored-by: Ye Kuang <[email protected]>
@archibate Yes! |
Related issue = #1407 closes #576
Benchmarks
3D 256^3 MGPCG reduction
CUDA
CPU
1D 1024 * 1024 * 128 linear reduction
CUDA
CPU
[Click here for the format server]