-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance PSI Constructor: Lower Peak Device Memory Usage #4154
Enhance PSI Constructor: Lower Peak Device Memory Usage #4154
Conversation
Hi @denghuilu , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@denghuilu However, I think it would be better put this optimization for T_in = T_out in implementation of abacus-develop/source/module_psi/kernels/cuda/memory_op.cu Lines 111 to 145 in bec5e64
|
LGTM, I'll put a PR later |
Reminder
Linked Issue
Close #4153
Unit Tests and/or Case Tests for my changes
What's changed?
Any changes of core modules? (ignore if not applicable)