-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/power #1479
Feature/power #1479
Conversation
…e elsewhere. Move assertAllMemFree and device::destroy before we bring down the comms (fixes a potential issue where we call a QUDA i/o function in these functions but the comms is down
…toring period (in microseconds) is set by QUDA_ENABLE_MONITOR_PERIOD (Default = 1000 microseconds = 1 millisecond
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a quick scan this looks fine to me, I had only a few OpenMP comments I left as single comments regarding the use of collapse()
I saw you use it in some cases and not in others, but I was just wondering if it could be used in one or two more. The only thing I worry about is dropping a functor ( Arg::apply ) or some such into a reduction clause. I wonder if it ought to be a custom reduction (like done for Multi-Reductions).
This could be just me not being up to spec with my OpenMP though. Modulo these I approve.
This PR is the initial step towards more power awareness in QUDA, as well as adding OMP threading for host kernels
QUDA_ENABLE_MONITOR=1
(default is off)QUDA_ENABLE_MONITOR_PERIOD=1
monitor_*****.tsv
file, where the **** encodes the rank id, and the date_time of the dump. All ranks have identical times by construction.QUDA_OPENMP
CMake parameter is no longer marked as advancedendQuda
if memory leaks were detected when running multi-GPU:printfQuda
would fail sincecomm_rank()
would be called after the comms have been torn down