-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nqt o2 #433
nqt o2 #433
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies if there's code I'm not seeing that addresses this, but do we check if the precision of the typedef Real
is double
before including this? Is there equivalent 'float' procedures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are not equivalent float
procedures at this time. Nor are there little endians. However all functions in this code are explicitly double
s.
You're worried about building for single-precision I assume. Yes, we should be worried about that. Let's extend that in a later PR, though. Getting this working was enough of a lift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments for discussion rather than blocking requests.
Timing results x86-voltaNQT o1
NQT o2
NQT true
grace-hopperNQT o1
NQT o2
NQT true
|
Timings for the stellar collapse reader in nanoseconds/point
so on GPUs, no difference. On grace, o2/true speedup is ~25% and on x86, its ~2x. |
@jhp-lanl @dholladay00 @pdmullen @jdolence this is ready for review |
example/eos_grid.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New example showing how to profile. Can be set to profile any EOS, even analytic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to this documentation is where the core of the method is. Essentially the quadratic interpolation of
Tests passing on re-git. Any objections to merging this? @AstroBarker lets discuss phoebus when you get the chance. |
PR Summary
This threads in the second order NQT method developed by Peter Hammond on the AthenaK team. More details soon as I develop them. Also there will be an archive note eventually.
PR Checklist
make format
command after configuring withcmake
.If preparing for a new release, in addition please check the following:
when='@main'
dependencies are updated to the release version in the package.py