-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit undefined behaviour and fix some wharnings too #243
Conversation
This branch is the same one as for the closed PR #240 . I re-used the branch. |
Thanks for this @jonahm-LANL -- this does solve my downstream problems. The |
Another thought @Yurlungur, this PR might be a good spot to slip in the sieMin_ etc in GetOnDevice() for stellar_collapse? |
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but I'm concerned about removing device testing of the ideal gas entropy test. To me it doesn't seem necessary since we test the vector version on device anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making the requested changes. I approve but I'd like to hear if @dholladay00 has anything else to add before merging it
Sounds good. I'll wait to hear from @dholladay00 |
@Yurlungur could you grep through the code and remove instances of root counts where they were used only b/c the API required it? I know where is one in |
(none of this implies a blocking review) Keeping the diagnostic would be useful even on device. Perhaps a better approach is to template on an interface struct DiagStub
{
// has interface as RootCounts but no-op implementation, e.g. DiagStub::reset() {}
};
// solves for f(x,params) - ytarget = 0
template <typename T, typename DiagCount = DiagStub>
PORTABLE_INLINE_FUNCTION Status regula_falsi(const T &f, const Real ytarget,
const Real guess, Real a, Real b,
const Real xtol, const Real ytol,
Real &xroot,
DiagCounts &counts) { For debug or evaluation builds, this can be changed at build time. You also get the benefit of reducing branch points in the function. (This is a "for instance" snippet, go easy on unintended effects). Tangentially, there is the question of rolling a custom root-find instead of relying on a more general library, though I'm not sure that this is a "now" concern. |
The one in In reply to @mauneyc-LANL :
I considered atomics, but I didn't want to deal with the performance implication. Perhaps a good long-term solution is to add atomics to POC and have the atomics be behind a compile flag.
Interesting. I kind of like this idea, but I also kind of like root-finding being optional as an argument, since there were a lot of places in the code where counts were used but not examined. How about we keep what I've done here for now but add an issue suggesting both atomics and a mixin here for the count type. Would that work? |
Could even do this (add default argument) Edit: not sure if can be done with reference. struct DiagStub
{
// has interface as RootCounts but no-op implementation, e.g. DiagStub::reset() {}
};
// solves for f(x,params) - ytarget = 0
template <typename T, typename DiagCount = DiagStub>
PORTABLE_INLINE_FUNCTION Status regula_falsi(const T &f, const Real ytarget,
const Real guess, Real a, Real b,
const Real xtol, const Real ytol,
Real &xroot,
DiagCounts &counts = DiagStub{}) { |
@dholladay00 oh boy, let me tell you all about value categories |
@Yurlungur @dholladay00 @mauneyc-LANL given that the CI passed on Darwin and that we seemed to resolve any open questions at the meeting today, I'm pulling the trigger and merging this. |
PR Summary
In a downstream code, @AstroBarker experienced a cuda "unspecified launch failure" for some applications and on some architectures. After some effort, we narrowed it down to the
RootFinding1D::RootCounts
andRootFinding1D::RootStatus
variables used in the tabulated EOS's. Each EOS had an internal instants ofRootCounts
andStatus
used for diagnostics when the code was run on host in serial.Writing to these variables in multiple threads is undefined behavior. However, I intentionally put the logic in the code anyway. My reasoning was that since they wouldn't be read except for diagnostic purposes in serial, this undefined behavior shouldn't cause any problems. This reasoning was apparently incorrect.
I still think these variables are useful, so I don't want to eliminate this capability completely. However, I want to avoid them being written to in a "device" context (whatever that means going forward), to prevent us from hitting this fail case in the future. So I have made the following changes:
Note that this strategy doesn't prevent someone from sticking a host EOS in an
openmp for
loop, so this doesn't entirely escape the issue, but it seems to be a safe enough workaround. A more complete solution would be to hide the histogram/status logic behind a compile-time flag or remove it entirely. I'd be open to that if people feel strongly, but I think this is probably good enough for now.@AstroBarker please try this branch and see if it resolves the issues you were seeing.
@dholladay00 @mauneyc-LANL @rbberger @jhp-lanl I'd like your feedback on this solution. Do you agree this feels safe enough for now? Or should we more aggressively eliminate any possibility of UB here?
While I'm at it, I cleaned up some warnings I was seeing when compiling the code which were getting on my nerves:
strcat
with device decorations (needed for the error message for entropy not implemented), as the standard library version is not implemented on deviceThere are more warnings in the code than this... and I'd like to clean them up when we have the time. But one thing at a time.
PR Checklist
make format
command after configuring withcmake
.