Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit undefined behaviour and fix some wharnings too #243

Merged
merged 13 commits into from
Mar 22, 2023
Merged

Conversation

Yurlungur
Copy link
Collaborator

@Yurlungur Yurlungur commented Mar 21, 2023

PR Summary

In a downstream code, @AstroBarker experienced a cuda "unspecified launch failure" for some applications and on some architectures. After some effort, we narrowed it down to the RootFinding1D::RootCounts and RootFinding1D::RootStatus variables used in the tabulated EOS's. Each EOS had an internal instants of RootCounts and Status used for diagnostics when the code was run on host in serial.

Writing to these variables in multiple threads is undefined behavior. However, I intentionally put the logic in the code anyway. My reasoning was that since they wouldn't be read except for diagnostic purposes in serial, this undefined behavior shouldn't cause any problems. This reasoning was apparently incorrect.

I still think these variables are useful, so I don't want to eliminate this capability completely. However, I want to avoid them being written to in a "device" context (whatever that means going forward), to prevent us from hitting this fail case in the future. So I have made the following changes:

  • The root counts object is now an optional parameter of the root finding routines. You pass in a pointer, if that pointer isn't null, the histogram is recorded.
  • The root counts and root status fields owned by each tabulated EOS object are only written to if the EOS believes itself to be "on host."

Note that this strategy doesn't prevent someone from sticking a host EOS in an openmp for loop, so this doesn't entirely escape the issue, but it seems to be a safe enough workaround. A more complete solution would be to hide the histogram/status logic behind a compile-time flag or remove it entirely. I'd be open to that if people feel strongly, but I think this is probably good enough for now.

@AstroBarker please try this branch and see if it resolves the issues you were seeing.

@dholladay00 @mauneyc-LANL @rbberger @jhp-lanl I'd like your feedback on this solution. Do you agree this feels safe enough for now? Or should we more aggressively eliminate any possibility of UB here?

While I'm at it, I cleaned up some warnings I was seeing when compiling the code which were getting on my nerves:

  • A few unused variable warnings
  • I re-implemented strcat with device decorations (needed for the error message for entropy not implemented), as the standard library version is not implemented on device
  • I made some device/inline annotations consistent in a few spots.

There are more warnings in the code than this... and I'd like to clean them up when we have the time. But one thing at a time.

PR Checklist

  • Adds a test for any bugs fixed. Adds tests for new features.
  • Format your changes by using the make format command after configuring with cmake.
  • Document any new features, update documentation for changes made.
  • Make sure the copyright notice on any files you modified is up to date.
  • After creating a pull request, note it in the CHANGELOG.md file
  • If preparing for a new release, update the version in cmake.

@Yurlungur Yurlungur added bug Something isn't working clean-up Changes that don't affect code execution but make the code cleaner labels Mar 21, 2023
@Yurlungur Yurlungur self-assigned this Mar 21, 2023
@Yurlungur
Copy link
Collaborator Author

This branch is the same one as for the closed PR #240 . I re-used the branch.

@AstroBarker
Copy link
Collaborator

Thanks for this @jonahm-LANL -- this does solve my downstream problems. The pcounts implementation looks solid to me.

@AstroBarker
Copy link
Collaborator

Another thought @Yurlungur, this PR might be a good spot to slip in the sieMin_ etc in GetOnDevice() for stellar_collapse?

@Yurlungur
Copy link
Collaborator Author

Another thought @Yurlungur, this PR might be a good spot to slip in the sieMin_ etc in GetOnDevice() for stellar_collapse?

Done

Copy link
Collaborator

@jhp-lanl jhp-lanl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but I'm concerned about removing device testing of the ideal gas entropy test. To me it doesn't seem necessary since we test the vector version on device anyway.

CHANGELOG.md Outdated Show resolved Hide resolved
singularity-eos/eos/eos_base.hpp Outdated Show resolved Hide resolved
test/profile_eos.cpp Show resolved Hide resolved
test/test_eos_unit.cpp Show resolved Hide resolved
Copy link
Collaborator

@jhp-lanl jhp-lanl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the requested changes. I approve but I'd like to hear if @dholladay00 has anything else to add before merging it

@Yurlungur
Copy link
Collaborator Author

Sounds good. I'll wait to hear from @dholladay00

@dholladay00
Copy link
Collaborator

@Yurlungur could you grep through the code and remove instances of root counts where they were used only b/c the API required it? I know where is one in singularity_eos.cpp that can be removed.

@mauneyc-LANL
Copy link
Collaborator

(none of this implies a blocking review)
I'm a bit late to this so apologies if this has been covered. Would the better solution be to use atomic writes? This may require some "portable" atomic routines in PoC, tho.

Keeping the diagnostic would be useful even on device. Perhaps a better approach is to template on an interface

struct DiagStub
{
// has interface as RootCounts but no-op implementation, e.g. DiagStub::reset() {}
};

// solves for f(x,params) - ytarget = 0
template <typename T, typename DiagCount = DiagStub>
PORTABLE_INLINE_FUNCTION Status regula_falsi(const T &f, const Real ytarget,
                                             const Real guess, Real a, Real b,
                                             const Real xtol, const Real ytol,
                                             Real &xroot,
                                             DiagCounts &counts) {

For debug or evaluation builds, this can be changed at build time. You also get the benefit of reducing branch points in the function. (This is a "for instance" snippet, go easy on unintended effects).

Tangentially, there is the question of rolling a custom root-find instead of relying on a more general library, though I'm not sure that this is a "now" concern.

@Yurlungur
Copy link
Collaborator Author

@Yurlungur could you grep through the code and remove instances of root counts where they were used only b/c the API required it? I know where is one in singularity_eos.cpp that can be removed.

The one in singularity_eos.cpp was the only one I missed. I'll fix it. :)

In reply to @mauneyc-LANL :

Would the better solution be to use atomic writes? This may require some "portable" atomic routines in PoC, tho.

I considered atomics, but I didn't want to deal with the performance implication. Perhaps a good long-term solution is to add atomics to POC and have the atomics be behind a compile flag.

Keeping the diagnostic would be useful even on device. Perhaps a better approach is to template on an interface

Interesting. I kind of like this idea, but I also kind of like root-finding being optional as an argument, since there were a lot of places in the code where counts were used but not examined.

How about we keep what I've done here for now but add an issue suggesting both atomics and a mixin here for the count type. Would that work?

@dholladay00
Copy link
Collaborator

dholladay00 commented Mar 22, 2023

Could even do this (add default argument) Edit: not sure if can be done with reference.

struct DiagStub
{
// has interface as RootCounts but no-op implementation, e.g. DiagStub::reset() {}
};

// solves for f(x,params) - ytarget = 0
template <typename T, typename DiagCount = DiagStub>
PORTABLE_INLINE_FUNCTION Status regula_falsi(const T &f, const Real ytarget,
                                             const Real guess, Real a, Real b,
                                             const Real xtol, const Real ytol,
                                             Real &xroot,
                                             DiagCounts &counts = DiagStub{}) {

@mauneyc-LANL
Copy link
Collaborator

Edit: not sure if can be done with reference.

@dholladay00 oh boy, let me tell you all about value categories

@jhp-lanl
Copy link
Collaborator

@Yurlungur @dholladay00 @mauneyc-LANL given that the CI passed on Darwin and that we seemed to resolve any open questions at the meeting today, I'm pulling the trigger and merging this.

@jhp-lanl jhp-lanl merged commit a6ef1e5 into main Mar 22, 2023
@jhp-lanl jhp-lanl deleted the jmm/fix-warnings branch March 22, 2023 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working clean-up Changes that don't affect code execution but make the code cleaner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants