-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EOSPAC vector implementations #177
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks really good! I don't think the underlying math and thermodynamics you copied over from the existing code has been thoroughly vetted, but that's beyond the scope of this MR. There may be additional performance improvements we can add, but those can be done later.
The only changes I would ask for would be a simple function to provide a required scratch size and a simple test. There are multiple scratch sizes needed though so this requires some discussion I think. For the test, I would just pick one of these methods and make a test for the chunk version and the full array version and make sure the result agrees with a for loop over scalar lookups. We can expand the test later as part of vetting the underlying logic and math.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a draft, so not approving or requesting changes. A little bit of feedback below.
Also thanks for putting in the work to prototype this! I like both the "scatch" and "chunk" approaches. But I think we should be a little thoughtful of GPU performance in both cases. I don't think any of this code will run on GPU as-is. |
b9afe6f
to
d4eef73
Compare
ac4e0b8
to
b67321c
Compare
@Yurlungur @jhp-lanl I've updated all scratch and chunk versions to use EOSPAC options to minimize copies and do unit conversions. I've also expanded the profiling to include more methods and made it part of CTest. Below are timings I've generated per method on |
6b54d26
to
644a3e0
Compare
Great work @rbberger ! I'm excited especially that this work is part of |
@jhp-lanl yeah, having it part of CTest already pays off. I'm having weird failures on power9 that need further investigation. 😄 |
Peanut gallery says we go to scratch and drop lambdas and chunks |
@Yurlungur @jhp-lanl Issue #187 blocks me from doing any meaningful GPU tests on the CI. |
a8a922d
to
159f465
Compare
I seem to recall that the accessible EOSPAC spackage isn't up-to-date and I'm not quite sure what the best way to get 6.5 is. I'll ping Daniel Sheppard. |
159f465
to
0f5fde3
Compare
0f5fde3
to
8f82ef6
Compare
@Yurlungur ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm nominally approving this because most of my comments are questions rather than critiques and I can be convinced out of any code changes.
@rbberger feel free to resolve any of my questions as you answer them or make code changes to address them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I left this so long @rbberger it slipped my mind. Overall this looks very good. A few nitpicks below.
Once we use C++17 this could be replaced with [[maybe_unused]]
134f642
to
80a6b03
Compare
791e78e
to
59a2e74
Compare
6d56ed6
to
50ef5a3
Compare
8b1907b
to
b184e19
Compare
@Yurlungur @jhp-lanl except the point about the modifiers and GPUs, this PR should be done. It also passes on re-git. Let me know if the documentation is sufficient. |
b184e19
to
72a6287
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good @rbberger ! Thanks for all the effort on this!
I'll wait for the Darwin tests to pass before we merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. 👍 Let's merge as soon as Darwin tests pass
6e1ed15
to
72a6287
Compare
@rbberger is there anything holding this up? To me it looks like you can merge in master (and resolve conflicts) and we can merge this. |
No. It kind of got stuck because we were looking at why GPU builds being so slow (in general, not just this), but I'll fix this one up again so we can merge it. |
Let us know if you need any support getting this pushed 👍 |
ef4fe7b
to
7829ece
Compare
@Yurlungur re-git pipeline successful. feel free to merge. |
Thanks for all your hard work on this @rbberger ! At some point we should check to see what is necessary to pull this through for GPU implementations and modifiers. But for now I think we can celebrate speeding up the EOSPAC calls. |
Yes indeed! Thanks for all the hard work @rbberger ! |
PR Summary
PR Checklist
make format
command after configuring withcmake
.