Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/openmp #1411

Merged
merged 27 commits into from
Nov 8, 2023
Merged

Feature/openmp #1411

merged 27 commits into from
Nov 8, 2023

Conversation

maddyscientist
Copy link
Member

This is mainly a quality of life PR, as well as adding some building work for future work:

  • Properly enable OpenMP host compilation
  • OMP parallelization of some of the tests codes (gauge force/path, hisq force, fat/long link construction, staggered dslash reference)
  • Acceleration and robustness improvement of (staggered_)dslash_ctest; we now use the same gauge field across different partitioning
    • this has an especially dramatic improvement on the execution time for the improved staggered fermion tests when constructing the fat/long links
  • some sundry cleanup resulting from the above additions

@mathiaswagner
Copy link
Member

Is QUDA_OPENMP exposed in cmake ?

@maddyscientist
Copy link
Member Author

Is QUDA_OPENMP exposed in cmake ?

Yes, it has been for about 5 years 🙂

…Field. Enable destruction of ColorSpinorField through moving a null constructed field
… reused across all partitions / precisions of dslash_test and staggered_dslash_test
@maddyscientist
Copy link
Member Author

Latest few commits fix a couple of additional issues I found in testing:

  • QUDA_CTEST_LAUNCH needed an additional FORCE to enable the user to override the setting, e.g., to run ctest on different process counts without nuking the build
  • For the dslash ctests, make the host quark field creation a one-time event, created when the host gauge field is created. This reduces the test overhead, and ensures we use the random source field for all partitions and precision.

Also noting that the improvements from this PR reduce the ctest overhead on the CSCS CI by over 100 seconds, so a welcome boost to CI throughput.

tests/CMakeLists.txt Outdated Show resolved Hide resolved
@mathiaswagner
Copy link
Member

@Jenkins test this please.

@mathiaswagner
Copy link
Member

@Jenkins test this please.

1 similar comment
@mathiaswagner
Copy link
Member

@Jenkins test this please.

@weinbe2
Copy link
Contributor

weinbe2 commented Oct 31, 2023

As a thought, in the spirit of the raw_pointer function for the raw gauge field pointer, it'd also be nice to have a raw_pointer_ghost (or whatnot) that returns a void** for CPU-ordered fields

@mathiaswagner
Copy link
Member

Is QUDA_OPENMP exposed in cmake ?

Yes, it has been for about 5 years 🙂

Guess it is one of the options I never use.
Should we turn it ON by default?
We could also rely on cmake to detect support and make QUDA_OPENMP=ON mean request OpenMP if supported by the compiler.

@maddyscientist
Copy link
Member Author

Is QUDA_OPENMP exposed in cmake ?

Yes, it has been for about 5 years 🙂

Guess it is one of the options I never use. Should we turn it ON by default? We could also rely on cmake to detect support and make QUDA_OPENMP=ON mean request OpenMP if supported by the compiler.

I didn't turn it on by default since I still consider it very experimental. In particular Eigen has huge slowdowns with OMP enabled, and until we work that out, I don't think we should enable by default.

@weinbe2
Copy link
Contributor

weinbe2 commented Nov 2, 2023

Is QUDA_OPENMP exposed in cmake ?

Yes, it has been for about 5 years 🙂

Guess it is one of the options I never use. Should we turn it ON by default? We could also rely on cmake to detect support and make QUDA_OPENMP=ON mean request OpenMP if supported by the compiler.

I didn't turn it on by default since I still consider it very experimental. In particular Eigen has huge slowdowns with OMP enabled, and until we work that out, I don't think we should enable by default.

I can't test it right now, but it looks like there's a compile-time define EIGEN_DONT_PARALLELIZE (circa 2019...) that can be used to disable OpenMP. Source: https://stackoverflow.com/a/56613641

Copy link
Contributor

@weinbe2 weinbe2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, thanks @maddyscientist !

@weinbe2 weinbe2 merged commit 01e419a into develop Nov 8, 2023
3 checks passed
@weinbe2 weinbe2 deleted the feature/openmp branch November 8, 2023 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants