Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ofast-flag rather than fast-math for mkFit #124

Merged
merged 3 commits into from
May 4, 2023
Merged

Conversation

mmasciov
Copy link

@mmasciov mmasciov commented Apr 14, 2023

PR description:

As per title, this PR is meant to switch to ofast-flag rather than fast-math for mkFit building, as this was proven to allow for improved stability across different architectures.
An analogous change is applied to the standalone mkFit Makefile.config.

Additional notes on the mkFit Makefile.config:
ofast-flag in CMSSW consists of -Ofast -fno-reciprocal-math -mrecip=none (see cms-sw/cmsdist#8280).
This is reflected in the mkFit Makefile.config, with intentional redundancy:
-Ofast turns on -ffast-math, which in turn activates -fno-math-errno (see reference); the redundant flags are not removed intentionally, so that in the case one is willing to remove the effects of ofast-flag in the standalone mkFit application, it will be sufficient to comment out the line added in Makefile.config with this PR, without changing vectorization due to other pre-existing flags.

PR validation:

This PR is not meant to affect physics performance, except for compilation-driven effects. Technical performance is not affected either.
This is validated in: http://uaf-10.t2.ucsd.edu/~mmasciov/MIC/testFastMath/MTV_ttbarPU_cgpu-1_fastmath_vs_ofast/ (red/black vs. blue)

@@ -130,6 +130,7 @@ endif
ifeq ($(CXX), g++)
CXXFLAGS += -std=c++1z -ftree-vectorize -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -Wstrict-overflow -Werror=array-bounds -Werror=format-contains-nul -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -felide-constructors -fmessage-length=0 -Wall -Wno-non-template-friend -Wno-long-long -Wreturn-type -Wunused -Wparentheses -Wno-deprecated -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Wstrict-aliasing -Werror=narrowing -Werror=unused-but-set-variable -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Werror=return-local-addr -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi
CXXFLAGS += -fdiagnostics-color=auto -fdiagnostics-show-option -pthread -pipe -fopenmp-simd
CXXFLAGS += -Ofast -fno-reciprocal-math -mrecip=none
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is -fno-math-errno still needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure. IIUC, @srlantz tested the additional flags on top of the existing ones, that's why I did not removed it. However, can be removed if unnecessary.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also , -ftree-vectorize is a part of -O2, apparently (which is in turn a part of -O3, and that is in -Ofast)

or is the point that redundant flags are OK?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I let @srlantz comment about it: to me, we can clean the unnecessary flags; however, I thought I would be as close as possible to what I understood was tested as a start.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, not setting errno is a requirement for vectorizing transcendentals, so it’s probably worth specifying even if it’s subsumed by other optimization flags

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Makefile.config gcc flags were copied from CMSSW at some point in the distant past, it would be completely reasonable to update to whatever the defaults are today. (I think there was one flag I had to omit back in the day, but that must have been resolved getting mkFit into CMSSW.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed cb2e7d6, which updates to the most recent gcc flags, with the exception of -O2, since we already invoke -O3. Does this look OK?
I checked, and standalone mkFit compiles.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I agree with @dan131riley that we should take the opportunity to update these flags. Thanks @mmasciov !

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Then, if OK with everyone, I will open a PR to cms-sw/cmssw

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: cms-sw#41374

@slava77 slava77 merged commit 34c4538 into master May 4, 2023
@srlantz srlantz deleted the ofast-flag_mkFit branch October 30, 2023 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants