-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FP16_Base compilation fixes #115
FP16_Base compilation fixes #115
Conversation
also, this PR is relative to a new branch |
comparisons also looked OK when running on GPU L4 cgpu-1 dev=0 |
/run all |
running the tests, although results will not be visible |
The PR was built and ran successfully in standalone mode. Here are some of the comparison plots. The full set of validation and comparison plots can be found here. Here is a timing comparison:
|
The PR was built and ran successfully with CMSSW. Here are some plots. OOTB All TracksThe full set of validation and comparison plots can be found here. |
I will |
do we need updates to the CI? (before trying to rerun the tests) |
I didn't think of that. Let's see what happens to #119, where I naively issued the command. |
@VourMa |
/run all |
@ariostas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VourMa
if this looks OK, perhaps it can be merged
Once the tests finish, I will merge.
The PR was built and ran successfully in standalone mode. Here are some of the comparison plots. The full set of validation and comparison plots can be found here. Here is a timing comparison:
|
The PR was built and ran successfully with CMSSW. Here are some plots. OOTB All TracksThe full set of validation and comparison plots can be found here. |
4b27e43
into
SegmentLinking:CMSSW_14_1_0_pre3_LST_X_LSTCore_realfiles_batch9
I tested by adding
-DFP16_Base
to RecoTracker/LSTCore/standalone/Makefile (inALPAKA_CUDA
) and RecoTracker/LSTCore/standalone/LST/Makefile (inALPAKACUDA
) as well as ROCM.ref before this PR
with the changes in Makefile:
There is a hint that the code is a bit faster.
The memory reduction is probably more important currently.