Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP16_Base compilation fixes #115

Conversation

slava77
Copy link

@slava77 slava77 commented Nov 5, 2024

I tested by adding -DFP16_Base to RecoTracker/LSTCore/standalone/Makefile (in ALPAKA_CUDA) and RecoTracker/LSTCore/standalone/LST/Makefile (in ALPAKACUDA) as well as ROCM.

ref before this PR
image
with the changes in Makefile:
image

There is a hint that the code is a bit faster.

The memory reduction is probably more important currently.

@slava77
Copy link
Author

slava77 commented Nov 5, 2024

also, this PR is relative to a new branch CMSSW_14_1_0_pre3_LST_X_LSTCore_realfiles_batch9

@slava77
Copy link
Author

slava77 commented Nov 5, 2024

with the changes in Makefile:

comparisons also looked OK when running on GPU L4 cgpu-1 dev=0
(somewhat older ref from sometime in batch7) https://uaf-10.t2.ucsd.edu/~slava77/sdl/efficiencies/testSOA_00013d-PU200/summary/fakerate.html
vs this PR
https://uaf-10.t2.ucsd.edu/~slava77/sdl/efficiencies/testFP16_7c3358-PU200/summary/fakerate.html

@VourMa VourMa mentioned this pull request Nov 6, 2024
@slava77
Copy link
Author

slava77 commented Nov 6, 2024

/run all

@slava77
Copy link
Author

slava77 commented Nov 6, 2024

running the tests, although results will not be visible

Copy link

github-actions bot commented Nov 6, 2024

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     39.6    390.8    194.9    127.4    135.7    544.1    117.7    243.4    105.8      3.4    1902.6    1318.9+/- 364.7     502.6   explicit[s=4] (target branch)
   avg     39.8    388.8    190.5    128.4    135.7    546.0    118.2    242.3    105.8      3.1    1898.6    1312.8+/- 367.0     501.9   explicit[s=4] (this PR)

Copy link

github-actions bot commented Nov 6, 2024

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks
Efficiency and fake rate vs pT, eta, and phi

The full set of validation and comparison plots can be found here.

@VourMa
Copy link
Collaborator

VourMa commented Nov 7, 2024

I will force-push the updated realfiles branch also to batch9, so that they are consistent.

@slava77
Copy link
Author

slava77 commented Nov 7, 2024

I will force-push the updated realfiles branch also to batch9, so that they are consistent.

do we need updates to the CI? (before trying to rerun the tests)

@VourMa
Copy link
Collaborator

VourMa commented Nov 7, 2024

do we need updates to the CI? (before trying to rerun the tests)

I didn't think of that. Let's see what happens to #119, where I naively issued the command.

@slava77
Copy link
Author

slava77 commented Nov 8, 2024

@VourMa
if this looks OK, perhaps it can be merged

@slava77
Copy link
Author

slava77 commented Nov 8, 2024

/run all

@slava77
Copy link
Author

slava77 commented Nov 8, 2024

@ariostas
in case you were working on Matti's comments, I added the bool part here (the other part about CopyToDevice seemed less trivial)

Copy link
Collaborator

@VourMa VourMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VourMa
if this looks OK, perhaps it can be merged

Once the tests finish, I will merge.

Copy link

github-actions bot commented Nov 8, 2024

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

Efficiency vs pT comparison Efficiency vs eta comparison
Fake rate vs pT comparison Fake rate vs eta comparison
Duplicate rate vs pT comparison Duplicate rate vs eta comparison

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     42.6    393.7    196.6    135.0    140.8    544.4    119.1    247.5    106.4      3.3    1929.5    1342.4+/- 368.2     511.1   explicit[s=4] (target branch)
   avg     41.4    395.8    197.4    136.0    139.8    540.0    118.1    247.4    105.5      3.4    1924.8    1343.5+/- 377.0     519.8   explicit[s=4] (this PR)

Copy link

github-actions bot commented Nov 8, 2024

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks
Efficiency and fake rate vs pT, eta, and phi

The full set of validation and comparison plots can be found here.

@VourMa VourMa merged commit 4b27e43 into SegmentLinking:CMSSW_14_1_0_pre3_LST_X_LSTCore_realfiles_batch9 Nov 8, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants