FP16_Base compilation fixes #115

slava77 · 2024-11-05T23:17:11Z

I tested by adding -DFP16_Base to RecoTracker/LSTCore/standalone/Makefile (in ALPAKA_CUDA) and RecoTracker/LSTCore/standalone/LST/Makefile (in ALPAKACUDA) as well as ROCM.

ref before this PR

with the changes in Makefile:

There is a hint that the code is a bit faster.

The memory reduction is probably more important currently.

slava77 · 2024-11-05T23:17:47Z

also, this PR is relative to a new branch CMSSW_14_1_0_pre3_LST_X_LSTCore_realfiles_batch9

slava77 · 2024-11-05T23:19:36Z

with the changes in Makefile:

comparisons also looked OK when running on GPU L4 cgpu-1 dev=0
(somewhat older ref from sometime in batch7) https://uaf-10.t2.ucsd.edu/~slava77/sdl/efficiencies/testSOA_00013d-PU200/summary/fakerate.html
vs this PR
https://uaf-10.t2.ucsd.edu/~slava77/sdl/efficiencies/testFP16_7c3358-PU200/summary/fakerate.html

slava77 · 2024-11-06T12:53:25Z

/run all

slava77 · 2024-11-06T12:53:42Z

running the tests, although results will not be visible

github-actions · 2024-11-06T13:09:45Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     39.6    390.8    194.9    127.4    135.7    544.1    117.7    243.4    105.8      3.4    1902.6    1318.9+/- 364.7     502.6   explicit[s=4] (target branch)
   avg     39.8    388.8    190.5    128.4    135.7    546.0    118.2    242.3    105.8      3.1    1898.6    1312.8+/- 367.0     501.9   explicit[s=4] (this PR)

github-actions · 2024-11-06T14:30:33Z

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks

The full set of validation and comparison plots can be found here.

VourMa · 2024-11-07T13:46:32Z

I will force-push the updated realfiles branch also to batch9, so that they are consistent.

slava77 · 2024-11-07T14:14:56Z

I will force-push the updated realfiles branch also to batch9, so that they are consistent.

do we need updates to the CI? (before trying to rerun the tests)

VourMa · 2024-11-07T14:16:40Z

do we need updates to the CI? (before trying to rerun the tests)

I didn't think of that. Let's see what happens to #119, where I naively issued the command.

slava77 · 2024-11-08T13:21:30Z

@VourMa
if this looks OK, perhaps it can be merged

slava77 · 2024-11-08T14:36:36Z

/run all

slava77 · 2024-11-08T14:38:40Z

@ariostas
in case you were working on Matti's comments, I added the bool part here (the other part about CopyToDevice seemed less trivial)

VourMa

@VourMa
if this looks OK, perhaps it can be merged

Once the tests finish, I will merge.

github-actions · 2024-11-08T14:52:47Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     42.6    393.7    196.6    135.0    140.8    544.4    119.1    247.5    106.4      3.3    1929.5    1342.4+/- 368.2     511.1   explicit[s=4] (target branch)
   avg     41.4    395.8    197.4    136.0    139.8    540.0    118.1    247.4    105.5      3.4    1924.8    1343.5+/- 377.0     519.8   explicit[s=4] (this PR)

github-actions · 2024-11-08T16:29:11Z

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks

The full set of validation and comparison plots can be found here.

FP16_Base compilation fixes

b8ccf2b

VourMa mentioned this pull request Nov 6, 2024

Add LST and LSTCore #30

Merged

correct return type

fc792fd

VourMa approved these changes Nov 8, 2024

View reviewed changes

VourMa merged commit 4b27e43 into SegmentLinking:CMSSW_14_1_0_pre3_LST_X_LSTCore_realfiles_batch9 Nov 8, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP16_Base compilation fixes #115

FP16_Base compilation fixes #115

slava77 commented Nov 5, 2024

slava77 commented Nov 5, 2024

slava77 commented Nov 5, 2024

slava77 commented Nov 6, 2024

slava77 commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

VourMa commented Nov 7, 2024

slava77 commented Nov 7, 2024

VourMa commented Nov 7, 2024

slava77 commented Nov 8, 2024

slava77 commented Nov 8, 2024

slava77 commented Nov 8, 2024

VourMa left a comment

github-actions bot commented Nov 8, 2024

github-actions bot commented Nov 8, 2024

FP16_Base compilation fixes #115

FP16_Base compilation fixes #115

Conversation

slava77 commented Nov 5, 2024

slava77 commented Nov 5, 2024

slava77 commented Nov 5, 2024

slava77 commented Nov 6, 2024

slava77 commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

VourMa commented Nov 7, 2024

slava77 commented Nov 7, 2024

VourMa commented Nov 7, 2024

slava77 commented Nov 8, 2024

slava77 commented Nov 8, 2024

slava77 commented Nov 8, 2024

VourMa left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 8, 2024

github-actions bot commented Nov 8, 2024