Follow up to the Alpaka Implementation of PFClusterProducer #43501

fwyzard · 2023-12-05T14:58:19Z

The goal of this issue is to collect feedback and action items on the Alpaka Implementation of PFClusterProducer, to be followed up after the integration of #43130.

`elements_in_block_with_stride`

We should implement a helper function elements_in_block_with_stride(acc, extent) to make all the threads in a group loop and cover extent elements, with a stride equal to the group size.

Then, update the loop in TopoClusterContraction to use elements_in_block_with_stride instead of

      for (int rhIdx = alpaka::getIdx<alpaka::Block, alpaka::Threads>(acc)[0u]; rhIdx < nRH;
           rhIdx += alpaka::getWorkDiv<alpaka::Block, alpaka::Threads>(acc)[0u]) {

single precision literals

The use of double precision literals (e.g. 0.5 in expf(-0.5 * value)) force the compiler to convert the operands from single to double precision, compute the result in double precision, and convert them to back single precision.
Given the cost of double precision operations on the "small" GPUs like NVIDIA T4 or Intel Flex, these conversions and temporary operations in double precision should be avoided, by explicitly marking the floating point literals as single precision:

expf(-0.5f * value);

avoid square roots where possible

Given

float cut = ...;
float value2 = ...;
float value = sqrtf(value2);
if (value > cut) { ...}

we should avoid the square root and use the square of cut:

float cut = ...;
float value2 = ...;
float cut2 = cut * cut;
if (value2 > cut2) { ...}

The text was updated successfully, but these errors were encountered:

cmsbuild · 2023-12-05T14:58:40Z

A new Issue was created by @fwyzard Andrea Bocci.

@Dr15Jones, @makortel, @sextonkennedy, @rappoccio, @antoniovilela, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

fwyzard · 2023-12-05T14:58:50Z

type pf

fwyzard · 2023-12-05T14:58:57Z

assign heterogeneous

cmsbuild · 2023-12-05T14:59:00Z

New categories assigned: heterogeneous

@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

jsamudio · 2023-12-14T21:39:14Z

I opened #43574 to address reading the thresholds from GT. Is there any changes in particular you would want bundled in that particular PR?

fwyzard · 2023-12-15T12:54:45Z

I think you have already address the second and third point (using single precision literals and avoid square roots) - if not, you could do that.

For the first point we still need to provide the helper function.

jsamudio · 2023-12-15T19:54:59Z

I believe avoiding the square roots was implemented in the original PR, and I just updated expf functions in the latest commit for #43574. So probably at this point everything but the first point is addressed.

cmsbuild added the pending-assignment label Dec 5, 2023

fwyzard mentioned this issue Dec 5, 2023

Add Alpaka Implementation of PFClusterProducer #43130

Merged

cmsbuild added pending-signatures heterogeneous-pending pf and removed pending-assignment labels Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Follow up to the Alpaka Implementation of PFClusterProducer #43501

Follow up to the Alpaka Implementation of PFClusterProducer #43501

fwyzard commented Dec 5, 2023

cmsbuild commented Dec 5, 2023 •

edited

Loading

fwyzard commented Dec 5, 2023 •

edited

Loading

fwyzard commented Dec 5, 2023

cmsbuild commented Dec 5, 2023

jsamudio commented Dec 14, 2023

fwyzard commented Dec 15, 2023

jsamudio commented Dec 15, 2023

Follow up to the Alpaka Implementation of PFClusterProducer #43501

Follow up to the Alpaka Implementation of PFClusterProducer #43501

Comments

fwyzard commented Dec 5, 2023

elements_in_block_with_stride

single precision literals

avoid square roots where possible

cmsbuild commented Dec 5, 2023 • edited Loading

fwyzard commented Dec 5, 2023 • edited Loading

fwyzard commented Dec 5, 2023

cmsbuild commented Dec 5, 2023

jsamudio commented Dec 14, 2023

fwyzard commented Dec 15, 2023

jsamudio commented Dec 15, 2023

`elements_in_block_with_stride`

cmsbuild commented Dec 5, 2023 •

edited

Loading

fwyzard commented Dec 5, 2023 •

edited

Loading