Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow up to the Alpaka integration in CMSSW #43853

Merged
merged 25 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
28bcb55
Skip invalid or corrupted ROCs
fwyzard Feb 2, 2024
e142ccc
Add check on the ROC range
fwyzard Feb 2, 2024
a2c7a27
Clean up includes and dependencies in pixel CondFormats
fwyzard Feb 3, 2024
2b5e6f6
Add overflow checks to CountModules::operator()
fwyzard Feb 3, 2024
937f0da
Rewrite the pixel clustering code
fwyzard Feb 4, 2024
571c91f
Rewrite the pixel charge cut code
fwyzard Feb 5, 2024
357cfba
Update comments to pixel topologies
fwyzard Feb 3, 2024
8f6ecae
Fixed decoding of the pixel timeout error
AdrianoDee Feb 5, 2024
c073269
Fix check for invalid pixel digis
AdrianoDee Feb 5, 2024
1f91765
Fix the constants used in the pixel clustering
fwyzard Feb 5, 2024
c116e10
Minor clean up of legacy CUDA code
fwyzard Feb 5, 2024
f3592ed
Renumber the ECAL-only alpaka workflow to .412
fwyzard Feb 6, 2024
e16a1a8
Add pixel-only alpaka workflows to the README
fwyzard Feb 6, 2024
0851cc3
Add alpaka workflows to the GPU relvals
fwyzard Feb 7, 2024
9369dbc
Add a protection for quasi-empty events
fwyzard Feb 7, 2024
63902cb
Fix include guards, clean up namespaces and includes
fwyzard Feb 9, 2024
ff7e930
Adjust the size of the collection created by CopyToHost::copyAsync
fwyzard Feb 9, 2024
bd9fe2d
Synchronise the treatment of pixel errors 26, 27, 30 with legacy code
fwyzard Feb 10, 2024
b3de2a3
Rewrite the uniform element kernel loops
fwyzard Feb 9, 2024
5a87cef
Rewrite the independent element kernel loops
fwyzard Feb 9, 2024
70371a8
Rewrite zeroAndInit kernel using alpakatools utilities
fwyzard Feb 11, 2024
2d01108
Rewrite pixel clustering and rechits using alpakatools utilities
fwyzard Feb 9, 2024
1bd5d4a
Rewrite pixel seeding using alpakatools utilities
fwyzard Feb 11, 2024
008ca51
Rename elements_with_stride to uniform_elements in user code
fwyzard Feb 10, 2024
c7d3641
Remove obsolete alpakatools utilities
fwyzard Feb 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions CondFormats/SiPixelObjects/BuildFile.xml
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
<use name="alpaka"/>
<use name="DataFormats/DetId"/>
<use name="DataFormats/SiPixelDetId"/>
<use name="CalibFormats/SiPixelObjects"/>
<use name="FWCore/Utilities"/>
<use name="DataFormats/FEDRawData"/>
<use name="CondFormats/Serialization"/>
<use name="CondCore/DBOutputService"/>
<use name="CondFormats/DataRecord"/>
<use name="CondFormats/SiStripObjects"/>
<use name="Geometry/TrackerGeometryBuilder"/>
<use name="CondFormats/External"/>
<use name="CondFormats/Serialization"/>
<use name="CondFormats/SiStripObjects"/>
<use name="DataFormats/DetId"/>
<use name="DataFormats/FEDRawData"/>
<use name="DataFormats/GeometryVector"/>
<use name="DataFormats/Portable"/>
<use name="DataFormats/SiPixelDetId"/>
<use name="DataFormats/SoATemplate"/>
<use name="DataFormats/TrackerCommon"/>
<use name="FWCore/MessageLogger"/>
<use name="FWCore/Utilities"/>
<use name="Geometry/CommonDetUnit"/>
<use name="Geometry/CommonTopologies"/>
<use name="HeterogeneousCore/AlpakaInterface"/>
<use name="Geometry/TrackerGeometryBuilder"/>
<use name="HeterogeneousCore/AlpakaCore"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is certainly beyond this PR, but I want to note nevertheless the AlpakaCore dependence smells (because of its dependence on the FWCore/Framework). It is, however, needed (even with this PR) because of

#include "HeterogeneousCore/AlpakaCore/interface/alpaka/typelookup.h"

and
#include "HeterogeneousCore/AlpakaCore/interface/alpaka/ESDeviceProduct.h"

A simple fix would be to move HeterogeneousCore/AlpakaCore/interface/alpaka/typelookup.h and HeterogeneousCore/AlpakaCore/interface/alpaka/ESDeviceProduct.h to HeterogeneousCore/AlpakaInterface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm... I think that ALPAKA_ACCELERATOR_NAMESPACE::ESDeviceProduct<T> is something too framework specific for HeterogeneousCore/AlpakaInterface ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AlpakaInterface has already

// TODO: this utility class is specific to CMSSW, but needs to be in a
// package that is suitable as DataFormat dependence

and
// TODO: this utility class is specific to CMSSW, but needs to be in a
// package that is suitable as DataFormat dependence

Maybe we need a new package then for CMSSW-specific classes that must not depend on FWCore/Framework?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see.

<use name="HeterogeneousCore/AlpakaInterface"/>
<flags ALPAKA_BACKENDS="1"/>
<export>
<lib name="1"/>
Expand Down
9 changes: 4 additions & 5 deletions CondFormats/SiPixelObjects/interface/SiPixelMappingHost.h
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
#ifndef CondFormats_SiPixelObjects_SiPixelMappingHost_h
#define CondFormats_SiPixelObjects_SiPixelMappingHost_h
#ifndef CondFormats_SiPixelObjects_interface_SiPixelMappingHost_h
#define CondFormats_SiPixelObjects_interface_SiPixelMappingHost_h

#include <alpaka/alpaka.hpp>
#include "DataFormats/Portable/interface/PortableHostCollection.h"
#include "CondFormats/SiPixelObjects/interface/SiPixelMappingLayout.h"
#include "DataFormats/Portable/interface/PortableHostCollection.h"

using SiPixelMappingHost = PortableHostCollection<SiPixelMappingSoA>;

#endif // CondFormats_SiPixelObjects_SiPixelMappingHost_h
#endif // CondFormats_SiPixelObjects_interface_SiPixelMappingHost_h
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,10 @@
#define CondFormats_SiPixelObjects_interface_alpaka_SiPixelMappingDevice_h

#include <cstdint>
#include <alpaka/alpaka.hpp>

#include "DataFormats/Portable/interface/alpaka/PortableCollection.h"
#include "CondFormats/SiPixelObjects/interface/SiPixelMappingLayout.h"
#include "HeterogeneousCore/AlpakaCore/interface/alpaka/ESProducer.h"
#include "DataFormats/Portable/interface/PortableHostCollection.h"
#include "HeterogeneousCore/AlpakaInterface/interface/config.h"

namespace ALPAKA_ACCELERATOR_NAMESPACE {

Expand Down
4 changes: 4 additions & 0 deletions CondFormats/SiPixelObjects/test/BuildFile.xml
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
<use name="CondCore/DBOutputService"/>
<use name="CondFormats/DataRecord"/>
<use name="CondFormats/SiPixelObjects"/>
<use name="FWCore/Framework"/>
<use name="FWCore/ParameterSet"/>
<use name="FWCore/ServiceRegistry"/>
<use name="Geometry/Records"/>
<use name="Geometry/TrackerGeometryBuilder"/>
<library name="SiPixelObjectsTest" file="*.cc">
<flags EDM_PLUGIN="1"/>
Expand Down
32 changes: 17 additions & 15 deletions Geometry/CommonTopologies/interface/SimplePixelTopology.h
Original file line number Diff line number Diff line change
Expand Up @@ -214,26 +214,26 @@ namespace phase2PixelTopology {

HOST_DEVICE_CONSTANT uint8_t layerPairs[2 * nPairs] = {

0, 1, 0, 4, 0, 16, //BPIX1 (3)
1, 2, 1, 4, 1, 16, //BPIX2 (6)
2, 3, 2, 4, 2, 16, //BPIX3 & Forward (9)
0, 1, 0, 4, 0, 16, // BPIX1 (3)
1, 2, 1, 4, 1, 16, // BPIX2 (6)
2, 3, 2, 4, 2, 16, // BPIX3 & Forward (9)

4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, //POS (16)
16, 17, 17, 18, 18, 19, 19, 20, 20, 21, 21, 22, 22, 23, //NEG (23)
4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, // POS (16)
16, 17, 17, 18, 18, 19, 19, 20, 20, 21, 21, 22, 22, 23, // NEG (23)

0, 2, 0, 5, 0, 17, 0, 6, 0, 18, // BPIX1 Jump (28)
1, 3, 1, 5, 1, 17, 1, 6, 1, 18, // BPIX2 Jump (33)

11, 12, 12, 13, 13, 14, 14, 15, //Late POS (37)
23, 24, 24, 25, 25, 26, 26, 27, //Late NEG (41)
11, 12, 12, 13, 13, 14, 14, 15, // Late POS (37)
23, 24, 24, 25, 25, 26, 26, 27, // Late NEG (41)

4, 6, 5, 7, 6, 8, 7, 9, 8, 10, 9, 11, 10, 12, //POS Jump (48)
16, 18, 17, 19, 18, 20, 19, 21, 20, 22, 21, 23, 22, 24, //NEG Jump (55)
4, 6, 5, 7, 6, 8, 7, 9, 8, 10, 9, 11, 10, 12, // POS Jump (48)
16, 18, 17, 19, 18, 20, 19, 21, 20, 22, 21, 23, 22, 24, // NEG Jump (55)
};
HOST_DEVICE_CONSTANT uint32_t layerStart[numberOfLayers + 1] = {0,
108,
324,
504, //Barrel
504, // Barrel
756,
864,
972,
Expand All @@ -245,7 +245,7 @@ namespace phase2PixelTopology {
1620,
1796,
1972,
2148, //Fp
2148, // Fp
2324,
2432,
2540,
Expand All @@ -257,7 +257,7 @@ namespace phase2PixelTopology {
3188,
3364,
3540,
3716, //Np
3716, // Np
numberOfModules};

HOST_DEVICE_CONSTANT int16_t phicuts[nPairs]{
Expand Down Expand Up @@ -332,7 +332,7 @@ namespace pixelTopology {
static constexpr uint32_t maxCellsPerHit = 256;
static constexpr uint32_t avgTracksPerHit = 10;
static constexpr uint32_t maxNumberOfTuples = 256 * 1024;
//this is well above thanks to maxNumberOfTuples
// this is well above thanks to maxNumberOfTuples
static constexpr uint32_t maxHitsForContainers = avgHitsPerTrack * maxNumberOfTuples;
static constexpr uint32_t maxNumberOfDoublets = 5 * 512 * 1024;
static constexpr uint32_t maxNumOfActiveDoublets = maxNumberOfDoublets / 8;
Expand Down Expand Up @@ -381,6 +381,7 @@ namespace pixelTopology {

static constexpr uint16_t numberOfModules = 3892;

// 1024 bins, 10 bits
static constexpr uint16_t clusterBinning = 1024;
static constexpr uint16_t clusterBits = 10;

Expand All @@ -391,7 +392,7 @@ namespace pixelTopology {
static constexpr uint16_t firstEndcapPos = 4;
static constexpr uint16_t firstEndcapNeg = 16;

static constexpr int16_t xOffset = -1e4; //not used actually, to suppress static analyzer warnings
static constexpr int16_t xOffset = -1e4; // not used actually, to suppress static analyzer warnings

static constexpr char const *nameModifier = "Phase2";

Expand Down Expand Up @@ -483,6 +484,7 @@ namespace pixelTopology {
static constexpr uint16_t lastRowInModule = numRowsInModule - 1;
static constexpr uint16_t lastColInModule = numColsInModule - 1;

// 418 bins < 512, 9 bits are enough
static constexpr uint16_t clusterBinning = numColsInModule + 2;
static constexpr uint16_t clusterBits = 9;

Expand Down Expand Up @@ -557,7 +559,7 @@ namespace pixelTopology {
static constexpr uint32_t maxPixInModule = 10000;

static constexpr uint32_t maxNumOfActiveDoublets =
maxNumberOfDoublets / 4; //TODO need to think a better way to avoid this duplication
maxNumberOfDoublets / 4; // TODO need to think a better way to avoid this duplication
static constexpr uint32_t maxCellsPerHit = 256;

static constexpr uint32_t maxNumClustersPerModules = phase1HIonPixelTopology::maxNumClustersPerModules;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -160,22 +160,26 @@ void SiPixelDigisClustersFromSoAAlpaka<TrackerTraits>::produce(edm::StreamID,
std::cout << "Dumping all digis. nDigis = " << nDigis << std::endl;
#endif
for (uint32_t i = 0; i < nDigis; i++) {
#ifdef GPU_DEBUG
PixelDigi dig2{digisView[i].pdigi()};
std::cout << i << ";" << digisView[i].rawIdArr() << ";" << digisView[i].clus() << ";" << digisView[i].pdigi() << ";"
<< digisView[i].adc() << ";" << dig2.row() << ";" << dig2.column() << std::endl;
#endif

// check for uninitialized digis
if (digisView[i].rawIdArr() == 0)
continue;
// check for noisy/dead pixels (electrons set to 0)
if (digisView[i].adc() == 0)
continue;
if (digisView[i].clus() >= -pixelClustering::invalidClusterId)
continue; // not in cluster; TODO add an assert for the size
// not in cluster; TODO add an assert for the size
if (digisView[i].clus() == pixelClustering::invalidClusterId) {
continue;
}
// unexpected invalid value
if (digisView[i].clus() < pixelClustering::invalidClusterId) {
edm::LogError("SiPixelDigisClustersFromSoAAlpaka")
<< "Skipping pixel digi with unexpected invalid cluster id " << digisView[i].clus();
continue;
}
// from clusters killed by charge cut
if (digisView[i].clus() == pixelClustering::invalidModuleId)
continue; // from clusters killed by charge cut
continue;

#ifdef EDM_ML_DEBUG
assert(digisView[i].rawIdArr() > 109999);
#endif
Expand All @@ -200,6 +204,10 @@ void SiPixelDigisClustersFromSoAAlpaka<TrackerTraits>::produce(edm::StreamID,
}
}
PixelDigi dig{digisView[i].pdigi()};
#ifdef GPU_DEBUG
std::cout << i << ";" << digisView[i].rawIdArr() << ";" << digisView[i].clus() << ";" << digisView[i].pdigi() << ";"
<< digisView[i].adc() << ";" << dig.row() << ";" << dig.column() << std::endl;
#endif

if (storeDigis_)
(*detDigis).data.emplace_back(dig);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -607,8 +607,8 @@ namespace pixelgpudetails {
digis_d->moduleId(), clusters_d->moduleStart(), digis_d->clus(), wordCounter);
cudaCheck(cudaGetLastError());

threadsPerBlock = ((TrackerTraits::maxPixInModule / 16 + 128 - 1) / 128) *
128; /// should be larger than maxPixInModule/16 aka (maxPixInModule/maxiter in the kernel)
// should be larger than maxPixInModule/16 aka (maxPixInModule/maxiter in the kernel)
threadsPerBlock = ((TrackerTraits::maxPixInModule / 16 + 128 - 1) / 128) * 128;
blocks = TrackerTraits::numberOfModules;
#ifdef GPU_DEBUG
std::cout << "CUDA findClus kernel launch with " << blocks << " blocks of " << threadsPerBlock << " threads\n";
Expand Down Expand Up @@ -752,13 +752,13 @@ namespace pixelgpudetails {
cudaCheck(cudaGetLastError());

auto nModules_Clusters_d = cms::cuda::make_device_unique<uint32_t[]>(3, stream);
// MUST be ONE block

#ifdef GPU_DEBUG
cudaCheck(cudaStreamSynchronize(stream));
std::cout << "CUDA fillHitsModuleStart kernel launch \n";
#endif

// MUST be ONE block
fillHitsModuleStart<TrackerTraits><<<1, 1024, 0, stream>>>(clusters_d->clusInModule(),
clusters_d->clusModuleStart(),
clusters_d->moduleStart(),
Expand Down
77 changes: 40 additions & 37 deletions RecoLocalTracker/SiPixelClusterizer/plugins/alpaka/CalibPixel.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,7 @@
namespace calibPixel {
using namespace cms::alpakatools;

constexpr uint16_t InvId = std::numeric_limits<uint16_t>::max() - 1;
// must be > MaxNumModules

template <bool debug = false>
struct CalibDigis {
template <typename TAcc>
ALPAKA_FN_ACC void operator()(const TAcc& acc,
Expand All @@ -41,52 +39,57 @@ namespace calibPixel {

// zero for next kernels...
if (cms::alpakatools::once_per_grid(acc)) {
clus_view[0].clusModuleStart() = clus_view[0].moduleStart() = 0;
clus_view[0].clusModuleStart() = 0;
clus_view[0].moduleStart() = 0;
}
for (auto i : cms::alpakatools::elements_with_stride(acc, phase1PixelTopology::numberOfModules)) {
clus_view[i].clusInModule() = 0;
}

cms::alpakatools::for_each_element_in_grid_strided(
acc, phase1PixelTopology::numberOfModules, [&](uint32_t i) { clus_view[i].clusInModule() = 0; });
cms::alpakatools::for_each_element_in_grid_strided(acc, numElements, [&](uint32_t i) {
for (auto i : cms::alpakatools::elements_with_stride(acc, numElements)) {
auto dvgi = view[i];
if (dvgi.moduleId() != InvId) {
bool isDeadColumn = false, isNoisyColumn = false;
int row = dvgi.xx();
int col = dvgi.yy();
auto ret = SiPixelGainUtilities::getPedAndGain(gains, dvgi.moduleId(), col, row, isDeadColumn, isNoisyColumn);
float pedestal = ret.first;
float gain = ret.second;
if (isDeadColumn | isNoisyColumn) {
dvgi.moduleId() = InvId;
dvgi.adc() = 0;
if (dvgi.moduleId() == ::pixelClustering::invalidModuleId)
continue;

bool isDeadColumn = false, isNoisyColumn = false;
int row = dvgi.xx();
int col = dvgi.yy();
auto ret = SiPixelGainUtilities::getPedAndGain(gains, dvgi.moduleId(), col, row, isDeadColumn, isNoisyColumn);
float pedestal = ret.first;
float gain = ret.second;
if (isDeadColumn | isNoisyColumn) {
if constexpr (debug)
printf("bad pixel at %d in %d\n", i, dvgi.moduleId());
} else {
float vcal = dvgi.adc() * gain - pedestal * gain;
dvgi.moduleId() = ::pixelClustering::invalidModuleId;
dvgi.adc() = 0;
} else {
float vcal = dvgi.adc() * gain - pedestal * gain;

float conversionFactor = dvgi.moduleId() < 96 ? VCaltoElectronGain_L1 : VCaltoElectronGain;
float offset = dvgi.moduleId() < 96 ? VCaltoElectronOffset_L1 : VCaltoElectronOffset;
float conversionFactor = dvgi.moduleId() < 96 ? VCaltoElectronGain_L1 : VCaltoElectronGain;
float offset = dvgi.moduleId() < 96 ? VCaltoElectronOffset_L1 : VCaltoElectronOffset;
#ifdef GPU_DEBUG
auto old_adc = dvgi.adc();
auto old_adc = dvgi.adc();
#endif
dvgi.adc() = std::max(100, int(vcal * conversionFactor + offset));
dvgi.adc() = std::max(100, int(vcal * conversionFactor + offset));
#ifdef GPU_DEBUG
if (cms::alpakatools::once_per_grid(acc)) {
printf(
"module %d pixel %d -> old_adc = %d; vcal = %.2f; conversionFactor = %.2f; offset = %.2f; new_adc = "
"%d \n",
dvgi.moduleId(),
i,
old_adc,
vcal,
conversionFactor,
offset,
dvgi.adc());
}
#endif
if (cms::alpakatools::once_per_grid(acc)) {
printf(
"module %d pixel %d -> old_adc = %d; vcal = %.2f; conversionFactor = %.2f; offset = %.2f; new_adc = "
"%d \n",
dvgi.moduleId(),
i,
old_adc,
vcal,
conversionFactor,
offset,
dvgi.adc());
}
#endif
}
});
}
}
};

struct CalibDigisPhase2 {
template <typename TAcc>
ALPAKA_FN_ACC void operator()(const TAcc& acc,
Expand Down
Loading