-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECAL Phase 2 weights method amplitude reconstruction on GPU #37695
ECAL Phase 2 weights method amplitude reconstruction on GPU #37695
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37695/29516
|
A new Pull Request was created by @ChrisSandever (Christopher Rhys Sandever) for master. It involves the following packages:
@makortel, @slava77, @clacaputo, @cmsbuild, @fwyzard, @jpata can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
type ecal |
enable gpu |
@cmsbuild please test |
We have some GPU workflows for Run3 in the matrix, are you planning to perhaps include also one for Phase2, so in the future this can be tested with the bot / in the IB-s? |
We also have ECAL Phase 2 development workflows (.61 suffix). We are currently considering to enable the gpu modifier for those. The other option would be to have dedicated WFs like for run3. |
-1 Failed Tests: RelVals-GPU RelVals-GPU
Comparison SummaryThere are some workflows for which there are errors in the baseline: Summary:
|
The changes seem to break the HLT running on GPU:
Can you look into it ? |
The reason for the error is the inclusion of the amplitudeError variable in the converters. We have multiple solutions to this at the moment, one is to include the amplitudeError in the phase 1 GPU multifit module where it should be initialized. The other option is to use the boolean introduced in the converters for turning off EE uncalib rechit production to switch between phase 1 and phase 2 behavior to ignore the amplitudeError if in a Phase 1 WF. Both options have been tested and pass these WFs, we are currently timing them. |
@fwyzard is the reason that this ECAL changes also break the PixelOnly and HcalOnly WFs because at the HLT all GPU reconstructions are running and the Hcal/PixelOnly in the WF name just corresponds to the reco step? |
e4e0aea
to
887c410
Compare
@cmsbuild please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19274f/26038/summary.html GPU Comparison SummarySummary:
Comparison SummarySummary:
|
+heterogeneous |
@ChrisSandever did you open an issue to keep track of the pending requests ? |
Yes, it can be found here: #38619 (comment) |
@cmsbuild please test refreshing the tests before signing (I was off) |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19274f/26560/summary.html GPU Comparison SummarySummary:
Comparison SummarySummary:
|
isPhase2 = cms.bool(True), | ||
recHitsLabelGPUEB = cms.InputTag('ecalUncalibRecHitSoA', 'EcalUncalibRecHitsEB'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just note here that
isPhase2 = cms.bool(True), | |
recHitsLabelGPUEB = cms.InputTag('ecalUncalibRecHitSoA', 'EcalUncalibRecHitsEB'), | |
isPhase2 = True, | |
recHitsLabelGPUEB = ('ecalUncalibRecHitSoA', 'EcalUncalibRecHitsEB'), |
would be preferred, but this can be done as a quick follow-up too.
+reconstruction
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
This PR contains two extra modules in RecoLocalCalo/EcalRecProducers for the ECAL reconstruction on GPU for the Phase 2 upgrade. It modifies 2 other modules with no effect on phase 1 operation.
The new modules are producers, the first of which is a data format converter, which converts phase 2 digis to SoA digis copied onto the GPU device. The other modules is a producer that produces UncalibratedRecHits by reconstructing the amplitudes using the weights method of reconstruction. It is Phase 2 compatible and does the reconstruction on a GPU.
There is an existing module that produces UncalibratedRecHits with the same method for Phase 2 executing the reconstruction on CPU. The UncalibratedRecHits from both modules are virtually identical. Furthermore, the CUDADataFormats/EcalRecHitSoA/interface/EcalUncalibratedRecHit.h data format has been modified to include the AmplitudeError variable to be compatible with the CPU data format.
A switch producer has been added to the phase 2 configuration to be able to run the CPU and GPU algorithms.
PR validation:
The PR passed scram b -tests and runTheMatrix.py -l limited -i all --ibeos.
The configuration was tested with ECAL Phase 2 WF 28234.61.
By adding the gpu modifier to the reconstruction step of WF 28234.61 the GPU branch of the switch producer has been tested.
The UncalibratedRecHits are equivalent for all variables between the CPU and GPU reconstructions.
The timings of the CPU and GPU algorithms have been compared and are similar.