-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete unscheduled modules that are not consumed by any scheduled module #29553
Conversation
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29553/14842
|
A new Pull Request was created by @makortel (Matti Kortelainen) for master. It involves the following packages: FWCore/Framework @makortel, @smuzaffar, @cmsbuild, @Dr15Jones, @fwyzard can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild, please test |
@Dr15Jones @wddgit Please review |
The tests are being triggered in jenkins. |
for (auto const& description : unusedModules) { | ||
l << "\n " << description->moduleLabel(); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to keep this printout (and the corresponding one in the SubProcess.cc
) enabled for a couple of IBs (to ease debugging in case problems occur), and then demote them to LogInfo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. In the future probably would want an easy way to make this list easily visible. Maybe via the Tracer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracer already prints out the deleted modules (between "ModuleConstruction" and "BeginJob"). Also with LogInfo
the "DeleteModules" messages can be enabled purely in the configuration.
These checks need to be extended for |
-1 Tested at: 69596df CMSSW: CMSSW_11_1_X_2020-04-23-1100 I found follow errors while testing this PR Failed tests: UnitTests RelVals AddOn
I found errors in the following unit tests: ---> test runtestPhysicsToolsPatAlgos had ERRORS
The relvals timed out after 4 hours. runTheMatrix-results/4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC/step2_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC.log8.0 step3 runTheMatrix-results/8.0_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS/step3_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS.log140.56 step2 runTheMatrix-results/140.56_RunHI2018+RunHI2018+RECOHID18+HARVESTDHI18/step2_RunHI2018+RunHI2018+RECOHID18+HARVESTDHI18.log136.731 step3 runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step3_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log136.793 step3 runTheMatrix-results/136.793_RunDoubleEG2017C+RunDoubleEG2017C+HLTDR2_2017+RECODR2_2017reHLT_skimDoubleEG_Prompt+HARVEST2017/step3_RunDoubleEG2017C+RunDoubleEG2017C+HLTDR2_2017+RECODR2_2017reHLT_skimDoubleEG_Prompt+HARVEST2017.log136.874 step3 runTheMatrix-results/136.874_RunEGamma2018C+RunEGamma2018C+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM/step3_RunEGamma2018C+RunEGamma2018C+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM.log1325.7 step2 runTheMatrix-results/1325.7_TTbar_13_94Xv2NanoAODINPUT+TTbar_13_94Xv2NanoAODINPUT+NANOEDMMC2017_94XMiniAODv2+HARVESTNANOAODMC2017_94XMiniAODv2/step2_TTbar_13_94Xv2NanoAODINPUT+TTbar_13_94Xv2NanoAODINPUT+NANOEDMMC2017_94XMiniAODv2+HARVESTNANOAODMC2017_94XMiniAODv2.log1000.0 step2 runTheMatrix-results/1000.0_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT/step2_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT.log1001.0 step2 runTheMatrix-results/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5/step2_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5.log1306.0 step3 runTheMatrix-results/1306.0_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15/step3_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15.log9.0 step4 runTheMatrix-results/9.0_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST/step4_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST.log1330.0 step3 runTheMatrix-results/1330.0_ZMM_13+ZMM_13+DIGIUP15+RECOUP15_L1TMuDQM+HARVESTUP15_L1TMuDQM+NANOUP15/step3_ZMM_13+ZMM_13+DIGIUP15+RECOUP15_L1TMuDQM+HARVESTUP15_L1TMuDQM+NANOUP15.log25.0 step4 runTheMatrix-results/25.0_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT/step4_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT.log158.0 step3 runTheMatrix-results/158.0_HydjetQ_B12_5020GeV_2018_ppReco+HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO/step3_HydjetQ_B12_5020GeV_2018_ppReco+HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO.log4.53 step4 runTheMatrix-results/4.53_RunPhoton2012B+RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT/step4_RunPhoton2012B+RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT.log10024.0 step3 runTheMatrix-results/10024.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017+NanoFull_2017/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017+NanoFull_2017.log25202.0 step3 runTheMatrix-results/25202.0_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25+NANOUP15_PU25/step3_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25+NANOUP15_PU25.log10824.0 step3 runTheMatrix-results/10824.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2018_GenSimFull+DigiFull_2018+RecoFull_2018+HARVESTFull_2018+ALCAFull_2018+NanoFull_2018/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2018_GenSimFull+DigiFull_2018+RecoFull_2018+HARVESTFull_2018+ALCAFull_2018+NanoFull_2018.log11634.0 step3 runTheMatrix-results/11634.0_TTb/bin/sh:/step3_TTbar_14TeV+TTbar_14TeV_TuneCP5_2023_GenSimFull+DigiFull_2023+RecoFull_2023+HARVESTFull_2023+ALCAFull_2023.log12434.0 step3 runTheMatrix-results/12434.0_TTbar_14TeV+TTbar_14TeV_TuneCP5_2023_GenSimFull+DigiFull_2023+RecoFull_2023+HARVESTFull_2023+ALCAFull_2023/step3_TTbar_14TeV+TTbar_14TeV_TuneCP5_2023_GenSimFull+DigiFull_2023+RecoFull_2023+HARVESTFull_2023+ALCAFull_2023.log10224.0 step3 runTheMatrix-results/10224.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017PU_GenSimFull+DigiFullPU_2017PU+RecoFullPU_2017PU+HARVESTFullPU_2017PU+NanoFull_2017PU/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017PU_GenSimFull+DigiFullPU_2017PU+RecoFullPU_2017PU+HARVESTFullPU_2017PU+NanoFull_2017PU.log20034.0 step4 runTheMatrix-results/20034.0_TTbar_14TeV+TTbar_14TeV_TuneCP5_2026D35_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D35+RecoFullGlobal_2026D35+HARVESTFullGlobal_2026D35/step4_TTbar_14TeV+TTbar_14TeV_TuneCP5_2026D35_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D35+RecoFullGlobal_2026D35+HARVESTFullGlobal_2026D35.log21234.0 step3 runTheMatrix-results/21234.0_TTbar_14TeV+TTbar_14TeV_TuneCP5_2026D44_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D44+RecoFullGlobal_2026D44+HARVESTFullGlobal_2026D44/step3_TTbar_14TeV+TTbar_14TeV_TuneCP5_2026D44_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D44+RecoFullGlobal_2026D44+HARVESTFullGlobal_2026D44.log20434.0 step4 runTheMatrix-results/20434.0_TTbar_14TeV+TTbar_14TeV_TuneCP5_2026D41_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D41+RecoFullGlobal_2026D41+HARVESTFullGlobal_2026D41/step4_TTbar_14TeV+TTbar_14TeV_TuneCP5_2026D41_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D41+RecoFullGlobal_2026D41+HARVESTFullGlobal_2026D41.log250202.181 step4 runTheMatrix-results/250202.181_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25/step4_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25.log23234.0 N/A
I found errors in the following addon tests: AddOnTest might have timed out: FAILED - secs |
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
+heterogeneous |
The tests are being triggered in jenkins.
|
+1 |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
This PR makes the framework to delete unscheduled EDModules whose output products are not consumed by any scheduled EDModule. For motivation see #26438 (which this PR mostly resolves).
Here "unscheduled" means that such EDModules are only in
Task
s, i.e. not directly in anyPath
orEndPath
, and "scheduled" means an EDModule in aPath
orEndPath
. If the module data dependence graph has a subgraph of unscheduled modules whose output products are not consumed by any scheduled module, those unscheduled modules are deleted.A possible data dependence from a scheduled module from a SubProcess is also enough to keep an otherwise non-consumed EDModule alive. For example, if the Process and its SubProcess both have a module
foo
, and only the SubProcess has a consumer with anInputTag("foo")
, the modulefoo
in the Process is kept as well for the case that the SubProcess'sfoo
would not produce a product for some events (assuming the product types and instance names also match).The data dependencies are checked also for Lumi and Run products in addition of Event products, for which the
EDConsumerBase::modulesWhoseProductsAreConsumed()
andPathsAndConsumesOfModules
were extended. They were extended also to carry the dependencies from the SubProcesses (it turned out that passing pairs of module label and process name are sufficient, the product type and instance name get handled automatically).The non-consumed modules are deleted after
preallocate
signal and beforebeginJob()
signal. Two new signals,preModuleDestruction
andpostModuleDestruction
are added for those deletions (the "normal" module deletion at the end of thecmsRun
process do not emit these signals). I took a look of all modules watching eitherpreModuleConstruction
orpostModuleConstruction
signals, and added watch for the destruction signals if it seemed to make sense.Note that even if the modules are deleted, the corresponding Workers are kept alive. The Workers are removed from the containers of unscheduled and accumulator modules, so they won't get any further framework transitions after the deletion.
Resolves #26438.
PR validation:
Framework unit tests run.