Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Sequence to Task Reconstruction_cff #28678

Merged
merged 3 commits into from
Jan 13, 2020
Merged

Conversation

jeongeun
Copy link
Contributor

PR description: Move Sequence to Task for Reconstruction(HeavyIons/Cosmics)_cff cff files.

(An extension of the previous task PR#25163, PR#27474 , PR#27640, PR#27887, PR#28050,PR#28300,PR#28399)

Tested in CMSSW_11_1_X_2019_12_20-1100, the basic test all passed in the CMSSW PR instructions.

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28678/13248

  • This PR adds an extra 24KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @jeongeun (JeongEun Lee) for master.

It involves the following packages:

Configuration/StandardSequences
RecoCTPPS/Configuration
RecoEgamma/EgammaPhotonProducers
RecoLocalFastTime/Configuration
RecoMTD/Configuration
RecoParticleFlow/PFTracking
RecoVertex/Configuration

@perrotta, @kpedro88, @cmsbuild, @franzoni, @slava77, @fabiocos, @davidlange6 can you please review it and eventually sign? Thanks.
@varuns23, @mmarionncern, @Sam-Harper, @makortel, @felicepantaleo, @jainshilpi, @jan-kaspar, @GiacomoSguazzoni, @rovere, @lgray, @Martin-Grunewald, @sobhatta, @seemasharmafnal, @hatakeyamak, @afiqaize, @dgulhan, @ebrondol, @VinInn, @forthommel, @bachtis, @cbernet this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos, @silviodonato you are the release manager for this.

cms-bot commands are listed here

@perrotta
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 23, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/4093/console Started: 2019/12/23 07:57

@cmsbuild
Copy link
Contributor

-1

Tested at: 6054885

CMSSW: CMSSW_11_1_X_2019-12-22-2300
SCRAM_ARCH: slc7_amd64_gcc820
You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-57203a/4093/summary.html

I found follow errors while testing this PR

Failed tests: RelVals AddOn

  • RelVals:

When I ran the RelVals I found an error in the following workflows:
5.1 step1

runTheMatrix-results/5.1_TTbar+TTbarFS+HARVESTFS/step1_TTbar+TTbarFS+HARVESTFS.log

135.4 step1
runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step1_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log

  • AddOn:

I found errors in the following addon tests:

cmsDriver.py TTbar_8TeV_TuneCUETP8M1_cfi --conditions auto:run1_mc --fast -n 100 --eventcontent AODSIM,DQM --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,EI,VALIDATION --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --datatier GEN-SIM-DIGI-RECO,DQMIO --beamspot Realistic8TeVCollision : FAILED - time: date Mon Dec 23 09:22:52 2019-date Mon Dec 23 09:19:12 2019 s - exit: 17920
cmsDriver.py TTbar_13TeV_TuneCUETP8M1_cfi --conditions auto:run2_mc_l1stage1 --fast -n 100 --eventcontent AODSIM,DQM --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,EI,VALIDATION --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --datatier GEN-SIM-DIGI-RECO,DQMIO --beamspot NominalCollision2015 --era Run2_25ns : FAILED - time: date Mon Dec 23 09:23:35 2019-date Mon Dec 23 09:19:27 2019 s - exit: 17920
cmsDriver.py TTbar_13TeV_TuneCUETP8M1_cfi --conditions auto:run2_mc --fast -n 100 --eventcontent AODSIM,DQM --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,EI,VALIDATION --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --datatier GEN-SIM-DIGI-RECO,DQMIO --beamspot NominalCollision2015 --era Run2_2016 : FAILED - time: date Mon Dec 23 09:23:31 2019-date Mon Dec 23 09:19:33 2019 s - exit: 17920

@cmsbuild
Copy link
Contributor

Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped)

globalreco = cms.Sequence(globalrecoTask)

_run3_globalrecoTask = globalrecoTask.copyAndExclude([CastorFullRecoTask])
run3_common.toReplaceWith(globalrecoTask, _run3_globalrecoTask)

_fastSim_globalreco = globalreco.copyAndExclude([CastorFullReco,muoncosmicreco])
# insert the few tracking modules to be run after mixing back in the globalreco sequence
_fastSim_globalreco.insert(0,newCombinedSeeds+trackExtrapolator+caloTowerForTrk+firstStepPrimaryVerticesUnsorted+ak4CaloJetsForTrk+initialStepTrackRefsForJets+firstStepPrimaryVertices)
Copy link
Contributor Author

@jeongeun jeongeun Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors were produced from "firstStepPrimaryVerticesUnsorted+ak4CaloJetsForTrk" in this line.
I will check this part again.

trackingGlobalRecoTask,
hcalGlobalRecoTask,
vertexrecoTask)
globalreco_tracking = cms.Sequence(globalreco_trackingTask)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed, as not used enywhere

globalreco = cms.Sequence(globalrecoTask)

_run3_globalrecoTask = globalrecoTask.copyAndExclude([CastorFullRecoTask])
run3_common.toReplaceWith(globalrecoTask, _run3_globalrecoTask)

_fastSim_globalreco = globalreco.copyAndExclude([CastorFullReco,muoncosmicreco])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifications should be done at the level of Task, as in the other cases

@perrotta
Copy link
Contributor

perrotta commented Jan 9, 2020

+1

  • This concludes the migration of the reconstruction Sequences into Tasks
  • Jenkins tests pass and show no differences, as they should

@silviodonato
Copy link
Contributor

+operations

@perrotta
Copy link
Contributor

@kpedro88 could you please have a look, and either sign for upgrade or propose updates?

@kpedro88
Copy link
Contributor

+upgrade

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @silviodonato, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 85b903f into cms-sw:master Jan 13, 2020
@silviodonato
Copy link
Contributor

silviodonato commented Jan 14, 2020

@jeongeun this PR is breaking a RelVal test (workflow 11634.7) of CMSSW_11_1_X_2020-01-13-2300.
See https://cmssdt.cern.ch/SDT/html/cmssdt-ib/#/relVal/CMSSW_11_1/2020-01-13-2300?selectedArchs=slc7_amd64_gcc820&selectedFlavors=X&selectedStatus=failed

I tried the step3 cmsDriver command with and without this PR in CMSSW_11_1_X_2020-01-13-1100.
For some reason, this PR removes "initialStepTrackCandidatesMkFitInput" from process.reconstruction_step. This causes the error:

Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: MkFitInputWrapper
Looking for module label: initialStepTrackCandidatesMkFitInput
Looking for productInstanceName: 

Could you check and provide a fix? Thanks.

You can easily see this issue using the following commands

cmsDriver.py step3  --conditions auto:phase1_2021_realistic -n 10 --era Run3 --eventcontent RECOSIM,MINIAODSIM,DQM --runUnscheduled  -s RAW2DIGI,L1Reco,RECO,RECOSIM,EI,PAT,VALIDATION:@standardValidation+@miniAODValidation,DQM:@standardDQM+@ExtraHLT+@miniAODDQM --datatier GEN-SIM-RECO,MINIAODSIM,DQMIO --customise RecoTracker/MkFit/customizeInitialStepToMkFit.customizeInitialStepToMkFit --geometry DB:Extended  --customise Validation/Performance/TimeMemorySummary.customiseWithTimeMemorySummary --prefix '/data/cmsbld/jenkins/workspace/ib-run-relvals/cms-bot/monitor_workflow.py timeout --signal SIGTERM 9000 '  --filein  file:step2.root  --fileout file:step3.root  --suffix "-j JobReport3.xml "  --nThreads 4 --no_exec
edmConfigDump step3_RAW2DIGI_L1Reco_RECO_RECOSIM_EI_PAT_VALIDATION_DQM.py > step3_dump.py
cat step3_dump.py | grep initialStepTrackCandidatesMkFitInput

@perrotta
Copy link
Contributor

Thank you @silviodonato, I'm also trying to investigate.

It is far from obvious to me to understand how this PR may have removed the initialStepTrackCandidatesMkFitInput, which is defined in RecoTracker/​MkFit/​python/​customizeInitialStepToMkFit.py (untouched by this PR), which modifies the InitialStepTask defined in RecoTracker/IterativeTracking/python/InitialStep_cff.py (also untouched by this PR)

@makortel , do you have any suggestion?

@makortel
Copy link
Contributor

By quick look I have no idea, but I can take care of fixing it.

@perrotta
Copy link
Contributor

By quick look I have no idea, but I can take care of fixing it.

Thank you @makortel !

@perrotta
Copy link
Contributor

edmConfigDump step3_RAW2DIGI_L1Reco_RECO_RECOSIM_EI_PAT_VALIDATION_DQM.py | fgrep InitialStepTask
process.InitialStepTask = cms.Task(process.caloJetsForTrkTask, process.firstStepPrimaryVertices, process.firstStepPrimaryVerticesUnsorted, process.initialStep, process.initialStepClassifier1, process.initialStepHitDoublets, process.initialStepHitQuadruplets, process.initialStepSeedLayers, process.initialStepSeeds, process.initialStepTrackCandidates, process.initialStepTrackCandidatesMkFit, process.initialStepTrackCandidatesMkFitInput, process.initialStepTrackRefsForJets, process.initialStepTrackingRegions, process.initialStepTracks)

and initialStepTrackCandidatesMkFitInput is correctly there.

Puzzled...

@makortel
Copy link
Contributor

Yeah, but process.reconstructionTask does not contain globalreco_trackingTask (that contains InitialStepTask via other Tasks) but a copy that contains an expanded copy of InitialStepTask. So some operation leads to (partial) expansion of reconstructionTask before the customizeInitialStepToMkFit() gets called. I'm trying to find what exactly triggers the expansion (so far I've only gotten more confused).

@makortel
Copy link
Contributor

The problem is essentially the same as in #28703. This line

process.initialStepTrackCandidates = mkFitOutputConverter_cfi.mkFitOutputConverter.clone(

starts a replacement operation in all tasks and sequences that may expand Tasks/Sequences, and the order which the replacement occur is not fully deterministic. Before this PR it happened that the reconstruction sequence had a link to InitialStepTask, and now it doesn't. (ok, the cause for this exact case is probably because the tasks are operated before sequences)

I have two options in mind:

  1. in the customize function, instead of adding the two other modules into InitialStepTask, loop over all tasks, and add them into those that have process.initialStepTrackCandidates
  2. replace the customize function with a Modifier (e.g. trackingMkFit in Configuration.ProcessModifiers)

@perrotta @slava77 @mtosi @JanFSchulte Do you have preference between the two?

@perrotta
Copy link
Contributor

Thank you Matti for your investigation and proposals.
I'd let those more expert than me about MkFit to add their more qualified thoughts. Personally, I have the impression that the option with the Modifier is cleaner (and in perspective even safer, because of that), and it can more easily scale in case further modifications will be required for using MkFit: my vote is for (2).

@makortel
Copy link
Contributor

In absence of further comments I implemented the Modifier solution in #28755.

@silviodonato
Copy link
Contributor

Thanks @makortel

@mtosi
Copy link
Contributor

mtosi commented Jan 17, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants