Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add basic data run3 reco and apply to splash events as a wf 138.3; fix tools for simple lumi list #35776

Merged

Conversation

slava77
Copy link
Contributor

@slava77 slava77 commented Oct 21, 2021

This is mostly an idea, but can become a full PR for the splash test data reco for run3.
The lumi and run 345881: [782, 790, 796, 801, 1031, 1037] are from https://hypernews.cern.ch/HyperNews/CMS/get/tier0-Ops/2293/1/1/1/1.html

I may need inputs from

  • @cms-sw/pdmv-l2 on how to better name the job fragments
  • @cms-sw/alca-l2 on a choice of GT

we will also need the files in an accessible place

/store/data/Commissioning2021/MinimumBias/RAW/v1/000/345/881/00000/0eeadc99-21db-416a-8e4d-892c8d2844d6.root
/store/data/Commissioning2021/MinimumBias/RAW/v1/000/345/881/00000/2c1fb516-292c-45df-a64a-607790834cd6.root
/store/data/Commissioning2021/MinimumBias/RAW/v1/000/345/881/00000/9356f54b-11d2-4dad-8eaa-74b77436c3b8.root
/store/data/Commissioning2021/MinimumBias/RAW/v1/000/345/881/00000/be2a33cc-ef19-422f-a10b-d719098ef08c.root
/store/data/Commissioning2021/MinimumBias/RAW/v1/000/345/881/00000/dc29ee8f-aa45-423c-a16d-0c42d5a725aa.root
/store/data/Commissioning2021/MinimumBias/RAW/v1/000/345/881/00000/ff6b4c8c-066a-4be0-9611-f5928b43896b.root

I was not able to access the files with the regular xrootd read (cms-xrd-global.cern.ch). Some help to get these (or a better replacement) transferred is appreciated.

after the files are available, the number of events will likely need to be updated so that just 1 or 2 events is processed to avoid timeouts.

@drkovalskyi

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35776/26131

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @slava77 (Slava Krutelyov) for master.

It involves the following packages:

  • Configuration/PyReleaseValidation (pdmv, upgrade)
  • FWCore/PythonUtilities (core)

@smuzaffar, @Dr15Jones, @jordan-martins, @makortel, @bbilin, @wajidalikhan, @cmsbuild, @AdrianoDee, @srimanob, @kskovpen can you please review it and eventually sign? Thanks.
@makortel, @wddgit, @Martin-Grunewald, @missirol, @kpedro88, @fabiocos, @slomeo this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@tvami
Copy link
Contributor

tvami commented Oct 21, 2021

Hi @slava77
I think the reason why you cannot read the files is that it's at Tier-0 disk and at TAPE. Luckily all these files are in the block of
/MinimumBias/Commissioning2021-v1/RAW#69627126-f5d2-4613-9bc9-623e9facaf76 so I've made a rule for it to stay at T2_CH_CERN: https://cms-rucio-webui.cern.ch/rule?rule_id=2bd0d8d691c944f2a34cdeb37dbefcd6

I think you can expect this to be copied in an hour from now.

Regarding the GT, I think what you did is correct and the autoCond key for data Run-3 should be used.

@slava77
Copy link
Contributor Author

slava77 commented Oct 21, 2021

I think you can expect this to be copied in an hour from now.

@tvami
thank you for making the request. I'll wait a bit to start the tests then.

@tvami
Copy link
Contributor

tvami commented Oct 21, 2021

Hi @slava77 the dataset is copied, feel free to trigger tests

@tvami
Copy link
Contributor

tvami commented Oct 21, 2021

@cmsbuild , please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4235bc/19815/summary.html
COMMIT: 44a3dc1
CMSSW: CMSSW_12_1_X_2021-10-21-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/35776/19815/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 40
  • DQMHistoTests: Total histograms compared: 2751113
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2751090
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 39 files compared)
  • Checked 170 log files, 37 edm output root files, 40 DQM output files
  • TriggerResults: no differences found

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35776/26136

@smuzaffar
Copy link
Contributor

smuzaffar commented Oct 22, 2021

ok, the input file is cached but looks like the job is hanging for this workflow https://cmssdt.cern.ch/jenkins/job/ib-run-pr-relvals/11580/console . It is not doing any thing after printing the message

%MSG-w EnergyInDeadEE_FE:  EcalRecHitProducer:ecalRecHit@cpu  22-Oct-2021 10:35:55 CEST Run: 345881 Event: 18724
TP energy in the dead TT = 310.166 at (EE iz +  ix 17 , iy 6)
%MSG
#--------------------------------------------------------------------------
#                         FastJet release 3.4.0
#                 M. Cacciari, G.P. Salam and G. Soyez                  
#     A software package for jet finding and analysis at colliders      
#                           http://fastjet.fr                           
#                                                                             
# Please cite EPJC72(2012)1896 [arXiv:1111.6097] if you use this package
# for scientific work and optionally PLB641(2006)57 [hep-ph/0512210].   
#                                                                       
# FastJet is provided without warranty under the GNU GPL v2 or higher.  
# It uses T. Chan's closest pair algorithm, S. Fortune's Voronoi code
# and 3rd party plugin jet algorithms. See COPYING file for details.
#--------------------------------------------------------------------------

I am afraid it might time out again. Have you managed to run this workflow locally ?

@davidlange6
Copy link
Contributor

davidlange6 commented Oct 22, 2021 via email

@davidlange6
Copy link
Contributor

No - it isn't - @smuzaffar - this needs at least the 10-21-2300 build to work..

@smuzaffar
Copy link
Contributor

please test
ah ok @davidlange6 , 10-21-2300 is available now for tests

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4235bc/19831/summary.html
COMMIT: 1a3ffb2
CMSSW: CMSSW_12_1_X_2021-10-21-2300/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/35776/19831/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-4235bc/138.3_RunMinimumBias2021Splash+RunMinimumBias2021Splash+RECODR3Splash+HARVESTDR3

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 5 differences found in the comparisons
  • DQMHistoTests: Total files compared: 40
  • DQMHistoTests: Total histograms compared: 2751113
  • DQMHistoTests: Total failures: 11
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2751080
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 39 files compared)
  • Checked 170 log files, 37 edm output root files, 40 DQM output files
  • TriggerResults: no differences found

@slava77
Copy link
Contributor Author

slava77 commented Oct 22, 2021

looking at the log for 138.3

Begin processing the 1st record. Run 345881, Event 18724, LumiSection 796 on stream 0 at 22-Oct-2021 11:25:59.651 CEST
Begin processing the 2nd record. Run 345881, Event 18568, LumiSection 790 on stream 0 at 22-Oct-2021 12:29:19.070 CEST
22-Oct-2021 12:29:25 CEST  Closed file 

the second event is not a splash.

Considering that the first one takes about an hour, I don't feel like adding another proper splash event.

The "plan" (hope?) for the near/mid term is that developers will check the slow modules and this will eventually become faster. Once that happens, perhaps another 1-2 events can be added as well.

I think that I'm done with the features needed in this setup from my side.

comments are welcome

@tvami
Copy link
Contributor

tvami commented Oct 22, 2021

looking at the log for 138.3

Begin processing the 1st record. Run 345881, Event 18724, LumiSection 796 on stream 0 at 22-Oct-2021 11:25:59.651 CEST
Begin processing the 2nd record. Run 345881, Event 18568, LumiSection 790 on stream 0 at 22-Oct-2021 12:29:19.070 CEST
22-Oct-2021 12:29:25 CEST  Closed file 

the second event is not a splash.

Considering that the first one takes about an hour, I don't feel like adding another proper splash event.

The "plan" (hope?) for the near/mid term is that developers will check the slow modules and this will eventually become faster. Once that happens, perhaps another 1-2 events can be added as well.

I think that I'm done with the features needed in this setup from my side.

comments are welcome

I agree with this proposal

@tvami
Copy link
Contributor

tvami commented Oct 22, 2021

+alca

if newLumis and newLumis[-1][0] <= lumi[0] <= newLumis[-1][1] + 1:
newLumis[-1][1] = max(newLumis[-1][1], lumi[1])
else:
newLumis.append(lumi)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit confused why these changes are needed (probably mostly because of not paying enough attention on what exactly this code does). I understand the point would be to support single lumi numbers in addition to pairs, but the docstring already has

Runs and lumis:
{
'1': [1,2,3,4,6,7,8,9,10],
'2': [1,4,5,20]
}
where the first key is the run number and the list is a list of
individual lumi sections. This form also takes a list of these objects
which can be much faster than LumiList += LumiList

which to me looks like the added lumi list in the relval_steps.py. Does this not work, or what am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it did not work

@makortel
Copy link
Contributor

+core

@srimanob
Copy link
Contributor

+Upgrade

@slava77
Copy link
Contributor Author

slava77 commented Oct 26, 2021

@cms-sw/pdmv-l2
please check this PR and comment or perhaps sign.
Thank you.

@kskovpen
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 24176c0 into cms-sw:master Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants