Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update EDM inputs to HLT-Validation tests [12_1_X only] #40020

Merged
merged 1 commit into from
Nov 24, 2022

Conversation

missirol
Copy link
Contributor

@missirol missirol commented Nov 9, 2022

PR description:

This PR is specific to the CMSSW_12_1_X release cycle.

Some of the HLT-Validation tests running in IBs started to fail due to the unavailability of EDM files used as input to these tests (see IB dashboard, and #40013).

This PR updates the path to these files, to use copies kept in the TSG area on EOS. which will now rely on the cms-bot cache (see discussion in #40013 and #40020, where the corresponding updates to cms-bot are also mentioned).

PR validation:

None.

If this PR is a backport, please specify the original PR and why you need to backport that PR. If this PR will be backported, please specify to which release cycle the backport is meant for:

N/A

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2022

A new Pull Request was created by @missirol (Marino Missiroli) for CMSSW_12_1_X.

It involves the following packages:

  • Configuration/HLT (hlt)
  • HLTrigger/Configuration (hlt)

@cmsbuild, @missirol, @Martin-Grunewald can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @silviodonato, @fabiocos this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@missirol
Copy link
Contributor Author

missirol commented Nov 9, 2022

please test

@@ -22,25 +22,25 @@ def addOnTestsHLT():
'hlt_mc_PRef' : ['cmsDriver.py TTbar_13TeV_TuneCUETP8M1_cfi -s GEN,SIM,DIGI,L1,DIGI2RAW --mc --scenario=pp -n 10 --conditions auto:run3_mc_PRef --relval 9000,50 --datatier "GEN-SIM-RAW" --eventcontent RAWSIM --customise=HLTrigger/Configuration/CustomConfigs.L1T --era Run3 --fileout file:RelVal_Raw_PRef_MC.root',
'HLTrigger/Configuration/test/OnLine_HLT_PRef.py',
'cmsDriver.py RelVal -s HLT:PRef,RAW2DIGI,L1Reco,RECO --mc --scenario=pp -n 10 --conditions auto:run3_mc_PRef --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run3 --processName=HLTRECO --filein file:RelVal_Raw_PRef_MC.root --fileout file:RelVal_Raw_PRef_MC_HLT_RECO.root'],
'hlt_data_Fake' : ['cmsDriver.py RelVal -s L1REPACK:GT1 --data --scenario=pp -n 10 --conditions auto:run1_hlt_Fake --relval 9000,50 --datatier "RAW" --eventcontent RAW --customise=HLTrigger/Configuration/CustomConfigs.L1T --fileout file:RelVal_Raw_Fake_DATA.root --filein /store/data/Run2012A/MuEG/RAW/v1/000/191/718/14932935-E289-E111-830C-5404A6388697.root',
'hlt_data_Fake' : ['cmsDriver.py RelVal -s L1REPACK:GT1 --data --scenario=pp -n 10 --conditions auto:run1_hlt_Fake --relval 9000,50 --datatier "RAW" --eventcontent RAW --customise=HLTrigger/Configuration/CustomConfigs.L1T --fileout file:RelVal_Raw_Fake_DATA.root --filein root://eoscms.cern.ch//eos/cms/store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/RAW/Run2012A_MuEG_run191718/14932935-E289-E111-830C-5404A6388697.root',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, all the files in the removed lines (in both files of this PR) are already cached by the IB system. @smuzaffar Do we have preference between the IB cache and explicit EOS group area?

Copy link
Contributor Author

@missirol missirol Nov 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, all the files in the removed lines (in both files of this PR) are already cached by the IB system.

(Sorry to interject. My understanding is..)

That's true, and I was unsure about changing the paths in the addOnTests. I did it to be consistent with HLTrigger/Configuration/test/cmsDriver.csh, which is what we do need to update as it runs in IBs without any redirection to the cmsbuild/ area (the bot cache is indeed the main source we used to copy those files into the TSG EOS area).

Copy link
Contributor Author

@missirol missirol Nov 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #40013 (comment) [**], an even better solution would be to figure out how to extend the caching mechanism to the files used by the HLT-Validation tests, i.e. those in cmsDriver.csh (it might be simple, but I don't know how), without relying on having them in the TSG area.

[**] It might be useful to know from experts if there are ways to cache these files similarly to what is done for the EDM files used in RelVal wfs and addOnTests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #40013 (comment) [**], an even better solution would be to figure out how to extend the caching mechanism to the files used by the HLT-Validation tests, i.e. those in cmsDriver.csh (it might be simple, but I don't know how), without relying on having them in the TSG area.

[**] It might be useful to know from experts if there are ways to cache these files similarly to what is done for the EDM files used in RelVal wfs and addOnTests.

Sorry for being slow... @smuzaffar What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bot caches all data files read by runTheMatrix workflows, addOns and unit tests. It parses the log files and search for messages like

09-Nov-2022 16:05:53 CET  Initiating request to open file root://eoscms.cern.ch//eos/cms/store/user/cmsbuild/store/relval/CMSSW_12_5_0_pre4/RelValTTbar_14TeV/GEN-SIM/124X_mcRun3_2022_realistic_v10_BS2022-v1/10000/597c0099-c758-4af7-84c3-02b6b15de660.root

We can extend bot for parse the HLT-Validation logs too and cache those files

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@missirol

  • For the HLT-Validation tests that run in IBs, if we were to only specify /store/{mc,data}/[..] in cmsDriver.csh, would these tests be redirected to the cmsbot cache when cmsRun runs? How does that work?

For now HLT tests are not using the bot cache but it is trivial to update these tests to start using bot eos cache. We just need to set [a] for this job ( which I just did cms-sw/cms-bot@5838fb2 ). SITECONF in cms-ib area has special rules [b] to first look for bot cache and then fallback to rxootd global re-director.

One problem is that we also often run these HLT-Val tests locally, and in that case it is convenient to just point to the TSG EOS area to avoid any issues with file access. Is there a way to run a script like this cmsDriver.csh interactively in a way that it eventually redirects to the cmsbot cache?

If you have access to cvmfs then you can set [a] locally and then your tests should first look in to bot cache.

[a]

export CMS_PATH="/cvmfs/cms-ib.cern.ch"
export SITECONFIG_PATH="/cvmfs/cms-ib.cern.ch/SITECONF/local"

[b]
https://github.com/cms-sw/siteconf/blob/master/local/PhEDEx/storage.xml#L9-L11
https://github.com/cms-sw/siteconf/blob/master/local/storage.json#L22-L41

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smuzaffar

Thanks a lot. It's clearer to me now.

I assume that cms-sw/cms-bot@5838fb2 applies to all release cycles from now on, correct?

I will check if the files in all the PRs like this one (e.g. #40031) are in the cmsbot cache; if that's the case, I can update these PRs removing the explicit path to the TSG area (otherwise, we need a way to cache the missing ones first).


@Martin-Grunewald , using [a] (in .bashrc or similar) we'd be able to run the HLT-Val tests locally even if we put just /store/{mc,data}/[..] as path names, without relying on the TSG area. On nodes with EOS access, I think this won't require a grid-certificate proxy.

[a]

export CMS_PATH="/cvmfs/cms-ib.cern.ch"
export SITECONFIG_PATH="/cvmfs/cms-ib.cern.ch/SITECONF/local"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smuzaffar , I'm looking into updating the HLT-Val tests to use the bot cache. I think the Data files are all in the cache (as Matti noted); for the MC files, that's not the case.

Those tests use GEN-SIM files (example) that I can't find anywhere (old datasets), except for the copies we do have in the TSG EOS area. Would it be possible to copy them manually into the bot cache (something like [1])?

Regarding

We can extend bot for parse the HLT-Validation logs too and cache those files

Is this something you could do in the next days? Should I wait for that before updating these PRs?

[1]

#!/bin/bash

copyFile(){
  local INPFILE=/eos/cms"${1}"
  local OUTFILE=/eos/cms/store/user/cmsbuild"${2}"
  ([ -f "${INPFILE}" ] && [ ! -f "${OUTFILE}" ]) || return 0
  mkdir -p "$(dirname ${OUTFILE})"
  cp "${INPFILE}" "${OUTFILE}"
}

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_5/143C21CD-E8A2-E311-87BE-0025904C66E8.root \
/store/relval/CMSSW_5_3_16/RelValPyquen_ZeemumuJets_pt10_2760GeV/GEN-SIM/PU_STARTHI53_LV1_mar03-v2/00000/143C21CD-E8A2-E311-87BE-0025904C66E8.root

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_5/DE03BB7E-F429-E211-A0B4-001A928116CC.root \
/store/relval/CMSSW_5_3_6-START53_V14/RelValProdTTbar/GEN-SIM/v2/00000/DE03BB7E-F429-E211-A0B4-001A928116CC.root

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_8/06A6C86B-C634-E611-93A5-0CC47A74525A.root \
/store/relval/CMSSW_8_0_11/RelValProdTTbar/GEN-SIM/80X_mcRun1_realistic_v4-v1/10000/06A6C86B-C634-E611-93A5-0CC47A74525A.root

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_8/06F2C3AC-8957-E611-9DDF-0025905B85D8.root \
/store/relval/CMSSW_8_0_16/RelValProdTTbar_13/GEN-SIM/80X_mcRun2_asymptotic_v16_gs7120p2-v1/10000/06F2C3AC-8957-E611-9DDF-0025905B85D8.root

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_8/F8FC5F64-1657-E611-A57E-002590A887F0.root \
/store/relval/CMSSW_8_0_16/RelValZEEMM_13_HI/GEN-SIM/80X_mcRun2_HeavyIon_v9-v1/10000/F8FC5F64-1657-E611-A57E-002590A887F0.root

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_9_phase1/14F749AC-8AFE-E611-9821-0CC47A78A4A0.root \
/store/relval/CMSSW_9_0_0_pre5/RelValTTbar_13/GEN-SIM/90X_upgrade2017_realistic_v15-v1/00000/14F749AC-8AFE-E611-9821-0CC47A78A4A0.root

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_10/E288668E-A2D1-D446-A401-D71EA43DD796.root \
/store/relval/CMSSW_10_3_0_pre5/RelValZEEMM_13_HI/GEN-SIM/103X_upgrade2018_realistic_v7-v1/10000/E288668E-A2D1-D446-A401-D71EA43DD796.root

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_11/3ee9ba1e-0ef8-4242-8343-cff886c9f7b3.root \
/store/relval/CMSSW_11_2_0_pre8/RelValTTbar_13/GEN-SIM/112X_mcRun3_2021_design_v10-v1/00000/3ee9ba1e-0ef8-4242-8343-cff886c9f7b3.root

copyFile /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/GEN-SIM/CMSSW_11/65e018bc-2a25-4f53-b9cf-aba35a7b212d.root \
/store/relval/CMSSW_11_2_0_pre8/RelValZEE_14_HI_2021/GEN-SIM/112X_mcRun3_2021_realistic_HI_v11-v1/00000/65e018bc-2a25-4f53-b9cf-aba35a7b212d.root

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@missirol , all these files are now available in ibeos cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @smuzaffar !

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2022

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19ec01/28906/summary.html
COMMIT: 313ec2b
CMSSW: CMSSW_12_1_X_2022-11-06-0000/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/40020/28906/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

  • 138.4138.4_PromptCollisions+RunMinimumBias2021+ALCARECOPROMPTR3+HARVESTDPROMPTR3/step2_PromptCollisions+RunMinimumBias2021+ALCARECOPROMPTR3+HARVESTDPROMPTR3.log

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 42
  • DQMHistoTests: Total histograms compared: 2901440
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2901412
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 41 files compared)
  • Checked 177 log files, 37 edm output root files, 42 DQM output files
  • TriggerResults: no differences found

@cmsbuild
Copy link
Contributor

Pull request #40020 was updated. @cmsbuild, @missirol, @Martin-Grunewald can you please check and sign again.

@missirol
Copy link
Contributor Author

please test

The PR is now updated to use the bot cache, instead of the explicit EOS (TSG) area. Will update the PR description(s).

@Martin-Grunewald
Copy link
Contributor

I assume

export CMS_PATH="/cvmfs/cms-ib.cern.ch"
export SITECONFIG_PATH="/cvmfs/cms-ib.cern.ch/SITECONF/local"

works for IBs but not (pre)releases as it looks ib specific?

@missirol
Copy link
Contributor Author

I assume

export CMS_PATH="/cvmfs/cms-ib.cern.ch"
export SITECONFIG_PATH="/cvmfs/cms-ib.cern.ch/SITECONF/local"

works for IBs but not (pre)releases as it looks ib specific?

Doing [1] I don't see problems checking out releases like 12_5_0 or 12_6_0_pre4.

@smuzaffar, is there any caveat to keep in mind when setting CMS_PATH and SITECONFIG_PATH like in [1] ?

[1]

# enable redirection to cms-bot cache (EOS area of user:cmsbuild)                                                                                 
> export CMS_PATH="/cvmfs/cms-ib.cern.ch"                                                                                                           
> export SITECONFIG_PATH="/cvmfs/cms-ib.cern.ch/SITECONF/local"                                                                                     
# enable CMS-software environment                                                                                                                 
> source /cvmfs/cms.cern.ch/cmsset_default.sh                                                                                                       

> echo $CMS_PATH
/cvmfs/cms-ib.cern.ch
> echo $SITECONFIG_PATH/
/cvmfs/cms-ib.cern.ch/SITECONF/local/

> which scram
/cvmfs/cms.cern.ch/common/scram

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19ec01/29223/summary.html
COMMIT: 5a70705
CMSSW: CMSSW_12_1_X_2022-11-20-0000/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/40020/29223/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

  • 138.4138.4_PromptCollisions+RunMinimumBias2021+ALCARECOPROMPTR3+HARVESTDPROMPTR3/step2_PromptCollisions+RunMinimumBias2021+ALCARECOPROMPTR3+HARVESTDPROMPTR3.log

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 42
  • DQMHistoTests: Total histograms compared: 2901440
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2901411
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 41 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 177 log files, 37 edm output root files, 42 DQM output files
  • TriggerResults: no differences found

@missirol
Copy link
Contributor Author

The failure in wf 138.4 [1] already occurs in 12_1_X IBs [2], so it's a known issue.

(reminder: this PR is insensitive to PR tests.)

[1]

----- Begin Fatal Exception 23-Nov-2022 21:50:33 CET-----------------------
An exception of category 'ConditionDatabase' occurred while
   [0] Processing Event run: 346512 lumi: 250 event: 243042266
   [1] Running path 'dqmoffline_step'
Exception Message:
Payload of type LHCInfo with id 33134a07bf5346992723195b6b8ab9e21778b71d could not be loaded. An exception of category 'ConditionDatabase' occurred.
Exception Message:
De-serialization failed: the current boost version (1_75) is unable to read the payload. Data might have been serialized with an incompatible version. Payload serialization info:  {
"CMSSW_version": "CMSSW_12_3_0",
"architecture": "slc7_amd64_gcc10",
"technology": "boost/serialization",
"tech_version": "1_78"
 }
 from default_deserialize 
 from Session::fetchPayload 
----- End Fatal Exception -------------------------------------------------

[2]
https://cmssdt.cern.ch/SDT/cgi-bin/logreader/cc8_amd64_gcc9/CMSSW_12_1_X_2022-11-20-0000/pyRelValMatrixLogs/run/138.4_PromptCollisions+RunMinimumBias2021+ALCARECOPROMPTR3+HARVESTDPROMPTR3/step2_PromptCollisions+RunMinimumBias2021+ALCARECOPROMPTR3+HARVESTDPROMPTR3.log

@missirol
Copy link
Contributor Author

+hlt

@cms-sw/orp-l2 , this PR is specific to 12_1_X: it fixes the HLT-Validation tests in 12_1_X IBs, which are currently failing.

The RelVal error is expected, see #40020 (comment).

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_12_1_X IBs (but tests are reportedly failing) and once validation in the development release cycle CMSSW_12_6_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@perrotta
Copy link
Contributor

merge

@cmsbuild cmsbuild merged commit 6e412aa into cms-sw:CMSSW_12_1_X Nov 24, 2022
@missirol missirol deleted the fixHLTValTests_121X branch November 24, 2022 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants