Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Developers overriding production location for retrieving conditions in configurations #27393

Closed
davidlange6 opened this issue Jun 28, 2019 · 99 comments · Fixed by #36408
Closed

Comments

@davidlange6
Copy link
Contributor

Breaks centrally determined location for retrieving conditions (which of course can change even if it rarely does). All of these should rely on the global tag to determine from where the conditions are to be taken (or they should not be loaded into production workflows)

CondDB.connect = cms.string("frontier://FrontierProd/CMS_CONDITIONS")

self.process.loadRecoTauTagMVAsFromPrepDB.connect = cms.string(conditionDB)

CondDB.connect = cms.string("frontier://FrontierProd/CMS_CONDITIONS")

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 28, 2019

A new Issue was created by @davidlange6 David Lange.

@davidlange6, @Dr15Jones, @smuzaffar, @fabiocos, @kpedro88 can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@davidlange6
Copy link
Contributor Author

assign l1,db,reco

@cmsbuild
Copy link
Contributor

New categories assigned: db,l1

@ggovi,@benkrikler,@rekovic you have been requested to review this Pull request/Issue and eventually sign? Thanks

@davidlange6
Copy link
Contributor Author

assign reconstruction

@cmsbuild
Copy link
Contributor

New categories assigned: reconstruction

@slava77,@perrotta you have been requested to review this Pull request/Issue and eventually sign? Thanks

@ggovi
Copy link
Contributor

ggovi commented Jun 28, 2019

My personal opinion is that a cfg file containing a Condition Customization ( defined with one or more toGet statements ) should be NEVER invoked ( directly or included ) in a production workflow, since:

  • it overrides the mapping record/tag provided by the official Global Tag, referencing some Tag that could stay outside the control ( in terms of testing and validation )
  • it potentially breaks reproducibility

For this reason, I think that the first measure required is to find out the reason of the concerned customisation, and remove them from the production workflows.
This will potentially imply:

  • to replace some tag in existing GTs (?)
  • to create additional GTs (?)

I think AlCa people should be involved in this review procedure

@slava77
Copy link
Contributor

slava77 commented Jun 28, 2019

for cmssw/RecoTauTag/RecoTau/python/tools/runTauIdMVA.py
@mbluj @steggema @swozniewski please comment and perhaps follow up with a resolution

@fabiocos
Copy link
Contributor

for the L1Trigger instances: @rekovic could you please address this issue?

@ggovi
Copy link
Contributor

ggovi commented Jun 28, 2019

Can we add alca in this loop?

@kpedro88
Copy link
Contributor

assign alca

@cmsbuild
Copy link
Contributor

New categories assigned: alca

@christopheralanwest,@franzoni,@tlampen,@pohsun,@tocheng you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mbluj
Copy link
Contributor

mbluj commented Jun 28, 2019

Hello,
comments on

self.process.loadRecoTauTagMVAsFromPrepDB.connect = cms.string(conditionDB)

  1. TauID payloads, i.e. BDT training files and pt-dependent WP definitions, are mapped via python configuration file RecoTauTag/Configuration/python/loadRecoTauTagMVAsFromPrepDB_cfi.py rather than via a GlobalTag. In fact the payloads do not define what one could call "conditions", but define a version of a given tau identification algorithm.
  2. This particular tool (runTauIdMVA.py) is used to add new tauIDs to process object and then potentially to a given workflow - PAT/MiniAOD or post-PAT/MiniAOD. Until 10_6_X the tool has been used only by final users to produce their analysis ntuples on top of MiniAOD. Since 10_6_X the tool is used to add one particular tauID to MiniAOD (namely DeepTauID).
  3. As it is handy to maintain definition of tauIDs in one place it is considered to use the tool also to define upgraded tauIDs for nanoAOD.
  4. Concerning retrieving conditions in #L41 of the tool - it is inactive. It can be activated only if an nonempty connection address is defined in #L31. The reason of such implementation is to make test of payloads from preparational DB simpler.

@slava77
Copy link
Contributor

slava77 commented Jul 1, 2019

+1

the only case affecting reco is clarified in #27393 (comment)
this looks like a reasonable use case for private tests, which does not affect the production setup.

@davidlange6
Copy link
Contributor Author

these are the payloads that were needed to run digi+reco without frontier - it does contradict the conclusion of @slava77 and @mbluj

https://gitlab.cern.ch/hep-benchmarks/hep-workloads/blob/cae5bd13ac1aaf3df01f224db642f2c00df26d93/cms/reco/cms-reco/generate_GlobalTag.sh

@slava77
Copy link
Contributor

slava77 commented Jul 2, 2019

generate_GlobalTag.sh

please elaborate what's in this file.
Thank you.

@davidlange6
Copy link
Contributor Author

all the payloads not in the GT that without which a digi-reco-miniaod workflow will not run

@davidlange6
Copy link
Contributor Author

which I think is not to say (if or if not) they are actually read and used..but rather that our applications depend on them being present to function - perhaps due to purely technical reasons

@davidlange6
Copy link
Contributor Author

(sorry, didn't mean to close this)

@slava77
Copy link
Contributor

slava77 commented Jul 2, 2019

I think that the origin is in

CondDBTauConnection = CondDB.clone( connect = cms.string( 'frontier://FrontierProd/CMS_CONDITIONS' ) )

not in the

self.process.loadRecoTauTagMVAsFromPrepDB.connect = cms.string(conditionDB)

@katilp
Copy link

katilp commented Jan 26, 2024

If these changes are expected to affect the output we would like to have them as soon as possible for testing and eventually reprocessing of the example PFNano datasets that will go to the open data release.
Changing the release has a big impact on the work to be done as well, but we will need to do much of that work for a new GT in any case.

@mmusich
Copy link
Contributor

mmusich commented Jan 26, 2024

For my edification why is CMS releasing open data related to a custom workflow (requiring unsupported conditions)?

@katilp
Copy link

katilp commented Jan 26, 2024

For my edification why is CMS releasing open data related to a custom wotkflow (requiring unsupported conditions)?

Producing NanoAOD enriched with the PF candidates is of major interest to the open data user community. That is why we are providing that example. We do not use unsupported conditions on purpose. We are more than happy to use supported conditions, please let us know how to modify the workflow or point us to the relevant documentation. Thank you, your help will be appreciated🙂!

@mbluj
Copy link
Contributor

mbluj commented Jan 26, 2024

@katilp Executing your cmsDriver command I got the following error:

ImportError: No module named PFNano.pfnano_cff

And indeed there is not PhysicsTools/PFNano/ package in 10_6_30 and customization cff therein. Could you please provide a full installation recipe for copy and paste to reproduce the issue?

In parallel I checked that this configuration without the customization works without problems. This highly probably mean that the issue is caused by the customization.

I will back to it at the beginning of the next week, but I am not sure if I will have time on Monday due to other duties.

@katilp
Copy link

katilp commented Jan 26, 2024

For my edification why is CMS releasing open data related to a custom wotkflow (requiring unsupported conditions)?

Producing NanoAOD enriched with the PF candidates is of major interest to the open data user community. That is why we are providing that example. We do not use unsupported conditions on purpose. We are more than happy to use supported conditions, please let us know how to modify the workflow or point us to the relevant documentation. Thank you, your help will be appreciated🙂!

Also please note that this error appears without any customization and has nothing to do with open data, other than open data preparations helping CMS to improve the codebase ❤️

@mmusich
Copy link
Contributor

mmusich commented Jan 26, 2024

Also please note that this error appears without any customization and has nothing to do with open data, other than open data preparations helping CMS to improve the codebase

this seems in contradiction with #27393 (comment)

@mbluj
Copy link
Contributor

mbluj commented Jan 26, 2024

For my edification why is CMS releasing open data related to a custom wotkflow (requiring unsupported conditions)?

Producing NanoAOD enriched with the PF candidates is of major interest to the open data user community. That is why we are providing that example. We do not use unsupported conditions on purpose. We are more than happy to use supported conditions, please let us know how to modify the workflow or point us to the relevant documentation. Thank you, your help will be appreciated🙂!

Also please note that this error appears without any customization and has nothing to do with open data, other than open data preparations helping CMS to improve the codebase ❤️

As written in the comment above: I was not able to reproduce the issue w/o the customization, i.e. executing the following

cmsrel CMSSW_10_6_30
cd CMSSW_10_6_30/src
cmsenv
cmsDriver.py data_2016UL_OpenData --data --eventcontent NANOAODSIM --datatier NANOAODSIM --step NANO --conditions 106X_dataRun2_v37   --era Run2_2016,run2_nanoAOD_106Xv2 --customise_commands="process.add_(cms.Service('InitRootHandlers', EnableIMT = cms.untracked.bool(False)))" --nThreads 4 -n 100 --filein /store/data/Run2016H/JetHT/MINIAOD/UL2016_MiniAODv2-v2/130000/676E37D2-044C-D346-92D9-A127A55FD279.root --fileout file:nano_data2016_nopf.root  --no_exec
voms-proxy-init --voms cms
cmsRun data_2016UL_OpenData_NANO.py

The cmsDriver command with customization does not work for me.

@katilp
Copy link

katilp commented Jan 26, 2024

@katilp Executing your cmsDriver command I got the following error:

ImportError: No module named PFNano.pfnano_cff

And indeed there is not PhysicsTools/PFNano/ package in 10_6_30 and customization cff therein. Could you please provide a full installation recipe for copy and paste to reproduce the issue?

In parallel I checked that this configuration without the customization works without problems. This highly probably mean that the issue is caused by the customization.

I will back to it at the beginning of the next week, but I am not sure if I will have time on Monday due to other duties.

Thank you!

cmsrel CMSSW_10_6_30
cd CMSSW_10_6_30_src

git clone https://github.com/cms-opendata-analyses/PFNanoProducerTool.git PhysicsTools/PFNano
scram b
cd PhysicsTools/PFNano/
cmsenv

cmsRun <config_below>

For the sake of brevity, removing the customization (updated: bringing back two default lines for customization):

import FWCore.ParameterSet.Config as cms

from Configuration.Eras.Era_Run2_2016_cff import Run2_2016
from Configuration.Eras.Modifier_run2_nanoAOD_106Xv2_cff import run2_nanoAOD_106Xv2

process = cms.Process('NANO',Run2_2016,run2_nanoAOD_106Xv2)

# import of standard configurations
process.load('Configuration.StandardSequences.Services_cff')
process.load('SimGeneral.HepPDTESSource.pythiapdt_cfi')
process.load('FWCore.MessageService.MessageLogger_cfi')
process.load('Configuration.EventContent.EventContent_cff')
process.load('Configuration.StandardSequences.GeometryRecoDB_cff')
process.load('Configuration.StandardSequences.MagneticField_AutoFromDBCurrent_cff')
process.load('PhysicsTools.NanoAOD.nano_cff')
process.load('Configuration.StandardSequences.EndOfProcess_cff')
process.load('Configuration.StandardSequences.FrontierConditions_GlobalTag_cff')

process.maxEvents = cms.untracked.PSet(
    input = cms.untracked.int32(100)
)

# Input source
process.source = cms.Source("PoolSource",
    fileNames = cms.untracked.vstring('root://eospublic.cern.ch//eos/opendata/cms/Run2016G/JetHT/MINIAOD/UL2016_MiniAODv2-v2/130000/35017A26-8C9D-204D-92B6-3ABFBBD4ADF3.root'),
    secondaryFileNames = cms.untracked.vstring()
)

process.options = cms.untracked.PSet(

)

# Production Info
process.configurationMetadata = cms.untracked.PSet(
    annotation = cms.untracked.string('nano_data_2016_UL nevts:100'),
    name = cms.untracked.string('Applications'),
    version = cms.untracked.string('$Revision: 1.19 $')
)

# Output definition

process.NANOAODSIMoutput = cms.OutputModule("NanoAODOutputModule",
    compressionAlgorithm = cms.untracked.string('LZMA'),
    compressionLevel = cms.untracked.int32(9),
    dataset = cms.untracked.PSet(
        dataTier = cms.untracked.string('NANOAODSIM'),
        filterName = cms.untracked.string('')
    ),
    fileName = cms.untracked.string('file:nano_data2016.root'),
    outputCommands = process.NANOAODSIMEventContent.outputCommands
)

# Additional output definition

# Other statements
from Configuration.AlCa.GlobalTag import GlobalTag
#process.GlobalTag = GlobalTag(process.GlobalTag, '106X_dataRun2_v37', '')
process.GlobalTag.connect = cms.string('sqlite_file:/cvmfs/cms-opendata-conddb.cern.ch/106X_dataRun2_v37.db')
process.GlobalTag.globaltag = '106X_dataRun2_v37'

# Path and EndPath definitions
process.nanoAOD_step = cms.Path(process.nanoSequence)
process.endjob_step = cms.EndPath(process.endOfProcess)
process.NANOAODSIMoutput_step = cms.EndPath(process.NANOAODSIMoutput)

# Schedule definition
process.schedule = cms.Schedule(process.nanoAOD_step,process.endjob_step,process.NANOAODSIMoutput_step)
from PhysicsTools.PatAlgos.tools.helpers import associatePatAlgosToolsTask
associatePatAlgosToolsTask(process)

#Setup FWK for multithreaded
process.options.numberOfThreads=cms.untracked.uint32(4)
process.options.numberOfStreams=cms.untracked.uint32(0)
process.options.numberOfConcurrentLuminosityBlocks=cms.untracked.uint32(1)

# customisation of the process.
# Automatic addition of the customisation function from PhysicsTools.NanoAOD.nano_cff
from PhysicsTools.NanoAOD.nano_cff import nanoAOD_customizeData

#call to customisation function nanoAOD_customizeData imported from PhysicsTools.NanoAOD.nano_cff
process = nanoAOD_customizeData(process)

# End of customisation functions

# Customisation from command line

process.add_(cms.Service('InitRootHandlers', EnableIMT = cms.untracked.bool(False)))
# Add early deletion of temporary data products to reduce peak memory need
from Configuration.StandardSequences.earlyDeleteSettings_cff import customiseEarlyDelete
process = customiseEarlyDelete(process)
# End adding early deletion

Note that it will not fail if a frontier connection is available, only when not (to come to the original question)

@mbluj
Copy link
Contributor

mbluj commented Jan 26, 2024

Ah, OK. How can I remove connection to frontier w/o loosing connection to GT?

@mbluj
Copy link
Contributor

mbluj commented Jan 26, 2024

Ah, OK. How can I remove connection to frontier w/o loosing connection to GT?

I see it in your cfg file. I will test it later, but now I must run.

@mbluj
Copy link
Contributor

mbluj commented Jan 26, 2024

With configuration copied from above which has connection to conditions in a sqlite file, giving python dump like this:

>>> process.GlobalTag
cms.ESSource("PoolDBESSource",
    DBParameters = cms.PSet(
        authenticationPath = cms.untracked.string(''),
        authenticationSystem = cms.untracked.int32(0),
        messageLevel = cms.untracked.int32(0),
        security = cms.untracked.string('')
    ),
    DumpStat = cms.untracked.bool(False),
    ReconnectEachRun = cms.untracked.bool(False),
    RefreshAlways = cms.untracked.bool(False),
    RefreshEachRun = cms.untracked.bool(False),
    RefreshOpenIOVs = cms.untracked.bool(False),
    connect = cms.string('sqlite_file:/cvmfs/cms-opendata-conddb.cern.ch/106X_dataRun2_v37.db'),
    globaltag = cms.string('106X_dataRun2_v37'),
    pfnPostfix = cms.untracked.string(''),
    pfnPrefix = cms.untracked.string(''),
    snapshotTime = cms.string(''),
    toGet = cms.VPSet()
)

it still works for me...

@mmusich
Copy link
Contributor

mmusich commented Jan 26, 2024

for the sake of honesty,

For the sake of brevity, removing the customization

this is not exactly removing the customization, you are still checking out a package that is not available in release via:

git clone https://github.com/cms-opendata-analyses/PFNanoProducerTool.git PhysicsTools/PFNano

in a self-contained CMSSW_10_6_30 (which is what is normally accepted as centrally supported) the configuration above doesn't run.

@katilp
Copy link

katilp commented Jan 26, 2024

Ah, OK. How can I remove connection to frontier w/o loosing connection to GT?

I see it in your cfg file. I will test it later, but now I must run.

No, connecting to the /cvmfs area for condition data is not enough, is does not cut the frontier connection. That's why this goes unobserved and it only fails when there is no frontier connection at all. i.e CMS open data VM, or in the CMS open data docker container if one explicitly removes the frontier connection, see https://cms-opendata-releaseguide.docs.cern.ch/computing_environment/containers/#testing-without-frontier-connection

@katilp
Copy link

katilp commented Jan 26, 2024

for the sake of honesty,

For the sake of brevity, removing the customization

this is not exactly removing the customization, you are still checking out a package that is not available in release via:

git clone https://github.com/cms-opendata-analyses/PFNanoProducerTool.git PhysicsTools/PFNano

in a self-contained CMSSW_10_6_30 (which is what is normally accepted as centrally supported) the configuration above doesn't run.

Yes, of course, this is what we provide for open data users. We do not point them to anything centrally supported but provide them examples of how they can use CMS open data. We believe that this an issue independent from the

With configuration copied from above which has connection to conditions in a sqlite file, giving python dump like this:

>>> process.GlobalTag
cms.ESSource("PoolDBESSource",
    DBParameters = cms.PSet(
        authenticationPath = cms.untracked.string(''),
        authenticationSystem = cms.untracked.int32(0),
        messageLevel = cms.untracked.int32(0),
        security = cms.untracked.string('')
    ),
    DumpStat = cms.untracked.bool(False),
    ReconnectEachRun = cms.untracked.bool(False),
    RefreshAlways = cms.untracked.bool(False),
    RefreshEachRun = cms.untracked.bool(False),
    RefreshOpenIOVs = cms.untracked.bool(False),
    connect = cms.string('sqlite_file:/cvmfs/cms-opendata-conddb.cern.ch/106X_dataRun2_v37.db'),
    globaltag = cms.string('106X_dataRun2_v37'),
    pfnPostfix = cms.untracked.string(''),
    pfnPrefix = cms.untracked.string(''),
    snapshotTime = cms.string(''),
    toGet = cms.VPSet()
)

it still works for me...

Yes, it will work if you do not cut the frontier connection. This is the issue. It goes unobserved.

@katilp
Copy link

katilp commented Jan 26, 2024

for the sake of honesty,

For the sake of brevity, removing the customization

this is not exactly removing the customization, you are still checking out a package that is not available in release via:
git clone https://github.com/cms-opendata-analyses/PFNanoProducerTool.git PhysicsTools/PFNano
in a self-contained CMSSW_10_6_30 (which is what is normally accepted as centrally supported) the configuration above doesn't run.

Yes, of course, this is what we provide for open data users. We do not point them to anything centrally supported but provide them examples of how they can use CMS open data. We believe that this an issue independent from the

With configuration copied from above which has connection to conditions in a sqlite file, giving python dump like this:

>>> process.GlobalTag
cms.ESSource("PoolDBESSource",
    DBParameters = cms.PSet(
        authenticationPath = cms.untracked.string(''),
        authenticationSystem = cms.untracked.int32(0),
        messageLevel = cms.untracked.int32(0),
        security = cms.untracked.string('')
    ),
    DumpStat = cms.untracked.bool(False),
    ReconnectEachRun = cms.untracked.bool(False),
    RefreshAlways = cms.untracked.bool(False),
    RefreshEachRun = cms.untracked.bool(False),
    RefreshOpenIOVs = cms.untracked.bool(False),
    connect = cms.string('sqlite_file:/cvmfs/cms-opendata-conddb.cern.ch/106X_dataRun2_v37.db'),
    globaltag = cms.string('106X_dataRun2_v37'),
    pfnPostfix = cms.untracked.string(''),
    pfnPrefix = cms.untracked.string(''),
    snapshotTime = cms.string(''),
    toGet = cms.VPSet()
)

it still works for me...

Yes, it will work if you do not cut the frontier connection. This is the issue. It goes unobserved.

If you cannot cut the frontier connection, take the RecoTauTag package, add prints to this file locally, and you will see it ending up there. And that's all what I'm trying to say.

@mmusich
Copy link
Contributor

mmusich commented Jan 26, 2024

If you cannot cut the frontier connection, take the RecoTauTag package, add prints to this file locally, and you will see it ending up there. And that's all what I'm trying to say.

I can indeed make the process crash by short-circuiting this via this recipe:

cmsrel CMSSW_10_6_30
cd CMSSW_10_6_30/src
cmsenv
cmsDriver.py data_2016UL_OpenData --data --eventcontent NANOAODSIM --datatier NANOAODSIM --step NANO --conditions 106X_dataRun2_v37 --era Run2_2016,run2_nanoAOD_106Xv2 --customise_commands="process.add_(cms.Service('InitRootHandlers', EnableIMT = cms.untracked.bool(False)));delattr(process, 'loadRecoTauTagMVAsFromPrepDB')" --nThreads 4 -n 100 --filein /store/data/Run2016H/JetHT/MINIAOD/UL2016_MiniAODv2-v2/130000/676E37D2-044C-D346-92D9-A127A55FD279.root --fileout file:nano_data2016_nopf.root --no_exec
voms-proxy-init --voms cms
cmsRun data_2016UL_OpenData_NANO.py

which is independent from the PFNano customization.
The issue persists in CMSSW_14_0_0_pre2 so this issue is indeed not fully solved even in recent (pre-)releases.

@hqucms
Copy link
Contributor

hqucms commented Jan 29, 2024

I did some more investigations for CMSSW_14_0_0_pre2 based on the recipe from @mmusich. It turns out after changing two things I can get it to work:

  1. Removing these tasks from patTauMVAIDsTask. It seems that they are not used anyhow as the output NANO content remain unchanged after removing them. Maybe @mbluj can confirm if they are indeed unused?
  2. Switch the GT from 106X_dataRun2_v37 to a more recent one (I just used auto:run2_data). It seems that some of MVA tags for boosted taus are only included since 113X GTs.

For the open NANO release, I suppose the easiest way is to gather a list of the tags needed but not in the GT, and just dump/add them into a sqlite file? I dumped a list of tags being loaded manually in the 106X workflow. Probably not all of them are strictly needed, but have all of them should make things work w/o connecting to Frontier.

tags.txt

@vlimant
Copy link
Contributor

vlimant commented Jan 29, 2024

2. Switch the GT from 106X_dataRun2_v37 to a more recent one (I just used auto:run2_data). It seems that some of MVA tags for boosted taus are only included since 113X GTs.

what GT have you used instead of auto:run2_data ?

@mbluj
Copy link
Contributor

mbluj commented Jan 29, 2024

I did some more investigations for CMSSW_14_0_0_pre2 based on the recipe from @mmusich. It turns out after changing two things I can get it to work:

  1. Removing these tasks from patTauMVAIDsTask. It seems that they are not used anyhow as the output NANO content remain unchanged after removing them. Maybe @mbluj can confirm if they are indeed unused?
  2. Switch the GT from 106X_dataRun2_v37 to a more recent one (I just used auto:run2_data). It seems that some of MVA tags for boosted taus are only included since 113X GTs.

For the open NANO release, I suppose the easiest way is to gather a list of the tags needed but not in the GT, and just dump/add them into a sqlite file? I dumped a list of tags being loaded manually in the 106X workflow. Probably not all of them are strictly needed, but have all of them should make things work w/o connecting to Frontier.

tags.txt

Thanks @hqucms!
I think you are right about the tasks to be removed. It is actually what I mentioned earlier here (or in other parallel thread) that whole content of taus_updatedMVAIds_cff.py should be reviewed (I suppose that this is not needed anymore). I plan to do it in next few days and prepare a PR to master (and backports if needed).
I also agree that adding missing payloads to GT (thanks for the list!) is the quickest fix for open-data workflows as it does not require new CMSSW release.

@katilp
Copy link

katilp commented Jan 29, 2024

I did some more investigations for CMSSW_14_0_0_pre2 based on the recipe from @mmusich. It turns out after changing two things I can get it to work:

  1. Removing these tasks from patTauMVAIDsTask. It seems that they are not used anyhow as the output NANO content remain unchanged after removing them. Maybe @mbluj can confirm if they are indeed unused?
  2. Switch the GT from 106X_dataRun2_v37 to a more recent one (I just used auto:run2_data). It seems that some of MVA tags for boosted taus are only included since 113X GTs.

For the open NANO release, I suppose the easiest way is to gather a list of the tags needed but not in the GT, and just dump/add them into a sqlite file? I dumped a list of tags being loaded manually in the 106X workflow. Probably not all of them are strictly needed, but have all of them should make things work w/o connecting to Frontier.
tags.txt

Thanks @hqucms! I think you are right about the tasks to be removed. It is actually what I mentioned earlier here (or in other parallel thread) that whole content of taus_updatedMVAIds_cff.py should be reviewed (I suppose that this is not needed anymore). I plan to do it in next few days and prepare a PR to master (and backports if needed). I also agree that adding missing payloads to GT (thanks for the list!) is the quickest fix for open-data workflows as it does not require new CMSSW release.

Thanks! For open data, we would prefer the cleanest solution i.e. a new release with the fixes and a new GT if that can happen very shortly. It will be more work for us now for changes in already prepared material, but in the future, it will avoid patches and additional explanations in the CMS open data tutorials and guides. What would be the estimated timescale?

@hqucms
Copy link
Contributor

hqucms commented Jan 29, 2024

  1. Switch the GT from 106X_dataRun2_v37 to a more recent one (I just used auto:run2_data). It seems that some of MVA tags for boosted taus are only included since 113X GTs.

what GT have you used instead of auto:run2_data ?

@vlimant auto:run2_data points to 133X_dataRun2_v2 in CMSSW_14_0_0_pre2. It seems that the tags missing in 106X_dataRun2_v37 are introduced in ~113X (e.g., https://cms-conddb.cern.ch/cmsDbBrowser/search/Prod/RecoTauTag_antiElectronMVA6v3_noeveto_gbr_NoEleMatch_woGwoGSF_BL).

@hqucms
Copy link
Contributor

hqucms commented Jan 29, 2024

I did some more investigations for CMSSW_14_0_0_pre2 based on the recipe from @mmusich. It turns out after changing two things I can get it to work:

  1. Removing these tasks from patTauMVAIDsTask. It seems that they are not used anyhow as the output NANO content remain unchanged after removing them. Maybe @mbluj can confirm if they are indeed unused?
  2. Switch the GT from 106X_dataRun2_v37 to a more recent one (I just used auto:run2_data). It seems that some of MVA tags for boosted taus are only included since 113X GTs.

For the open NANO release, I suppose the easiest way is to gather a list of the tags needed but not in the GT, and just dump/add them into a sqlite file? I dumped a list of tags being loaded manually in the 106X workflow. Probably not all of them are strictly needed, but have all of them should make things work w/o connecting to Frontier.
tags.txt

Thanks @hqucms! I think you are right about the tasks to be removed. It is actually what I mentioned earlier here (or in other parallel thread) that whole content of taus_updatedMVAIds_cff.py should be reviewed (I suppose that this is not needed anymore). I plan to do it in next few days and prepare a PR to master (and backports if needed). I also agree that adding missing payloads to GT (thanks for the list!) is the quickest fix for open-data workflows as it does not require new CMSSW release.

Thanks! For open data, we would prefer the cleanest solution i.e. a new release with the fixes and a new GT if that can happen very shortly. It will be more work for us now for changes in already prepared material, but in the future, it will avoid patches and additional explanations in the CMS open data tutorials and guides. What would be the estimated timescale?

I think if we create a dedicated sqlite or make a new GT to include the missing tags, then no change is needed for the release. And since the open data will be using a sqlite anyhow, it might be much easier to just add the missing tags to the sqlite, rather than making a new GT. Maybe the Alca/DB group can comment on this?

@katilp
Copy link

katilp commented Jan 29, 2024

I did some more investigations for CMSSW_14_0_0_pre2 based on the recipe from @mmusich. It turns out after changing two things I can get it to work:

  1. Removing these tasks from patTauMVAIDsTask. It seems that they are not used anyhow as the output NANO content remain unchanged after removing them. Maybe @mbluj can confirm if they are indeed unused?
  2. Switch the GT from 106X_dataRun2_v37 to a more recent one (I just used auto:run2_data). It seems that some of MVA tags for boosted taus are only included since 113X GTs.

For the open NANO release, I suppose the easiest way is to gather a list of the tags needed but not in the GT, and just dump/add them into a sqlite file? I dumped a list of tags being loaded manually in the 106X workflow. Probably not all of them are strictly needed, but have all of them should make things work w/o connecting to Frontier.
tags.txt

Thanks @hqucms! I think you are right about the tasks to be removed. It is actually what I mentioned earlier here (or in other parallel thread) that whole content of taus_updatedMVAIds_cff.py should be reviewed (I suppose that this is not needed anymore). I plan to do it in next few days and prepare a PR to master (and backports if needed). I also agree that adding missing payloads to GT (thanks for the list!) is the quickest fix for open-data workflows as it does not require new CMSSW release.

Thanks! For open data, we would prefer the cleanest solution i.e. a new release with the fixes and a new GT if that can happen very shortly. It will be more work for us now for changes in already prepared material, but in the future, it will avoid patches and additional explanations in the CMS open data tutorials and guides. What would be the estimated timescale?

I think if we create a dedicated sqlite or make a new GT to include the missing tags, then no change is needed for the release. And since the open data will be using a sqlite anyhow, it might be much easier to just add the missing tags to the sqlite, rather than making a new GT. Maybe the Alca/DB group can comment on this?

I think they commented in https://cms-talk.web.cern.ch/t/condition-database-access-outside-of-gt-for-nano-production/33715/5
In any case, a change is needed in the code either to remove the tasks or remove the frontier connection, and from the open data point of view we would prefer it clean. Open data will be using this release for years to come.

@mbluj
Copy link
Contributor

mbluj commented Feb 1, 2024

Hello, sorry for delay in answering, but I was taken by other commitments. So, to summarize what needs to be done:

  1. Removal of tauIDs with payloads not in GT from official workflows:
    it is quite straightforward in master, but requires more work in UL release series (10_6) where nothing in this direction has been performed as far as I remember.
  2. I expect that cleaning in the UL/10_6 releases will anyway require update of GT as I suppose that corresponding GT do not contain even minimal required set of needed payloads - to be checked.
  3. What should be done with intermediate release series (>10_6 & <14_0)? Do we want backport of the cleaning to all of them? I would like to avoid it if not strictly necessary.

About timescale: I have other things on the plate (e.g. some L1 stuff for 2024 & phase-2), but I can reorder this if necessary. I suppose that cleaning master will take 1-2 days, backport to 10_6 a few additional days. What concerns GT update, I have not experience with it, so I have to either learn how to do or someone else should do the job - any help will be appreciated.

@perrotta
Copy link
Contributor

perrotta commented Feb 1, 2024

Thank you @mbluj

If I understand it correctly we should:

  • Remove tauIDs with payloads not in GT from the master
  • Add the missing payloads to the GTs for UL/10_6 (AlCa can do it)

If you remove the tauIDs also from UL/10_6 then no update of the GTs is needed, but as it will probably require modifying in a non trivial way a closed release this should be probably better avoided.

For the intermediate releases I would check one by one: probably we could avoid acting on all them, but it depends on what's intended to do with them.

@mbluj
Copy link
Contributor

mbluj commented Feb 1, 2024

Thank you @mbluj

If I understand it correctly we should:

  • Remove tauIDs with payloads not in GT from the master

Correct. As far as I understand the problem it touches only NanoAOD workflows.

  • Add the missing payloads to the GTs for UL/10_6 (AlCa can do it)
    Yes, but the number of payloads to add is potentially bigger comparing to what is in GT for master and other release series.

If you remove the tauIDs also from UL/10_6 then no update of the GTs is needed, but as it will probably require modifying in a non trivial way a closed release this should be probably better avoided.

It should be checked. In principle a "full cleaning" can require changes in AOD/RECO, miniAOD and NanoAOD sequences, but I suppose that the changes will not affect data content at any of datatiers. The only effect can be removal of era-dependent modifications for compatibility with old (pre-UL) samples. Anyway, if I understand it correctly the idea is to update only NanoAOD sequences as it is not expected to produce new UL-like samples other than NanoAOD for OpenData purposes, right?

For the intermediate releases I would check one by one: probably we could avoid acting on all them, but it depends on what's intended to do with them.

OK. Changes in intermediate releases newer than 11_3 (if I am correct) will be similar to those in master while for older similar to those for 10_6. But, even trivial backporting and testing (sometimes creating GT) to a number of release series is already some additional burden.

@jmhogan
Copy link
Contributor

jmhogan commented Feb 26, 2024

Pinging this thread along with #43797 (@mbluj)

In the Open Data context, DPOA's preferred solution is cleaning these IDs from UL/10_6 so that a new release can be used without a new (bigger) GT.

Nano sequences are very likely to be the most used in Open Data, but we do provide instructions on how users can produce their own MC, which follows the full sequence. It would be ideal to clean this out fully. But the Nano sequences have the highest priority. We can test where in the full production chain it fails if needed.

@vlimant
Copy link
Contributor

vlimant commented May 13, 2024

please close

done with #44685 until further notice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.