-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix when neither "data" nor "MC" is specified in the cmsDriver. #34452
Fix when neither "data" nor "MC" is specified in the cmsDriver. #34452
Conversation
test parameters:
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34452/23889
|
A new Pull Request was created by @srimanob (Phat Srimanobhas) for master. It involves the following packages:
@cmsbuild, @silviodonato, @qliphy, @fabiocos, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
# This is a temporary solution to make validation-only sequence work in IB. One should define workflow properly. | ||
if options.era and 'Phase2' in options.era: | ||
options.isMC=True | ||
if 'miniAODValidation' in options.step: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was counter-intuitive to me that all miniAODValidation
must be considered as "MC": of course, we produce miniAOD also from real data.
Then I realized that the miniAODValidationSequence
only includes a genParticlesValidation
: therefore, the fix here makes sense... but the name of the sequence probably not!
@hatakeyamak @jfernan2 : is there any (hystorical) reason for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen that we use miniAODValidationSequence with data before. In UL, PdmV uses DQM:@rerecoCommon
or DQM:@standardDQM+@ExtraHLT
I agree that this makes confusion on what it for MC/Data or both
[1, 2016 UL]
https://cmsweb.cern.ch/couchdb/reqmgr_config_cache/5ad31c9be7cfbf89c62a938d7fa94dfa/configFile
[2, B-Parking]
https://cmsweb.cern.ch/couchdb/reqmgr_config_cache/be67f91f945866144130b3b61e2e2994/configFile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was counter-intuitive to me that all
miniAODValidation
must be considered as "MC"
May I ask why? (I'm mostly just curious) I mean, we tend to use the word "validation" for MC and "DQM" for data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was counter-intuitive to me that all
miniAODValidation
must be considered as "MC"May I ask why? (I'm mostly just curious) I mean, we tend to use the word "validation" for MC and "DQM" for data.
Ah, ok: that was the key I missed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@makortel @perrotta
If that is the key, the issue may solve easily. However, I see that we also use them together, i.e.
VALIDATION:@phase2Validation+@miniAODValidation,DQM:@phase2+@miniAODDQM in (*)
Can I simply say that if we have VALIDATION step, it will always be MC workflow?
And if we have only DQM, it will be data workflow?
(*)
https://cmsweb.cern.ch/couchdb/reqmgr_config_cache/ab0d87d02b073cc5e7ae3f16e1350bbb/configFile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I assume we should be OK if DQM-only is used with MC because CMSSW should treat it like data in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we tend to use the word "validation" for MC and "DQM" for data.
I believe this is not true in a practical and general way, as you can see from:
where there is a mixture of VALIDATION,DQM and VALIDATION/DQM over data and MC too
What it is true is that the DQM Unit Tests performed on:
https://github.com/cms-sw/cmssw/blob/master/DQMOffline/Configuration/test/runtests.sh#L23
are using a single (real) data input file, to simplify them, so we could switch these tests to isData=True to make them run with Phat's PR
In these tests, since a module with a given label can have different configurations depending on options such as Era,
Scenario, Data vs. MC etc, if multiple configurations for the same name were found, they are listed separately
here and denoted using subscripts. That is why the changes in this PR are failing somehow.
I don't know if the changes in this PR have implications in the standard workflows though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we tend to use the word "validation" for MC and "DQM" for data.
I believe this is not true in a practical and general way, as you can see from:
Right, sorry for being imprecise. What I really meant (and what would have been sufficient) is that what we tend to call "validation" in the context of "standard sequences" requires MC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What it is true is that the DQM Unit Tests performed on:
https://github.com/cms-sw/cmssw/blob/master/DQMOffline/Configuration/test/runtests.sh#L23
are using a single (real) data input file, to simplify them, so we could switch these tests to isData=True to make them run with Phat's PR
This may not true. Somehow the test also picks DQM-only for Phase-2 workflow which does not exist (or I miss it somehow). CMSSW already guess that it is data-workflow, but still, fail. See in https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-7e50b2/16755/unitTests/failed.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The input file for the tests is always the same, and it is an EGamma 2018 RAW data file downloaded at the beginning of the script to avoid delays from AAA:
--infile file:///data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_0_X_2021-07-12-2300/tmp/004D960A-EA4C-E811-A908-FA163ED1F481.root
-1 Failed Tests: UnitTests Unit TestsI found errors in the following unit tests: ---> test TestDQMOfflineConfiguration_160 had ERRORS ---> test TestDQMOfflineConfiguration_20 had ERRORS ---> test TestDQMOfflineConfiguration_200 had ERRORS ---> test TestDQMOfflineConfiguration_190 had ERRORS and more ... Comparison SummarySummary:
|
I still see the error of the unit test. Maybe the best/proper way is to make test script pick up the proper config, either MC or data. VALIDATION only sequence is a bit tricky as it does not fall into any rule of guess as MC workflow. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34452/23908
|
+1 |
@cms-sw/xpog-l2 I've checked the Mini and Nano sequences, Basically, they are the same. They check first if isData is True or not. If not, MC (split between Fast/Full in Mini) customization will be used. So the code is already in the good shape already. |
Can this PR be merged? I don't see any point to hold it now. There is nothing change for XPOG, except cmsDriver can dump proper config when neither --data nor --mc is specified for production workflow. |
@gouskos did you checked ? |
I would propose we merge this as we don't have (yet) a new proposal to make MC/Data guess from xpog. This PR does not change the way we guess, just make the setting right for data part. I see no point to hold it. |
Hi @srimanob I am not done checking possible issues. |
+xpog |
+operations |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy, @perrotta (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
This PR is the first draft to fix the issue reported in
#34443
It is #34302 (which is currently reverted) with additional fix to make DQM/validation test on IB work. Detail is in #34443
PR validation:
Run successfully with
scram b runtests_TestDQMOfflineConfiguration_110
if this PR is a backport please specify the original PR and why you need to backport that PR:
Not a backport, and no need a backport at the moment.