-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New MVA-based Tau-Ids (update) #146
New MVA-based Tau-Ids (update) #146
Conversation
Dear all, I see that the two int numbers for the cut-based tau ID have not been added to PhysicsTools/NanoAOD/python/taus_cff.py . Can these please still be added ? I think the option to switch to the cut-based tau ID discriminators may be a useful option in case of problems with data/MC differences in the MVA tau ID input variables (the Tau POG is seeing some evidence for such data/MC differences right now and is investigating). |
Hi Michal, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- use run2_nanoAOD_94XMiniAODv1/v2 eras for 94X
- it's not clear what's happening on 80X
cut = cms.string("pt > 18 && tauID('decayModeFindingNewDMs') && (tauID('byLooseCombinedIsolationDeltaBetaCorr3Hits') || tauID('byVLooseIsolationMVArun2v1DBoldDMwLT') || tauID('byVLooseIsolationMVArun2v1DBnewDMwLT') || tauID('byVLooseIsolationMVArun2v1DBdR03oldDMwLT') || tauID('byVVLooseIsolationMVArun2v1DBoldDMwLT2017v2') || tauID('byVVLooseIsolationMVArun2v1DBnewDMwLT2017v2') || tauID('byVVLooseIsolationMVArun2v1DBdR03oldDMwLT2017v2'))") | ||
) | ||
run2_miniAOD_94XFall17.toModify(finalTaus, | ||
cut = cms.string("pt > 18 && tauID('decayModeFindingNewDMs') && (tauID('byLooseCombinedIsolationDeltaBetaCorr3Hits') || tauID('byVLooseIsolationMVArun2v1DBoldDMwLT2015') || tauID('byVLooseIsolationMVArun2v1DBnewDMwLT') || tauID('byVLooseIsolationMVArun2v1DBdR03oldDMwLT') || tauID('byVVLooseIsolationMVArun2v1DBoldDMwLT2017v2') || tauID('byVVLooseIsolationMVArun2v1DBnewDMwLT2017v2') || tauID('byVVLooseIsolationMVArun2v1DBdR03oldDMwLT2017v2'))") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use run2_miniAOD_94XFall17
era
Have the default config to be that for 94XMiniAODv2, and use run2_nanoAOD_94XMiniAODv1
to adapt to MiniAOD v1
(I'm putting just one comment but it is true for all uses of this era in the file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for letting me know, I will correct it.
@@ -41,6 +41,8 @@ | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff too is not being too smart here.
I would assume this only loads new items compared to what's currently in the release, can you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I confirm that it only adds new 2017v2 payloads. In particular, it means that standard RECO and MiniAOD v1 and v2 workflows are not broken.
tauIDSources = _tauIDSourcesExt | ||
) | ||
|
||
patTauMVAIDsSeq += slimmedTausUpdated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put all this in a separate cff file (e.g. taus_updatedMVAIds_cff) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
+ patTauDiscriminationByTightIsolationMVArun2v1DBoldDMwLT2015 | ||
+ patTauDiscriminationByVTightIsolationMVArun2v1DBoldDMwLT2015 | ||
+ patTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT2015 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is 2015 what is in current miniAOD v1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly
Hi
On Tue, Mar 27, 2018 at 4:31 PM, Christian Veelken ***@***.*** > wrote:
Dear all,
I see that the two int numbers for the cut-based tau ID have not been
added to PhysicsTools/NanoAOD/python/taus_cff.py . Can these please still
be added ? I think the option to switch to the cut-based tau ID
discriminators may be a useful option in case of problems with data/MC
differences in the MVA tau ID input variables (the Tau POG is seeing some
evidence for such data/MC differences right now and is investigating).
It is in plans, i.e. I will add it as soon as I can (hopefully by tomorrow
about noon).
<#m_-5256863754617431948_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
|
On Tue, Mar 27, 2018 at 4:46 PM, Jan Steggemann ***@***.***> wrote:
Hi Michal,
Thanks for the fast turnaround! We agreed yesterday to support both the
2017 MC v1-based and 2017 MC v2-based trainings. Do I read correctly that
this PR adds the v2-based training, plus reverts to the 2015-based training
for NanoAOD production based on MiniAOD v2? I think it would be better to
offer the 2017 MC v1-based training, either in addition or replacing the
2015 training.
OK, it can be done like this, i.e. 2017v1 can be added. The issue with the
2017v1 is that it exists only for oldDMs dR=0.5 (however it is most used
version).
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
|
Dear All,
It does not address, however, the request to add cut-based tau-Ids as due to some reasons there is a problem with parsing of mathematical expressions like: "tauID('byCombinedIsolationDeltaBetaCorrRaw3Hits')<1.5" ("<" sign looks be a problem). |
Hi @mbluj, |
Hi,
On Wed, Mar 28, 2018 at 1:24 PM, Giovanni Petrucciani < ***@***.***> wrote:
Hi @mbluj <https://github.com/mbluj>,
Where do you need to parse the x < 1.5 ?
If it's in the table producer, then comparisons should be supported for
variables of type bool, as the physicstools parser grammar allows them in
cuts but not in expressions.
If you need it in an expression, e.g. to say 1*(x<1.5) + 2*(y>4.7) then
you need to use the ternary operator but with one extra question mark at
the beginning (because of limitations in the parser of the grammar), so the
above would become (? x<1.5 ? 1 : 0) + (? y>4.7 ? 2 : 0)
Yes, it is expected to go to the table producer. I have checked that it
indeed works for a single bool, but what I want is to store a few of bools
(a set of WPs) in one integer (uint8), like this (a simplified example):
idCombIsodR03 = Var(
"1 * (tauID('chargedIsoPtSumdR03') < 2.5) + "
"2 * (tauID('chargedIsoPtSumdR03') < 1.5) ",
"uint8", doc="Combined isolation WP"
)
I will try to adapt to your proposal, but it looks that it will produce a
quite complicated expression :(
|
Automatic test started, see https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/343947/builds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatic test report for 343947
- gitlab pipeline at https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/343947/builds
- status: FAILURE (see link above for failed jobs)
Code integration
Code checks not run for this PR (no source files modified)
Tests
- Test mc_94Xv2: passed
- Test mc_94X: passed
please use the proper Era Switches to support MC campaigns |
It looks that 80X jobs fail. How they should be setup? Same as MiniAODv1 ones? Should I use a specific era?
|
there are specific eras for the various scenarios |
On Wed, Mar 28, 2018 at 3:55 PM, arizzi ***@***.***> wrote:
there are specific eras for the various scenarios
https://github.com/cms-sw/cmssw/blob/master/Configuration/
StandardSequences/python/Eras.py#L47-L48
Thank you, I know this. But while meaning and use cases for '
run2_miniAOD_94XFall17', 'run2_nanoAOD_92X' and
'run2_nanoAOD_92X' are more or less clear I am not sure what should be used
for 80X as there is not specific era for nanoAOD with 80X. And it makes
difference if it will be 'run2_miniAOD_80XLegacy' or something else.
|
Now eras should correctly handle 94X MiniAOD v1 and v1, 92X MiniAOD, and 80X MiniAOD and LegacyMiniAOD. Concerning cut-based isolation tau-Ids:
P.S. I'm away since ~5pm today and will be able provide only reduced fixes to the PR until end of Easter. |
are the missing variables in DQM produced in some of the other eras? @gpetruc which era is the dqm tested on ? |
the DQM config is tested on 94XMiniAODv2 (missing variables are ignored)
I haven't tried to make the dqm auto-update script to work with eras.
Giovanni
…On Wed, Apr 4, 2018 at 8:50 AM, arizzi ***@***.***> wrote:
are the missing variables in DQM produced in some of the other eras?
@gpetruc <https://github.com/gpetruc> which era is the dqm tested on ?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#146 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AfBP0d9r5Al7Pv2dSdKCGcfDUYF71oSHks5tlG0agaJpZM4S8_nl>
.
|
On Wed, Apr 4, 2018 at 8:50 AM, arizzi ***@***.***> wrote:
are the missing variables in DQM produced in some of the other eras?
@gpetruc <https://github.com/gpetruc> which era is the dqm tested on ?
Content of the tau table (other tables as well?) depends on era and it
could be indeed useful to make DQM also era depended. Anyway, I have
defined plots which make sense for all areas, i.e. there are histograms
which correspond with variables present in some eras but not in others
which triggers the warning by the DQM test.
|
@mbluj giovanni suggested that you store in the cfi only the 94X latest (i.e. default) config and use the DQM_cff.py to apply era switches that could add back the legacy variables. |
On Wed, Apr 4, 2018 at 2:14 PM, arizzi ***@***.***> wrote:
@mbluj <https://github.com/mbluj> giovanni suggested that you store in
the cfi only the 94X latest (i.e. default) config and use the DQM_cff.py to
apply era switches that could add back the legacy variables.
Can you try implementing it? (let me know if you can do it today,
otherwise I may try to have a look at it at some point)
OK, I can spend some time on it. I'll let you know if I fail.
|
Era dependence added to NanoDQM_cff for tau plots. However, I have not idea how to test it properly... |
Automatic test started, see https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/348530/builds |
* Add payloads for tau-Id MVAIso 2017v2 * Add MVAIso tau-Id 2017v2 * Move definitions of new MVA tau-Ids to a specific cff file * Use correct eras, add MVAIso2017v1 * Add cut-based isolation WPs * correct eras, comment out idCombIsodR03 due to incorrect behavior * further era corrections * fix bug in eras handling * add rawIso dR=0.3, remove cut-base WPs and footprintCorr * remove MVAIso oldDM dR=0.3 and newDM with 2015 training * remove instead commenting out * keep all MVAIso 2015 Tau-Ids and remove 2017v1/2 ones for 80X * Add era modifiers to NanoDQM
* Add payloads for tau-Id MVAIso 2017v2 * Add MVAIso tau-Id 2017v2 * Move definitions of new MVA tau-Ids to a specific cff file * Use correct eras, add MVAIso2017v1 * Add cut-based isolation WPs * correct eras, comment out idCombIsodR03 due to incorrect behavior * further era corrections * fix bug in eras handling * add rawIso dR=0.3, remove cut-base WPs and footprintCorr * remove MVAIso oldDM dR=0.3 and newDM with 2015 training * remove instead commenting out * keep all MVAIso 2015 Tau-Ids and remove 2017v1/2 ones for 80X * Add era modifiers to NanoDQM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatic test report for 348530
- gitlab pipeline at https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/348530/builds
- outputs at https://test-cms-nanoaod-integration.web.cern.ch/integration/test_pr_146/
Code integration
Code checks not run for this PR (no source files modified)
Tests
- Long test data80X (10000 events): passed, with differences; dqm plots: all, diff
- Long test data80Xhip (3000 events): passed, with differences; dqm plots: all, diff
- Long test data94X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data94Xv2 (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc80X (10000 events): passed, with differences; dqm plots: all, diff
- Long test mc94X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc94Xv2 (9000 events): passed, with differences; dqm plots: all, diff
- Test mc_94Xv2: passed
- Test mc_94X: passed
- Test mc_80X: passed
- Test data_94Xv2: passed
- Test data_94X: passed
- Test data_80X: passed
Disk size report
Sample | kb/event | ref kb/event | diff |
---|---|---|---|
TTbar MC 94Xv1 | 1.544 | 1.532 | 0.012 ( +0.8% ) |
TTbar MC 94Xv2 | 1.563 | 1.549 | 0.014 ( +0.9% ) |
TTbar MC 80X | 1.569 | 1.569 | -0.000 ( -0.0% ) |
Data 94Xv1 | 0.620 | 0.613 | 0.008 ( +1.3% ) |
Data 80X | 0.558 | 0.557 | 0.001 ( +0.1% ) |
Data 80X, Mu Run2016E | 0.558 | 0.557 | 0.001 ( +0.1% ) |
cherry picked on master-cmsswmaster and masterMergedAndNoData-integration |
@ALL what was the intention is pulling taggers from outside of the GT here ? |
Hello, |
After further checks: there is still possibility to run old tauIDs (early Run-2) accessing their payloads w/o GT for certain Run-2 nanoAOD eras. This is steered by the following cff: |
thanks for checking, please prepare a PR indeed. |
I agree, it makes sense to maintain |
The MVA-based Tau-IDs are retrained with 94X Fall17 samples and should be evaluated using MiniAOD Taus (skimmedTaus) and added to NanoAOD. This PR is meant to supersede its initial version (#108) based on MVA training with 92X Summer17 samples.
The commits change three files:
Notes:
FYI, @veelken @steggema @ohlushch