-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DeepFlavour Negative Tagger + DeepFlavour 10X training #23467
Conversation
The code-checks are being triggered in jenkins. |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23467/5008 Code check has found code style and quality issues which could be resolved by applying a patch in https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23467/5008/git-diff.patch You can run |
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23467/5010 |
A new Pull Request was created by @emilbols for master. It involves the following packages: PhysicsTools/PatAlgos @perrotta, @monttj, @cmsbuild, @slava77, @gpetruc, @arizzi can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild please test with cms-sw/cmsdist#4084 |
Ignoring test request. |
@cmsbuild please test with cms-sw/cmsdist#4086 |
The tests are being triggered in jenkins. |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1
This PR should be integrated together or after cms-sw/cmsdist#4086 |
@slava77 next task for Emil is to move DeepFlavour to Reco, as he discussed during the Reco meeting (unfortunately I could not connect, but I was told there was nobody against this plan). This will allow to have DeepFlavour more easily integrated in the standard validation workflow. |
On 6/8/18 4:14 PM, imarches wrote:
@slava77 <https://github.com/slava77> next task for Emil is to move
DeepFlavour to Reco, as he discussed during the Reco meeting
(unfortunately I could not connect, but I was told there was nobody
against this plan). This will allow to have DeepFlavour more easily
integrated in the standard validation workflow.
Are there any present issues to include relevant parts of BTag
validation in the @miniAODValidation sequence?
IIUC, all necessary products are available at run time and the standard
plots can be made
|
+1 |
merge |
On 6/11/18 6:40 AM, emilbols wrote:
@slava77 <https://github.com/slava77> @jmduarte
<https://github.com/jmduarte> I guess the strategy for backport of this
to 94X, will be to wait for the 94X backport of the deepDoubleB Tagger
first?
yes.
In case it is substantially simpler to combine in one PR, it can be
considered as well.
|
There has been no "standard" miniAOD BTag validation set up since the discriminators in miniAOD are directly copied from RECO and not recomputed. What we've been doing is from time to time manually run a sequence to re-compute the taggers using miniAOD products and check the performance agreement with RECO (not jet-by-jet). We can and will naturally include DeepFlavour into this workflow. Now, if DeepFlavour is only available in miniAOD it would indeed be good to validate it there too (and we'll add it to the standard validation when it's included in RECO). |
On 6/12/18 2:51 AM, Sébastien Wertz wrote:
Are there any present issues to include relevant parts of BTag
validation in the @miniaodvalidation sequence? IIUC, all necessary
products are available at run time and the standard plots can be made
There has been no "standard" miniAOD BTag validation set up since the
discriminators in miniAOD are directly copied from RECO and not
recomputed. What we've been doing is from time to time manually run a
sequence to re-compute the taggers using miniAOD products and check the
performance agreement with RECO (not jet-by-jet). We can and will
naturally include DeepFlavour into this workflow.
Now, if DeepFlavour is only available in miniAOD it would indeed be good
to validate it there too (and we'll add it to the standard validation
when it's included in RECO).
Does this mean that the dependence on PUPPI will be eliminated?
IIRC, this is currently the main reason why DeepFlavour is not running
in RECO.
Also, there is a perhaps secondary issue related to using packed
candidates directly which leads to differences between what can be
running in RECO and what's in miniAOD.
About the "miniAODValidation sequence", could you point me to where it's
configured? To be clear, we're talking about a sample-vs-sample
comparison within miniAOD, not an object-vs-object comparison between
miniAOD and REOC?
yes
Also, when is this sequence run? An issue is that the validation uses
products that are not available in miniAOD, but maybe these are
available when that sequence is run...
You can check the available products based on the configuration of a
miniAOD-only job.
E.g. runTheMatrix.py -l 1325.51
By running it, you will also get an example of how the miniAOD DQM and
Validation sequences are setup.
|
Hi @slava77 yes, you remember correctly. The next PR we will perform, that will move DeepFlavour to the RECO is indeed removing also the Puppi dependence |
I'm also not very familiar with the miniAOD DQM and Validation sequence but since it is run in the same job as the miniAOD production, all the AOD products should be available which should make it possible to run the standard btag validation with a few tweaks here and there. As @swertz pointed out, up to now all ak4 btag discriminators were produced in RECO and then stored in miniAOD so there was no strong reason to implement anything in the miniAOD validation sequence. DeepFlavour is now an exception but if it will soon be in RECO, we will be back to the usual situation. One place where miniAOD validation would be useful is fat jets for which we don't run b tagging in the standard reco. |
@@ -209,20 +218,16 @@ void DeepFlavourTagInfoProducer::produce(edm::Event& iEvent, const edm::EventSet | |||
{ return btagbtvdeep::sv_vertex_comparator(sva, svb, pv); }); | |||
// fill features from secondary vertices | |||
for (const auto & sv : svs_sorted) { | |||
if (reco::deltaR(sv, jet) > jet_radius_) continue; | |||
if (reco::deltaR2(sv.position() - pv.position(), flip_ ? -jet_dir : jet_dir) > (jet_radius_*jet_radius_)) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slava77 it was just a mistake. In the training it was always done in the previous way (deltaR calculated with vertex momentum).
This PR is concerned with the implementation of the DeepFlavour negative tagger, which is used for estimating the rate of light jets being tagged as b jets. The goal of the negative tagger is to have the same light jet distribution as the full tagger, but filter out as many b-jets as possible at high discriminator values. This is done by inverting the sign of all distributions that are symmetric under sign inversion for light jets, but not for b-jets.
A description of the negative taggers in general can be found in these presentation at the BTV meeting [1] and at the RECO meeting [2].
The DeepFlavour negative tagger distribution is shown in [3]. The negative tagger discriminator values are shown from -1 to 0, and the full tagger is shown from 0 to 1. Red is b-jets, blue and light blue are light jets, and green is c-jets.
Originally this PR was a part of PR #23206 , but it was filtered out due to merge conflicts. The review comments by @slava77 left in that PR has been dealt with here.
PR #23206 also introduced the 94X training which was made a while back. While that PR was ongoing we produced a 10X training, which performs better, so in this PR we also add a new DeepFlavour model trained on 10X MC samples. It is therefore required that the PR cms-data/RecoBTag-Combined#14 is merged first. The ROC curves can be seen in [4]. Here the dashed line is c vs b and the full line is l vs b.
[1] https://indico.cern.ch/event/713923/contributions/2938152/attachments/1620844/2578707/NegativeTagger.pdf
[2] https://indico.cern.ch/event/732060/contributions/3020340/attachments/1660464/2659922/NegativeTaggerPR.pdf
[3]
[4]