Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MXNet-based implementation of the ParticleNet tagger #28902

Merged
merged 9 commits into from
Mar 10, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions PhysicsTools/PatAlgos/python/recoLayer0/bTagging_cff.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@
, 'pfDeepDoubleXTagInfos'
# DeepBoostedJet tag infos
, 'pfDeepBoostedJetTagInfos'
# ParticleNet tag infos
, 'pfParticleNetTagInfos'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should an AK8 suffix be added?
We didn't do it for the boosted and still manage. So, perhaps we can keep pretending that there is no confusion.
@ferencek did the topic of the parent jet cone confusion show up in some recent (1-2 years) discussions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77, in the past we had examples of AK8 and CA15 added in the TagInfo names, even when there was Boosted somewhere in the name if more than one cone size was supported for fat jets. If long term ParticleNet tagging coukd be supported for multiple cone sizes, having an extra suffix might be a good idea just to be more explicit and avoid potential confusion down the road.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hqucms
so, it sounds like AK8 here (and for the tags) would help

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77
Thinking about it a bit more -- I think it's actually better not to add the AK8 here as the infrastructure is generic and applies to any jet sizes. What we can do (and we are doing now in MiniAOD/NanoAOD) is to use postfix to distinguish different jet types (e.g., in applyDeepBtagging_cff.py, we use postfix=AK8WithDeepInfo so the tagInfo product is called pfParticleNetTagInfosAK8WithDeepInfo). Hardcoding AK8 into the name seems to make it awkward to use it for other jet sizes (actually we do have another use case for AK15 using the same infrastructure).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm, but pfParticleNetTagInfos by itself is configured to use AK8.
Also, I understood that a major effort done by JME to put PUPPI into the RECO workflow was to then enable running the ML taggers (the ones using PUPPI) in RECO as well.
For that case, this has a specific meaning and not a somewhat magic key for applyDeepBtagging_cff.py to call PhysicsTools/PatAlgos/python/tools/jetTools.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77 I think PhysicsTools/PatAlgos/python/tools/jetTools.py is also called when adding b-taggers in RECO, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehm, nevermind this file itself is
a part of PAT.
OK then

# Pixel Cluster tag infos
, 'pixelClusterTagInfos'
]
Expand Down Expand Up @@ -231,3 +233,17 @@
for disc in _pfMassDecorrelatedDeepBoostedJetTagsMetaDiscrs:
supportedMetaDiscr[disc] = _pfMassDecorrelatedDeepBoostedJetTagsProbs
# -----------------------------------

# -----------------------------------
# setup ParticleNet
from RecoBTag.MXNet.pfParticleNet_cff import _pfParticleNetJetTagsProbs, _pfParticleNetJetTagsMetaDiscrs, \
_pfMassDecorrelatedParticleNetJetTagsProbs, _pfMassDecorrelatedParticleNetJetTagsMetaDiscrs
# update supportedBtagDiscr
for disc in _pfParticleNetJetTagsProbs + _pfMassDecorrelatedParticleNetJetTagsProbs:
supportedBtagDiscr[disc] = [["pfParticleNetTagInfos"]]
# update supportedMetaDiscr
for disc in _pfParticleNetJetTagsMetaDiscrs:
supportedMetaDiscr[disc] = _pfParticleNetJetTagsProbs
for disc in _pfMassDecorrelatedParticleNetJetTagsMetaDiscrs:
supportedMetaDiscr[disc] = _pfMassDecorrelatedParticleNetJetTagsProbs
# -----------------------------------
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ def applyDeepBtagging( process, postfix="" ) :
delattr(process, 'selectedUpdatedPatJetsSlimmedDeepFlavour'+postfix)

from RecoBTag.ONNXRuntime.pfDeepBoostedJet_cff import _pfDeepBoostedJetTagsAll as pfDeepBoostedJetTagsAll
from RecoBTag.MXNet.pfParticleNet_cff import _pfParticleNetJetTagsAll as pfParticleNetJetTagsAll

# update slimmed jets to include particle-based deep taggers (keep same name)
# make clone for DeepTags-less slimmed AK8 jets, so output name is preserved
Expand All @@ -60,7 +61,7 @@ def applyDeepBtagging( process, postfix="" ) :
'pfMassIndependentDeepDoubleCvLJetTags:probHcc',
'pfMassIndependentDeepDoubleCvBJetTags:probHbb',
'pfMassIndependentDeepDoubleCvBJetTags:probHcc',
) + pfDeepBoostedJetTagsAll
) + pfDeepBoostedJetTagsAll + pfParticleNetJetTagsAll
)
updateJetCollection(
process,
Expand Down
23 changes: 23 additions & 0 deletions PhysicsTools/PatAlgos/python/tools/jetTools.py
Original file line number Diff line number Diff line change
Expand Up @@ -662,6 +662,29 @@ def setupBTagging(process, jetSource, pfCandidates, explicitJTA, pvSource, svSou
),
process, task)

if btagInfo == 'pfParticleNetTagInfos':
if pfCandidates.value() == 'packedPFCandidates':
# case 1: running over jets whose daughters are PackedCandidates (only via updateJetCollection for now)
puppi_value_map = ""
vertex_associator = ""
elif pfCandidates.value() == 'particleFlow':
raise ValueError("Running pfDeepBoostedJetTagInfos with reco::PFCandidates is currently not supported.")
# case 2: running on new jet collection whose daughters are PFCandidates (e.g., cluster jets in RECO/AOD)
puppi_value_map = "puppi"
vertex_associator = "primaryVertexAssociation:original"
else:
raise ValueError("Invalid pfCandidates collection: %s." % pfCandidates.value())
addToProcessAndTask(btagPrefix+btagInfo+labelName+postfix,
btag.pfParticleNetTagInfos.clone(
jets = jetSource,
vertices = pvSource,
secondary_vertices = svSource,
pf_candidates = pfCandidates,
puppi_value_map = puppi_value_map,
vertex_associator = vertex_associator,
),
process, task)

acceptedTagInfos.append(btagInfo)
elif hasattr(toptag, btagInfo) :
acceptedTagInfos.append(btagInfo)
Expand Down
1 change: 1 addition & 0 deletions RecoBTag/Configuration/python/RecoBTag_cff.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from RecoBTag.ONNXRuntime.pfDeepFlavour_cff import *
from RecoBTag.ONNXRuntime.pfDeepDoubleX_cff import *
from RecoBTag.ONNXRuntime.pfDeepBoostedJet_cff import *
from RecoBTag.MXNet.pfParticleNet_cff import *
from RecoVertex.AdaptiveVertexFinder.inclusiveVertexing_cff import *
from RecoBTag.PixelCluster.pixelClusterTagInfos_cfi import *

Expand Down
64 changes: 40 additions & 24 deletions RecoBTag/FeatureTools/plugins/DeepBoostedJetTagInfoProducer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ class DeepBoostedJetTagInfoProducer : public edm::stream::EDProducer<> {
const double jet_radius_;
const double min_jet_pt_;
const double min_pt_for_track_properties_;
const bool use_puppiP4_;

edm::EDGetTokenT<edm::View<reco::Jet>> jet_token_;
edm::EDGetTokenT<VertexCollection> vtx_token_;
Expand Down Expand Up @@ -83,10 +84,13 @@ const std::vector<std::string> DeepBoostedJetTagInfoProducer::particle_features_
"pfcand_dphidphi", "pfcand_dxydxy", "pfcand_dzdz", "pfcand_dxydz",
"pfcand_dphidxy", "pfcand_dlambdadz", "pfcand_btagEtaRel", "pfcand_btagPtRatio",
"pfcand_btagPParRatio", "pfcand_btagSip2dVal", "pfcand_btagSip2dSig", "pfcand_btagSip3dVal",
"pfcand_btagSip3dSig", "pfcand_btagJetDistVal",
"pfcand_btagSip3dSig", "pfcand_btagJetDistVal", "pfcand_mask", "pfcand_pt_log_nopuppi",
"pfcand_e_log_nopuppi"

};

const std::vector<std::string> DeepBoostedJetTagInfoProducer::sv_features_{
"sv_mask",
"sv_phirel",
"sv_etarel",
"sv_deltaR",
Expand All @@ -108,6 +112,7 @@ DeepBoostedJetTagInfoProducer::DeepBoostedJetTagInfoProducer(const edm::Paramete
: jet_radius_(iConfig.getParameter<double>("jet_radius")),
min_jet_pt_(iConfig.getParameter<double>("min_jet_pt")),
min_pt_for_track_properties_(iConfig.getParameter<double>("min_pt_for_track_properties")),
use_puppiP4_(iConfig.getParameter<bool>("use_puppiP4")),
jet_token_(consumes<edm::View<reco::Jet>>(iConfig.getParameter<edm::InputTag>("jets"))),
vtx_token_(consumes<VertexCollection>(iConfig.getParameter<edm::InputTag>("vertices"))),
sv_token_(consumes<SVCollection>(iConfig.getParameter<edm::InputTag>("secondary_vertices"))),
Expand Down Expand Up @@ -138,6 +143,7 @@ void DeepBoostedJetTagInfoProducer::fillDescriptions(edm::ConfigurationDescripti
desc.add<double>("jet_radius", 0.8);
desc.add<double>("min_jet_pt", 150);
desc.add<double>("min_pt_for_track_properties", -1);
desc.add<bool>("use_puppiP4", true);
desc.add<edm::InputTag>("vertices", edm::InputTag("offlinePrimaryVertices"));
desc.add<edm::InputTag>("secondary_vertices", edm::InputTag("inclusiveCandidateSecondaryVertices"));
desc.add<edm::InputTag>("pf_candidates", edm::InputTag("particleFlow"));
Expand All @@ -150,8 +156,7 @@ void DeepBoostedJetTagInfoProducer::fillDescriptions(edm::ConfigurationDescripti
void DeepBoostedJetTagInfoProducer::produce(edm::Event &iEvent, const edm::EventSetup &iSetup) {
auto output_tag_infos = std::make_unique<DeepBoostedJetTagInfoCollection>();

edm::Handle<edm::View<reco::Jet>> jets;
iEvent.getByToken(jet_token_, jets);
auto jets = iEvent.getHandle(jet_token_);

iEvent.getByToken(vtx_token_, vtxs_);
if (vtxs_->empty()) {
Expand Down Expand Up @@ -259,11 +264,17 @@ void DeepBoostedJetTagInfoProducer::fillParticleFeatures(DeepBoostedJetFeatures
// get the original reco/packed candidate not scaled by the puppi weight
daughters.push_back(pfcands_->ptrAt(cand.key()));
}
// sort by Puppi-weighted pt
std::sort(
daughters.begin(), daughters.end(), [&puppi_wgt_cache](const reco::CandidatePtr &a, const reco::CandidatePtr &b) {
return puppi_wgt_cache.at(a.key()) * a->pt() > puppi_wgt_cache.at(b.key()) * b->pt();
});
if (use_puppiP4_) {
// sort by Puppi-weighted pt
std::sort(daughters.begin(),
daughters.end(),
[&puppi_wgt_cache](const reco::CandidatePtr &a, const reco::CandidatePtr &b) {
return puppi_wgt_cache.at(a.key()) * a->pt() > puppi_wgt_cache.at(b.key()) * b->pt();
});
} else {
// sort by original pt (not Puppi-weighted)
std::sort(daughters.begin(), daughters.end(), [](const auto &a, const auto &b) { return a->pt() > b->pt(); });
}

// reserve space
for (const auto &name : particle_features_) {
Expand All @@ -279,7 +290,7 @@ void DeepBoostedJetTagInfoProducer::fillParticleFeatures(DeepBoostedJetFeatures
const auto *packed_cand = dynamic_cast<const pat::PackedCandidate *>(&(*cand));
const auto *reco_cand = dynamic_cast<const reco::PFCandidate *>(&(*cand));

auto puppiP4 = puppi_wgt_cache.at(cand.key()) * cand->p4();
auto candP4 = use_puppiP4_ ? puppi_wgt_cache.at(cand.key()) * cand->p4() : cand->p4();
if (packed_cand) {
float hcal_fraction = 0.;
if (packed_cand->pdgId() == 1 || packed_cand->pdgId() == 130) {
Expand Down Expand Up @@ -343,14 +354,18 @@ void DeepBoostedJetTagInfoProducer::fillParticleFeatures(DeepBoostedJetFeatures

// basic kinematics
fts.fill("pfcand_puppiw", puppi_wgt_cache.at(cand.key()));
fts.fill("pfcand_phirel", reco::deltaPhi(puppiP4, jet));
fts.fill("pfcand_etarel", etasign * (puppiP4.eta() - jet.eta()));
fts.fill("pfcand_deltaR", reco::deltaR(puppiP4, jet));
fts.fill("pfcand_abseta", std::abs(puppiP4.eta()));
fts.fill("pfcand_phirel", reco::deltaPhi(candP4, jet));
fts.fill("pfcand_etarel", etasign * (candP4.eta() - jet.eta()));
fts.fill("pfcand_deltaR", reco::deltaR(candP4, jet));
fts.fill("pfcand_abseta", std::abs(candP4.eta()));

fts.fill("pfcand_ptrel_log", catch_infs(std::log(candP4.pt() / jet.pt()), -99));
fts.fill("pfcand_erel_log", catch_infs(std::log(candP4.energy() / jet.energy()), -99));
fts.fill("pfcand_pt_log", catch_infs(std::log(candP4.pt()), -99));

fts.fill("pfcand_ptrel_log", catch_infs(std::log(puppiP4.pt() / jet.pt()), -99));
fts.fill("pfcand_erel_log", catch_infs(std::log(puppiP4.energy() / jet.energy()), -99));
fts.fill("pfcand_pt_log", catch_infs(std::log(puppiP4.pt()), -99));
fts.fill("pfcand_mask", 1);
fts.fill("pfcand_pt_log_nopuppi", catch_infs(std::log(cand->pt()), -99));
fts.fill("pfcand_e_log_nopuppi", catch_infs(std::log(cand->energy()), -99));

double minDR = 999;
for (const auto &sv : *svs_) {
Expand Down Expand Up @@ -390,14 +405,14 @@ void DeepBoostedJetTagInfoProducer::fillParticleFeatures(DeepBoostedJetFeatures

TrackInfoBuilder trkinfo(track_builder_);
trkinfo.buildTrackInfo(&(*cand), jet_dir, jet_ref_track_dir, *pv_);
fts.fill("pfcand_btagEtaRel", trkinfo.getTrackEtaRel());
fts.fill("pfcand_btagPtRatio", trkinfo.getTrackPtRatio());
fts.fill("pfcand_btagPParRatio", trkinfo.getTrackPParRatio());
fts.fill("pfcand_btagSip2dVal", trkinfo.getTrackSip2dVal());
fts.fill("pfcand_btagSip2dSig", trkinfo.getTrackSip2dSig());
fts.fill("pfcand_btagSip3dVal", trkinfo.getTrackSip3dVal());
fts.fill("pfcand_btagSip3dSig", trkinfo.getTrackSip3dSig());
fts.fill("pfcand_btagJetDistVal", trkinfo.getTrackJetDistVal());
fts.fill("pfcand_btagEtaRel", catch_infs(trkinfo.getTrackEtaRel()));
fts.fill("pfcand_btagPtRatio", catch_infs(trkinfo.getTrackPtRatio()));
fts.fill("pfcand_btagPParRatio", catch_infs(trkinfo.getTrackPParRatio()));
fts.fill("pfcand_btagSip2dVal", catch_infs(trkinfo.getTrackSip2dVal()));
fts.fill("pfcand_btagSip2dSig", catch_infs(trkinfo.getTrackSip2dSig()));
fts.fill("pfcand_btagSip3dVal", catch_infs(trkinfo.getTrackSip3dVal()));
fts.fill("pfcand_btagSip3dSig", catch_infs(trkinfo.getTrackSip3dSig()));
fts.fill("pfcand_btagJetDistVal", catch_infs(trkinfo.getTrackJetDistVal()));
} else {
fts.fill("pfcand_normchi2", 999);

Expand Down Expand Up @@ -445,6 +460,7 @@ void DeepBoostedJetTagInfoProducer::fillSVFeatures(DeepBoostedJetFeatures &fts,

for (const auto *sv : jetSVs) {
// basic kinematics
fts.fill("sv_mask", 1);
fts.fill("sv_phirel", reco::deltaPhi(*sv, jet));
fts.fill("sv_etarel", etasign * (sv->eta() - jet.eta()));
fts.fill("sv_deltaR", reco::deltaR(*sv, jet));
Expand Down
Loading