Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MXNet-based implementation of the ParticleNet tagger #28902

Merged
merged 9 commits into from
Mar 10, 2020

Conversation

hqucms
Copy link
Contributor

@hqucms hqucms commented Feb 10, 2020

PR description:

This PR implements the ParticleNet boosted jet tagger into CMSSW. The ParticleNet tagger uses a novel graph neural network architecture and shows significant performance improvement compared to the DeepAK8 algorithm. Two versions are provided: the nominal version (pfParticleNetJetTags) is a multi-class tagger for Top, W, Z, Higgs and their various decay modes, with very strong performance but features strong mass sculpting. The mass-decorrelated version (pfMassDecorrelatedParticleNetJetTags) is a generic two-prong tagger for X->bb, X->cc, and X->qq. It is trained w/ a special signal sample generated with a flat mass spectrum for the signal
particle (X) and shows substantial improvement in both discrimination power and mass decorrelation compared to the mass-decorrelated DeepAK8 tagger. The performance of the ParticleNet tagger is summarized in the DP note (DP-2020/002) and presentations in the JME and the BTV groups.

Needs cms-data/RecoBTag-Combined#26.

The current implementation is based on MXNet. Conversion to ONNX is not successful due to the complexity of the graph neural network architecture. The inference w/ MXNet takes ~60ms per AK8 jet per tagger (nominal and mass-decorrelated), or ~50ms/evt in total on a ttbar sample -- once we solve cms-sw/cmsdist#5528.

PR validation:

Implementation of this PR has been verified with the training framework and shows consistent results.

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28902/13689

  • This PR adds an extra 52KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @hqucms (Huilin Qu) for master.

It involves the following packages:

PhysicsTools/PatAlgos
RecoBTag/Configuration
RecoBTag/FeatureTools
RecoBTag/MXNet

@perrotta, @cmsbuild, @santocch, @slava77 can you please review it and eventually sign? Thanks.
@rappoccio, @gouskos, @hatakeyamak, @emilbols, @HeinerTholen, @peruzzim, @seemasharmafnal, @mmarionncern, @ahinzmann, @smoortga, @acaudron, @jdolen, @ferencek, @jdamgov, @nhanvtran, @gkasieczka, @schoef, @mariadalfonso, @clelange, @riga, @JyothsnaKomaragiri, @mverzett, @gpetruc, @andrzejnovak, @pvmulder this is something you requested to watch as well.
@davidlange6, @silviodonato, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@perrotta
Copy link
Contributor

please test with cms-data/RecoBTag-Combined#26.

@hqucms
Copy link
Contributor Author

hqucms commented Feb 12, 2020

@perrotta Looks like the test is not triggered successfully?

@perrotta
Copy link
Contributor

please test with cms-data/RecoBTag-Combined#26

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 12, 2020

The tests are being triggered in jenkins.
Tested with other pull request(s) cms-data/RecoBTag-Combined#26
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/4614/console Started: 2020/02/12 08:13

@cmsbuild
Copy link
Contributor

+1
Tested at: 7a7c0a7
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9fe682/4614/summary.html
CMSSW: CMSSW_11_1_X_2020-02-11-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@hqucms
Copy link
Contributor Author

hqucms commented Mar 5, 2020

@slava77
The 200 GeV is a general recommendation for analyses using AK8 jets and it should be applied offline by the analyzers. In the code we put 150 because there it is the uncorrected pt and we want leave some margin for the JECs.

@slava77
Copy link
Contributor

slava77 commented Mar 5, 2020

In the code we put 150 because there it is the uncorrected pt and we want leave some margin for the JECs.

ok, that's fine then

@hqucms
Copy link
Contributor Author

hqucms commented Mar 5, 2020

Ah, this is quite useful. Thank you!

@slava77 BTW, is there a quick way to find out which index correspond to which tagger for the pairDiscriVector_?

I just open the file and print when needed.

Here is a script I used to check the old tags are unaffected
compare_patJets_btags in https://github.com/slava77/cms-reco-tools/blob/master/printFWLite.C
In the first event it prints the old and new map between names and indices.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 5, 2020

+1
Tested at: 3c9ce7f
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9fe682/5001/summary.html
CMSSW: CMSSW_11_1_X_2020-03-04-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 5, 2020

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 5, 2020

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9fe682/5001/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 1102 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2680577
  • DQMHistoTests: Total failures: 40
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2680218
  • DQMHistoTests: Total skipped: 319
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@slava77
Copy link
Contributor

slava77 commented Mar 6, 2020

+1

for #28902 3c9ce7f

  • code changes are in line with the PR description and the follow up review
  • jenkins tests pass and comparisons with the baseline show differences only in the miniAOD variables of slimmedJetsAK8 btagging discriminators: 37 new discriminators were added
  • local tests rerunning miniAOD on 1K ttbar PU50 events show somewhat expected behavior:
    • CPU use is up by 4% of miniAOD time (0.5% of reco+miniAOD). Considering that somewhat as a part of this development we had a reduction in all of the different "deep" jet tags from 121 ms in CMSSW_11_0_0_pre10 to 83 ms with this PR, I think that this change is acceptable.
      • similar logic should [preferably] apply to the 10_2_X and 10_6_X targets when considering enabling the ParticleNet in the backports
    • RSS peak on 8-thread job is up by 22 MB/thread, in agreement with detailed profiling
    • the miniAOD disk use is up by 0.5% due to 17% increase in size of the pat::Jets slimmedJetsAK8 collection. The effect is likely smaller on more events. Still, I'd like to remind of the pending issue apply 1e-4 rounding/truncation in ONNXRuntime jet tags #28469 which is expected to reduce the compressed size on disk.

Some plots from 1K ttbar events, for pfParticleNetJetTags [ignore the red histos]

probQCDothers still seems to have too much at 0
all_origVSsign1097-pass-3c9ce7f_TTbar13UP18PU50miniAODc_min2,max-2,patJets_slimmedJetsAK8__PAT_obj___pairDiscriVector__13__second142

well, perhaps the DiscriminatorsJetTags are more expected to be "naturally" separated
ZvsQCD (
all_origVSsign1097-pass-3c9ce7f_TTbar13UP18PU50miniAODc_min2,max-2,patJets_slimmedJetsAK8__PAT_obj___pairDiscriVector__101__second230

WvsQCD
all_origVSsign1097-pass-3c9ce7f_TTbar13UP18PU50miniAODc_min2,max-2,patJets_slimmedJetsAK8__PAT_obj___pairDiscriVector__83__second212

these are showing some considerable signal-like fractions.
HbbvsQCD and ZbbvsQCD have a much smaller bump near 1, H4q has almost none. With some imagination this might make sense.
Without gen-match, this is perhaps good enough.

Note: this requires cms-data/RecoBTag-Combined#26

@silviodonato
Copy link
Contributor

to be merged with cms-sw/cmsdist#5626

@cmsbuild cmsbuild mentioned this pull request Mar 7, 2020
@slava77
Copy link
Contributor

slava77 commented Mar 9, 2020

@silviodonato
is this PR waiting for the analysis signature or are there some other concerns about it?

@silviodonato
Copy link
Contributor

merge
@santocch please have a look

@cmsbuild cmsbuild merged commit c1ef845 into cms-sw:master Mar 10, 2020
@santocch
Copy link

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants