Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce DeepDoubleX V2 #30016

Merged
merged 1 commit into from
Jul 7, 2020
Merged

Introduce DeepDoubleX V2 #30016

merged 1 commit into from
Jul 7, 2020

Conversation

andrzejnovak
Copy link
Contributor

@andrzejnovak andrzejnovak commented May 28, 2020

This PR introduces V2 of DeepDoubleX. Required changes w.r.t. V1 of DeepDoubleX producer include adding a new group of variables, add more variables to existing groups and reordering some of them.

Input x-check in training FW and CMSSW - ✅.
image

Evaluation x-check - ✅

image

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-30016/15708

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@andrzejnovak
Copy link
Contributor Author

andrzejnovak commented May 28, 2020

@slava77 bump

@andrzejnovak andrzejnovak changed the title [WIP] Introduce DeepDoubleX V2 - feedback requested Introduce DeepDoubleX V2 - feedback requested Jun 2, 2020
Copy link
Contributor

@perrotta perrotta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a real review, just pointing out a few thing that jumped on my eyes while scrolling the differences

++idx;
}
}

// run prediction
outputs = globalCache()->run(input_names_, data_, output_names_, batch_size)[0];

// DEBUG: Dump inputs to file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this debug to remain in the producer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can gate it by a debug_ bool, but I would prefer to keep it here as it helps a lot with catching problems when syncing with the outside framework.

@perrotta
Copy link
Contributor

perrotta commented Jun 8, 2020

This is a draft PR for introducing V2 of DeepDoubleX. Required changes w.r.t. V1 of DeepDoubleX producer include adding a new group of variables, add more variables to existing groups and reordering some of them.

I would like to ask for feedback on how backcomp to V1 should be treated? Should there be a new producer? It seems like a possibly simpler choice than making all the differences configurable.

(model file will be factored out before this PR is ready)

What is the purpose of back compatibility with V1? Do you plan to use both V1 and V2 at the same time, or just allow finishing some analysis which already started with V1? That decision has to be taken in the the POG: please summarize here the result of the discussion in the POG.

If the two version must be kept available at the same time, at the first glance it does not seem to me too much complicated to factorize out the differences and make them configurable inside the same producer: code duplication doesn't seem neededd here, and normally better to be avoided.

@andrzejnovak
Copy link
Contributor Author

@perrotta Thanks for getting back to me. This is supposed to supersede V1, I am asking from maintenance and backcomp point of view. From POG POV V2 could replace V1.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 9, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 9, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-30016/15948

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 9, 2020

A new Pull Request was created by @andrzejnovak (Andrzej Novak) for master.

It involves the following packages:

DataFormats/BTauReco
RecoBTag/Combined
RecoBTag/FeatureTools
RecoBTag/ONNXRuntime

@perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@emilbols, @smoortga, @riga, @rovere, @JyothsnaKomaragiri, @mverzett, @hqucms, @ferencek, @andrzejnovak this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 9, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-082aa3/7519/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 36
  • DQMHistoTests: Total histograms compared: 2778915
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2778864
  • DQMHistoTests: Total skipped: 50
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 35 files compared)
  • Checked 152 log files, 16 edm output root files, 36 DQM output files

@andrzejnovak
Copy link
Contributor Author

@slava77 @perrotta Is anything missing?

@slava77
Copy link
Contributor

slava77 commented Jul 1, 2020

@slava77 @perrotta Is anything missing?

#30016 (comment)

@slava77
Copy link
Contributor

slava77 commented Jul 2, 2020

@slava77 not this?

sorry, I meant that Andrea is not available this week.

@perrotta
Copy link
Contributor

perrotta commented Jul 2, 2020

@andrzejnovak thank you for providing your performance results in #30016 (comment) and #30016 (comment). With which workflow were they obtained?

Looking at them, I see that both V1 and V2 behaves rather similarly for what timing and memory are concerned: timing seems well manageable, and also for the memory usage some 9 Mb MEM_LIVE footprint for each version seems affordable.

@slava77 please comment if you disagree; but given the measured performances I believe that both V1 and V2 could be run simultaneously, so that full studies can be performed on V2 before finally switching to it as default suggested version for analyses. This would require reverting the effect of the commit in which you removed V2 from the list of versions saved in miniAOD output.
It must be understood that such a contemporary presence of two versions/tunes of the same algo must be intended as temporary, and as soon as all studies and scale factors have been concluded and computed only the recommended version will remin available in the release.

@slava77
Copy link
Contributor

slava77 commented Jul 2, 2020

related to #30016 (comment)
and #30016 (comment) (taking just DeepDouble module names)

0.000017  pfDeepDoubleXTagInfos
0.000022  pfMassIndependentDeepDoubleBvLJetTags
0.000030  pfMassIndependentDeepDoubleBvLV2JetTags
0.000022  pfMassIndependentDeepDoubleCvBJetTags
0.000044  pfMassIndependentDeepDoubleCvBV2JetTags
0.000022  pfMassIndependentDeepDoubleCvLJetTags
0.000031  pfMassIndependentDeepDoubleCvLV2JetTags

I see in my recent (IB, not this PR) that in 136.888, using JetHT inputs, the times per event are typically around 1 ms/ev per JetTags modules, which is about a factor of 40 larger than the values above.
Even with a very fast machine possibly doing work a factor of 3-4 faster, I conclude that the timing posted above is more likely to be for events with no jets passing preselections.
Anyways, an addition of 3 ms from 3 extra modules seems acceptable, if I extrapolate from my reference.

@andrzejnovak
Copy link
Contributor Author

Hmm, fair point. The sample processed certainly had some passing jets, but it's not majority. If the times get averaged it could be biased. Though both tests were on the same sample.

@perrotta The igprof results are from running the test_deep_doublex file.

I think actually think it's fine as is - not running v2 by default in mini. The values can be calculated when producing the validation tuples/nano. I think that will make for a cleaner switch even if it wouldn't be too expensive to keep both temporarily.

@slava77
Copy link
Contributor

slava77 commented Jul 2, 2020

@perrotta The igprof results are from running the test_deep_doublex file.

'file:72164088-CB67-E811-9D0D-008CFA197AC4.root', ?
If it's a local copy of what's a row above from GluGluHToCC_M125_13TeV_powheg_pythia8,
then at least from the name it sounds like there are to first order no AK8 jets with pt>120 GeV (do I recall the cut value correctly?) in these events.

@andrzejnovak
Copy link
Contributor Author

@perrotta
Copy link
Contributor

perrotta commented Jul 4, 2020

+1

  • DeepDoubleX version 2 added as a configurable possibility in the code, but not switched on in the settings of the default workflows
  • The following external must also be merged to allow running V2: DDXV2 cms-data/RecoBTag-Combined#31
  • The timing and memory performance of V2 are comparable with the ones of current V1, therefore no issues are expected when the switch will be decided and implemented by BTV (final checks and corresponding scale factors still to be computed)
  • Since the developers agree on having only by now V1 in the release (see Introduce DeepDoubleX V2  #30016 (comment)), no changes are expected in reco outputs: jenkins tests pass and actually show no differences

@silviodonato
Copy link
Contributor

merge

@santocch
Copy link

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@kpedro88 kpedro88 mentioned this pull request Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants