Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix index typo in PFRecoTauClusterVariables #30919

Merged
merged 1 commit into from
Jul 27, 2020
Merged

Conversation

slava77
Copy link
Contributor

@slava77 slava77 commented Jul 25, 2020

@swozniewski @mbluj

I expect that this should get rid of non-reproducible behavior in phase-2 workflows in tau ID outputs

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-30919/17314

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @slava77 (Slava Krutelyov) for master.

It involves the following packages:

RecoTauTag/RecoTau

@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@riga this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor Author

slava77 commented Jul 25, 2020

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 25, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+1
Tested at: fdb86b8
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8e562d/8292/summary.html
CMSSW: CMSSW_11_2_X_2020-07-25-1100
SCRAM_ARCH: slc7_amd64_gcc820

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8e562d/8292/git-log-recent-commits
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8e562d/8292/git-merge-result

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8e562d/8292/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 13 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2526188
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2526134
  • DQMHistoTests: Total skipped: 47
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 144 log files, 17 edm output root files, 34 DQM output files

@slava77
Copy link
Contributor Author

slava77 commented Jul 26, 2020

@swozniewski @mbluj @fojensen
please clarify where is the original code that does the mapping between the names/indices of variables in the training payload.

I'd naively expect this to still be available in the BDT file metadata.
I looked at the RecoTauTag_tauIdMVAIsoPhase2 (SHA f8e6ade86304851fcd1a3194c299fea8a2453294) and it doesn't have the metadata for variable names.

@swozniewski
Copy link
Contributor

@slava77 thank you for the fast follow-up! I agree that this is probably the right solution but I cannot tell for sure. @fojensen could you please verify with your training setup?

@slava77
Copy link
Contributor Author

slava77 commented Jul 27, 2020

+1

for #30919 fdb86b8

  • this should at least solve the reproducibility problems; confirmation of the complete correct variable mapping is pending

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 73ff004 into cms-sw:master Jul 27, 2020
@fojensen
Copy link
Contributor

fojensen commented Jul 27, 2020 via email

@slava77
Copy link
Contributor Author

slava77 commented Jul 27, 2020

Regarding the metadata, is this something which would have been introduced when adding the variables at the training stage, i.e. the TMVA AddVariable function?

apparently, not. At least not in the way I was thinking when I asked the question.
The payloads are of GBRForest and they do not contain any data about variable names.
It would be useful to keep a reference link to the code that originally extracted the variables in the description of #30341 (or perhaps even inlined in the code that's making MVA inputs).
This will allow for at least more open way to cross-reference the meaning of the variables.

This in a way goes back to #29818 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants