Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable HLT tau filtering based on jet-tags (ParticleNet) #43100

Merged
merged 4 commits into from
Oct 27, 2023

Conversation

mbluj
Copy link
Contributor

@mbluj mbluj commented Oct 24, 2023

PR description:

This PR adds an HLT filter of tau candidates based on jet-tagger scores. This filter, in particular, enables tau filtering based on particle-net.
The PR introduces the following changes:

  • Adds TauTagFilter: an event filter based on the number of tau candidates that pass a selection which combines kinematic requirements (pt, eta) and requirements on jet-tagging. Optionally it also requires a matching with seeds and applies a pt correction;
  • Moves TauWPThreshold utility class from DeepTauId code to a separate file to be used by both DeepTauId and TauTagFilter (and possibly others in future);
  • Fixes L2TauTagFilter.cc to set correct id to output object.

No changes are expected in existing workflows (except correct id of objects returned by L2TauTagFilter).

Detailed performance of new HLT paths using newly introduced TauTagFilter and based on particle-net taggers presented at the TSG meeting on 2023-10-25: link to indico.

PR validation:

Code tested using CMSSW_13_3_0_pre3 with a custom HLT python configuration (with MC): hltMC_PNet_tau.py

The matrix tests runTheMatrix.py -l limited -i all --ibeos successful.

@mbluj
Copy link
Contributor Author

mbluj commented Oct 24, 2023

FYI, @kandrosov

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43100/37342

  • This PR adds an extra 44KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mbluj for master.

It involves the following packages:

  • RecoTauTag/HLTProducers (hlt)
  • RecoTauTag/RecoTau (reconstruction)

@cmsbuild, @jfernan2, @mmusich, @Martin-Grunewald, @mandrenguyen, @missirol can you please review it and eventually sign? Thanks.
@silviodonato, @azotz, @missirol, @mbluj this is something you requested to watch as well.
@sextonkennedy, @rappoccio, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

@mbluj mbluj changed the title Cmssw 13 3 x tau pog pnet tau at hlt Enable HLT tau filtering based on jet-tags (ParticleNet) Oct 24, 2023
@mmusich
Copy link
Contributor

mmusich commented Oct 24, 2023

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-fc6b04/35399/summary.html
COMMIT: 053a33a
CMSSW: CMSSW_13_3_X_2023-10-24-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43100/35399/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 6 lines to the logs
  • Reco comparison results: 26 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3357400
  • DQMHistoTests: Total failures: 1070
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3356308
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

Copy link
Contributor

@mmusich mmusich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only few very minor comments.
In general please squash the last 3 commits (see discussion at cms-sw/cms-bot#2080 for a rationale)
In any case I'll wait for the talk at the TSG meeting before signoff.

RecoTauTag/HLTProducers/src/TauTagFilter.cc Outdated Show resolved Hide resolved
RecoTauTag/HLTProducers/src/TauTagFilter.cc Outdated Show resolved Hide resolved
RecoTauTag/RecoTau/interface/TauWPThreshold.h Outdated Show resolved Hide resolved
RecoTauTag/HLTProducers/src/TauTagFilter.cc Outdated Show resolved Hide resolved
@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-fc6b04/35412/summary.html
COMMIT: 0cffcda
CMSSW: CMSSW_13_3_X_2023-10-24-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43100/35412/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 67 lines to the logs
  • Reco comparison results: 34 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3351458
  • DQMHistoTests: Total failures: 1073
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3350363
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 213 log files, 166 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor

mmusich commented Oct 26, 2023

+hlt

Concerning the test results:

@mmusich
Copy link
Contributor

mmusich commented Oct 26, 2023

@mbluj, as discussed yesterday during the TSG meeting of Oct 25th please prepare a corresponding backport to 13.2.X, so that integration of the relevant paths using this new filter can proceed. Thank you.

@mbluj
Copy link
Contributor Author

mbluj commented Oct 26, 2023

@mbluj, as discussed yesterday during the TSG meeting of Oct 25th please prepare a corresponding backport to 13.2.X, so that integration of the relevant paths using this new filter can proceed. Thank you.

Sure, it will be done.

@mandrenguyen
Copy link
Contributor

type tau

@cmsbuild cmsbuild added the tau label Oct 26, 2023
@mandrenguyen
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @sextonkennedy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@antoniovilela
Copy link
Contributor

+1

Comment on lines +13 to +19
try {
size_t pos = 0;
value_ = std::stod(cut_str, &pos);
simple_value = (pos == cut_str.size());
} catch (std::invalid_argument&) {
} catch (std::out_of_range&) {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This violates the CMSSW coding rules:

7.11 In general, do not catch exceptions – leave them to the Framework (see Exception Guidelines).

These changes should not have been accepted in the current state. Please fix this misuse ASAP.

Copy link
Contributor

@mmusich mmusich Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbluj can you have a look? This is in the way of further development of the HLT menu for 2024.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that try-catch can be simply removed as exceptions will be cached by framework?
@fwyzard for my curiosity: those exception catches was found because the exceptions have been thrown or it was done with other means?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that try-catch can be simply removed as exceptions will be cached by framework?

Exceptions caught by the framework will usually cause the jobs to terminate.

@fwyzard for my curiosity: those exception catches was found because the exceptions have been thrown or it was done with other means?

These exceptions have been noticed running cmsRun hlt.py under gdb and telling it to catch exceptions, in order to investigate were some failures are happening.
These try/catch blocks add a lot of noise to the debugging sessions, making it harder to figure out were the real problems are happening.

In general, in CMSSW it is discouraged to use try/catch as a way to check if a condition is valid or not; it's better to do an explicit check.

This approach is idiomatic in python, where checking for an exception is almost as fast as checking for a return value. In C++, the regular return values are faster, and the exceptions are slower.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbluj having a parsing error at this stage is ok; it just means that WP is not a "simple value". So I think your code should be modified like this:

const char* cut_cstr = cut_str.c_str();
char* end_cstr;
value_ = std::strtod(cut_cstr, &end_cstr);
simple_value = end_cstr != cut_cstr && std::isfinite(value_);

(without any throw)

Yes, indeed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, the code I proposed above is also wrong. For example, for cut_str="1 + 1", it will return simple_value=true and _value=1, which is not intended. So should be something like this:

    const char* cut_cstr = cut_str.c_str();
    char* end_cstr;
    value_ = std::strtod(cut_cstr, &end_cstr);
    const bool simple_value = end_cstr != cut_cstr && static_cast<std::size_t>(end_cstr - cut_cstr) == cut_str.size() && value_ != HUGE_VALF;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is enough to check if end_cstr is empty !*end_cstr, so the piece of code can be like this:

      const char* cut_cstr = cut_str.c_str();
      char* end_cstr{};
      value_ = std::strtod(cut_cstr, &end_cstr);
      simple_value = (!*end_cstr && cut_cstr != end_cstr && std::isfinite(value_));
      if (!simple_value) {

I think std::isfinite(value_) is better than value_ != HUGE_VAL as it does not accept nan. BTW, current code accepts both nan and inf as a valid "simple_value".

Copy link
Contributor

@kandrosov kandrosov Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Perhaps, as in your original proposal, an exception should be thrown for inf and especially nan - I don't see any use cases where one would want to use them in valid configs...
Is std::isfinite(HUGE_VALF) == false guaranteed, or could it be compiler-dependent? To make sure that we account for all possible return scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quote from cppreference:

On implementations that support floating-point infinities, these macros (HUGE_VALF, HUGE_VAL, HUGE_VALL) always expand to the positive infinities of float, double, and long double, respectively."

So, I think it is OK to check if value_ is finite.

In the current implementation not-finite numbers are treated as not being a "simple_value" and as they are not correct threshold expressions an exception is thrown.

BTW, for double one should use HUGE_VAL rather than HUGE_VALF that is for float.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants