Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #884, remove code repetition in ProcessNormalization and AsymPow #989

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pitkajuh
Copy link
Contributor

@pitkajuh pitkajuh commented Jul 6, 2024

Dear All. This PR fixes #884. Unfortunately I could not figure out a proper place for new logKappa function, so I created a new file for it. If there is a better place for it, please do not hesitate to tell me.

The code compiles without any problems but some test fail after running pytest via conda in test/ directory:

========================================================================== test session starts ==========================================================================
platform linux -- Python 3.8.0, pytest-8.2.2, pluggy-1.5.0
rootdir: /home/pitkajuh/bin/HiggsAnalysis/CombinedLimit/test
plugins: anyio-4.4.0
collected 0 items / 8 errors

================================================================================ ERRORS =================================================================================
________________________________________________________________ ERROR collecting validation/test_AS.py _________________________________________________________________
validation/test_AS.py:13: in <module>
    datacardGlob("simple-counting/counting-B5p5-Obs[16]*.txt"),
validation/TestClasses.py:439: in datacardGlob
    base = os.environ["CMSSW_BASE"] + "/src/HiggsAnalysis/CombinedLimit/data/benchmarks/"
../../../../anaconda3/envs/combine/lib/python3.8/os.py:673: in __getitem__
    raise KeyError(key) from None
E   KeyError: 'CMSSW_BASE'
________________________________________________________________ ERROR collecting validation/test_BS.py _________________________________________________________________
validation/test_BS.py:13: in <module>
    datacardGlob("simple-counting/counting-B5p5-Obs[16]*.txt"),
validation/TestClasses.py:439: in datacardGlob
    base = os.environ["CMSSW_BASE"] + "/src/HiggsAnalysis/CombinedLimit/data/benchmarks/"
../../../../anaconda3/envs/combine/lib/python3.8/os.py:673: in __getitem__
    raise KeyError(key) from None
E   KeyError: 'CMSSW_BASE'
________________________________________________________________ ERROR collecting validation/test_BT.py _________________________________________________________________
validation/test_BT.py:15: in <module>
    datacardGlob("simple-counting/counting-B5p5-Obs[16]*.txt"),
validation/TestClasses.py:439: in datacardGlob
    base = os.environ["CMSSW_BASE"] + "/src/HiggsAnalysis/CombinedLimit/data/benchmarks/"
../../../../anaconda3/envs/combine/lib/python3.8/os.py:673: in __getitem__
    raise KeyError(key) from None
E   KeyError: 'CMSSW_BASE'
________________________________________________________________ ERROR collecting validation/test_FC.py _________________________________________________________________
validation/test_FC.py:13: in <module>
    datacardGlob("simple-counting/counting-B5p5-Obs[16]*-Syst30U.txt"),
validation/TestClasses.py:439: in datacardGlob
    base = os.environ["CMSSW_BASE"] + "/src/HiggsAnalysis/CombinedLimit/data/benchmarks/"
../../../../anaconda3/envs/combine/lib/python3.8/os.py:673: in __getitem__
    raise KeyError(key) from None
E   KeyError: 'CMSSW_BASE'
________________________________________________________________ ERROR collecting validation/test_HN.py _________________________________________________________________
validation/test_HN.py:261: in <module>
    datacardGlob("simple-counting/counting-B5p5-Obs[16]*-S*[Uy].txt"),
validation/TestClasses.py:439: in datacardGlob
    base = os.environ["CMSSW_BASE"] + "/src/HiggsAnalysis/CombinedLimit/data/benchmarks/"
../../../../anaconda3/envs/combine/lib/python3.8/os.py:673: in __getitem__
    raise KeyError(key) from None
E   KeyError: 'CMSSW_BASE'
_______________________________________________________________ ERROR collecting validation/test_MCMC.py ________________________________________________________________
validation/test_MCMC.py:13: in <module>
    datacardGlob("simple-counting/counting-B5p5-Obs6*.txt"),
validation/TestClasses.py:439: in datacardGlob
    base = os.environ["CMSSW_BASE"] + "/src/HiggsAnalysis/CombinedLimit/data/benchmarks/"
../../../../anaconda3/envs/combine/lib/python3.8/os.py:673: in __getitem__
    raise KeyError(key) from None
E   KeyError: 'CMSSW_BASE'
________________________________________________________________ ERROR collecting validation/test_PLC.py ________________________________________________________________
validation/test_PLC.py:13: in <module>
    datacardGlob("simple-counting/counting-B5p5-Obs[16]*.txt"),
validation/TestClasses.py:439: in datacardGlob
    base = os.environ["CMSSW_BASE"] + "/src/HiggsAnalysis/CombinedLimit/data/benchmarks/"
../../../../anaconda3/envs/combine/lib/python3.8/os.py:673: in __getitem__
    raise KeyError(key) from None
E   KeyError: 'CMSSW_BASE'
________________________________________________________________ ERROR collecting validation/test_htt.py ________________________________________________________________
validation/test_htt.py:17: in <module>
    datacardGlob("htt/125/htt_*_8TeV.txt"),
validation/TestClasses.py:439: in datacardGlob
    base = os.environ["CMSSW_BASE"] + "/src/HiggsAnalysis/CombinedLimit/data/benchmarks/"
../../../../anaconda3/envs/combine/lib/python3.8/os.py:673: in __getitem__
    raise KeyError(key) from None
E   KeyError: 'CMSSW_BASE'
======================================================================== short test summary info ========================================================================
ERROR validation/test_AS.py - KeyError: 'CMSSW_BASE'
ERROR validation/test_BS.py - KeyError: 'CMSSW_BASE'
ERROR validation/test_BT.py - KeyError: 'CMSSW_BASE'
ERROR validation/test_FC.py - KeyError: 'CMSSW_BASE'
ERROR validation/test_HN.py - KeyError: 'CMSSW_BASE'
ERROR validation/test_MCMC.py - KeyError: 'CMSSW_BASE'
ERROR validation/test_PLC.py - KeyError: 'CMSSW_BASE'
ERROR validation/test_htt.py - KeyError: 'CMSSW_BASE'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 8 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
========================================================================== 8 errors in 24.11s ===========================================================================

I am not sure if these errors have anything to do with the edits I made. I ran the tests by executing pytest command in test/ directory. I don't know if this is the proper way to run the tests. I did not find any information how to run tests. Can anyone provide any insight on this?

@pitkajuh
Copy link
Contributor Author

pitkajuh commented Jul 6, 2024

It seems that the files I edited had unnecessary white spaces. My text editor automatically removes them. Is this ok?

Copy link
Collaborator

@adewit adewit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the PR. I just had a quick look but it would be great if other developers could also comment, since I'm not sure if there was a plan to fix the open issues such as the one that this one addresses immediately, as we may be doing a bigger overhaul in the future.

However, I already wanted to comment since I think in general it would be much preferred if you could test the code even just on a few different datacards to check that it is doing what you want, rather than relying on the unit tests that we can do as part of the PR (while quite extensive, we know they don't cover everything). I think it wasn't done in this case, since unless I misread the code, there is a minus sign error somewhere, which I would expect to change the behaviour.

@@ -93,7 +94,7 @@ Double_t ProcessNormalization::evaluate() const {
const RooAbsReal *theta = asymmThetaListVec_.at(i);
const std::pair<double,double> logKappas = logAsymmKappa_.at(i);
double x = theta->getVal();
logVal += x * logKappaForX(x, logKappas);
logVal += x * logKappaForX(x, logKappas.second, logKappas.first);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you not just call logKappa here, given that all logKappaForX does is call logKappa?

double ret = avg + alpha*halfdiff;
//assert(alpha >= -1 && alpha <= 1 && "Something is wrong in the interpolation");
return ret;
return logKappa(x, logKhi, logKlo);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the behaviour of the original code, if I'm not mistaken, since you have already defined logKlo as -log(kappaLow_), but in the logKappa function you still put a minus sign in front of the variable kappaLo which, thus already represents -log(kappaLow_) in your code.

#define logKappa_h

template<typename T>
T logKappa(const T x, const double &kappaHigh, const double &kappaLow) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that what you are passing here are log(kappaHigh) and log(kappaLow), I think these variable names could be updated to something that matches better what they represent?

#define logKappa_h

template<typename T>
T logKappa(const T x, const double &kappaHigh, const double &kappaLow) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
T logKappa(const T x, const double &kappaHigh, const double &kappaLow) {
inline double logKappa(double x, double kappaHigh, double kappaLow) {

Why the template? RooFit always deals with double anyway. And why the references? Doubles are usually passed by value.


template<typename T>
T logKappa(const T x, const double &kappaHigh, const double &kappaLow) {
if (fabs(x) >= 0.5) return (x >= 0 ? kappaHigh : -kappaLow);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (fabs(x) >= 0.5) return (x >= 0 ? kappaHigh : -kappaLow);
if (std::abs(x) >= 0.5) return (x >= 0 ? kappaHigh : -kappaLow);

Better to use functions from the C++ standard library and don't forget to #include <cmath> in this file to be sure to have the math functions.

@guitargeek
Copy link
Contributor

guitargeek commented Jul 8, 2024

Very nice initiative!

About the place to put your function: I would create a new file to collect these free math functions, just like we do in RooFit:
https://github.com/root-project/root/blob/master/roofit/roofitcore/inc/RooFit/Detail/MathFuncs.h#L86

Mabye CombineMathFuncs.h. Then we can also put other repeated code there.

@adewit, what do you think?

@guitargeek
Copy link
Contributor

@pitkajuh, you can close this PR, and alternative PR has been merged. Thanks for the initial work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code duplication found
3 participants