Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UL: Corrections #72

Open
mcremone opened this issue Sep 22, 2023 · 53 comments · Fixed by #87
Open

UL: Corrections #72

mcremone opened this issue Sep 22, 2023 · 53 comments · Fixed by #87
Assignees

Comments

@mcremone
Copy link
Owner

mcremone commented Sep 22, 2023

First of all, pull the current most up to date version from the master branch:

https://github.com/mcremone/decaf/blob/master/analysis/utils/corrections.py

then replace the corrections one by one with the ones recommended for UL. Please be mindful of a couple of things:

  1. Use 'correctionlib': https://github.com/cms-nanoAOD/correctionlib
  2. Use the coffea lookup tools to interface with correctionlib. If need help, ask Nick Smith.
  3. Comment on each single correction, including links from where correction files have been taken etc.
  4. Clean up the 'analysis/data' folder from non-UL files, keep only the ones that are used. With correctionlib, they should be reduced to just a bunch of json files
@mcremone
Copy link
Owner Author

An additional comment, we need to double check one by one against the corrections implemented by KIT. Still, we want to implement ours using correctionlib.

@ParticleChef
Copy link
Collaborator

Electron id, photon id, MET phi corrections, and pu weight are included at correction.py file with json files (https://gitlab.cern.ch/cms-nanoAOD/jsonpog-integration)
The electron trigger weight and reco sf should be included.

@ParticleChef
Copy link
Collaborator

For nlo ewk scale factor, the root files are made by monojet analysis. At the last meeting, we discussed that this scale factor can be used. https://github.com/ParticleChef/decaf/blob/master/analysis/utils/corrections.py#L220

@mcremone
Copy link
Owner Author

mcremone commented Oct 6, 2023

Electron id, photon id, MET phi corrections, and pu weight are included at correction.py file with json files (https://gitlab.cern.ch/cms-nanoAOD/jsonpog-integration)
The electron trigger weight and reco sf should be included.

Can you point me to the part of your code where these are used? Also, how about muon isolation weights? For what concerns trigger weight, most likely you're also missing single muon and MET trigger, am I right?

@ParticleChef
Copy link
Collaborator

@ParticleChef
Copy link
Collaborator

I make quick test with json file and btageff.merged file existed.

# generate 20 dummy jet features
jet_pt    = np.random.exponential(50., 15) 
jet_eta   = np.random.uniform(0.0, 2.4, 15) 
jet_flav  = np.random.choice([0, 4, 5], 15) 
jet_discr = np.random.uniform(0.0, 1.0, 15) 

# separate light and b/c jets
light_jets = np.where(jet_flav == 0)
bc_jets    = np.where(jet_flav != 0)

btag = load('hists/btageff2017.merged')
bpass = btag[tagger].integrate('dataset').integrate('wp',workingpoint).integrate('btag', 'pass').values()[()]
ball = btag[tagger].integrate('dataset').integrate('wp',workingpoint).integrate('btag').values()[()]
nom = bpass / np.maximum(ball, 1.) 
eff = lookup_tools.dense_lookup.dense_lookup(nom, [ax.edges() for ax in btag[tagger].axes()[3:]])

btvjson = correctionlib.CorrectionSet.from_file('data/BtagSF/'+year+'_UL/btagging.json.gz')
sf_nom = btvjson["deepJet_comb"].evaluate('central','M', jet_flav[bc_jets], jet_eta[bc_jets], jet_pt[bc_jets])
print('sf_nom: ', sf_nom, len(sf_nom))

def P(eff):
    weight = eff.ones_like()
    weight[istag] = eff[istag]
    weight[~istag] = (1 - eff[~istag])
    return weight.prod()

eff = eff(jet_pt, jet_eta, jet_flav)                                                                                                                       
print('extract eff:', eff, len(eff))

eff_data_nom  = np.minimum(1., sf_nom*eff)
nnom = P(eff_data_nom)/P(eff)
print('P(eff_data_nom)/P(eff)', nnom)

I printed the values and I got error like this

sf_nom:  [0.94694163 0.95233112 0.9551299  0.95522698 0.95875001 0.94341749
 0.95456105 0.94572292 0.94499175 0.95435803 0.94332464] 11
extract eff: [0.9375     0.9375     0.9375     0.9375     0.9375     0.61748634
 0.9375     0.9375     0.91052632 0.91052632 0.91052632 0.91052632
 0.9375     0.9375     0.61748634] 15
Traceback (most recent call last):
  File "utils/cortest.py", line 89, in <module>
    eff_data_nom  = np.minimum(1., sf_nom*eff)
ValueError: operands could not be broadcast together with shapes (11,) (15,) 

Which part should I fix to solve this error?

@mcremone
Copy link
Owner Author

To avoid this shape mismatch you can use real data/MC in the test. My suggestion is that we finish first implementing all corrections with correctionlib (when possible). I'll then do a quick review of the code and then we structure a test.

@ParticleChef
Copy link
Collaborator

I finished the modifying btag. How you implement the jec? Other than jec, I modified all corrections I need.

@mcremone
Copy link
Owner Author

mcremone commented Nov 2, 2023

@ParticleChef
Copy link
Collaborator

I update the corrections.py and jet energy correction files.

@ParticleChef
Copy link
Collaborator

When run the correction.py, the error is accured at import uproot_methods

Traceback (most recent call last):
  File "utils/corrections.py", line 6, in <module>
    import uproot, uproot_methods
  File "/uscms/home/jhong/.local/lib/python3.6/site-packages/uproot_methods/__init__.py", line 5, in <module>
    from uproot_methods.classes.TVector2 import TVector2, TVector2Array
  File "/uscms/home/jhong/.local/lib/python3.6/site-packages/uproot_methods/classes/TVector2.py", line 8, in <module>
    import awkward.array.jagged
ModuleNotFoundError: No module named 'awkward.array'

Instead of this, update error lines to uproot3.

And separate the '2016' to '2016preVFP' and '2016postVFP' at btag part.
(https://github.com/ParticleChef/decaf/blob/UL/analysis/utils/corrections.py#L465)
(https://github.com/ParticleChef/decaf/blob/UL/analysis/utils/common.py#L19)

@mcremone
Copy link
Owner Author

Hi think you want to do the other way around, which means using the latest awkward version, changing the code lines to use what the latest awkward version wants you to use.

@ParticleChef
Copy link
Collaborator

Then, is there no need to change the version of Awkward?

@mcremone
Copy link
Owner Author

In general you want to use the latest version of everything, both awkward and uproot. If we need to change the code a bit to adjust to the format the new versions may want, that's what I would do.

@ParticleChef
Copy link
Collaborator

ParticleChef commented Nov 16, 2023

The current correction.py works in awkward version 1.9.0. I checked my current setup and latest version of some module.

        (current / latest)
awkward ( 1.9.0 / 2.4.10 )
uproot  ( 4.3.7 / 5.1.2 )
uproot3 ( 3.14.4 / 3.14.4 )
numpy  ( 1.17.0 / 1.26.0 )

Is it okay to change the version in current coffea (0.7.12)?

@mcremone
Copy link
Owner Author

where the current version automatically installed when you installed coffea 0.7.12?

@ParticleChef
Copy link
Collaborator

I forgot which version is installed when I installed coffea 0.7.12. All module are installed at /uscms/home/jhong/.local/lib/python3.6/site-packages/

@mcremone
Copy link
Owner Author

I think that if you didn't upgrade packages by hand, those are the versions that coffea installed by itself. I wouldn't touch them then, but in the correction.py code, wherever you are using using uproot3, use uproot instead. If this makes the code crash, then we need to understand why.

@mcremone
Copy link
Owner Author

mcremone commented Dec 8, 2023

@ParticleChef were you able to use uproot instead of uproot3? Besides that I don't think that this needs more work.

@mcremone
Copy link
Owner Author

mcremone commented Dec 8, 2023

@mcremone
Copy link
Owner Author

@ParticleChef were you able to use uproot instead of uproot3? Besides that I don't think that this needs more work.

@ParticleChef any news on this?

@alejands alejands self-assigned this Dec 13, 2023
@ParticleChef
Copy link
Collaborator

ParticleChef commented Dec 14, 2023

Hi. I I checked the uproot, uproot3 and uproot_methods.
The firstly error is accurred in uproot_methods. The error is like this:

[jhong@cmslpc175 analysis]$ python utils/corrections.py 
Traceback (most recent call last):
  File "utils/corrections.py", line 4, in <module>
    import uproot, uproot_methods
  File "/uscms/home/jhong/.local/lib/python3.6/site-packages/uproot_methods/__init__.py", line 5, in <module>
    from uproot_methods.classes.TVector2 import TVector2, TVector2Array
  File "/uscms/home/jhong/.local/lib/python3.6/site-packages/uproot_methods/classes/TVector2.py", line 8, in <module>
    import awkward.array.jagged
ModuleNotFoundError: No module named 'awkward.array'

And using uproot without uproot_methods, the lookup_tools has error:

Traceback (most recent call last):
  File "utils/corrections.py", line 28, in <module>
    get_met_trig_weight[year] = lookup_tools.dense_lookup.dense_lookup(met_trig_hist.values, met_trig_hist.edges)
AttributeError: 'Model_TH1F_v1' object has no attribute 'edges'

Everything works well when I change all uproot to uproot3 (I don't include uproot_methods)
In all lines using uproot.open, uproot changed to uproot3. One of the line is https://github.com/ParticleChef/decaf/blob/UL/analysis/utils/corrections.py#L20

@mcremone
Copy link
Owner Author

Actually, we need also to implement the UL ttbar corrections:

https://github.com/mcremone/decaf/blob/master/analysis/utils/corrections.py#L352-L353

To be found here:

https://twiki.cern.ch/twiki/bin/view/CMS/TopPtReweighting

@alejands can you look into this?

@mcremone
Copy link
Owner Author

I forgot to mention that we also need to updated these corrections:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L311-L320

A good place to start would be asking the boosted Higgs team, or digging into their code:

https://github.com/nsmith-/boostedhiggs/tree/master/boostedhiggs

@alejands @ParticleChef

@ParticleChef
Copy link
Collaborator

I modified the btageff.py file for making btageff merged file used in corrections.py file.
https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/processors/btageff.py

Does it need any other process other than reduce.py and merge.py to make btageff.merged files?

@mcremone
Copy link
Owner Author

@ParticleChef I really have a strong preference for adding boolean as attributes of objects. For example, here:

https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/processors/btageff.py#L52

I really prefer this:

https://github.com/mcremone/decaf/blob/master/analysis/processors/btageff.py#L49

@mcremone
Copy link
Owner Author

@ParticleChef I really have a strong preference for adding boolean as attributes of objects. For example, here:

https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/processors/btageff.py#L52

I really prefer this:

https://github.com/mcremone/decaf/blob/master/analysis/processors/btageff.py#L49

@ParticleChef can you open a separate issue for this? Also in this case we should move from coffea.hist to hist following these instructions:

scikit-hep/coffea#705

@alejands alejands linked a pull request Jan 23, 2024 that will close this issue
@alejands
Copy link
Collaborator

alejands commented Jan 23, 2024

Actually, we need also to implement the UL ttbar corrections:

https://github.com/mcremone/decaf/blob/master/analysis/utils/corrections.py#L352-L353

To be found here:

https://twiki.cern.ch/twiki/bin/view/CMS/TopPtReweighting

After going through the twiki above and going around some TOP PAG twikis to double check, it appears that no updates have been made to the top pt reweighting function for data-NLO (data/POWHEG+Pythia8). The recommendation still matches our code.

def get_ttbar_weight(pt):
return np.exp(0.0615 - 0.0005 * np.clip(pt, 0, 800))

I did notice this line in the twiki...

New plots with full Run 2 data and different predictions are expected to replace these soon (08/2020).

@alejands
Copy link
Collaborator

alejands commented Jan 23, 2024

I was able to update corrections.py script to use the uproot package rather than uproot3. I'll be adding my updates in this PR:

@alejands
Copy link
Collaborator

I noticed these output filenames were changed by @ParticleChef, presumably while testing:

save(ids, "data/test_ids.coffea")

save(corrections, 'data/testcorrections.coffea')

Should these be changed back or left as is?

@alejands
Copy link
Collaborator

Output filenames above updated in commit 87ddf88.

@mcremone
Copy link
Owner Author

I forgot to mention that we also need to updated these corrections:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L311-L320

A good place to start would be asking the boosted Higgs team, or digging into their code:

https://github.com/nsmith-/boostedhiggs/tree/master/boostedhiggs

@alejands @ParticleChef

Here is the way to implement the new corrections:

https://github.com/jennetd/hbb-coffea/blob/master/boostedhiggs/corrections.py#L25-L47

@alejands you can take msdcorr.json from here:

https://github.com/jennetd/hbb-coffea/blob/master/boostedhiggs/data/msdcorr.json

@alejands
Copy link
Collaborator

The PR has been updated with the new msd corrections (commit 7471585).

The new get_msd_corr() function takes in fatjet coffea objects rather than pt and eta awkward arrays. The scripts in analysis/processors that call this function are updated accordingly, but have not been tested since the compatibility for these scripts has not been updated.

@mcremone
Copy link
Owner Author

I had a look and this still needs work.

  1. NLO corrections.

I noticed that only EWK corrections are implemented:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L290-L305

@ParticleChef can you confirm that this is because samples are already NLO in QCD?

Also, systematic variations need to be implemented. They can be taken from here:

https://github.com/mcremone/decaf/blob/master/analysis/utils/corrections.py#L248-L350

  1. JERC

This won't work unfortunately:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L602-L621

We need to implement this:

https://github.com/nsmith-/boostedhiggs/blob/master/boostedhiggs/build_jec.py

I'll open a new issue for this.

@mcremone
Copy link
Owner Author

I had a look and this still needs work.

  1. NLO corrections.

I noticed that only EWK corrections are implemented:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L290-L305

@ParticleChef can you confirm that this is because samples are already NLO in QCD?

Also, systematic variations need to be implemented. They can be taken from here:

https://github.com/mcremone/decaf/blob/master/analysis/utils/corrections.py#L248-L350

  1. JERC

This won't work unfortunately:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L602-L621

We need to implement this:

https://github.com/nsmith-/boostedhiggs/blob/master/boostedhiggs/build_jec.py

I'll open a new issue for this.

I took care of that. JERCs need to be updated to UL though:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L658-L799

@alejands can you check which are the recommendations?

@ParticleChef
Copy link
Collaborator

I had a look and this still needs work.

  1. NLO corrections.

I noticed that only EWK corrections are implemented:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L290-L305

@ParticleChef can you confirm that this is because samples are already NLO in QCD?

Also, systematic variations need to be implemented. They can be taken from here:

https://github.com/mcremone/decaf/blob/master/analysis/utils/corrections.py#L248-L350

Yes, I know that KIT people generated those samples in NLO QCD so NLO QCD corrections are not applied additionally.

@mcremone
Copy link
Owner Author

I had a look and this still needs work.

  1. NLO corrections.

I noticed that only EWK corrections are implemented:
https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L290-L305
@ParticleChef can you confirm that this is because samples are already NLO in QCD?
Also, systematic variations need to be implemented. They can be taken from here:
https://github.com/mcremone/decaf/blob/master/analysis/utils/corrections.py#L248-L350

  1. JERC

This won't work unfortunately:
https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L602-L621
We need to implement this:
https://github.com/nsmith-/boostedhiggs/blob/master/boostedhiggs/build_jec.py
I'll open a new issue for this.

I took care of that. JERCs need to be updated to UL though:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L658-L799

@alejands can you check which are the recommendations?

@alejands ping on this.

@mcremone
Copy link
Owner Author

I had a look and this still needs work.

  1. NLO corrections.

I noticed that only EWK corrections are implemented:
https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L290-L305
@ParticleChef can you confirm that this is because samples are already NLO in QCD?
Also, systematic variations need to be implemented. They can be taken from here:
https://github.com/mcremone/decaf/blob/master/analysis/utils/corrections.py#L248-L350

Yes, I know that KIT people generated those samples in NLO QCD so NLO QCD corrections are not applied additionally.

Good, but I believe we still want systematic variations. I re-implemented those.

@mcremone
Copy link
Owner Author

@ParticleChef
Copy link
Collaborator

I'm modifying btagging weight on corrections file with btagging json file. https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/utils/correctionsBTseperate.py#L35
I checked it works with 2018 efficiency file we produced.
But it should be checked if it works on setup of new version and also up and down case should be checked.
https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/utils/correctionsBTseperate.py#L53

@mcremone
Copy link
Owner Author

I'm modifying btagging weight on corrections file with btagging json file. https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/utils/correctionsBTseperate.py#L35 I checked it works with 2018 efficiency file we produced. But it should be checked if it works on setup of new version and also up and down case should be checked. https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/utils/correctionsBTseperate.py#L53

@ParticleChef I strongly suggest you use the btagging weight calculation I implemented in the latest version of corrections.py, there were a lot of things I fixed. Also, the version you have, as well as the current version of corrections.py, won't work with the new hist format, as I was commenting before. In order to fix that, this part should be changed:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L475-L487

@mcremone
Copy link
Owner Author

I'm modifying btagging weight on corrections file with btagging json file. https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/utils/correctionsBTseperate.py#L35
I checked it works with 2018 efficiency file we produced.
But it should be checked if it works on setup of new version and also up and down case should be checked.
https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/utils/correctionsBTseperate.py#L53

Also, on a separate note, I don't know which btageff2018.merged file you are using here:

https://github.com/ParticleChef/decaf/blob/forBtagw/analysis/utils/correctionsBTseperate.py#L42

If you are using the one was already in decaf that was obtained with pre-UL samples. If you generated you own using the KIT UL QCD samples, that wouldn't work either because, as of yesterday, the btageff processor was kind of incorrect. Also, a lot of KIT UL QCD root files are corrupted, and they make you coffea jobs crash. That means that even if you managed to run the incorrect processor over them, you're missing a lot of data that wasn't processed.

@ParticleChef
Copy link
Collaborator

I checked the new version of btag weight at correction file on your area today. I will use new version. And the btageff2018.merged file I used was generated by previous version of btageff.py file. So I tried again and got btageff file with latest version.

I have one question when draw the 2D plot of efficiency.
The hist stored in btageff2018.merged is stored with dictionary type that the keys are name of reduced file:

deepflav = hists['deepflav']
print(deepflav)

>>
{'TTTo2L2Nu_TuneCP5_13TeV-powheg-pythia8.reduced': Hist(
  StrCategory(['loose', 'medium', 'tight'], growth=True),
  StrCategory(['pass', 'fail'], growth=True),
  IntCategory([0, 4, 5, 6]),
  Variable([20, 30, 50, 70, 100, 140, 200, 300, 600, 1000]),
  Variable([0, 1.4, 2, 2.5]),
  storage=Double()) # Sum: 1111061577.0 (1111114875.0 with flow), 'TTToHadronic_TuneCP5_13TeV-powheg-pythia8.reduced': Hist(
  StrCategory(['loose', 'medium', 'tight'], growth=True),
  StrCategory(['pass', 'fail'], growth=True),
  IntCategory([0, 4, 5, 6]),
  Variable([20, 30, 50, 70, 100, 140, 200, 300, 600, 1000]),
  Variable([0, 1.4, 2, 2.5]),
  storage=Double()) # Sum: 4569717606.0 (4569910719.0 with flow), 'TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8.reduced': Hist(
  StrCategory(['loose', 'medium', 'tight'], growth=True),
  StrCategory(['pass', 'fail'], growth=True),
  IntCategory([0, 4, 5, 6]),
  Variable([20, 30, 50, 70, 100, 140, 200, 300, 600, 1000]),
  Variable([0, 1.4, 2, 2.5]),
  storage=Double()) # Sum: 4890351531.0 (4890566991.0 with flow)}

So it should be used like this:

deepflav = hists['deepflav']['TTTo2L2Nu_TuneCP5_13TeV-powheg-pythia8.reduced']
loose_pass = deepflav[{'wp': 'tight', 'btag':'pass'}]

Do you have any idea to merge all dataset?

@mcremone
Copy link
Owner Author

mcremone commented Feb 28, 2024

I checked the new version of btag weight at correction file on your area today. I will use new version. And the btageff2018.merged file I used was generated by previous version of btageff.py file. So I tried again and got btageff file with latest version.

I have one question when draw the 2D plot of efficiency. The hist stored in btageff2018.merged is stored with dictionary type that the keys are name of reduced file:

deepflav = hists['deepflav']
print(deepflav)

>>
{'TTTo2L2Nu_TuneCP5_13TeV-powheg-pythia8.reduced': Hist(
  StrCategory(['loose', 'medium', 'tight'], growth=True),
  StrCategory(['pass', 'fail'], growth=True),
  IntCategory([0, 4, 5, 6]),
  Variable([20, 30, 50, 70, 100, 140, 200, 300, 600, 1000]),
  Variable([0, 1.4, 2, 2.5]),
  storage=Double()) # Sum: 1111061577.0 (1111114875.0 with flow), 'TTToHadronic_TuneCP5_13TeV-powheg-pythia8.reduced': Hist(
  StrCategory(['loose', 'medium', 'tight'], growth=True),
  StrCategory(['pass', 'fail'], growth=True),
  IntCategory([0, 4, 5, 6]),
  Variable([20, 30, 50, 70, 100, 140, 200, 300, 600, 1000]),
  Variable([0, 1.4, 2, 2.5]),
  storage=Double()) # Sum: 4569717606.0 (4569910719.0 with flow), 'TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8.reduced': Hist(
  StrCategory(['loose', 'medium', 'tight'], growth=True),
  StrCategory(['pass', 'fail'], growth=True),
  IntCategory([0, 4, 5, 6]),
  Variable([20, 30, 50, 70, 100, 140, 200, 300, 600, 1000]),
  Variable([0, 1.4, 2, 2.5]),
  storage=Double()) # Sum: 4890351531.0 (4890566991.0 with flow)}

So it should be used like this:

deepflav = hists['deepflav']['TTTo2L2Nu_TuneCP5_13TeV-powheg-pythia8.reduced']
loose_pass = deepflav[{'wp': 'tight', 'btag':'pass'}]

Do you have any idea to merge all dataset?

The new version of corrections.py already ingests the new format and merges everything:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L487-L498

@ParticleChef
Copy link
Collaborator

I checked quickly that "deepJet_comb" has only 4 and 5 for hadron flavor in json file. So I should use "deepJet_incl" for light sf (hadron flavor 0).
I think this also cause the error. It is solved already?
https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L498

@mcremone
Copy link
Owner Author

mcremone commented Mar 5, 2024

It depends on what you are loading here:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L475-L476

Also, which error are you referring to?

@mcremone
Copy link
Owner Author

mcremone commented Mar 5, 2024

@ParticleChef
Copy link
Collaborator

I updated the codes and I got another error when compile the corrections.py file.
from correctionlib import convert has some issue:

[jhong@cmslpc115 analysis]$ python3 utils/corrections.py 
Traceback (most recent call last):
  File "utils/corrections.py", line 3, in <module>
    from correctionlib import convert
  File "/uscms/home/jhong/.local/lib/python3.8/site-packages/correctionlib/convert.py", line 19, in <module>
    from .schemav2 import (
  File "/uscms/home/jhong/.local/lib/python3.8/site-packages/correctionlib/schemav2.py", line 37, in <module>
    class Variable(Model):
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/py3-pydantic/1.8/lib/python3.8/site-packages/pydantic/main.py", line 287, in __new__
    fields[ann_name] = ModelField.infer(
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/py3-pydantic/1.8/lib/python3.8/site-packages/pydantic/fields.py", line 392, in infer
    return cls(
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/py3-pydantic/1.8/lib/python3.8/site-packages/pydantic/fields.py", line 327, in __init__
    self.prepare()
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/py3-pydantic/1.8/lib/python3.8/site-packages/pydantic/fields.py", line 432, in prepare
    self._type_analysis()
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/py3-pydantic/1.8/lib/python3.8/site-packages/pydantic/fields.py", line 532, in _type_analysis
    if issubclass(origin, Tuple):  # type: ignore
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/python3/3.8.2-bcolbf/lib/python3.8/typing.py", line 771, in __subclasscheck__
    return issubclass(cls, self.__origin__)
TypeError: issubclass() arg 1 must be a class

@mcremone
Copy link
Owner Author

mcremone commented Mar 7, 2024

To address this I have already changed the setup file:

https://github.com/mcremone/decaf/blob/UL/setup.sh#L8

@mcremone
Copy link
Owner Author

It needed a lot of work, but now the b-tagging class works.

@ParticleChef
Copy link
Collaborator

For nlo scale factor in correction.py, the method using extractor has no error.
Previous method occurs error with ValueError: object of too small depth for desired array from the line `nlo_ewk = get_nlo_ewk_weight['w'](ak.max(genWs.pt, axis=1))

nlo_ewk_hists = {
        'dy': ["* * data/vjets_SFs/merged_kfactors_zjets.root"],
        'w': ["* * data/vjets_SFs/merged_kfactors_wjets.root"],
        'z': ["* * data/vjets_SFs/merged_kfactors_zjets.root"],
        'a': ["* * data/vjets_SFs/merged_kfactors_gjets.root"],
}    
get_nlo_ewk_weight = {} 
for p in ['dy','w','z','a']:
    print(nlo_ewk_hists[p])
    ext = extractor()
    ext.add_weight_sets(nlo_ewk_hists[p])
    ext.finalize()
    get_nlo_ewk_weight[p] = ext.make_evaluator()["kfactor_monojet_ewk"]

@mcremone
Copy link
Owner Author

@ParticleChef There should not be any nlo_ewk = get_nlo_ewk_weight['w'](ak.max(genWs.pt, axis=1)) line in your processor anymore. I made modifications a couple of days ago and now it looks like this:

https://github.com/mcremone/decaf/blob/UL/analysis/processors/hadmonotopv2.py#L671-L676

Also, is extractor a functionality from coffea.lookup_tools? If yes, since you're modifying this part, it would be good to use correctionlib tools. You can follow this as an example:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L23-L39

@mcremone
Copy link
Owner Author

mcremone commented Apr 1, 2024

@ParticleChef There should not be any nlo_ewk = get_nlo_ewk_weight['w'](ak.max(genWs.pt, axis=1)) line in your processor anymore. I made modifications a couple of days ago and now it looks like this:

https://github.com/mcremone/decaf/blob/UL/analysis/processors/hadmonotopv2.py#L671-L676

Also, is extractor a functionality from coffea.lookup_tools? If yes, since you're modifying this part, it would be good to use correctionlib tools. You can follow this as an example:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L23-L39

I took care of it:

https://github.com/mcremone/decaf/blob/UL/analysis/utils/corrections.py#L369-L507

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants