Cluster charge cut, take two #162

areinsvo · 2018-08-27T16:53:45Z

The commits got messed up in PR161 The code on this branch is unchanged with respect to the previous pull request, but here the history should be fixed. Thanks @kmcdermo for your help.

This PR includes the implementation of the cluster charge cut (CCC) that I have been working on and presented at the last two group meetings. The default value of the cut is 1620, and the --apply-ccc flag has to be called when running writeMemoryFile in order for the cut to be applied. The flag --apply-ccc can also take an argument to change the value of the cut.

In addition, there is a minor change to tkNtuple/Makefile to properly include ../lib in the path. I've also included Mario's script to compare efficiencies, duplicate rates, and fake rates before and after changes. This can now be found in the plotting directory.

Merging this pull request should wait until the changes to the benchmarking procedure are merged and after phiphi is available for testing.

cerati · 2018-08-27T16:59:58Z

Do we want to have a place where the negative hit indices are documented?
This PR is adding '-9' to the list.

I am not suggesting to delay this PR, just bringing up the issue.

kmcdermo · 2018-08-27T18:04:26Z

@areinsvo thanks! This looks much better :). Feel free to close #161. From a review standpoint: this looks ready to go, although as already mentioned, lets wait until #154 and #160 are merged, rebase again (such fun), and then run the validation.

@cerati I think this is a fair point. Thankfully, with #160, we now have a pretty thorough documentation of the code, and I think we can add it to the list of things that need to be added after this PR and #160 are merged.

I will open an issue on what still needs to go into the README. Documenting the hit index meanings can also go in another .txt file, in the vein of cmssw-trackerinfo-desc.txt and validation-desc.txt, appending the list of resources in Section 8 of the README to point to this new file.

cerati · 2018-08-31T15:30:14Z

We need to remember to update the memory files when we merge this PR (otherwise it will have no effect...)

kmcdermo · 2018-08-31T16:05:11Z

We need to remember to update the memory files when we merge this PR (otherwise it will have no effect...)

For sure, this is a few line change in: ./xeon_scripts/benchmark-cmssw-ttbar-fulldet-build.sh and ./val_scripts/validation-cmssw-benchmarks.sh.

@slava77 Do we have the memory files with this cut for all the different samples we use? We also need to copy them to the disks of phi1 and phi2 for benchmarking. Or as @osschar , we could always alias just one set, and have the different platforms read across different machines... although this would convolute in I/O as well (especially since the compute tests are concurrent on different platforms).

samples (off the top of my head)

ttbar PU 70 HS
ttbar PU 35
ttbar noPU
10mu 0.5 < pt < 10
10 mu low pt

slava77 · 2018-08-31T16:07:26Z

@slava77 Do we have the memory files with this cut for all the different samples we use?

I didn't try to remake the files yet.
Should I?

areinsvo · 2018-08-31T16:21:14Z

@slava77 do you have a script to generate all of the necessary memory files? If not, could you specify which flags are used for the different naming conventions? I'll probably add a string to the name to indicate which value of the CCC was used.

It doesn't matter to me which of us makes them. I already have some of the files, but I would probably regenerate them anyway to be 100% sure the appropriate flags are set.

slava77 · 2018-08-31T16:29:44Z

On 8/31/18 9:21 AM, areinsvo wrote: @slava77 <https://github.com/slava77> do you have a script to generate all of the necessary memory files? If not, could you specify which flags are used for the different naming conventions? I'll probably add a string to the name to indicate which value of the CCC was used. It doesn't matter to me which of us makes them. I already have some of the files, but I would probably regenerate them anyway to be 100% sure the appropriate flags are set.

I produced two kinds of files in the past 1. saving only sim tracks with at least N (3?) hits ./writeMemoryFile --input $fIn --output $fOut --write-rec-tracks --clean-sim-tracks --write-all-events this is what .clean. stands for in the file names 2. files with all sim tracks from the tracking ntuple this does not have --clean-sim-tracks The one-liner to process everything for the last round was: find /pathToTrackingNtuples/ -name trackingNtuple.root | while read -r fIn; do fOut=`echo $fIn | sed -e 's/trackingNtuple.root/memoryFile.fv3.clean.writeAll.recT.081318-4551fbf.bin/g'`; [ -f "$fOut" ] && echo "$fOut exists: skipping" && continue; echo $fIn $fOut > ${fOut}.log; date >> ${fOut}.log; ./writeMemoryFile --input $fIn --output $fOut --write-rec-tracks --clean-sim-tracks --write-all-events --verbosity 2 | grep -v "^SKIP\|^lay=\|^track #" >> ${fOut}.log 2>&1; done

areinsvo · 2018-09-06T18:44:31Z

@slava77 apologies for not getting to this earlier in the week. Thanks for sending the command to generate the memory files. I don't have permission to write in the directories within /data2/slava77/samples/2017/pass-c93773a/initialStep/ but the files can be found at /home/users/areinsvo/memoryFiles. Would you be able to move them to your area so everything is together? I generated four memory files for each sample: with and without clean tracks x with and without the CCC. As @kmcdermo pointed out, these might need to be copied to phi1 and phi2 as well.

slava77 · 2018-09-10T23:44:09Z

tkNtuple/WriteMemoryFile.cc

+        {
+          applyCCC = true;
+	  if( next_arg_option(mArgs, i))
+	    {


the indentation looks off

slava77 · 2018-09-10T23:46:53Z

tkNtuple/WriteMemoryFile.cc

+	layerHits_[ilay].push_back(hit);
+	MCHitInfo hitInfo(simTkIdx, ilay, layerHits_[ilay].size()-1, totHits);
+	simHitsInfo_.push_back(hitInfo);
+	totHits++;


please indent the block inside the if

@slava77 I fixed the first indentation, but in lines 950 to 953 the block is already indented, as far as I can tell.

that's not what I see in the browser.
It seems to be a tab vs space issue.

areinsvo · 2018-09-11T20:31:00Z

It looked indented to me in the browser, but I made another commit where the indentation should be more obvious.

slava77 · 2018-09-11T20:41:02Z

It looked indented to me in the browser, but I made another commit where the indentation should be more obvious.

thank you.
it looks indented to me now.

slava77 · 2018-09-11T23:46:45Z

I finally finished redistributing the files

phi3:/data2/slava77/samples/2017/pass-c93773a/initialStep
phi2:/data1/work/slava77/samples/2017/pass-c93773a/initialStep
phi1:/data2/nfsmic/slava77/samples/2017/pass-c93773a/initialStep

perhaps someone with sudo could ln -s to avoid the differences.
Full file relocation may be less trivial because of possible underlying hardware differences.
I think that the files should be on the faster disks.

In each directory there is a

memoryFile.fv3.clean.writeAll.recT.082418-25daeda.bin without CCC
memoryFile.fv3.clean.writeAll.CCC1620.recT.082418-25daeda.bin with CCC

@areinsvo thank you for making these.
I didn't copy the versions without "clean.writeAll", because I don't think these are actually useful anymore.

areinsvo · 2018-09-12T21:28:49Z

I tried running the benchmark scripts on the new memory files, but I ran into an issue with the validation. With @kmcdermo's help I narrowed down the problem, but it hasn't been solved yet. The summary of the problem is below:

validation fails with MEIF + CCC 1620 file + CMSSW tracks for SIMVAL
validation works with all other combination of options, including
- MEIF + file without CCC 1620 + CMSSW tracks for SIMVAL
- MEIF + CCC 1620 file + std/ce tracks for SIMVAL
- no MEIF + CCC 1620 file + CMSSW tracks for SIMVAL

For now, the plots I have (everything except the SIMVAL plots) can be found at http://areinsvo.web.cern.ch/areinsvo/pull162/
I will generate the SIMVAL plots without MEIF and include those there as well.

…ation

areinsvo · 2018-09-13T18:36:35Z

The SIMVAL plots for events in flight = 1 can now be found at http://areinsvo.web.cern.ch/areinsvo/pull162/SIMVAL_NoMEIF/SIMVAL/

Also, I committed changes to the benchmark and validation scripts to point to the new memory files (thanks @slava77 for copying those), and I updated the documentation slightly to clarify a few minor issues I had setting up my website for plots.

* deregister context of TFile to allow writing from multiple threads * make TFile's and TTree's std::unique_ptr * set directory of TTree to 0 when initialized, then set to file at the end

…ex-desc.txt

Fix MEIF with ROOT validation + CCC cut

areinsvo · 2018-09-17T15:18:48Z

@kmcdermo fixed the issues with MEIF and found one issue with the CCC. His changes have been merged with my branch. The full set of validation plots after his changes can be found here:
https://kmcdermo.web.cern.ch/kmcdermo/mictrk/PR162_fixMEIFval_v2/

If you want to compare these to the validation plots for the new memory file without the cluster charge cut, those can be found here:
http://areinsvo.web.cern.ch/areinsvo/pull162NoCCC_AfterPR/

As expected, the CMSSW performance lines are exactly the same between the two memory files. The weird behavior we saw at the end of last week must have been due to the bugs that Kevin fixed.

kmcdermo · 2018-09-17T16:56:46Z

Ah, this is great! Okay, I will merge this now.

One point to mention: we noticed that the simtrack validation changed slightly between the two versions of the old and new memory files.

Compare:

This PR + old memory file: https://kmcdermo.web.cern.ch/kmcdermo/mictrk/pr160/forPR/SIMVAL/SKL-SP_CMSSW_TTbar_PU70_eff_eta_build_pt0p0_SIMVAL.png
This PR + new no CCC applied memory file: http://areinsvo.web.cern.ch/areinsvo/pull162NoCCC_AfterPR/SIMVAL/SKL-SP_CMSSW_TTbar_PU70_eff_eta_build_pt0p0_SIMVAL.png
This PR + new CCC applied memory file: https://kmcdermo.web.cern.ch/kmcdermo/mictrk/PR162_fixMEIFval_v2/SIMVAL/SKL-SP_CMSSW_TTbar_PU70_eff_eta_build_pt0p0_SIMVAL.png

As Allie already stated the cmssw track efficiency are the same in 2. and 3. (as they should be!), and the mkFit tracks move up in efficiency. This same effect was already seen with the old memory files, applying the CCC by hand.

However, both mkFit and cmssw tracks change w.r.t. to 1., although this looks mostly statistical. Perhaps different events/tracks were written to the memory file? Just looking at the top lines of the text file dumps, it is clear different tracks and/or events were saved:

So, I think this can be safely merged, although it would be great to understand why the memory file would write different tracks/events.

slava77 · 2018-09-17T17:03:47Z

On 9/17/18 9:56 AM, Kevin McDermott wrote: However, both mkFit and cmssw tracks change w.r.t. to 1., although this looks mostly statistical. Perhaps different events/tracks were written to the memory file? Just looking at the top lines of the text file dumps, it is clear different tracks and/or events were saved:

yes, the ttbar PU70HS sample was remade: the minbias events are not the same.

kmcdermo · 2018-09-17T17:24:46Z

ah, okay, thanks for the clarification!

areinsvo added 4 commits September 6, 2018 12:01

First attempt at implementing CCC

68e2bb1

Count failed CCC hits when marking sim tracks as findable

4b68679

Make CCC value configurable

fbd083c

Clean up code for pull request; add marios plotting script

25daeda

areinsvo force-pushed the CCCv2 branch from e848ee2 to 25daeda Compare September 6, 2018 19:17

slava77 reviewed Sep 10, 2018

View reviewed changes

areinsvo added 2 commits September 11, 2018 12:37

Fix indentation

f37f688

Fix indentation again

c2bef6e

Point benchmarks and val scripts to new memory files, update document…

d81102e

…ation

kmcdermo and others added 4 commits September 15, 2018 10:37

fix MEIF with ROOT validation:

b903242

* deregister context of TFile to allow writing from multiple threads * make TFile's and TTree's std::unique_ptr * set directory of TTree to 0 when initialized, then set to file at the end

do not get ==-9 for last hit on track during validation

8617a3b

update documentation for hit indices + track indices to new file: ind…

5939c1d

…ex-desc.txt

Merge pull request #1 from kmcdermo/CCCv2_fixMEIFval_v2

9453a9e

Fix MEIF with ROOT validation + CCC cut

kmcdermo merged commit 6dfdc46 into trackreco:devel Sep 17, 2018

areinsvo deleted the CCCv2 branch September 17, 2018 17:41

slava77 mentioned this pull request Feb 27, 2019

binary file feature requests 1Q18v1 #121

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster charge cut, take two #162

Cluster charge cut, take two #162

areinsvo commented Aug 27, 2018

cerati commented Aug 27, 2018

kmcdermo commented Aug 27, 2018

cerati commented Aug 31, 2018

kmcdermo commented Aug 31, 2018

slava77 commented Aug 31, 2018

areinsvo commented Aug 31, 2018

slava77 commented Aug 31, 2018 via email

areinsvo commented Sep 6, 2018

slava77 Sep 10, 2018

slava77 Sep 10, 2018

areinsvo Sep 11, 2018

slava77 Sep 11, 2018

areinsvo commented Sep 11, 2018

slava77 commented Sep 11, 2018

slava77 commented Sep 11, 2018

areinsvo commented Sep 12, 2018

areinsvo commented Sep 13, 2018

areinsvo commented Sep 17, 2018

kmcdermo commented Sep 17, 2018

slava77 commented Sep 17, 2018 via email

kmcdermo commented Sep 17, 2018

Cluster charge cut, take two #162

Cluster charge cut, take two #162

Conversation

areinsvo commented Aug 27, 2018

cerati commented Aug 27, 2018

kmcdermo commented Aug 27, 2018

cerati commented Aug 31, 2018

kmcdermo commented Aug 31, 2018

slava77 commented Aug 31, 2018

areinsvo commented Aug 31, 2018

slava77 commented Aug 31, 2018 via email

areinsvo commented Sep 6, 2018

slava77 Sep 10, 2018

Choose a reason for hiding this comment

slava77 Sep 10, 2018

Choose a reason for hiding this comment

areinsvo Sep 11, 2018

Choose a reason for hiding this comment

slava77 Sep 11, 2018

Choose a reason for hiding this comment

areinsvo commented Sep 11, 2018

slava77 commented Sep 11, 2018

slava77 commented Sep 11, 2018

areinsvo commented Sep 12, 2018

areinsvo commented Sep 13, 2018

areinsvo commented Sep 17, 2018

kmcdermo commented Sep 17, 2018

slava77 commented Sep 17, 2018 via email

kmcdermo commented Sep 17, 2018