Disabled the TED and TRE tables by default. #30818

aehart · 2020-07-18T15:20:16Z

PR description:

Addresses issues #30742 and #30744. After discussion with @skinnari and @tomalin, we decided that the tables in the TrackletEngineDisplaced and TripletEngine can be disabled by default. These tables are meant to reduce the occupancy for downstream modules while minimally impacting the ultimate tracking efficiency. This will be necessary for the eventual FPGA design, but given the problems with these tables in terms of memory usage and the effect on tracking efficiency, we think disabling them in the emulation is the best solution for now.

I also increased the maximum number of tracklets in the case of the extended algorithm. This is a temporary change that is necessary to handle the additional tracklets that will eventually be removed by the now-disabled tables.

Finally, there are a few other minor changes meant to reduce the memory usage of these tables. Several other changes are foreseen in the near future. But since the tables are disabled by default now, this is somewhat moot for the time being.

PR validation:

I ran the L1TrackNtupleMaker analyzer over 100 ttbar events with 200 PU. With the baseline, non-extended algorithm, the results are exactly identical, as expected. With the extended algorithm, I see approximately half the memory usage in top, and the memory usage does not appear to grow over time as seen before.

I also ran over a sample of displaced muons with no PU with the extended algorithm. With the tables disabled, as is now the default, we see good efficiency for tracks with d₀ up to ~5 cm.

cmsbuild · 2020-07-18T15:20:38Z

A new Pull Request was created by @aehart (Andrew Hart) for CMSSW_11_1_X.

It involves the following packages:

L1Trigger/TrackFindingTracklet

@cmsbuild, @rekovic, @benkrikler can you please review it and eventually sign? Thanks.
@Martin-Grunewald this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

dpiparo · 2020-07-19T06:07:18Z

@aehart this code will be ran on the grid. For this reason, before running any production, we need to know

the cpu efficiency of the production job after this change
the RSS footprint of the L1 step as we imagine it in production (it has to be decoupled from Reco since it requires too many resoureces)
thanks.

davidlange6 · 2020-07-20T12:59:23Z

I would propose that this PR removes the python config parameter that can enable the massive data structure via a "simple python change pull request" (TM).

aehart · 2020-07-20T14:51:31Z

@dpiparo I tested a single-threaded (because L1FPGATrackProducer is an edm::one::EDProducer) job running only the L1TrackTrigger step over 10 ttbar events with 200 PU:

I'm not sure what the best way to measure CPU and real time is, but I used the FastTimerService to write out a summary JSON. With that I see the following efficiencies:
- total: 213794.773163 / 217487.481959 = 98.3%
- TTTracksFromTrackletEmulation: 17710.210555 / 17716.911111 = 100%
- TTTracksFromExtendedTrackletEmulation: 21206.114799 / 21257.489168 = 99.8%
So the extended algorithm is marginally less efficient than the baseline algorithm, which was not touched, but the overall efficiency of this step looks pretty good to me. The efficiency reported by time for the same job is 90%, I guess because it includes additional CMSSW overhead that the FastTimerService is able to account for. But if there's a better way to measure these times, please let me know.
For the RSS footprint, I used igprof and the IgProfService to output the memory profile after 5/10 events, and the total usage at that point is 1.4 GB. This agrees with what I observe with top during running.

If you need any additional information or if you want the tests run again with different parameters, please let me know.

@davidlange6 I'm not sure I understand what you mean. These TED and TRE tables, which were responsible for the huge memory usage before, can now only be enabled by flipping a switch in Settings.h and recompiling.

davidlange6 · 2020-07-20T14:56:59Z

On Jul 20, 2020, at 4:51 PM, Andrew Hart <[email protected]<mailto:[email protected]>> wrote: @dpiparo<https://github.com/dpiparo> I tested a single-threaded (because L1FPGATrackProducer is an edm::one::EDProducer) job running only the L1TrackTrigger step over 10 ttbar events with 200 PU: 1. I'm not sure what the best way to measure CPU and real time is, but I used the FastTimerService to write out a summary JSON. With that I see the following efficiencies: * total: 213794.773163 / 217487.481959 = 98.3% * TTTracksFromTrackletEmulation: 17710.210555 / 17716.911111 = 100% * TTTracksFromExtendedTrackletEmulation: 21206.114799 / 21257.489168 = 99.8% So the extended algorithm is marginally less efficient than the baseline algorithm, which was not touched, but the overall efficiency of this step looks pretty good to me. The efficiency reported by time for the same job is 90%, I guess because it includes additional CMSSW overhead that the FastTimerService is able to account for. But if there's a better way to measure these times, please let me know. 2. But do your FastTimerService results include the framework stalling while your module to runs? Since its the slowest thing in L1 and a one module, those “CMSSW overheads” are likely partially from L1FPGATrackProducer) 1. For the RSS footprint, I used igprof and the IgProfService to output the memory profile after 5/10 events, and the total usage at that point is 1.4 GB. This agrees with what I observe with top during running. If you need any additional information or if you want the tests run again with different parameters, please let me know. @davidlange6<https://github.com/davidlange6> I'm not sure I understand what you mean. These TED and TRE tables, which were responsible for the huge memory usage before, can now only be enabled by flipping a switch in Settings.h and recompiling. Great- I misunderstood before — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#30818 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABGPFQ2WMYSHBXR4I6LRSHDR4RKYLANCNFSM4PAA36GQ>.

silviodonato · 2020-07-20T15:50:14Z

please test

cmsbuild · 2020-07-20T15:50:38Z

The tests are being triggered in jenkins.

CMSSW_11_1_X_2020-07-20-1100/slc7_amd64_gcc820: https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/8136/console Started: 2020/07/20 17:58

aehart · 2020-07-20T16:01:23Z

But do your FastTimerService results include the framework stalling while your module to runs? Since its the slowest thing in L1 and a one module, those “CMSSW overheads” are likely partially from L1FPGATrackProducer)

I'm honestly not sure. I admit to using it without fully understanding how it works (shame 🔔).

But looking at the results that it spits out (which you can find at /afs/cern.ch/user/a/ahart/public/1thread/resources.json on LXPLUS), I see CPU and real times for each module that runs, including PoolSource and PoolOutputModule, as well as an "other" category, which includes some kind of overhead. And then the times for the "total" are just the sums for all the modules plus the "other" category.

Is there a better way to measure the timing?

cmsbuild · 2020-07-20T17:13:55Z

+1
Tested at: 0fe1975
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c7382d/8136/summary.html
CMSSW: CMSSW_11_1_X_2020-07-20-1100
SCRAM_ARCH: slc7_amd64_gcc820

cmsbuild · 2020-07-20T17:13:58Z

Comparison job queued.

cmsbuild · 2020-07-20T19:00:11Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c7382d/8136/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 2 differences found in the comparisons
DQMHistoTests: Total files compared: 36
DQMHistoTests: Total histograms compared: 2780792
DQMHistoTests: Total failures: 3
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2780739
DQMHistoTests: Total skipped: 50
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 35 files compared)
Checked 152 log files, 16 edm output root files, 36 DQM output files

silviodonato · 2020-07-21T12:45:24Z

@aehart please make a PR also for master (ie. CMSSW_11_2_X)

aehart · 2020-07-21T14:01:18Z

@silviodonato Done: #30847

cmsbuild · 2020-07-22T23:14:55Z

Pull request #30818 was updated. @cmsbuild, @rekovic, @benkrikler can you please check and sign again.

silviodonato · 2020-07-27T10:17:19Z

please test

cmsbuild · 2020-07-27T10:17:46Z

The tests are being triggered in jenkins.

CMSSW_11_1_X_2020-07-27-1100/slc7_amd64_gcc820: https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/8303/console Started: 2020/07/27 13:02

cmsbuild · 2020-07-27T12:20:30Z

+1
Tested at: 989cf34
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c7382d/8303/summary.html
CMSSW: CMSSW_11_1_X_2020-07-27-1100
SCRAM_ARCH: slc7_amd64_gcc820

cmsbuild · 2020-07-27T12:20:33Z

Comparison job queued.

cmsbuild · 2020-07-27T14:07:02Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c7382d/8303/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 0 differences found in the comparisons
DQMHistoTests: Total files compared: 36
DQMHistoTests: Total histograms compared: 2780792
DQMHistoTests: Total failures: 1
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2780741
DQMHistoTests: Total skipped: 50
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 35 files compared)
Checked 152 log files, 16 edm output root files, 36 DQM output files

cmsbuild · 2020-07-28T11:41:07Z

Pull request #30818 was updated. @cmsbuild, @rekovic, @benkrikler can you please check and sign again.

silviodonato · 2020-08-06T09:30:54Z

Since we started the re-reco and re-L1 of the HLT TDR samples, I think we can close this PR.
It was discussed at the ORP meeting to avoid to merge this backport - unless we get further memory issues - because it might affect the tracking efficiency.
Please let me know if we need to open it back.
@rekovic

tomalin · 2020-08-06T12:35:25Z

I leave it to @aehart to comment further. I'm surprised by your decision to reject this PR. I understood that even with single-threaded running, these tables waste about 2 GB of memory, which must make running on the GRID challenging. And Andrew Hart showed that disabling them doesn't adversely affect the displaced L1 tracking efficiency. (For prompt L1 tracking, these tables are irrelevant).

aehart · 2020-08-06T14:19:46Z

I'm also surprised by this decision. My understanding was that this was a necessary and urgent change to enable the extended L1 tracking to run for production jobs. If the extended tracking will not be run for production jobs, then it doesn't matter either way. But I thought the goal was to have it run.

skinnari · 2020-08-19T12:53:08Z

@rekovic @silviodonato we are all surprised that this was not integrated, as it seemed it was an urgent and necessary requirement for running production jobs? can someone please clarify, so that we know how to proceed? should this be opened as a PR to master instead at least?

aehart · 2020-08-19T15:10:49Z

@skinnari regarding the last question, there is an open PR to the master branch: #30847

It would still be good though to get clarification on the decision made for this PR and what we need to do to get #30847 merged @rekovic @silviodonato

Added a flag for enabling the application of the TED and TRE tables.

0fe1975

cmsbuild added this to the CMSSW_11_1_X milestone Jul 18, 2020

cmsbuild added comparison-pending l1-pending orp-pending pending-signatures tests-pending labels Jul 18, 2020

aehart mentioned this pull request Jul 18, 2020

Igprof analysis of 11_1_0_p3_ROOT618 : trklet::TrackletEngineDisplaced::readTables() #30742

Closed

cmsbuild added tests-started and removed tests-pending labels Jul 20, 2020

cmsbuild added tests-approved and removed tests-started labels Jul 20, 2020

cmsbuild added comparison-available and removed comparison-pending labels Jul 20, 2020

aehart mentioned this pull request Jul 21, 2020

Added a flag for enabling the application of the TED and TRE tables. #30847

Closed

Fixed code format issues.

50faab3

cmsbuild removed the comparison-available label Jul 21, 2020

cmsbuild added comparison-pending tests-pending and removed comparison-available tests-approved labels Jul 22, 2020

cmsbuild added tests-started and removed tests-pending labels Jul 27, 2020

cmsbuild added tests-approved and removed tests-started labels Jul 27, 2020

cmsbuild added comparison-available and removed comparison-pending labels Jul 27, 2020

Removed confusing comment.

23f4adc

cmsbuild added comparison-pending tests-pending and removed comparison-available tests-approved labels Jul 28, 2020

silviodonato closed this Aug 6, 2020

aehart mentioned this pull request Oct 18, 2020

Disabled the TED and TRE tables by default. cms-L1TK/cmssw#47

Merged

aehart deleted the extended_tables_11_1_X branch October 23, 2020 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disabled the TED and TRE tables by default. #30818

Disabled the TED and TRE tables by default. #30818

aehart commented Jul 18, 2020

cmsbuild commented Jul 18, 2020

dpiparo commented Jul 19, 2020

davidlange6 commented Jul 20, 2020

aehart commented Jul 20, 2020

davidlange6 commented Jul 20, 2020 via email

silviodonato commented Jul 20, 2020

cmsbuild commented Jul 20, 2020 •

edited

Loading

aehart commented Jul 20, 2020

cmsbuild commented Jul 20, 2020

cmsbuild commented Jul 20, 2020

cmsbuild commented Jul 20, 2020

silviodonato commented Jul 21, 2020

aehart commented Jul 21, 2020

cmsbuild commented Jul 22, 2020

silviodonato commented Jul 27, 2020

cmsbuild commented Jul 27, 2020 •

edited

Loading

cmsbuild commented Jul 27, 2020

cmsbuild commented Jul 27, 2020

cmsbuild commented Jul 27, 2020

cmsbuild commented Jul 28, 2020

silviodonato commented Aug 6, 2020 •

edited

Loading

tomalin commented Aug 6, 2020

aehart commented Aug 6, 2020

skinnari commented Aug 19, 2020

aehart commented Aug 19, 2020

Disabled the TED and TRE tables by default. #30818

Disabled the TED and TRE tables by default. #30818

Conversation

aehart commented Jul 18, 2020

PR description:

PR validation:

cmsbuild commented Jul 18, 2020

dpiparo commented Jul 19, 2020

davidlange6 commented Jul 20, 2020

aehart commented Jul 20, 2020

davidlange6 commented Jul 20, 2020 via email

silviodonato commented Jul 20, 2020

cmsbuild commented Jul 20, 2020 • edited Loading

aehart commented Jul 20, 2020

cmsbuild commented Jul 20, 2020

cmsbuild commented Jul 20, 2020

cmsbuild commented Jul 20, 2020

silviodonato commented Jul 21, 2020

aehart commented Jul 21, 2020

cmsbuild commented Jul 22, 2020

silviodonato commented Jul 27, 2020

cmsbuild commented Jul 27, 2020 • edited Loading

cmsbuild commented Jul 27, 2020

cmsbuild commented Jul 27, 2020

cmsbuild commented Jul 27, 2020

cmsbuild commented Jul 28, 2020

silviodonato commented Aug 6, 2020 • edited Loading

tomalin commented Aug 6, 2020

aehart commented Aug 6, 2020

skinnari commented Aug 19, 2020

aehart commented Aug 19, 2020

cmsbuild commented Jul 20, 2020 •

edited

Loading

cmsbuild commented Jul 27, 2020 •

edited

Loading

silviodonato commented Aug 6, 2020 •

edited

Loading