-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add DT and CSC rechits to AOD content #40251
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40251/33302
|
A new Pull Request was created by @kakwok for master. It involves the following packages:
@cmsbuild, @mandrenguyen, @clacaputo can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
enable profiling |
please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-58b334/29499/summary.html Comparison SummarySummary:
|
Question: If this PR is merge, do we still need |
@kakwok could you please measure the size increase using wf |
@srimanob : we do still need the EXOCSCCluster skim because it facilitates analyses that can use the HMT triggers and also includes additional RAW information that can be used to further develop and improve the MDS reconstruction. The skim is small. For 2022 the total size was less than 30TB. They do not have to be centrally requested to DISK and can be treated the same AOD. Thanks. |
Hi @clacaputo, I tried running the workflow on lxplus with
|
Hi @kakwok , if I run |
@clacaputo I ran
|
Please add |
Hi @clacaputo,
The size increase is ~1% for the AOD output with the added DT and CSC rechits. |
Hi @clacaputo, all |
Hi @clacaputo @namapane : indeed, I agree with Nicola, skimming proposals would drop potentially critical information, and we will be forced again to go back to RAW to develop the analysis and improvements. The Muon Detector Shower signature, and the clustering algorithms are still very new and we think there are ways to improve them. Saving the rechits will enable the collaboration to develop new algorithms using low level rechit information. Could we proceed with that? |
Hi all, I certainly don't object to adding CSC rechits to AOD - as shown by the quantitative numbers quoted earlier in this thread they seem to give a negligible increase in size - and having them in AOD might even make AOD a useful data tier for CSC itself. (As an aside, I presume CSC segment and rechits - at least from reco'ed muons - are already in AOD?) When discussing 'skimming rechits' is the suggestion to i) skim all the CSC rechit information from specific events, or ii) skim partial rechit information from events (i.e. split the rechit data structure)? If there are specific questions you think I can address, please let me know. |
The reason that we are discussing this issue, is that a 1% increase in the size of AOD is not negligible. The collaboration is working extremely hard to reduce the size of our data, and every percent counts. |
So what you want to do is extract a couple of simple 'properties' of
rechits (I presume id and position) and pretend that's sensible?
I don't think that's a good thing to do - people already tend to think
rechits have more physical reality than is strictly warranted, and if
you remove all vestige of their properties you will have no idea how
credible or reliable they are.
… Matthew Nguyen ***@***.***>
January 10, 2023 at 16:32
The reason that we are discussing this issue, is that a 1% increase in
the size of AOD is not negligible. The collaboration is working
extremely hard to reduce the size of our data, and every percent counts.
If there was a way to produce a reduced rec hit collection, as done
for example for the ECAL, that would certainly be helpful.
—
Reply to this email directly, view it on GitHub
<#40251 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGYLHSD6QXWEUFUFNANKNDWRV6KJANCNFSM6AAAAAASVZSL64>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Given what @ptcox and @namapane said above, I think it's not feasible to take partial information from the rechits only. As @arturapresyan said above, and was also presented in my talk when we met a few weeks ago, having this extra 1% size would allow the collaboration to significantly expand the long-lived particle search program in a way that would allow many additional groups and analyzers to contribute to that program. Without this addition, we are currently relying on myself only to be able to process the data with the needed information starting from RAW, which presents a significant bottle neck to fully exploit this new object/signature. By taking advantage of the automatic tape-staging request feature in CRAB, this extra 1% can be kept primarily on tape only and would only occupy disk-space temporarily following the CRAB tape-stage request and are automatically cleaned after a short period (I think it's 30 days or so). It would great if we can proceed with this change so that we can make it into the 2023 data and Monte Carlo production. Thanks! |
Storing all rechits in AOD is not ideal. |
Dear @mandrenguyen : yes, we will work with muon POG and DPG and EXO PAG to define shower objects that will eventually go into miniAOD. We need rechits in AOD so that we can develop those algorithms. Thanks for help! |
Thanks a lot @ptcox , @namapane for your comments. @arturapresyan , we could merge this PR, but it is important to have a defined plan for the development of the "shower objects"
It would be good to have EPR tasks allocated for this |
Dear @clacaputo : ok, it's great to hear that you will merge this PR! Having the rechits in AOD will be critical for developments using new data. Thanks a lot! |
@clacaputo : And it would be great if some EPR could be assigned to this task so that we ensure it gets done. |
+reconstruction
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
I'm a bit confused here. I'm not really sure that a temporary data formats change that is expected to go away in the near future is optimal. Do we expect that this will subsequently disappear in favor of a reduced-size collection? |
Hi @rappoccio this PR should not change any data format, but save DT and CSC rec-hits in the AOD. These rec-hits will be used from exo people for new signature and for developing new algorithms for "muon shower" reconstruction |
+1
|
PR description:
This PR adds CSC/DT rechits to the AOD datatier, which are the essential inputs for multiple searches of LLP decaying in the muon system and evaluating related trigger performances
The DT/CSC rechits has been added to the AOD in the b-parking dataset from this PR: #34066
The overall data size increase are ~1.7%
PR validation:
To be presented at https://indico.cern.ch/event/1228914/#6-adding-csc-and-dt-rechits-to