Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of "Fix for FastSim decays of exotic-descendent SM particles decaying outside pipe" to 10_6_X #36352

Merged
merged 6 commits into from
Dec 21, 2021

Conversation

cericeci
Copy link

@cericeci cericeci commented Dec 3, 2021

PR description:

Backport of #36122 and #36324 for UL FastSim production. Full functionality of both, without the code formatting.

This fixes a bug which would invalidate simulation for long-lived and a fraction of prompt signals. Our understanding is that this case doesn't go against the no-change policy.

Original PR description: Jaebak found that some b- and d-mesons produced in the decays of exotic particles, which cross the beam pipe radius, have their decay performed twice, with sim hits created for each copy. This causes problems for some high-pT b-jets, as well as the MET in events that contain such jets. A few lines are added to ParticleManager.cc which results in FastSim deciding to not decay such particles further, and rather take only the original generator-determined decay trees. FastSim will still propagate these particles and create secondaries/material interactions, but it won't decay them.

If this PR is a backport please specify the original PR and why you need to backport that PR:

#36122 and #36324, needed for UL MC production

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 3, 2021

A new Pull Request was created by @cericeci (Carlos Erice) for CMSSW_10_6_X.

It involves the following packages:

  • FastSimulation/SimplifiedGeometryPropagator (fastsim)

@ssekmen, @lveldere, @civanch, @mdhildreth, @cmsbuild, @sbein can you please review it and eventually sign? Thanks.
@matt-komm this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@perrotta
Copy link
Contributor

perrotta commented Dec 3, 2021

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 3, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b01498/20972/summary.html
COMMIT: af8e61e
CMSSW: CMSSW_10_6_X_2021-11-30-1100/slc7_amd64_gcc700
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/36352/20972/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 3215686
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3215350
  • DQMHistoTests: Total skipped: 334
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 143 log files, 29 edm output root files, 35 DQM output files
  • TriggerResults: no differences found

@sbein
Copy link
Contributor

sbein commented Dec 6, 2021

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 6, 2021

This pull request is fully signed and it will be integrated in one of the next CMSSW_10_6_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_2_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Dec 6, 2021

This fixes a bug which would invalidate simulation for long-lived and a fraction of prompt signals. Our understanding is that this case doesn't go against the no-change policy.

Uhm... I would say that the bug must get fixed.

On the other hand, it is a bit unclear to me how can this fulfill the "no-change" rule:

  • Will the affected and already produced fastsim samples get invalidated?
  • How can one distinguish the bugged from the corrected ones?

Even if this is FastSim, I have the impression that some traceable modifier should be allowed, to track which production was used for it, and make them distinguishable from the bugged ones.

What PPD or Phyiscs coordinations suggest for it?
@rappoccio @arizzi (etc.)

@sbein
Copy link
Contributor

sbein commented Dec 6, 2021

@perrotta yes it will change some physics with high-pT jets descending from exotics. The linked slides identify all of the data sets that were produced with the affected releases, and the SUS conveners and myself are in contact with the analyzers who requested those samples. We are working on a possible recipe may need to apply when using the affected samples for interpretation. The traceable modifier is a good idea for any samples that need to be reproduced. @vdutta or @peruzzim could potentially weigh in here, but I think we're still figuring out the best strategy for one or two of the analysis teams.

This topic was discussed in the last and today's SUS PAG meeting
https://indico.cern.ch/event/1100680/contributions/4642309/attachments/2359631/4027619/FastSIm_LLSMParticleIssue_red2.pdf

and in the last SIM meeting on Friday.

@cericeci
Copy link
Author

cericeci commented Dec 6, 2021

@perrotta Right now we are using the production release to identify the samples that suffer from the bug. In the spirit of the point you make... this is less than ideal for analyzers.

I'm not quite sure on how to implement -technically- such modifier as the bug was introduced in a relatively recent backport and only affects a small fraction of already existing ones (for context, around 10 out of 200 of samples in 2018 have the bug). Thus I understand we would need to be able to tag separately two sets of already finished samples (one with the bug and one without). My understanding is such a thing won't be doable inside cmssw and so we need to discuss with PPD, right?

@qliphy
Copy link
Contributor

qliphy commented Dec 8, 2021

So the issue tracks back to #31285 and its backports in Sep, 2020.

As for the concerns brought by @perrotta :

  • Will the affected and already produced fastsim samples get invalidated?

  • How can one distinguish the bugged from the corrected ones?

Once this PR is merged and a new release is made, probably @cms-sw/pdmv-l2 can update the production campaigns with specific tag in the dataset name, and eventually the buggy samples can be invalidated?

@rappoccio @arizzi any comment?

@cericeci @sbein How urgent is a new release with this fix for fastsim production?

@cericeci
Copy link
Author

cericeci commented Dec 8, 2021

Hi @qliphy, we waited to submit several requests of UL signal MC to avoid the issue so it is quite urgent. In an ideal world we would like to have this before the break so we can send the requests before then. I do understand that would be significantly tight for all parties involved -and likely unfeasible- but it is still quite pressing for us to get this merged.

@perrotta
Copy link
Contributor

perrotta commented Dec 9, 2021

@cericeci @sbein we all agree that this fix must be backported and made available for the forthcoming fastsim productions.
The point is: should it be steered by a modifier, so that we can keep unambiguously track of the different productions (in which case a modification of this PR is needed), or we shouldn't care about it?
If PPD and Physics agree that we don't care about mixing fastsim productions in the same release and with different physics outputs (e.g. because the bugged ones can get invalidated), a "+1" will follow immediately.
Otherwise, an update of this and all other similar backport PRs is in order.

@sbein
Copy link
Contributor

sbein commented Dec 9, 2021

@perrotta, @qliphy, indeed it is urgent to resubmit some samples. However, the plan is for some of the samples produced with the bug to still be used by analyses, with a special treatment applied at the miniAOD level. Basically an event mask and weight. This is still being discussed in SIM and SUS general meetings though, and would likely only apply to a small number of samples. In light of this, we mostly need to make sure that the weights are applied when analysers use the affected samples, and I'm not sure if invalidating data sets will be necessary.

@perrotta
Copy link
Contributor

@perrotta I'm running tests right now, I'll be able to confirm (or deny) in an hour. Is that ok?

Great, thank you Carlos! Yes, that's perfect.
(Please confirm also in #36509 that the current status of the PR is fine with you and no further updates are expected, if so)

@cericeci
Copy link
Author

@perrotta I ported 0e205c3 and bd81676 to CMSSW_12_3_X_2021-12-16-1100 and checked the effect of the modifier in one of the buggy workflows (TChiHH, as in https://cms-pdmv.cern.ch/mcm/public/restapi/requests/get_setup/SUS-RunIIAutumn18FSPremix-00170, just the cmsDriver command). By adding --procModifiers fastSimDisableLongLivedBug to the cmsDriver command I see that the fix is activated and we avoid the issues with duplicated simulated hits. From my side I'm satisfied with the results and I'd proceed

@perrotta
Copy link
Contributor

@cericeci sorry, but I'm interested in knowing that the whole machinery works in 10_6, not in 12_3!
Please run your test in one of the latest 10_6 IB instead (e.g. CMSSW_10_6_X_2021-12-19-2300)

@sbein
Copy link
Contributor

sbein commented Dec 20, 2021

Thanks @cericeci, @perrotta. As a small point, I was wondering if the name fastSimDisableLongLivedBug[_cff.py] is a bit ambiguous. I think of fixing a bug or disabling a bug fix. But what is disabling a bug?

@cericeci
Copy link
Author

@cericeci , as the last check: did you try to run one of your workflows in 12_3_X IB + this PR and by making use of the modifier in the matrix (or cmsDriver) call? Did its output satisfy you?

Sorry @perrotta I got confused by this one. I tested again in CMSSW_10_6_X_2021-12-20-2300 and it works as well

@perrotta
Copy link
Contributor

@cericeci , as the last check: did you try to run one of your workflows in 12_3_X IB + this PR and by making use of the modifier in the matrix (or cmsDriver) call? Did its output satisfy you?

Sorry @perrotta I got confused by this one. I tested again in CMSSW_10_6_X_2021-12-20-2300 and it works as well

Ah, yes, you are right: there was a typo in my message, indeed...

@perrotta
Copy link
Contributor

Thanks @cericeci, @perrotta. As a small point, I was wondering if the name fastSimDisableLongLivedBug[_cff.py] is a bit ambiguous. I think of fixing a bug or disabling a bug fix. But what is disabling a bug?

@cericeci I must say that I agree with @sbein: what about calling it fastSimFixLongLivedBug instead, or anything more easily understandable? Can you quickly modify it? Then the PR can be merged, and a new 10_6 release built.

@cmsbuild
Copy link
Contributor

Pull request #36352 was updated. @perrotta, @ssekmen, @lveldere, @civanch, @mdhildreth, @cmsbuild, @sbein, @qliphy, @fabiocos, @davidlange6 can you please check and sign again.

@cericeci
Copy link
Author

@Perrota I changed as suggested, I'm travelling and depending on my phone's connection so it is unlikely I'll be able to do anything until ~11pm

@Perrota
Copy link

Perrota commented Dec 21, 2021

@Perrota I changed as suggested, I'm travelling and depending on my phone's connection so it is unlikely I'll be able to do anything until ~11pm

I think you meant to mention @perrotta

@perrotta
Copy link
Contributor

please test
(Thank you @cericeci )

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b01498/21419/summary.html
COMMIT: a58de5c
CMSSW: CMSSW_10_6_X_2021-12-20-2300/slc7_amd64_gcc700
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/36352/21419/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 3215686
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3215350
  • DQMHistoTests: Total skipped: 334
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 143 log files, 29 edm output root files, 35 DQM output files
  • TriggerResults: no differences found

@perrotta
Copy link
Contributor

+1

  • @sbein the last commit only renamed the modifier, also following your suggestion: I take your signature as granted, since you already signed this PT before that commit

@perrotta
Copy link
Contributor

merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants