-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reco::Candidate pt = -nan in some rare cases #39110
Comments
A new Issue was created by @swagata87 Swagata Mukherjee. @Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
thanks to @shdutta16 and @Prasant1993 for bringing this to my notice, this came up while training new photon IDs for Run3. FYI @cms-sw/pf-l2 |
@swagata87 |
assign reconstruction |
New categories assigned: reconstruction @jpata,@clacaputo,@mandrenguyen you have been requested to review this Pull request/Issue and eventually sign? Thanks |
Looping over particleFlow candidates ( |
considering that it was a pion, is there a nan in the |
yes the I do not see any In one event, both photons have: In the other event: |
assign pf |
As an intermediate solution, what we can do is to not consider such problematic PF candidates in isolation sum of photons. This way we won't have nan isolation for photons anymore. This does not address the root cause of why such PF candidates are reconstructed in the first place. But it's better than nothing, and serves the purpose for egamma. |
right so now I checked that the issue also happen (rarely though) in data (checked in 2022C). |
okay after a few more checks, the real issue seem to here: cmssw/RecoParticleFlow/PFProducer/src/PFCandConnector.cc Lines 159 to 161 in 6d2f660
before doing that division, we should check that the divisor is with this patch, the issue seems to be solved: diff --git a/RecoParticleFlow/PFProducer/src/PFCandConnector.cc b/RecoParticleFlow/PFProducer/src/PFCandConnector.cc
index 6d3d1fa9ec8..bd96361516e 100644
--- a/RecoParticleFlow/PFProducer/src/PFCandConnector.cc
+++ b/RecoParticleFlow/PFProducer/src/PFCandConnector.cc
@@ -156,9 +156,10 @@ void PFCandConnector::analyseNuclearWPrim(PFCandidateCollection& pfCand,
const math::XYZTLorentzVectorD& momentumPrim = primaryCand.p4();
- math::XYZTLorentzVectorD momentumSec;
+ math::XYZTLorentzVectorD momentumSec(0,0,0,0);
- momentumSec = momentumPrim / momentumPrim.E() * (primaryCand.ecalEnergy() + primaryCand.hcalEnergy());
+ if ( (momentumPrim.E() * (primaryCand.ecalEnergy() + primaryCand.hcalEnergy())) > 0.0 )
+ momentumSec = momentumPrim / momentumPrim.E() * (primaryCand.ecalEnergy() + primaryCand.hcalEnergy());
map<double, math::XYZTLorentzVectorD> candidatesWithTrackExcess;
map<double, math::XYZTLorentzVectorD> candidatesWithoutCalo; I did not check the effect of the above patch on other objects (jet/met/tau etc) |
@swagata87 wouldn't it be enough checking that |
Thanks a lot @swagata87. I don't really understand how the energy can be zero here, is the p4 also fully zero? Can you share the setup you're using for testing (just rerunning reach on the RAW event you shared above)? |
Hello @perrotta, you are right; |
yeah.. no idea why
yes I am running the RECO step on RAW. Below is the setup I have:
Added
Added |
Even though #39120 will mitigate it for photons, I could imagine there's a high likelihood a NaN PFCandidate will mess up also other things. Therefore, @cms-sw/pf-l2 @laurenhay it might be useful to consider a fix in PF on a short timescale (or understanding if the issue comes from somewhere upstream in reco). |
Hi @jpata yes I am taking a look. We seem to have candidates with 0's for kinematics before the corrections which puts them to nans. Trying to understand and fix this early in the chain, will have news later this week. |
Hi all, sorry, it took me a bit longer to reproduce the error and figure out a suitable debug setup to chase down what's happening, but I think I have some useful info now. The nan comes from the line identified by Swagata, in a step where the nuclear interactions are recombined into a single candidate. The original issue, however, is coming from a candidate with zero four momentum. Looking at event 355872:546279379 from the file I believe what's happening in these lines is a rescaling the kinematics of all charged hadron tracks simultaneously such that the sum of charged hadron tracks is consistent with the sum of calo energy. Because the track error of this track is so high, it's getting a negative value from the fit, and is reset to zero here: https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/src/PFAlgo.cc#L2510-L2511 in general I don't think zero is a reasonable fallback, as it gives a "ghost" candidate with no kinematics or mass. I guess it would be better to remove it at that point, or at the very least make sure the mass is taken into account to avoid problems downstream. At the moment I'm not sure how technically difficult it would be to remove a candidate at this point in the code, certainly one has to be careful. On the other hand, it might make sense to exclude these tracks from getting promoted to PF cands in the first place, but I need to think/investigate more for that. I guess this issue is happening more often now if mkFit is giving a higher fraction of displaced tracks with high uncertainty. Is it known or expected? Maybe the mkFit experts want to provide some feedback on this particular track? |
what is the direction of this track, and also the number of hits? |
Hi Slava, here is some more info on the track
|
also in your original post |
thanks @swagata87 it is useful. |
using the setup at #39110 (comment)
is gone. So it seems it's a feature of the calibrations used in the realistic GT. |
can someone of you prepare a list of the DetIds of the hits associated to the problematic track in the MC event? |
@mmusich was this for me or for the dpg conveners? I spent some time messing with it, The trackRef key = 3 and ID = 1463 in the AOD that I produce from the RAW, but tracks.recHits() hasn't been stored. Can you give me the keep statement to store all the tracks so this works? |
to anyone who can devote some some to this :)
thanks, this is already helping.
I think that if you retain the |
Thanks, good point. I'm still not 100% sure about the interface, if I do track.recHit(i) I get a SiPixelRecHit which always returns null from its det function. I can give you the collection, keys, and raw ID from these objects, though:
gives
|
can you try to print details from the reco job itself? |
type pf |
I noticed that one of the algorithms (that I was running) was quitting with the error that the charged hadron isolation had NaN value, while running it over Run 3 GJet samples. This is actually when I noticed this issue. |
I add a further look to this:
these are all Pixel DetIds: Info on the DetIds
namely it's a track with 1 hit in BPix 1 and 6 hits on side 1 of FPix (suggesting it's very forward).
Comparing the rigid body alignment parameters (3 cartesian coordinates of the sensor active area center and corresponding Euler angles) of these albeit there's one On the other hand I have seen that using this overridden set of conditions on top of the recipe at #39110 (comment) from Configuration.AlCa.GlobalTag import GlobalTag
process.GlobalTag = GlobalTag(process.GlobalTag, '121X_mcRun3_2021_realistic_v10', '')
process.GlobalTag.toGet = cms.VPSet(
cms.PSet(record = cms.string("TrackerSurfaceDeformationRcd"),
tag = cms.string("TrackerSurfaceDeformations_zero")
)
) is sufficient to remove the warning |
+1
|
@cms-sw/reconstruction-l2 : Is there something pending to close this issue? |
cms-bot internal usage |
+1 |
This issue is fully signed and ready to be closed. |
Would be appreciated if someone could close this issue. Thanks. |
please close |
This is to keep track of (and hopefully solve at some point) an issue that reco::Candidate pt is
-nan
in some rare cases. It seems that this problem can lead to egamma object's PF isolation being-nan
as well, if the problematic candidate ends up in egamma object's isolation cone.One such problematic event is in this AOD file:
root://xrootd-cms.infn.it//store/mc/Run3Winter22DR/GJet_Pt-10to40_DoubleEMEnriched_TuneCP5_13p6TeV_pythia8/AODSIM/FlatPU0to70_122X_mcRun3_2021_realistic_v9-v2/2430000/6cd37543-62ec-4f62-9fa9-23b7c66f9c20.root
,the exact run:event number is:
eventsToProcess = cms.untracked.VEventRange('1:78326956-1:78326956'),
If we run AOD->MiniAOD step (I ran in CMSSW_12_2_1) for this event, we get the following warning in the event
triggered from here:
cmssw/CommonTools/RecoAlgos/src/PrimaryVertexSorting.cc
Lines 35 to 36 in 6d2f660
I checked that, in that loop, c->pt() =
-nan
, and the pdgId is-211
.In this event, we have 2 reconstructed gedPhotons, both have
chargedHadronIso()=-nan
.Another example of such problematic event is in this GEN-SIM-DIGI-RAW file:
/eos/cms/store/relval/CMSSW_12_1_0_pre4/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/121X_mcRun3_2021_realistic_v10_HighStat-v2/10000/22c2fc3b-069f-4437-ab6d-edf0a9c0dfc7.root
,exact event is:
eventsToProcess = cms.untracked.VEventRange('1:36727-1:36727')
Running the RECO step (in CMSSW_12_1_0_pre4) on this event, we get the same warning
In this case, reco::Candidate pt = -nan and its pdgId is 211.
And the event has 2 reconstructed ged photons with -nan value of charged hadron isolation.
Next I plan to check if the issue of photon's charged hadron isolation being
-nan
happen in data also or not.The text was updated successfully, but these errors were encountered: