Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check on PATLeptonTimeLifeInfoProducer closestState valid state #44864

Merged
merged 1 commit into from
Apr 30, 2024

Conversation

francescobrivio
Copy link
Contributor

PR description:

This PR addresses #44862 as suggested in #44862 (comment), in order to cure the crash observed in the CMSSW_14_0_6 replay (see CMSTalk).

It simply adds an isValid() check to the closestState object before accessing its members and emits an edm::LogError in case it's not valid.

@cms-sw/tracking-pog-l2 @mbluj please advise if a better solution is needed to fix the issue at its root.

PR validation:

Code compiles.
Additionally, I re-run successfully on the crashing event following the recipe from #44862 (comment) and, as expected, I got:

%MSG-e PATLeptonTimeLifeInfoProducer:   PATElectronTimeLifeInfoProducer:electronTimeLifeInfos  29-Apr-2024 11:56:53 CEST Run: 369998 Event: 31680062
closestState not valid!
%MSG

but no crash.

Backport:

A 14_0_X backport is needed in order to be deployed in production in Tier0 (thus allowing deployment of 14_0_6_MULTIARCHS in HLT)

FYI @mandrenguyen @missirol

@francescobrivio
Copy link
Contributor Author

urgent

@francescobrivio
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 29, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44864/40118

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @francescobrivio for master.

It involves the following packages:

  • PhysicsTools/PatAlgos (xpog, reconstruction)

@vlimant, @hqucms, @jfernan2, @mandrenguyen can you please review it and eventually sign? Thanks.
@jdamgov, @AlexDeMoor, @gkasieczka, @castaned, @azotz, @gpetruc, @ahinzmann, @mmarionncern, @jdolen, @demuller, @Senphy, @gouskos, @mbluj, @rappoccio, @seemasharmafnal, @Ming-Yan, @nhanvtran, @mariadalfonso, @andrzejnovak, @schoef, @hatakeyamak this is something you requested to watch as well.
@rappoccio, @antoniovilela, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@@ -169,6 +170,10 @@ void PATLeptonTimeLifeInfoProducer<T>::produceAndFillIPInfo(const T& lepton,
AnalyticalImpactPointExtrapolator extrapolator(transTrack.field());
TrajectoryStateOnSurface closestState =
extrapolator.extrapolate(transTrack.impactPointState(), RecoVertex::convertPos(pv.position()));
if (!closestState.isValid()) {
edm::LogError("PATLeptonTimeLifeInfoProducer") << "closestState not valid!";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also print transTrack.impactPointState() and RecoVertex::convertPos(pv.position())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Slava for the suggestion, I force-pushed a new commit including the printout of the two values.
The code it's a bit uglier (enforced by code-format), but the output looks nicer anyway!
Testing on the crashing event I get:

%MSG-e PATLeptonTimeLifeInfoProducer:   PATElectronTimeLifeInfoProducer:electronTimeLifeInfos  29-Apr-2024 18:32:43 CEST Run: 369998 Event: 31680062
closestState not valid! From:
transTrack.impactPointState():
global parameters
x =       1.91865      34.2722      166.593
p =     -0.016365  0.000857716    0.0626035
global error
     0.812786   0.00612206   -0.0855089   -0.0504526    0.0847113
   0.00612206   0.00027459 -0.000491722  0.000820261   0.00174473
   -0.0855089 -0.000491722    0.0124496   0.00858109   -0.0126903
   -0.0504526  0.000820261   0.00858109    0.0129749  -0.00277318
    0.0847113   0.00174473   -0.0126903  -0.00277318    0.0202798
local parameters (q/p,v',w',v,w)
     -15.4529            0     -3.82022            0           -0
local error
     0.812786    0.0291878   -0.0981561    0.0504526    -0.334519
    0.0291878   0.00453957    0.0104463   0.00673732   0.00313124
   -0.0981561    0.0104463    0.0685205    0.0127032     0.109981
    0.0504526   0.00673732    0.0127032    0.0129749   -0.0109511
    -0.334519   0.00313124     0.109981   -0.0109511     0.316245
Defined at beforeSurface
Magnetic field in inverse GeV:  (1.92692e-06,3.442e-05,0.0112619) 
RecoVertex::convertPos(pv.position()):
 (0.117078,-0.182911,0.95493) 
%MSG

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-e10e38/39152/summary.html
COMMIT: c150b8b
CMSSW: CMSSW_14_1_X_2024-04-29-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44864/39152/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
24834.78 step 2
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44864/40126

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

Pull request #44864 was updated. @jfernan2, @hqucms, @mandrenguyen, @vlimant, @cmsbuild can you please check and sign again.

@francescobrivio
Copy link
Contributor Author

@cmsbuild please test

@hqucms
Copy link
Contributor

hqucms commented Apr 29, 2024

enable nano

@hqucms
Copy link
Contributor

hqucms commented Apr 29, 2024

please test

@hqucms
Copy link
Contributor

hqucms commented Apr 29, 2024

type tau

@cmsbuild cmsbuild added the tau label Apr 29, 2024
@hqucms
Copy link
Contributor

hqucms commented Apr 29, 2024

@cms-sw/tau-pog-l2 @mbluj Please have a look if the fix is OK to you.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-e10e38/39164/summary.html
COMMIT: 36effa5
CMSSW: CMSSW_14_1_X_2024-04-29-1100/el8_amd64_gcc12
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44864/39164/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
24834.78 step 2
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

NANO Comparison Summary

Summary:

  • You potentially removed 2 lines from the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 15
  • DQMHistoTests: Total histograms compared: 17023
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 17023
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 14 files compared)
  • Checked 55 log files, 32 edm output root files, 15 DQM output files

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.0 2.783 2.783 0.000 ( +0.0% ) 3.53 3.44 +2.6% 2.160 2.204
2500.001 2.897 2.897 0.000 ( +0.0% ) 3.17 3.13 +1.2% 2.539 2.624
2500.002 2.843 2.843 0.000 ( +0.0% ) 3.28 3.23 +1.5% 2.538 2.627
2500.01 1.446 1.446 0.000 ( +0.0% ) 6.08 5.87 +3.6% 2.268 2.313
2500.011 1.906 1.906 0.000 ( +0.0% ) 3.29 3.25 +1.4% 2.455 2.433
2500.012 1.761 1.761 0.000 ( +0.0% ) 4.81 4.69 +2.6% 2.397 2.428
2500.1 2.354 2.354 0.000 ( +0.0% ) 4.54 4.42 +2.5% 1.968 2.075
2500.2 2.459 2.459 0.000 ( +0.0% ) 5.19 4.81 +7.9% 1.877 1.989
2500.21 1.286 1.286 0.000 ( +0.0% ) 3.46 3.40 +1.7% 2.147 2.273
2500.211 1.668 1.668 0.000 ( +0.0% ) 3.10 2.99 +3.8% 2.196 2.258
2500.3 2.229 2.229 0.000 ( +0.0% ) 9.97 9.38 +6.2% 1.868 1.970
2500.301 2.833 2.833 0.000 ( +0.0% ) 8.95 8.65 +3.4% 1.800 1.965
2500.31 7.164 7.164 0.000 ( +0.0% ) 1.52 1.46 +4.1% 1.708 1.704
2500.311 1.568 1.568 0.000 ( +0.0% ) 8.75 8.42 +4.0% 1.059 1.053
2500.312 540.457 540.457 0.000 ( +0.0% ) 0.53 0.49 +8.0% 1.603 1.595
2500.313 817.694 817.694 0.000 ( +0.0% ) 0.75 0.70 +6.3% 1.587 1.581
2500.32 1.348 1.348 0.000 ( +0.0% ) 13.12 12.13 +8.2% 2.332 2.248
2500.321 1.757 1.757 0.000 ( +0.0% ) 10.33 9.45 +9.3% 2.415 2.409
2500.322 1.236 1.236 0.000 ( +0.0% ) 10.73 9.82 +9.3% 2.171 2.208
2500.323 7.772 7.772 0.000 ( +0.0% ) 4.45 4.05 +10.0% 1.917 1.817
2500.324 1.874 1.874 0.000 ( +0.0% ) 10.81 9.83 +9.9% 2.180 2.195
2500.325 4.136 4.136 0.000 ( +0.0% ) 4.68 4.33 +8.1% 2.170 2.167
2500.326 3.342 3.342 0.000 ( +0.0% ) 1.67 1.57 +6.5% 2.108 2.167
2500.327 1.811 1.811 0.000 ( +0.0% ) 11.07 10.27 +7.8% 2.309 2.327
2500.4 2.374 2.374 0.000 ( +0.0% ) 8.76 9.02 -2.8% 1.826 1.830
2500.401 1.891 1.891 0.000 ( +0.0% ) 7.42 7.84 -5.3% 1.704 1.690
2500.402 2.950 2.950 0.000 ( +0.0% ) 8.69 8.12 +7.0% 1.756 1.749
2500.403 8.700 8.700 0.000 ( +0.0% ) 2.90 2.92 -0.6% 1.768 1.835
2500.404 5.474 5.474 0.000 ( +0.0% ) 1.22 1.15 +6.1% 1.819 1.835
2500.405 2.860 2.860 0.000 ( +0.0% ) 8.55 7.71 +10.9% 1.738 1.771
2500.5 5.194 5.194 0.000 ( +0.0% ) 16.07 15.50 +3.7% 1.525 1.519
2500.51 9.120 9.120 0.000 ( +0.0% ) 9.80 9.57 +2.4% 1.477 1.477

@hqucms
Copy link
Contributor

hqucms commented Apr 29, 2024

+1

@antoniovilela
Copy link
Contributor

+orp

  • Will be merged once @cms-sw/reconstruction-l2 signs.

@mandrenguyen
Copy link
Contributor

+1

@mandrenguyen
Copy link
Contributor

Please remove the RFC

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@cmsbuild cmsbuild merged commit ba95554 into cms-sw:master Apr 30, 2024
13 checks passed
@francescobrivio francescobrivio changed the title [RFC] Add check on PATLeptonTimeLifeInfoProducer closestState valid state Add check on PATLeptonTimeLifeInfoProducer closestState valid state Apr 30, 2024
@francescobrivio francescobrivio deleted the alca-closestState_check branch May 5, 2024 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants