-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More Nano step-related fixes for Run3/Phase2 workflows #36350
Conversation
resolves #36347 |
type bug-fix |
test parameters: |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36350/27112
|
A new Pull Request was created by @kpedro88 (Kevin Pedro) for master. It involves the following packages:
@perrotta, @jordan-martins, @bbilin, @wajidalikhan, @cmsbuild, @AdrianoDee, @srimanob, @kskovpen, @qliphy, @fabiocos, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
-1 Failed Tests: RelVals RelVals
|
urgent |
@kpedro88 Tests did not finish, yet. But there is at least an issue with 11723.17 to be fixed |
For future reference, the crash and relevant trace:
The interesting thing is that this workflow runs fine in 12_1_0_pre5 (with or without this PR). It appears that it's not crashing in 12_2_X IBs only because the Reco step is removed entirely without this PR. @perrotta what do you want to do here? I don't have the time or expertise to debug this workflow... |
@kpedro88 The Reco step for 11723.17 is only removed between CMSSW_12_2_X_2021-11-29-1100 [1] and CMSSW_12_2_X_2021-11-29-2300 [2], as @perrotta commented previously here, with a suspect being #36167 from you. Can you check what happened in #36167 for 11723.17 and whether we can revert it back (partly)? Thanks! |
@qliphy I'm saying that if I copy the exact step3 command from CMSSW_12_2_X_2021-11-29-1100:
it crashes in a clean CMSSW_12_2_X_2021-12-03-1100 IB. So this specific crash is not caused by #36167; rather, #36167 accidentally hid the real problem by removing the RECO step, so the crash was not visible until now. |
I have verified what @kpedro88 wrote. If I run
in CMSSW_12_2_X_2021-11-29-1100 (i.e. before #36167 was merged), that workflow crashes as well. But in the IB tests of CMSSW_12_2_X_2021-11-29-1100 the same workflow succeed. The difference is in the input events, which in the PR tests are produced from scratch at step1, while in the IB tests they are taken from the store, i.e. they are different events. (I cannot verify in https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-71105c/20969/summary.html because that page does not open now, but this is what I see in my local lxplus) Therefore, it confirms that the crash of 11723.17 in the tests does not depend on #36167, or even on this PR. (We can even re-run the tests without 11723.17, if people prefere not to have the "test rejected" status. And in any case, the issue must be debugged, but by Tracking and/or EGamma, since it involves |
test parameters: |
please test |
@perrotta thanks for confirming, I agree with your proposal for this PR. |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-71105c/20992/summary.html Comparison Summary@slava77 comparisons for the following workflows were not done due to missing matrix map:
Summary:
|
@cms-sw/pdmv-l2 @cms-sw/upgrade-l2 please check and sign, if you agree with this PR: this is the last one missing before building pre3 |
+Upgrade |
+1 |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged. |
I have opened an issue for the crash observed in 11723.17: #36369 |
PR description:
run3_nanoAOD_devel
modifier into theRun3
Era (easier way to make sure workflows are consistently applying it)attn: @cms-sw/xpog-l2
PR validation:
Compared output of this command for 12_1_0_pre5 and this branch:
runTheMatrix.py -w upgrade -nel 10024.1,11624.1,11634.17,11834.17,12834.17,13034.17,11834.98,11834.99,11634.21,11834.21,11834.9821,11834.9921,35034.21,35234.21,35234.98,35234.99,35234.9821,35234.9921 --dryRun
. (This tests trackingOnly, deepCore, prodLike, premix workflows.)