Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fake comparison failures due to L1T in wf 25202.0 and 1330.0 #29237

Closed
silviodonato opened this issue Mar 19, 2020 · 29 comments
Closed

Fake comparison failures due to L1T in wf 25202.0 and 1330.0 #29237

silviodonato opened this issue Mar 19, 2020 · 29 comments

Comments

@silviodonato
Copy link
Contributor

silviodonato commented Mar 19, 2020

We are observing fake comparison failures in wf 25202.0 and 1330.0 since CMSSW_11_1_X_2020-03-17-2300 [1].
The failures are in folders L1TEMU/L1TdeStage2uGMT, L1T/L1TStage2uGMT, L1T/L1TriggerVsGen.

Example: L1T / L1TStage2uGMT / EMTFInput
image

Likely, this fake error is related to #29080 .

[1] https://cmsweb.cern.ch/dqm/dev/session/6Stmgn

@silviodonato
Copy link
Contributor Author

assign l1

@cmsbuild
Copy link
Contributor

New categories assigned: l1

@benkrikler,@rekovic you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

A new Issue was created by @silviodonato Silvio Donato.

@Dr15Jones, @smuzaffar, @silviodonato, @makortel, @davidlange6, @fabiocos can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@silviodonato
Copy link
Contributor Author

cc: @srimanob

@srimanob
Copy link
Contributor

srimanob commented Mar 19, 2020

Thanks @silviodonato
Does the DQM link work for you?
https://cmsweb.cern.ch/dqm/dev/session/6Stmgn
I got a message of
"The page you accessed, /dqm/dev/session/6Stmgn, is not a valid page or you may have tried to access content that belongs to someone else that was not made "public.""

@silviodonato
Copy link
Contributor Author

try with https://tinyurl.com/ubwjuu6

@silviodonato
Copy link
Contributor Author

@jiafulow
Copy link
Contributor

I'm sorry for taking so long to get back on this issue. I'm unfortunately not very proficient with global tags. The issue is rather tricky. It affects MC simulation with Run2_2016 era, and with the default global tags in both 10_6_X and 11_1_X, after the latest EMTF fixes.

To provide some background, during the UL16 production, the global tag for 2016 was updated to have the correct tags (see [1], Feb 2020). As a consequence, it broke the EMTF emulator, because the software (both sides of python & C++) was not set up correctly. It was set up to emulate the latest firmware (i.e. Run2_2018 at the moment). Andrew & Efe have submitted PRs to fix the issue (see [2] and [3] for 11_1_X). The results were verified using the latest UL16 RelVals [4].

So I mentioned that the change of the GT was what broke the EMTF. But in both 10_6_X and 11_1_X when you execute 'runTheMatrix', the 'auto' global tags are not updated. In 10_6_11, 'auto:run2_mc' points to '106X_mcRun2_asymptotic_v9' [5] which has the correct EMTF tags for Run2_2018; in 11_1_0_pre4, it points to '110X_mcRun2_asymptotic_v7' [6] which has the correct EMTF tags for Run2_2018. That means that, out of the 'runTheMatrix' results, those with Run2_2018 era are correct, but those with Run2_2016 are incorrect. Currently, only two processes ('1330.0' and '25202.0') are using '--era Run2_2016', so they are the ones affected. This became an issue after the latest EMTF fixes mentioned above. Prior to that, the software (both python & C++) always expected the EMTF tags for Run2_2018; but now it expects different tags for 2016 and for 2017/18.

As a side note, the results with Run2_2017 era are also somewhat incorrect. But because EMTF uses the same BDT trees (meaning the same L1TMuonEndCapForestRcd tag) in 2017 & 2018, this doesn't break the EMTF emulator, although certain parameters are not exactly correct.

I don't know what is the correct solution. I believe the 'runTheMatrix' workflows should be using the appropriate global tags for different eras. I don't know how the other subsystems are dealing with this, as I'm sure EMTF is not the first one to come across this issue. Please let us know if you have any suggestions and recommendations. Thank you very much!

[1] https://indico.cern.ch/event/880783/contributions/3745327/attachments/1985198/3307604/2020_02_11_GTs.pdf
[2] #29080
[3] #29260
[4] https://dmytro.web.cern.ch/dmytro/cmsprodmon/workflows.php?campaign=CMSSW_10_6_11_CANDIDATE3__hltul16_pre-1585301830
[5] https://github.com/cms-sw/cmssw/blob/CMSSW_10_6_11/Configuration/AlCa/python/autoCond.py#L21
[6] https://github.com/cms-sw/cmssw/blob/CMSSW_11_1_0_pre4/Configuration/AlCa/python/autoCond.py#L21


Notifications: @abrinke1 @efeyazgan @rekovic @davignon @BenjaminRS

@slava77
Copy link
Contributor

slava77 commented Apr 10, 2020

assign alca

@cmsbuild
Copy link
Contributor

New categories assigned: alca

@christopheralanwest,@tlampen,@pohsun,@tocheng you have been requested to review this Pull request/Issue and eventually sign? Thanks

@slava77
Copy link
Contributor

slava77 commented Apr 10, 2020

In 10_6_11, 'auto:run2_mc' points to '106X_mcRun2_asymptotic_v9' [5] which has the correct EMTF tags for Run2_2018; in 11_1_0_pre4, it points to '110X_mcRun2_asymptotic_v7' [6] which has the correct EMTF tags for Run2_2018. That means that, out of the 'runTheMatrix' results, those with Run2_2018 era are correct, but those with Run2_2016 are incorrect.

run2_mc auto-GT is supposed to be used only for 2016.
For 2018 the correct (realistic GT) is phase1_2018_realistic, similarly for 2017.

So, if run2_mc contains a payload that works only for 2018, it is a problem and the GT should be updated.

@jiafulow
Copy link
Contributor

run2_mc auto-GT is supposed to be used only for 2016.
For 2018 the correct (realistic GT) is phase1_2018_realistic, similarly for 2017.

Ah, I didn't realize that. So the comment in the autoCond.py is correct (I wasn't sure).

Then I think, for 10_6_X, 'auto:run2_mc_pre_vfp' and 'auto:run2_mc' should be updated to point to '106X_mcRun2_asymptotic_preVFP_v6' and '106X_mcRun2_asymptotic_v12' as used in the latest UL16 RelVals [1,2]. For 11_1_X, new global tags might need to be created?

[1] https://dmytro.web.cern.ch/dmytro/cmsprodmon/workflows.php?campaign=CMSSW_10_6_11_CANDIDATE3__hltul16_pre-1585301830
[2] https://dmytro.web.cern.ch/dmytro/cmsprodmon/workflows.php?campaign=CMSSW_10_6_11_CANDIDATE3__hltul16_post-1585306066

@makortel
Copy link
Contributor

makortel commented May 5, 2020

Appears that the same problem is causing address sanitizer (ASAN) failures (#29332) in a scale which makes it hard to spot other issues with ASAN. Could this issue be addressed soon? Thanks.

@srimanob
Copy link
Contributor

srimanob commented May 6, 2020

Hi @jiafulow
I assume Alca needs to update the GTs of run2_mc_pre_vfp and run2_mc with 111 GT. We should not roll back GT to 10_6 again.

@christopheralanwest @tlampen @tocheng Could you please have a look? Thanks.

@christopheralanwest
Copy link
Contributor

run2_mc auto-GT is supposed to be used only for 2016.
For 2018 the correct (realistic GT) is phase1_2018_realistic, similarly for 2017.

Ah, I didn't realize that. So the comment in the autoCond.py is correct (I wasn't sure).

Then I think, for 10_6_X, 'auto:run2_mc_pre_vfp' and 'auto:run2_mc' should be updated to point to '106X_mcRun2_asymptotic_preVFP_v6' and '106X_mcRun2_asymptotic_v12' as used in the latest UL16 RelVals [1,2]. For 11_1_X, new global tags might need to be created?

[1] https://dmytro.web.cern.ch/dmytro/cmsprodmon/workflows.php?campaign=CMSSW_10_6_11_CANDIDATE3__hltul16_pre-1585301830
[2] https://dmytro.web.cern.ch/dmytro/cmsprodmon/workflows.php?campaign=CMSSW_10_6_11_CANDIDATE3__hltul16_post-1585306066

The latest 10_6_X GTs for 2016 MC are:

  • 106X_mcRun2_asymptotic_v13
  • 106X_mcRun2_asymptotic_preVFP_v8

These differ from the GTs you suggested by the addition of an updated L1TCaloParamsRcd tag L1TCaloParams_static_v3_3_1_UL2016v3_mc:

https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/106X_mcRun2_asymptotic_preVFP_v6/106X_mcRun2_asymptotic_preVFP_v8
https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/106X_mcRun2_asymptotic_v12/106X_mcRun2_asymptotic_v13

Are the tags in the latest GTs the correct tags to be used in 10_6_X and later? I'm asking because the move from 106X_mcRun2_asymptotic_preVFP_v5 to 106X_mcRun2_asymptotic_preVFP_v6 also modified the L1TCaloParamsRcd tag:

https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/106X_mcRun2_asymptotic_preVFP_v6/106X_mcRun2_asymptotic_preVFP_v5

Should the cosmics GT also be updated with the changes?

After receiving a response, I can make a PR for master and the appropriate backports.

@jiafulow
Copy link
Contributor

jiafulow commented May 6, 2020

@christopheralanwest I believe it's ok to use the latest GTs that you listed. I checked that L1TMuonEndCapForestRcd and L1TMuonEndCapParamsRcd are correct (there's no difference in those records).

Would you also be able to update the 11_1_X GTs? I'm not sure if there are 11_1_X GTs with the correct L1TMuonEndCapForestRcd and L1TMuonEndCapParamsRcd records at the moment, so maybe they need to be created?

@christopheralanwest
Copy link
Contributor

@jiafulow Could you also confirm that the 2016 cosmics GT should be updated in the same way as the collision GTs? The differences between the latest cosmic and collision GTs for 2016:

https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/106X_mcRun2cosmics_startup_deco_v6/106X_mcRun2_asymptotic_v13

is larger than it had been previously:

https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/106X_mcRun2_asymptotic_v9/106X_mcRun2cosmics_startup_deco_v6

Could you confirm that the only difference between L1T tags the pp collision and cosmics GTs are in the following records:

  • L1MuCSCTFConfigurationRcd
  • L1MuDTTFMasksRcd
  • L1MuDTTFParametersRcd
  • L1RPCBxOrConfigRcd
  • L1RPCConeDefinitionRcd
  • L1RPCConfigRcd
  • L1RPCHsbConfigRcd

@jiafulow
Copy link
Contributor

jiafulow commented May 9, 2020

@christopheralanwest I'm not familiar with any of the listed records, so I don't think I know how to answer the question. Maybe @davignon or other L1T experts could help?

@christopheralanwest
Copy link
Contributor

@christopheralanwest I'm not familiar with any of the listed records, so I don't think I know how to answer the question. Maybe @davignon or other L1T experts could help?

@davignon @bundocka Could you respond to the question at #29237 (comment)? Thanks.

@davignon
Copy link
Contributor

@christopheralanwest @jiafulow
These are the differences between the two tags (cosmics = 1, pp = 2), as far as L1 is concerned:

 L1TCaloParamsRcd:

  1. L1TCaloParams_static_CMSSW_9_2_10_2017_v1_8_2_updateHFSF_v6MET
  2. L1TCaloParams_static_v3_3_1_UL2016v3_mc
    --> 1) should be changed to L1TCaloParams_static_v3_3_1_UL2016v3_mc

L1TMuonEndCapForestRcd:

  1. L1TMuonEndCapForest_static_Sq_20170613_v7_mc
  2. L1TMuonEndCapForest_static_2016_mc

L1TMuonEndCapParamsRcd:

  1. L1TMuonEndCapParams_static_v96.47_2017_MC
  2. L1TMuonEndCapParams_static_2016_mc
    --> I let @jiafulow and @abrinke1 comment on the two above and what should be used for 2016 cosmics.

L1TMuonGlobalParamsRcd:

  1. L1TMuonGlobalParams_static_v94.6.1
  2. L1TMuonGlobalParams_static_UL2016v2_mc
    --> I think 1) should be changed to L1TMuonGlobalParams_static_UL2016v2_mc, but @dinyar should confirm that this is suitable for a cosmics configuration.

L1TUtmTriggerMenuRcd:

  1. L1Menu_Collisions2016_v9_m2_xml
  2. Menu_Collisions2016_v6r5_ugt_1board_xml
    --> 1) should contain Menu_Collisions2016_v6r5_ugt_1board_xml

Regarding your question:
"Could you confirm that the only difference between L1T tags the pp collision and cosmics GTs are in the following records:

L1MuCSCTFConfigurationRcd
L1MuDTTFMasksRcd
L1MuDTTFParametersRcd
L1RPCBxOrConfigRcd
L1RPCConeDefinitionRcd
L1RPCConfigRcd
L1RPCHsbConfigRcd"

I'm afraid I don't know the answer... Maybe @dildick or @gekobs or @dinyar know?

Best,
Olivier

@jiafulow
Copy link
Contributor

For 2016 cosmics (MC), for L1TMuonEndCapForestRcd and L1TMuonEndCapParamsRcd, please use L1TMuonEndCapForest_static_2016_mc and L1TMuonEndCapParams_static_2016_mc. Though, I should note that the cosmics reconstruction in EMTF doesn't actually use the BDT (the "forest" record).

Sorry that I didn't answer this in my earlier reply.

@dinyar
Copy link
Contributor

dinyar commented May 18, 2020

Sorry for the delay. I confirm that L1TMuonGlobalParams_static_UL2016v2_mc should be used.

For the question "Could you confirm that the only difference between L1T tags the pp collision and cosmics GTs are in the following records: [...]" I'm afraid I also don't know the answer.

@makortel
Copy link
Contributor

Could someone comment briefly the status of the fix? Thanks.

@silviodonato
Copy link
Contributor Author

@jiafulow @christopheralanwest @davignon @dinyar What is the status of the fix? Are you going to update these records in the 111X and 112X GT?

@hjkwon260
Copy link
Contributor

hjkwon260 commented Jun 5, 2020

Hi - may I ask few questions on GT update(w/ correct L1T tags)?

  1. Do we only need to summit request for correct L1T tags in GT queue (or create a candidate GT as well)?
  2. In condDB, I couldn't find 111X, 112X GT(for MC 16), is the L1T tag update should be done on 110X GT(MC 16) only?
  3. if 2, these queues can be used?:
    • pp:
      • 110X_mcRun2_asymptotic_preVFP_Queue
      • 110X_mcRun2_asymptotic_Queue
    • cosmic:
      • 110X_mcRun2cosmics_startup_deco_Queue

Thanks,
Hyejin (L1T AlcaDB contact)

@christopheralanwest
Copy link
Contributor

Do I understand correctly that these changes are needed for 11_0_X, 11_1_X and 11_2_X? If so, you can queue the tags to:

  • 110X_mcRun2_asymptotic_preVFP_Queue
  • 110X_mcRun2_asymptotic_Queue
  • 110X_mcRun2cosmics_startup_deco_Queue

Since no other changes have been made to these GTs since 110X, we can then use the 110X GTs for all three release series.

Please queue the tags and create the candidate GTs. Thanks.

christopheralanwest added a commit to christopheralanwest/cmssw that referenced this issue Jun 8, 2020
christopheralanwest added a commit to christopheralanwest/cmssw that referenced this issue Jun 9, 2020
cmsbuild added a commit that referenced this issue Jun 10, 2020
…29237

Updated L1T tags to address CMSSW issue #29237
@silviodonato
Copy link
Contributor Author

After #30151, everything seems to work properly
See
#30123 (comment)
#30170 (comment)
#30172 (comment)
#30173 (comment)
#30174 (comment)

christopheralanwest added a commit to christopheralanwest/cmssw that referenced this issue Jun 12, 2020
cmsbuild added a commit that referenced this issue Jun 12, 2020
…29237-11_1_X

Updated L1T tags to address CMSSW issue #29237 [11_1_X]
cmsbuild added a commit that referenced this issue Jun 12, 2020
…29237_11_0_X

Updated L1T tags to address CMSSW issue #29237 [11_0_X]
cmsbuild added a commit that referenced this issue Jun 18, 2020
…29237-10_6_X

Updated L1T tags to address CMSSW issue #29237 [10_6_X]
@christopheralanwest
Copy link
Contributor

+1

This is resolved by #30151 and backports

@smuzaffar
Copy link
Contributor

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests