-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DiMuonMassBiasClient does not support concurrent lumis #39180
Comments
assign alca |
A new Issue was created by @makortel Matti Kortelainen. @Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
New categories assigned: alca @yuanchao,@francescobrivio,@malbouis,@saumyaphor4252,@tvami,@ChrisMisan you have been requested to review this Pull request/Issue and eventually sign? Thanks |
I think it's being resolved in |
Caused by #39148 and A quick workaround would be to amend cmssw/Configuration/AlCa/python/autoAlca.py Lines 53 to 54 in f3a1901
but given the widespread failures I'm wondering if this would cause too many job types to not use concurrent lumis? |
How? This condition is not visible in single-thread tests. |
unassign alca |
assign dqm |
New categories assigned: dqm @jfernan2,@ahmad3213,@micsucmed,@rvenditti,@emanueleusai,@pmandrik you have been requested to review this Pull request/Issue and eventually sign? Thanks |
What are the options here?
and I am not sure in which job of the Tier0 production would the harvesting of this fit. |
Maybe it would be time to seriously consider one? (although I guess it could be a major effort) |
@cms-sw/dqm-l2 can you please comment on the feasibility of that? |
I'm not sure how to put together lumi safe version of a DQMEDHarvester, but as DQM we're open to look into it and discuss with the CMSSW experts. So I support opening a dedicated issue to keep track of the task. |
Let me backtrack a bit. Back in the days the stance of the DQM group was along "harvesting modules would never be run outside of specific harvesting jobs that are single-threaded". The DQM documentation in https://github.com/cms-sw/cmssw/blob/master/DQMServices/Core/README.md#processing-environments mentions
Question for @cms-sw/alca-l2 perhaps? A challenge with the cmssw/DQMServices/Core/interface/DQMEDHarvester.h Lines 43 to 52 in 6d0a48b
regardless what the class inheriting from DQMEDHarvester actually implements. In this case the DiMuonMassBiasClient implements only the dqmEndJob() function (i.e. endProcessBlock transition), which means the DiMuonMassBiasClient itself would be safe towards concurrent lumis and runs! It is just that the intermediate base class prevents it for no good (apparent) reason. The best way would be to have the concrete harvester modules to declare with base class template arguments the transitions they want to interact with. At minimum a separate (e.g.) DQMEDJobHarvester base class, that listens only the endProcessBlock transition (not lumi/run), would allow DiMuonMassBiasClient to run as part of non-harvesting job without preventing concurrent lumis (or runs).
(@mmusich by the way, since cmssw/DQMOffline/Alignment/src/DiMuonMassBiasClient.cc Lines 27 to 28 in 6d0a48b
has no effect in practice; consuming edm::InProcess could, but DQMEDAnalyzer does not produce tokens in ProcessBlock)
|
So after thinking a bit, I came up with #39217 cmssw/DQMOffline/Configuration/python/autoDQM.py Lines 96 to 99 in d9d4aec
that would be run at Tier-0 for the The problem with that PR is that since For these reasons, I think it would still make sense to provide something along these lines:
in order to allow About:
I removed the |
Many workflows in CMSSW_12_5_X_2022-08-24-1100 fail with
The text was updated successfully, but these errors were encountered: