-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DQM: Use ProcessBlock in Harvesting #30698
Conversation
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-30698/16998
|
A new Pull Request was created by @schneiml (Marcel Schneider) for master. It involves the following packages: DQMServices/Components @andrius-k, @kmaeshima, @schneiml, @cmsbuild, @jfernan2, @fioriNTU can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test I don't expect anything interesting but just to be sure. |
The tests are being triggered in jenkins.
|
Thanks @schneiml for trying out!
By quick look we are (at least) missing the cmssw/FWCore/Framework/interface/WillGetIfMatch.h Lines 32 to 38 in b062071
@wddgit Would you be able to take a deeper look?
A single ProcessBlock should cover arbitrary number of runs processed by a single job. (strictly speaking as long as the output module does not change files, but our feeling is that no-one uses that feature of PoolOutputModule anyway) |
I will take a look and investigate. |
-1 Tested at: 150f194 CMSSW: CMSSW_11_2_X_2020-07-14-2300 I found follow errors while testing this PR Failed tests: UnitTests
I found errors in the following unit tests: ---> test TestDQMServicesDemo had ERRORS |
Comparison job queued. |
Yes, I missed GetterOfProducts. It is not implemented for ProcessBlock. I am working on adding support for that now. When we were discussing the ProcessBlock design, we decided to only have support for getByToken. I think the main point was that we intentionally did not want to provide support for getByLabel and getManyByType. We didn't actually discuss GetterOfProducts, but it makes sense to support that. I will work on that now. It may be as simple as adding a couple lines at the point Matti referenced above and adding a unit test. Internally it is using tokens and depends on the same things as getByToken. |
Comparison is ready @slava77 comparisons for the following workflows were not done due to missing matrix map:
Comparison Summary:
|
PR #30737 adds support to GetterOfProducts for ProcessBlock (not merged yet, tests running at the moment). Thanks for reporting the problem and trying this. Let us know if you notice any other problems or have questions. |
The unit test failure is a real problem, due to legacy modules which are (for now?) still fully supported in harvesting. Since the I see two options:
Edit: The comparison looks pretty clean, so at least we don't use many such legacy modules commonly. |
Unfortunately this won't work because ProcessBlock support was not extended to legacy modules (which we really should get rid of). |
+1 |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1 The comparison is as clean as it could be. The changes seem ok to me. We should be prepared for more random misbehavior in harvesting jobs, since these changes will throw the undefined-but-pretty-reliable module ordering out of order and there may be more harvesting code that needs explicit dependencies set up. But that should be easy to fix (like in the SiPixelPhase1 case here) and there is not much we can do about it. |
@cms-sw/alca-l2 do you have any comments? |
+1 |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
This PR aims to use the new ProcessBlock feature in DQM HARVESTING to fix issue #28354.
See also my recent talk about how this works: https://indico.cern.ch/event/934933/contributions/3928462/attachments/2072544/3479621/dqm-cmssw.pdf (Page 50).
Caveats:
It does not work for now. Either the dependencies are not correctly set up at all or at leastFixed with Add to GetterOfProducts support for ProcessBlock #30737.process.Tracer = cms.Service("Tracer", dumpPathsAndConsumes = cms.untracked.bool(True))
does not correctly display them.endJob
and if their assumptions remain within the guarantees of edm.DQMFileSaver
should be covered and more depdendencies can be added it if turns out to be required. (Update: One more dependency around QTests inSiPixelPhase1Summary
found and fixed here. There might be more like these, and they might show up randomly now and then.)endRun
and then the second harvesting inendJob
(nowendProcessBlock
). That was the only way to do it withoutProcessBlock
s. Now we could do it all inendProcessBlock
(a.k.a.dqmEndJob
) using token dependencies and two separate harvesting modules, but I did not perform this transform. In practice, the only use case for such a setup isSiStripMonitorClient
, and I am happy that it works as is and don't want to break and debug it another time.endJob
withendProcessBlock
for now, which should cover all existing cases. WithProcessBlocks
, it should be possible again to harvest multiple runs independently in one job (or even multi-run harvest multiple groups of runs), but there is no need or support for that right now.endJob
harvesting is now no longer supported in legacy modules. It was not really used anyways. I removed a bunch of emptyendJob
methods to avoid confusion. Some modules still haveendJob
harvesting code, but it is not used in production; it can still work in stand-alone setups that do not rely onDQMFileSaver
.PR validation:
Dependencies are displayed by the tracer, and the PR tests seem ok. This may break some stand-alone workflows (esp. Validation) if they use legacy modules, use end-job harvesting, and use
DQMFileSaver
. (Not sure if that combination exists anywhere).@wddgit @makortel FYI.