-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Developers overriding production location for retrieving conditions in configurations #27393
Comments
A new Issue was created by @davidlange6 David Lange. @davidlange6, @Dr15Jones, @smuzaffar, @fabiocos, @kpedro88 can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign l1,db,reco |
New categories assigned: db,l1 @ggovi,@benkrikler,@rekovic you have been requested to review this Pull request/Issue and eventually sign? Thanks |
assign reconstruction |
My personal opinion is that a cfg file containing a Condition Customization ( defined with one or more toGet statements ) should be NEVER invoked ( directly or included ) in a production workflow, since:
For this reason, I think that the first measure required is to find out the reason of the concerned customisation, and remove them from the production workflows.
I think AlCa people should be involved in this review procedure |
for cmssw/RecoTauTag/RecoTau/python/tools/runTauIdMVA.py |
for the L1Trigger instances: @rekovic could you please address this issue? |
Can we add alca in this loop? |
assign alca |
New categories assigned: alca @christopheralanwest,@franzoni,@tlampen,@pohsun,@tocheng you have been requested to review this Pull request/Issue and eventually sign? Thanks |
Hello,
|
+1 the only case affecting reco is clarified in #27393 (comment) |
please elaborate what's in this file. |
all the payloads not in the GT that without which a digi-reco-miniaod workflow will not run |
which I think is not to say (if or if not) they are actually read and used..but rather that our applications depend on them being present to function - perhaps due to purely technical reasons |
(sorry, didn't mean to close this) |
I think that the origin is in
not in the
|
If these changes are expected to affect the output we would like to have them as soon as possible for testing and eventually reprocessing of the example PFNano datasets that will go to the open data release. |
For my edification why is CMS releasing open data related to a custom workflow (requiring unsupported conditions)? |
Producing NanoAOD enriched with the PF candidates is of major interest to the open data user community. That is why we are providing that example. We do not use unsupported conditions on purpose. We are more than happy to use supported conditions, please let us know how to modify the workflow or point us to the relevant documentation. Thank you, your help will be appreciated🙂! |
@katilp Executing your cmsDriver command I got the following error:
And indeed there is not In parallel I checked that this configuration without the customization works without problems. This highly probably mean that the issue is caused by the customization. I will back to it at the beginning of the next week, but I am not sure if I will have time on Monday due to other duties. |
Also please note that this error appears without any customization and has nothing to do with open data, other than open data preparations helping CMS to improve the codebase ❤️ |
this seems in contradiction with #27393 (comment) |
As written in the comment above: I was not able to reproduce the issue w/o the customization, i.e. executing the following
The cmsDriver command with customization does not work for me. |
Thank you!
For the sake of brevity, removing the customization (updated: bringing back two default lines for customization):
Note that it will not fail if a frontier connection is available, only when not (to come to the original question) |
Ah, OK. How can I remove connection to frontier w/o loosing connection to GT? |
I see it in your cfg file. I will test it later, but now I must run. |
With configuration copied from above which has connection to conditions in a sqlite file, giving python dump like this:
it still works for me... |
for the sake of honesty,
this is not exactly removing the customization, you are still checking out a package that is not available in release via:
in a self-contained |
No, connecting to the /cvmfs area for condition data is not enough, is does not cut the frontier connection. That's why this goes unobserved and it only fails when there is no frontier connection at all. i.e CMS open data VM, or in the CMS open data docker container if one explicitly removes the frontier connection, see https://cms-opendata-releaseguide.docs.cern.ch/computing_environment/containers/#testing-without-frontier-connection |
Yes, of course, this is what we provide for open data users. We do not point them to anything centrally supported but provide them examples of how they can use CMS open data. We believe that this an issue independent from the
Yes, it will work if you do not cut the frontier connection. This is the issue. It goes unobserved. |
If you cannot cut the frontier connection, take the RecoTauTag package, add prints to this file locally, and you will see it ending up there. And that's all what I'm trying to say. |
I can indeed make the process crash by short-circuiting this via this recipe:
which is independent from the |
I did some more investigations for
For the open NANO release, I suppose the easiest way is to gather a list of the tags needed but not in the GT, and just dump/add them into a sqlite file? I dumped a list of tags being loaded manually in the 106X workflow. Probably not all of them are strictly needed, but have all of them should make things work w/o connecting to Frontier. |
what GT have you used instead of |
Thanks @hqucms! |
Thanks! For open data, we would prefer the cleanest solution i.e. a new release with the fixes and a new GT if that can happen very shortly. It will be more work for us now for changes in already prepared material, but in the future, it will avoid patches and additional explanations in the CMS open data tutorials and guides. What would be the estimated timescale? |
@vlimant |
I think if we create a dedicated sqlite or make a new GT to include the missing tags, then no change is needed for the release. And since the open data will be using a sqlite anyhow, it might be much easier to just add the missing tags to the sqlite, rather than making a new GT. Maybe the Alca/DB group can comment on this? |
I think they commented in https://cms-talk.web.cern.ch/t/condition-database-access-outside-of-gt-for-nano-production/33715/5 |
Hello, sorry for delay in answering, but I was taken by other commitments. So, to summarize what needs to be done:
About timescale: I have other things on the plate (e.g. some L1 stuff for 2024 & phase-2), but I can reorder this if necessary. I suppose that cleaning master will take 1-2 days, backport to 10_6 a few additional days. What concerns GT update, I have not experience with it, so I have to either learn how to do or someone else should do the job - any help will be appreciated. |
Thank you @mbluj If I understand it correctly we should:
If you remove the tauIDs also from UL/10_6 then no update of the GTs is needed, but as it will probably require modifying in a non trivial way a closed release this should be probably better avoided. For the intermediate releases I would check one by one: probably we could avoid acting on all them, but it depends on what's intended to do with them. |
Correct. As far as I understand the problem it touches only NanoAOD workflows.
It should be checked. In principle a "full cleaning" can require changes in AOD/RECO, miniAOD and NanoAOD sequences, but I suppose that the changes will not affect data content at any of datatiers. The only effect can be removal of era-dependent modifications for compatibility with old (pre-UL) samples. Anyway, if I understand it correctly the idea is to update only NanoAOD sequences as it is not expected to produce new UL-like samples other than NanoAOD for OpenData purposes, right?
OK. Changes in intermediate releases newer than 11_3 (if I am correct) will be similar to those in master while for older similar to those for 10_6. But, even trivial backporting and testing (sometimes creating GT) to a number of release series is already some additional burden. |
Pinging this thread along with #43797 (@mbluj) In the Open Data context, DPOA's preferred solution is cleaning these IDs from UL/10_6 so that a new release can be used without a new (bigger) GT. Nano sequences are very likely to be the most used in Open Data, but we do provide instructions on how users can produce their own MC, which follows the full sequence. It would be ideal to clean this out fully. But the Nano sequences have the highest priority. We can test where in the full production chain it fails if needed. |
please close done with #44685 until further notice |
Breaks centrally determined location for retrieving conditions (which of course can change even if it rarely does). All of these should rely on the global tag to determine from where the conditions are to be taken (or they should not be loaded into production workflows)
cmssw/L1Trigger/L1TMuon/python/simGmtStage2Digis_cfi.py
Line 23 in 02d4198
cmssw/RecoTauTag/RecoTau/python/tools/runTauIdMVA.py
Line 41 in 82c4a52
cmssw/L1Trigger/L1TCalorimeter/python/simDigis_cff.py
Line 49 in 8587c76
The text was updated successfully, but these errors were encountered: