Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROOT IORule for std::auto_ptr -> std::unique_ptr does not work with high split level [ROOT] #43923

Open
Dr15Jones opened this issue Feb 8, 2024 · 21 comments

Comments

@Dr15Jones
Copy link
Contributor

Dr15Jones commented Feb 8, 2024

The iorule in SimDataFormats/GeneratorProducts which deals with converting files written using std::auto_ptr<gen::PdfInfo> but read with releases using std::unique_ptr<gen::PdfInfo> does not work if the file was written using split level of 99. It does work if the file was written using split level 0.

In both cases, the iorule is executed, but in the high split level case the address returned from std::auto_ptr<>::get is always null.

@Dr15Jones
Copy link
Contributor Author

assign core, root

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 8, 2024

New categories assigned: core

@Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 8, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 8, 2024

A new Issue was created by @Dr15Jones Chris Jones.

@Dr15Jones, @rappoccio, @smuzaffar, @makortel, @sextonkennedy, @antoniovilela can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

The iorule in question are

<ioread sourceClass = "GenEventInfoProduct" version="[-11]" targetClass="GenEventInfoProduct" source = "std::auto_ptr<gen::PdfInfo> pdf_;" target="pdf_">
<![CDATA[pdf_.reset(onfile.pdf_.get()); onfile.pdf_.release();]]>
</ioread>

<ioread sourceClass = "LHEEventProduct" version="[13]" targetClass="LHEEventProduct" source = "std::auto_ptr<gen::PdfInfo> pdf_;" target="pdf_">
<![CDATA[pdf_.reset(onfile.pdf_.get()); onfile.pdf_.release();]]>
</ioread>

@Dr15Jones
Copy link
Contributor Author

@pcanal FYI

@makortel
Copy link
Contributor

makortel commented Feb 8, 2024

type root

@vlimant
Copy link
Contributor

vlimant commented May 21, 2024

assign xpog
because of the interest in reading run2 legacy AODSIM

@cmsbuild
Copy link
Contributor

New categories assigned: xpog

@vlimant,@hqucms you have been requested to review this Pull request/Issue and eventually sign? Thanks

@vlimant
Copy link
Contributor

vlimant commented May 31, 2024

in the context of EOY24 reprocessing, this needs some attention. what are cms options on this matter ?

@makortel
Copy link
Contributor

makortel commented Jun 1, 2024

Ideally we'd have the iorule behavior fixed in ROOT.

Thinking about workarounds, I guess in principle we could run a simple conversion job using 10_6_X first that reads the high-split-level AODSIM file and produces a splitlevel-0 file, that is then read by the first 15_0_X processing step.

@vlimant
Copy link
Contributor

vlimant commented Jun 3, 2024

Hi @makortel , can you please test and share one such configuration ?

@makortel
Copy link
Contributor

makortel commented Jun 3, 2024

I'd expect something along the following to do the job

import FWCore.ParameterSet.Config as cms

process = cms.Process("ConvertToNonsplit")
process.source = cms.Source("PoolSource",
    fileNames = cms.untracked.vstring("<inputfile>")
)
process.out = cms.OutputModule("PoolOutputModule",
    fileName = cms.untracked.string("<outputfile>"),
    splitLevel = cms.untracked.int32(0),
    overrideInputFileSplitLevels = cms.untracked.bool(True)
)
process.ep = cms.EndPath(process.out)

but I'd leave the testing to others.

@makortel makortel moved this to Needs debugging in ROOT in ROOT prioritization Jun 3, 2024
@makortel
Copy link
Contributor

makortel commented Jun 5, 2024

but I'd leave the testing to others.

I could also rephrase: if someone crafts a test (that gets run in IBs) that checks the std::{auto,unique}_ptr<gen::PdfInfo> is read correctly (including the contents of gen::PdfInfo are read correctly) and thus demonstrates the problem, core can add the aforementioned workaround to that test.

@vlimant
Copy link
Contributor

vlimant commented Jun 14, 2024

so you'd like a test, that reads the old file with the old release, dumps the value somehow. reads the old file with the new release (and the workaround), dumps the value ; and make a comparison of the two. I don't think we really need this to run in the IB, but rather a one-of test that #43923 (comment) would do

@vlimant
Copy link
Contributor

vlimant commented Jun 28, 2024

looks like there is no issue in the end : #43422 (comment)
can you please show me again what is supposed to fail, because reading old 10.6 AODSIM to produce mini and nano in recent (14.1 IB) does seem to function.

@makortel
Copy link
Contributor

makortel commented Jul 8, 2024

looks like there is no issue in the end : #43422 (comment)

I checked the file /eos/cms/store/mc/RunIISummer20UL18RECO/TTToHadronic_TuneCP5_RTT_13TeV-powheg-pythia8/AODSIM/106X_upgrade2018_realistic_v11_L1v1-v2/60009/F4A0C48A-4DC4-7D4C-BD6C-94F9492FAE97.root, and the branch

GenEventInfoProduct                   "generator"                 ""                "GEN"

is not split (i.e. split level is 0), which then explains why the read rule works.

We were probably interpreting the statement "AOD and MiniAOD are stored with split level 99" too literally. What the statement really means is that the data products created by the job writing AOD/MiniAOD are written with split level 99, but the GenEventInfoProduct data product is copied from the input, and by default we preserve the split level of data products from the input file.

If all relevant 10_6_X AODSIM and MiniAODSIM datasets have the same property (i.e. GenEventInfoProduct is stored by an earlier process in the chain that explicitly set the output split level to 0, and the PoolOutputModule's overrideInputFileSplitLevels parameter is never set to True by any later step in the chain), then I guess there is no issue here in practice for the planned Re-Mini+Re-Nano of these datasets.

@makortel
Copy link
Contributor

makortel commented Jul 8, 2024

I don't think we really need this to run in the IB, but rather a one-of test that #43923 (comment) would do

Unfortunately one-off test is not sufficient to guarantee the present functionality stays functional with new code changes (I'm mostly worried about ROOT). Even if the reading seems to work in practice with the 10_6_X files, I'd recommend to craft a test that ensures that capability.

@vlimant
Copy link
Contributor

vlimant commented Jul 9, 2024

thanks @makortel . We would need to find a file with that branch with high split level, than use it as input. Not sure how to find one such 10.6 AODSIM dataset ; chances are they are all with split level 0

@makortel
Copy link
Contributor

makortel commented Sep 5, 2024

We would need to find a file with that branch with high split level, than use it as input. Not sure how to find one such 10.6 AODSIM dataset ; chances are they are all with split level 0

So how worried are we about a 10.6 AODSIM dataset where the GEN information was stored with non-zero split level? In the leading order I'd also guess they would have all been produced with the same output settings, but who knows.

@vlimant
Copy link
Contributor

vlimant commented Sep 6, 2024

@cms-sw/xpog-l2 are less worried about this indeed

@makortel makortel moved this from Needs debugging in ROOT to In Progress in ROOT prioritization Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

4 participants