-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't handle column types redundantly anymore #401
Don't handle column types redundantly anymore #401
Conversation
hi, triggering the bot meanwhile |
Automatic test started, see https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/1112208/builds |
Hi @arizzi, yes I think that's what I did. I tested by stripping down the nano configs down to just the GenPart table for a quick check, and at the end of my cfg I did: process.out = cms.OutputModule("NanoAODOutputModule",
fileName = cms.untracked.string('out.root'),
outputCommands = process.NanoAODEDMEventContent.outputCommands,
compressionLevel = cms.untracked.int32(9),
compressionAlgorithm = cms.untracked.string("LZMA"),
dataset = cms.untracked.PSet(
dataTier = cms.untracked.string('NANOAODSIM'),
filterName = cms.untracked.string('')
),
) It that what you mean? |
Please update |
nope, I mean using PoolOutputModule rather than NanoAodOutputModule and then convert to actual (flat) nano in the merge step. |
Sorry for not being experienced with this yet. Alright I swapped |
The "long" bot tests for data do run the EDM output + merging and
conversion to NANO
Giovanni
Il Mar 24 Set 2019, 15:16 arizzi <[email protected]> ha scritto:
…
https://github.com/cms-sw/cmssw/blob/master/Configuration/DataProcessing/python/Merge.py
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#401?email_source=notifications&email_token=AHYE7UJCNXQZE4ZK2ALEQXLQLIHLHA5CNFSM4IZ5UQUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7OIXJA#issuecomment-534547364>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHYE7UNS7JKGQRSGJGCFGQTQLIHLHANCNFSM4IZ5UQUA>
.
|
Okay I used the from Configuration.DataProcessing.Merge import mergeProcess
process = mergeProcess(
["file:out1.root", "file:out2.root", "file:out3.root"],
process_name = "Merge",
output_file = "Merged.root",
output_lfn = None,
newDQMIO = False,
mergeNANO = True,
bypassVersionCheck = False)
print(process.dumpPython()) It worked fine, and in the end I got once more a "flat ntuple". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatic test report for 1112208
- gitlab pipeline at https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/1112208/builds
- outputs at https://cms-nanoaod-integration.web.cern.ch/integration/test_pr_401/
Code integration
Code checks passed for this PR
Please update PhysicsTools/NanoAOD/python/nanoDQM_cfi.py
: take this patch or run prepareDQM.py -d -u nano_file_mc.root
, and then if needed adjust the plot range using some human common sense.
Tests
- Long test data102X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data106X (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test data80X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data80Xhip (3000 events): passed, no significant changes; dqm plots: all, diff
- Long test data94X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data94X2016 (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data94Xv2 (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc102X (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc106X (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc80X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc94X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc94X2016 (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc94Xv2 (9000 events): passed, no significant changes; dqm plots: all, diff
- Test mc_94Xv2: passed
- Test mc_102X: passed
- Test data_94X: passed
- Test data_102X: passed
Disk size report
Sample | kb/event | ref kb/event | diff |
---|---|---|---|
TTbar MC 102X | 1.831 | 1.831 | 0.000 ( +0.0% ) |
TTbar MC 94Xv1 | 1.924 | 1.924 | 0.000 ( +0.0% ) |
TTbar MC 94Xv2 | 1.956 | 1.956 | 0.000 ( +0.0% ) |
TTbar MC 94X2016 | 1.745 | 1.746 | -0.000 ( -0.0% ) |
TTbar MC 80X | 1.902 | 1.900 | 0.001 ( +0.1% ) |
Data 102X | 0.963 | 0.963 | 0.000 ( +0.0% ) |
Data 94Xv1 | 0.913 | 0.913 | -0.000 ( -0.0% ) |
Data 80X | 0.793 | 0.793 | -0.000 ( -0.0% ) |
Data 80X, Mu Run2016E | 0.775 | 0.775 | 0.000 ( +0.1% ) |
Automatic test started, see https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/1202133/builds |
Hi @mariadalfonso, the commits here are not exactly the same as in the cms-sw PR (cms-sw#30436). There is just one additional commit from cms-sw#30273 on which my developments depended. I think we should also backport this boost commit in cms-sw together with the nano-types PR to really keep the nano-types changes nicely in sync. |
Automatic test started, see https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/1812488/builds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatic test report for 1812488
- gitlab pipeline at https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/pipelines/1812488/builds
- outputs at https://cms-nanoaod-integration.web.cern.ch/integration/test_pr_401/
Code integration
Code checks passed for this PR
Code format passed for this PR
Tests
- Long test data102X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data106Xul17 (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test data106Xul18 (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test data80X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data94X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data94X2016 (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test data94Xv2 (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc102X (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc106Xul16 (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc106Xul17 (9000 events): passed, with differences; dqm plots: all, diff
- Long test mc106Xul18 (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc80X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc94X (10000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc94X2016 (9000 events): passed, no significant changes; dqm plots: all, diff
- Long test mc94Xv2 (9000 events): passed, no significant changes; dqm plots: all, diff
- Test mc_94Xv2: passed
- Test mc_102X: passed
- Test data_94X: passed
- Test data_102X: passed
Disk size report
Sample | kb/event | ref kb/event | diff |
---|---|---|---|
TTbar MC 102X | 1.998 | 1.997 | 0.001 ( +0.0% ) |
TTbar MC 94Xv1 | 2.054 | 2.054 | 0.000 ( +0.0% ) |
TTbar MC 94Xv2 | 2.095 | 2.095 | -0.000 ( -0.0% ) |
TTbar MC 94X2016 | 1.889 | 1.886 | 0.002 ( +0.1% ) |
TTbar MC 80X | 2.006 | 2.007 | -0.001 ( -0.1% ) |
Data 102X | 1.068 | 1.067 | 0.000 ( +0.0% ) |
Data 94Xv1 | 1.019 | 1.018 | 0.000 ( +0.0% ) |
Data 80X | 0.870 | 0.870 | 0.000 ( +0.0% ) |
tests are successful, |
Thanks, so I can close this PR I guess. |
Hi NanoAOD devs!
I hope this is the right cmssw fork and branch for this PR.
Yesterday I wanted to introduce some new column types in my private nanoAOD productions (
int16_t
for example, to save a bit of space) and use them in the flat table producers. However, I realized that there are many parts of the NanoAOD code which have to be tweaked if you want to do this, as the way how column types are handled is not completely trivial.One source of complication is that when you add a column to a flat table with
addColumn()
, you have to pass the type as a template argument as well as an enum value in the function parameters. After working a bit with the code, I understood that this is redundant, because thecheck_type
function makes sure you always use the right enum value with the right template parameter. Therefore, we could just drop this enum parameter and deduce it from the template argument. In this situation, we would also not needcheck_type
anymore.The only tricky part are
bool
columns, which should actually be represented by auint8_t
vector. So far, the logic to take care of this had to be implemented in the plugins that made use of the FlatTable class, but I think I found a way to have this logic directly in the FlatTable class so one can just useaddColumn<bool>
to createbool
columns and they will be internally stored in theuint8_t
vector.What do you think? This simplifies the type handling already quite a bit, and I think it's the good path towards a
FlatTable
class that will support all basic types that you can also store in TTrees.I tested this with the local matrix tests so far, can the nano-bot tests still be done here? That would be very cool!
Thanks for considering this and cheers,
Jonas