Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of the FixMissingStreamerInfos service #43175

Merged
merged 1 commit into from
Nov 9, 2023

Conversation

wddgit
Copy link
Contributor

@wddgit wddgit commented Nov 2, 2023

PR description:

The version of ROOT associated with CMSSW_13_0_0 had a bug that caused it to fail to write out StreamerInfo objects for some types in the output. This affected some other releases in the 13_0_X and 13_1_X release cycles. It was fixed in both of those release cycles and didn't affect others. See Issue #41246 for more details.

The missing StreamerInfos don't actually cause a problem until the data format of the associated class changes and someone tries to read the file. Then the schema evolution feature of ROOT is needed and that requires the StreamerInfo objects. This could occur much later.

This PR introduces a workaround fix that allows one to read problem objects in those files. It adds a new service that will read in a file containing only StreamerInfo objects. Then the problem objects become readable and schema evolution succeeds. One sets a parameter in the service to point to a file that contains only the missing StreamerInfo objects and that causes them to be stored in memory.

The PR includes a script that can be used to generate the file that contains the StreamerInfo objects.

One shortcoming of this PR is that it is hard to identify all problem types. We found all types missing the StreamInfo in one input file and there was one additional type identified in the initial problem reports. One can use the script to generate a new file if more problem types are encountered in the future.

There is a second PR associated with this one that will add 2 files to the cms-data repository associated with IOPool/Input. This PR should not be merged before the cms-data PR.

PR validation:

A new unit test is included to verify that this workaround will succeed.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2023

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43175/37502

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2023

A new Pull Request was created by @wddgit (W. David Dagenhart) for master.

It involves the following packages:

  • IOPool/Input (core)

@cmsbuild, @makortel, @smuzaffar, @Dr15Jones can you please review it and eventually sign? Thanks.
@makortel this is something you requested to watch as well.
@rappoccio, @antoniovilela, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@wddgit
Copy link
Contributor Author

wddgit commented Nov 2, 2023

please test with cms-data/IOPool-Input#2

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2023

-1

Failed Tests: RelVals RelVals-INPUT AddOn
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-973223/35568/summary.html
COMMIT: d100af2
CMSSW: CMSSW_13_3_X_2023-11-02-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43175/35568/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

RelVals-INPUT

AddOn Tests

----- Begin Fatal Exception 03-Nov-2023 00:28:47 CET-----------------------
An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-11-02-1100/src/Utilities/ReleaseScripts/scripts/read312RV_cfg.py
Exception Message:
 unknown python problem occurred.
ModuleNotFoundError: No module named 'past'

At:
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-11-02-1100/src/FWCore/ParameterSet/python/Types.py(6): <module>
  <frozen importlib._bootstrap>(228): _call_with_frames_removed
  <frozen importlib._bootstrap_external>(850): exec_module
  <frozen importlib._bootstrap>(695): _load_unlocked
  <frozen importlib._bootstrap>(986): _find_and_load_unlocked
  <frozen importlib._bootstrap>(1007): _find_and_load
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-11-02-1100/src/FWCore/ParameterSet/python/Config.py(15): <module>
  <frozen importlib._bootstrap>(228): _call_with_frames_removed
  <frozen importlib._bootstrap_external>(850): exec_module
  <frozen importlib._bootstrap>(695): _load_unlocked
  <frozen importlib._bootstrap>(986): _find_and_load_unlocked
  <frozen importlib._bootstrap>(1007): _find_and_load
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-11-02-1100/src/Utilities/ReleaseScripts/scripts/read312RV_cfg.py(2): <module>

----- End Fatal Exception -------------------------------------------------
[fastsim:1] cmsDriver.py TTbar_8TeV_TuneCUETP8M1_cfi  --conditions auto:run1_mc --fast  -n 100 --eventcontent AODSIM,DQM --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,VALIDATION  --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --datatier GEN-SIM-DIGI-RECO,DQMIO --beamspot Realistic8TeVCollision : FAILED - elapsed time: 0 sec (ended on Fri Nov  3 00:28:49 2023) - exit: 256
[fastsim1:1] cmsDriver.py TTbar_13TeV_TuneCUETP8M1_cfi --conditions auto:run2_mc_l1stage1 --fast  -n 100 --eventcontent AODSIM,DQM --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,VALIDATION  --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --datatier GEN-SIM-DIGI-RECO,DQMIO --beamspot NominalCollision2015 --era Run2_25ns : FAILED - elapsed time: 0 sec (ended on Fri Nov  3 00:28:53 2023) - exit: 256
Expand to see more addon errors ...

@wddgit
Copy link
Contributor Author

wddgit commented Nov 3, 2023

please test with cms-data/IOPool-Input#2

Try again. Maybe the failures are a glitch. It is hard to see how this could cause anything to fail outside the single unit test it adds. There is a modified unit test, a new script and a new service. The script and service are not currently used by anything other than the unit test.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 3, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-973223/35584/summary.html
COMMIT: d100af2
CMSSW: CMSSW_13_3_X_2023-11-03-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43175/35584/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 127 lines to the logs
  • Reco comparison results: 17 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3362691
  • DQMHistoTests: Total failures: 1395
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3361274
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor

makortel commented Nov 4, 2023

Comparison differences are related to #39803

@@ -102,4 +102,6 @@ inputfile=$(edmFileInPath IOPool/Input/data/$file) || die "Failure edmFileInPath
#root.exe -b -l -q file:$inputfile "${LOCAL_TEST_DIR}/testForStreamerInfo.C(gFile)" | sort -u | grep Missing > testForStreamerInfo2.log
#grep "Missing" testForStreamerInfo2.log && die "Missing nested streamer info" 1

cmsRun ${LOCAL_TEST_DIR}/SchemaEvolution_test_read_cfg.py --inputFile $inputfile --enableStreamerInfosFix || die "Failed to read old file $file with fix" $?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the comment above could be rephrased now e.g. along "the test would fail without the --enableStreamerInfosFix"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited that comment using your suggestion and also generally shortened the comment. It's better now I think. Thanks.

#include <iostream>

void makeFileContainingStreamerInfos() {
std::cout << "Executing makeFixitFile()" << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name in the printout is different from the function and file name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed that so the printout and names match. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about this file being in scripts directory. Those files end up in $PATH, but this macro needs to be explicitly ran via root .... But I wouldn't place it e.g. in tests either, so maybe this is the "least bad" option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have mixed feelings about that. I'll move it to tests if you want. It's not really a test though. It does not seem like it will be used often enough to deserve being in the PATH. I can't think of any other options though. Is there any other place I could put it? Maybe a subdirectory of scripts? Would that help? Or I could make up a new directory and call it rootScripts... If you want me to move it, make a suggestion and I will move it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dr15Jones @smuzaffar Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe given @smuzaffar's reply in #43174 (comment) this placement is kind of ok as long as the file does not have execution permissions (which I think is already the case).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't give it executable permissions in my local working area so we should be good. Lets just leave it there.

@wddgit wddgit force-pushed the fixMissingStreamerInfos branch from d100af2 to dcf316f Compare November 6, 2023 16:38
@makortel
Copy link
Contributor

makortel commented Nov 7, 2023

@cmsbuild, please test

Let's see if we could get clean comparisons. The PR itself looks good.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 7, 2023

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-973223/35652/summary.html
COMMIT: dcf316f
CMSSW: CMSSW_14_0_X_2023-11-06-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43175/35652/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 1 errors in the following unit tests:

---> test TestIOPoolInputSchemaEvolution had ERRORS

Comparison Summary

Summary:

  • You potentially added 419 lines to the logs
  • Reco comparison results: 28 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3363010
  • DQMHistoTests: Total failures: 1406
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3361582
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor

makortel commented Nov 7, 2023

Comparison failures are related to #39803

@makortel
Copy link
Contributor

makortel commented Nov 7, 2023

@cmsbuild, please test with cms-data/IOPool-Input#2

Forgot the external...

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 7, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-973223/35666/summary.html
COMMIT: dcf316f
CMSSW: CMSSW_14_0_X_2023-11-07-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43175/35666/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 209 lines to the logs
  • Reco comparison results: 22 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3363010
  • DQMHistoTests: Total failures: 1406
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3361582
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor

makortel commented Nov 7, 2023

Comparison failures are related to #39803

@makortel
Copy link
Contributor

makortel commented Nov 7, 2023

+core

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 7, 2023

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@makortel
Copy link
Contributor

makortel commented Nov 9, 2023

@cms-sw/orp-l2 Could this PR be merged (it needs also cms-data/IOPool-Input#2)? Thanks!

@antoniovilela
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit e251c0a into cms-sw:master Nov 9, 2023
@wddgit wddgit deleted the fixMissingStreamerInfos branch March 11, 2024 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants