Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster edmConfigDump with --prune, also prunes unreferenced PSets #43276

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alkis-pap
Copy link

PR description:

edmConfigDump with --prune calls the prune method of the process object which is extremely slow and can even fail for some configurations because it uses delattr to remove unused attributes. Instead it should simply avoid printing the unused attributes to the final output.

Furthermore, it doesn't remove any PSets even though the data contained in most of them is already copied to the corresponding modules, adding thousands of lines of redundant data to the pruned config. Very few PSets are actually referenced by other PSets via refToPSet_ and should be kept.

The main changes in this PR:

  • The prune method of Process was renamed to _unusedAttributes. Instead of calling delattr, it collects the unused attribute names into a list (to preserve order) and returns it.
  • _unusedAttributes also finds the names of all referenced PSets by recursively finding all refToPSet_ attributes (elements of __dict__ not starting with '_') of objects and their children starting from the process object. All PSets that are not referenced are included in the returned list.
  • A new prune method was added because it's used in some tests (in the same file). It simply calls delattr for every attribute returned by _unusedAttributes.
  • dumpPython has a new boolean argument prune. If set to true, _unusedAttributes will be called and the resulting attributes will not be included in the output
  • edmConfigDump doesn't call prune but sets the prune argument accordingly when calling dumpPython

PR validation:

Passes the tests in FWCore/ParameterSet

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43276/37684

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 14, 2023

A new Pull Request was created by @alkis-pap (Alkiviadis Papadopoulos) for master.

It involves the following packages:

  • FWCore/ParameterSet (core)

@cmsbuild, @smuzaffar, @makortel, @Dr15Jones can you please review it and eventually sign? Thanks.
@missirol, @wddgit, @makortel this is something you requested to watch as well.
@sextonkennedy, @rappoccio, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor

Thank you for the PR. Unfortunately the situation towards PSets is a bit complicated, and removing them from the top-level PSet does not generally work. For example, there is some code that goes to look for specific PSet(s) from the top-level PSet. In addition, the workflow management adds (untracked) PSet(s) to the process to deliver some metadata around, and prune() should not remove those.

The concept of dumpPython() skipping unnecessary components instead of calling prune() would be acceptable, as long as all the PSets are kept.

I'd prefer to use a different name than prune for code that does not necessarily prune, e.g. skipUnused.

edmConfigDump with --prune calls the prune method of the process object which is extremely slow and can even fail for some configurations because it uses delattr to remove unused attributes.

Could you give more details on a case where the prune() fails?

@alkis-pap
Copy link
Author

Unfortunately the situation towards PSets is a bit complicated, and removing them from the top-level PSet does not generally work. For example, there is some code that goes to look for specific PSet(s) from the top-level PSet. In addition, the workflow management adds (untracked) PSet(s) to the process to deliver some metadata around, and prune() should not remove those.

Then I would propose to add a command line argument to edmConfigDump to optionally also remove the PSets because for some configs the unused PSets can take up more than 90k lines while the rest of the file is around 35k lines.

Could you give more details on a case where the prune() fails?

For example, it randomly crashes when using the release CMSSW_13_3_0_pre5 and the following commands:

cmsDriver.py step3 -s RAW2DIGI:RawToDigiTask,RECO:reconstruction_pixelTrackingOnly,VALIDATION:@pixelTrackingOnlyValidation,DQM:@pixelTrackingOnlyDQM --conditions auto:phase1_2022_realistic --datatier GEN-SIM-RECO,DQMIO -n 10 --eventcontent RECOSIM,DQM --geometry DB:Extended --era Run3 --procModifiers pixelNtupletFit --filein file:step2.root --no_exec
edmConfigDump --prune step3_RAW2DIGI_RECO_VALIDATION_DQM.py > pruned.py

I did some debugging and discovered that it crashes if the first element of a cms.ignore happens to be removed before all other elements have been removed because the leave method of _MutatingSequenceVisitor uses contents[0][0] without checking if it's None unless allNull is true, ie. the first element happens to be the last one to be removed. The order of removal is random because a set is used in _pruneModules.

@makortel
Copy link
Contributor

Unfortunately the situation towards PSets is a bit complicated, and removing them from the top-level PSet does not generally work. For example, there is some code that goes to look for specific PSet(s) from the top-level PSet. In addition, the workflow management adds (untracked) PSet(s) to the process to deliver some metadata around, and prune() should not remove those.

Then I would propose to add a command line argument to edmConfigDump to optionally also remove the PSets because for some configs the unused PSets can take up more than 90k lines while the rest of the file is around 35k lines.

If the command line option would be named along edmConfigDump --skipTopLevelPSets, and the help message telling that the resulting printout is not necessarily runnable as such, that would be acceptable. Would you keep or skip PSets like process.maxEvents and process.options?

Then the prune() unit tests in Config.py should be extended to demonstrate that both tracked and untracked top-level PSets are kept with prune().

Could you give more details on a case where the prune() fails?

For example, it randomly crashes when using the release CMSSW_13_3_0_pre5 and the following commands:

cmsDriver.py step3 -s RAW2DIGI:RawToDigiTask,RECO:reconstruction_pixelTrackingOnly,VALIDATION:@pixelTrackingOnlyValidation,DQM:@pixelTrackingOnlyDQM --conditions auto:phase1_2022_realistic --datatier GEN-SIM-RECO,DQMIO -n 10 --eventcontent RECOSIM,DQM --geometry DB:Extended --era Run3 --procModifiers pixelNtupletFit --filein file:step2.root --no_exec
edmConfigDump --prune step3_RAW2DIGI_RECO_VALIDATION_DQM.py > pruned.py

I did some debugging and discovered that it crashes if the first element of a cms.ignore happens to be removed before all other elements have been removed because the leave method of _MutatingSequenceVisitor uses contents[0][0] without checking if it's None unless allNull is true, ie. the first element happens to be the last one to be removed. The order of removal is random because a set is used in _pruneModules.

Thank! Would it be possible for you to add a unit test in the Config.py that demonstrates this error? (just leave it commented out and we would address it later)

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 6, 2024

Milestone for this pull request has been moved to CMSSW_14_1_X. Please open a backport if it should also go in to CMSSW_14_0_X.

@cmsbuild
Copy link
Contributor

Milestone for this pull request has been moved to CMSSW_14_2_X. Please open a backport if it should also go in to CMSSW_14_1_X.

@cmsbuild cmsbuild modified the milestones: CMSSW_14_1_X, CMSSW_14_2_X Aug 27, 2024
@antoniovilela
Copy link
Contributor

ping (to make bot change milestone)

@cmsbuild cmsbuild modified the milestones: CMSSW_14_1_X, CMSSW_14_2_X Sep 3, 2024
@cmsbuild
Copy link
Contributor

Milestone for this pull request has been moved to CMSSW_15_0_X. Please open a backport if it should also go in to CMSSW_14_2_X.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants