Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect ProcessCallGraph for module IDs larger than the number of modules #29584

Merged
merged 1 commit into from
May 4, 2020

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Apr 29, 2020

PR description:

#29553 can make the number of modules at beginJob() (and thereafter) smaller than the largest module ID, which would lead to out-of-bound errors in ProcessCallGraph::preBeginJob() when using the module IDs as an index to a boost graph. This PR proposes, as a minimal fix, to look for the maximum module ID explicitly.

This PR fixes some of the crashes seen in #29553 tests, I'm submitting the fix in a separate PR to make the review a bit easier.

PR validation:

Limited matrix runs.

…f modules

It could be (soon) that the module ID is larger than the number of modules.
@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29584/14904

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel (Matti Kortelainen) for master.

It involves the following packages:

HLTrigger/Timer

@cmsbuild, @Martin-Grunewald, @fwyzard can you please review it and eventually sign? Thanks.
@Martin-Grunewald this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 29, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/5901/console Started: 2020/04/29 15:55

@cmsbuild
Copy link
Contributor

+1
Tested at: a0988b8
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d87fb0/5901/summary.html
CMSSW: CMSSW_11_1_X_2020-04-29-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@fwyzard
Copy link
Contributor

fwyzard commented Apr 29, 2020

hi @makortel ,
a possibly easier alternative could be to add a const access to the highest id returned by ModuleDescription::getUniqueID() ?

Something like adding to DataFormats/Provenance/src/ModuleDescription.cc

unsigned int ModuleDescription::getOneAfterLargestValidID() { return s_id; }

(a better name is welcome)

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d87fb0/5901/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2696435
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2696114
  • DQMHistoTests: Total skipped: 319
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@makortel
Copy link
Contributor Author

@fwyzard Before going there, let me ask a clarification of how the preBeginJob() behaves (or should behave) towards SubProcesses. I see that it uses a "root graph" for the main Process, and then creates a "sub graph" for each SubProcess. Then, it asks the number of modules of that (Sub)Process with pathsAndConsumes.allModules().size() and adds that many vertices to the root/sub graph. Later, it then uses the module ID as an index to the graph_.m_graph[].

Does the graph_.m_graph contain the vertices of the root and all the subgraps? (I'd guess so, assuming the indexing by module ID works also for SubProcesses)

If yes, already this PR leads to overestimating the number of vertices for subgraphs, because the largest module ID in the set of modules of a SubProcess is certainly larger than the number of modules in that SubProcess. The overestimation is nevertheless smaller than using the "largest ID in a cmsRun" for each (SubProcess). How big burden the extra vertices would be (beyond consuming more memory)?

@makortel
Copy link
Contributor Author

I was actually thinking to add something along unsigned int PathsAndConsumesOfModules::largestModuleID() but didn't bother for the first attempt for only one use. I noticed now that FWCore/Services/plugins/DependencyGraph.cc has the same issue, so probably some common solution would be good.

I would prefer to keep the exact logic of module ID assignment (that is the value of s_id) as an implementation detail to allow more future flexibility (although both ProcessCallGraph and DependencyGraph would likely need changes anyhow in such cases).

@Martin-Grunewald
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants