-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SiPixelAli PCL] Update the pede options to avoid issues with too many binaries open #28306
[SiPixelAli PCL] Update the pede options to avoid issues with too many binaries open #28306
Conversation
…nable the option closeandreopen: (from manual: Set flag keepOpen to zero to enable closing and reopening of binary files to limit the number of concurrently open files) to enable closing and reopening of binary files to limit the number of concurrently open files. The modification dates of the files are monitored to ensure data integrity.
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28306/12518
|
A new Pull Request was created by @mmusich (Marco Musich) for master. It involves the following packages: Alignment/CommonAlignmentProducer @christopheralanwest, @tocheng, @cmsbuild, @franzoni, @tlampen, @pohsun can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
The tests are being triggered in jenkins. |
please test with cms-sw/cmsdist#5309 |
The tests are being triggered in jenkins. |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
During Run-2 pp operations it was noticed that occasionally the PCL alignment workflow was stalled for very long runs (typically exceeding 1000 LS long), examples are run 317182 and run 317320.
In those relatively rare cases no measurement was available despite the large amount of tracks available.
Inspection of log files revealed that the issue lied in the inability of the pede routine to open the input binary files and access the data within them:
The issues has been traced back in the meanwhile in having a too large amount of pede binary files open at the same time.
This issue has been solved in recent pede releases (cf. http://www.desy.de/~kleinwrt/MP2/doc/html/index.html), by providing the option closeandreopen (from the pede manual: sets flag
keepOpen
to zero to enable closing and reopening of binary files to limit the number of concurrently open files).Such an option is available starting from the 19.04.12 revision and is available either in the privately distributed pede executable at
/afs/cern.ch/user/c/ckleinw/bin/rev183/pede
or in the official MillePede releaseV04-06-00
(which has been requested here: cms-sw/cmsdist#5309)This PR updates the configuration of
SiPixelAliPedeAlignmentProducer
in order to use thecloseandreopen
option and is a companion of cms-sw/cmsdist#5309 (N.B. the two should be tested together).PR validation:
The previously failing PCL workflow for run 317320 re-generated via:
with the modifications proposed in this PR + adding the line:
in the configuration (equivalent to have
V04-06-00
as default executable) runs to completion (takes about 3h to complete).if this PR is a backport please specify the original PR:
This PR is not a backport.
cc:
@adewit @connorpa @dmeuser