-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[11_0_X] Protect storage accounting UDP messages from NaN, and Use StatisticsSenderService for all framework files #36358
[11_0_X] Protect storage accounting UDP messages from NaN, and Use StatisticsSenderService for all framework files #36358
Conversation
NaN's were being reported from the values computed using sqrt. This most likely was from the different variables not being updated atomically together.
Previously, each try to open the file using a different PFN would report an open attempt for the same LFN. This meant we could have multiple opens but only one close for a given LFN.
When sending information to the StatisticsSenderService, the file LFN or URL must be supplied.
Send statistics for primary, secondary, and embedded files. The aggregate file statistics are only reset on primary file close boundaries to keep the behavior the same as previous. Changed all calls to closeFile_() to be the new closeFile()
Now broadcasts how the file is used.
A new Pull Request was created by @makortel (Matti Kortelainen) for CMSSW_11_0_X. It involves the following packages:
@cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild, please test |
backport |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0c72ac/20977/summary.html Comparison SummarySummary:
|
This pull request is fully signed and it will be integrated in one of the next CMSSW_11_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_2_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
IOPool/Input/src/RootInputFileSequence.cc includes also the updates originally in #28911 (“New developments using multiple data catalogs provided in site-local-config.xml“) : they were merged in CMSSW_11_1_0_patch1 @makortel please confirm that importing the updates only from one file in #28911 (instead of the whole PR) doesn't cause possible issues somewhere |
@perrotta Thanks for detailed check, but could you clarify? My intention was to not include any changes from #28911 (and if I wasn't careful enough and something sneaked in, I want to understand it well). The changes in
All these are from #35505. Note that the first of these changes caused indentation changes, and the diff becomes easier to read with "Hide whitespace". |
Hi Matti. |
Thanks @perrotta, I see your point now. Indeed the equivalent of
Of those, cmssw/FWCore/Services/plugins/Tracer.cc Lines 531 to 537 in a1d2ae9
and StatisticsSenderService propagates it in the UDP packetcmssw/Utilities/StorageFactory/src/StatisticsSenderService.cc Lines 236 to 238 in a1d2ae9
(although this PR changes the information delivery mechanism from the ActivityRegistry callbacks to a direct call of StatisticsSenderService::closedFile() from RootInputFileSequence::closeFile() , but the source of the information is the same).
I think this points to a bug in #28911 that the information if a fallback file is not used does not propagate anymore to the UDP packets. |
} | ||
if (!filePtr && (hasFallbackUrl)) { | ||
try { | ||
usedFallback_ = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, since the usedFallback_ = true
is set here, 11_0_X (and earlier) do not need the backport of #36379. The FileOpenSentry
will still always signal that none of the files are fallbacks, but that information is not being used anywhere (except in Tracer
Service, but those being "wrong" is not a big deal). The StatisticsSenderService
anyway gets the value of this boolean via direct call (instead of the ActivityRegistry
callbacks).
Pull request #36358 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please check and sign again. |
Now including #36403 too. |
unhold |
@cmsbuild, please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0c72ac/21145/summary.html Comparison SummarySummary:
|
+1 |
This pull request is fully signed and it will be integrated in one of the next CMSSW_11_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_3_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
This PR is a combined backport of #35362 and #35505, following requests in #29412 and #36349. Includes also #36403 as further cleanup.
PR validation:
Unit tests pass.