-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[XrdAdaptor] Delay additional source acquisition when redirect limit error is returned. #35498
[XrdAdaptor] Delay additional source acquisition when redirect limit error is returned. #35498
Conversation
A new Pull Request was created by @osschar (Matevž Tadel) for CMSSW_12_1_DEVEL_X. It involves the following packages:
@makortel, @smuzaffar, @cmsbuild, @Dr15Jones can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
code-checks |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35498/25687 ERROR: Build errors found during clang-tidy run.
|
lets wait for newer IB with #35435 |
(There is some issue with CMSSWAgendaMaker causing by the fact that this PR enters 12_1_0 milestone there: https://api.github.com/repos/cms-sw/cmssw/pulls?state=open&milestone=86&per_page=100&page=1 (No.11 there) That is why I did some test above. Please ignore. @smuzaffar By chance do you know why it happens? Adding also @silviodonato @davidlange6 @perrotta ) |
please test |
may be CMSSWAgendaMaker does not work properly when there is no milestone assigned to it? Note that this is for DEVEL branch and there is no milestone for DEVEL . should I update bot to assign default CMSSW_N_M_X milestone for CMSSW_N_M_*_X branches too? |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-e872b6/19333/summary.html Comparison SummarySummary:
|
Code-checks |
code-checks |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35498/25721
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35498/25723
|
+core |
This pull request is fully signed and it will be integrated in one of the next CMSSW_12_1_DEVEL_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_1_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1
|
When file open returns error status
XrdCl::errRedirectLimit
this means the limit of retries on a redirector has been reached and that any further requests on the same redirector will result in the same error.This PR introduces a progressive scaling variable that increases the delay until the next attempt of additional source acquisition.
In redirector hierarchy the reject decision gets made on each site redirector, based on its configuration -- this is why scaling is progressive, a job might later still succeed in opening of another file replica from another site.
Together with the
triedrc=resel
change in previous PR this allows for tuning of how many:a) reopen requests due to errors, and
b) reopen requests to get an additional source
are allowed at each point in a redirector hierarchy and avoids constant pinging of redirectors for the same file.
This is really important for XCache (where additional open requests potentially introduce unwanted file replicas in the cache) and might also be relevant for EOS and other installations where data is served from a single set of disks with several servers and opening of additional requests only burdens the storage system.