-
Notifications
You must be signed in to change notification settings - Fork 893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(download): work around hyper
hang issue by adjusting reqwest
config
#3855
Conversation
@@ -345,6 +345,10 @@ pub mod reqwest_be { | |||
|
|||
fn client_generic() -> ClientBuilder { | |||
Client::builder() | |||
// HACK: set `pool_max_idle_per_host` to `0` to avoid an issue in the underlying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why this is necessary/why there is no better solution for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@djc The underlying issue is hyperium/hyper#2312, see hyperium/hyper#2312 (comment) for the first appearance of this hack. It is suspected that there is a hyper pool deadlock underneath, and this workaround has been used by many to successfully mitigate the issue (see all the external references below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a good reproduction scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@djc Unfortunately not yet, and that's the tricky part of this thing I guess... If it really is a concurrency bug, then it'll be rather hard to track and to fix (that's also why that issue has been open for such a long time I guess, a previous project that I collaborated on has also hit it back in 2020, and the same hack was applied 🤔).
As far as our CI is concerned, we can see several failures in the past 5 days caused by this one, and they are all "solved" by relaunching the test task. The particular test(s) that failed varied from time to time, but I'm sure that all those tests are reqwest
-related.
Anyway, this PR intentionally won't close #3852, and I'll keep an eye on hyperium/hyper#2312 as well as our CI in the upcoming weeks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@djc The workaround seems to be working: our CI hasn't been failing for this reason in the past 5 days.
This PR tries to mitigate #3852 by applying Azure/azure-sdk-for-rust#1550 's hack to all the
ClientBuilder
's I can find across the repo.The verification of the mitigation is not easy, since
async
execution is not so deterministic. Nevertheless, as this seems to be a widely encountered issue withhyper
andreqwest
, I suggest we merge this first and see how it goes.