-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: improve proxy handling #5893
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -213,7 +213,7 @@ def resolve_redirects(self, resp, req, stream=False, timeout=None, | |
|
||
# Rebuild auth and proxy information. | ||
proxies = self.rebuild_proxies(prepared_request, proxies) | ||
self.rebuild_auth(prepared_request, resp) | ||
self.rebuild_auth(prepared_request, resp, proxies) | ||
|
||
# A failed tell() sets `_body_position` to `object()`. This non-None | ||
# value ensures `rewindable` will be True, allowing us to raise an | ||
|
@@ -251,13 +251,25 @@ def resolve_redirects(self, resp, req, stream=False, timeout=None, | |
url = self.get_redirect_target(resp) | ||
yield resp | ||
|
||
def rebuild_auth(self, prepared_request, response): | ||
def rebuild_auth(self, prepared_request, response, proxies): | ||
"""When being redirected we may want to strip authentication from the | ||
request to avoid leaking credentials. This method intelligently removes | ||
and reapplies authentication where possible to avoid credential loss. | ||
""" | ||
headers = prepared_request.headers | ||
url = prepared_request.url | ||
scheme = urlparse(url).scheme | ||
|
||
if 'Proxy-Authorization' in headers: | ||
del headers['Proxy-Authorization'] | ||
|
||
try: | ||
username, password = get_auth_from_url(proxies[scheme]) | ||
except KeyError: | ||
username, password = None, None | ||
|
||
if username and password: | ||
headers['Proxy-Authorization'] = _basic_auth_str(username, password) | ||
|
||
if 'Authorization' in headers and self.should_strip_auth(response.request.url, url): | ||
# If we get redirected to a new host, we should strip out any | ||
|
@@ -283,32 +295,21 @@ def rebuild_proxies(self, prepared_request, proxies): | |
:rtype: dict | ||
""" | ||
proxies = proxies if proxies is not None else {} | ||
headers = prepared_request.headers | ||
new_proxies = proxies.copy() | ||
url = prepared_request.url | ||
scheme = urlparse(url).scheme | ||
new_proxies = proxies.copy() | ||
no_proxy = proxies.get('no_proxy') | ||
if scheme in proxies: | ||
return new_proxies | ||
|
||
bypass_proxy = should_bypass_proxies(url, no_proxy=no_proxy) | ||
if self.trust_env and not bypass_proxy: | ||
environ_proxies = get_environ_proxies(url, no_proxy=no_proxy) | ||
no_proxy = proxies.get('no_proxy') | ||
|
||
proxy = environ_proxies.get(scheme, environ_proxies.get('all')) | ||
environ_proxies = get_environ_proxies(url, no_proxy=no_proxy) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The changes here were done mainly to avoid calling There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMHO this should be inside / under the |
||
proxy = environ_proxies.get(scheme, environ_proxies.get('all')) | ||
if proxy: | ||
bypass_proxy = should_bypass_proxies(url, no_proxy=no_proxy) | ||
|
||
if proxy: | ||
if self.trust_env and not bypass_proxy: | ||
new_proxies.setdefault(scheme, proxy) | ||
|
||
if 'Proxy-Authorization' in headers: | ||
del headers['Proxy-Authorization'] | ||
|
||
try: | ||
username, password = get_auth_from_url(new_proxies[scheme]) | ||
except KeyError: | ||
username, password = None, None | ||
|
||
if username and password: | ||
headers['Proxy-Authorization'] = _basic_auth_str(username, password) | ||
|
||
return new_proxies | ||
|
||
def rebuild_method(self, prepared_request, response): | ||
|
@@ -633,7 +634,8 @@ def send(self, request, **kwargs): | |
kwargs.setdefault('stream', self.stream) | ||
kwargs.setdefault('verify', self.verify) | ||
kwargs.setdefault('cert', self.cert) | ||
kwargs.setdefault('proxies', self.rebuild_proxies(request, self.proxies)) | ||
if 'proxies' not in kwargs: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know this is correct. You've added no tests. Why do you feel this change is necessary? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sigmavirus24 this change is needed because without it we are calling to re-build proxies from the environment even when proxies have been set/provided. In one use test case I have setup - the time to "retrieve" a cached connection (using the cache control adapter) 10,000 times goes from 9.741 seconds and 25212240 function calls to 9902240 function calls & 6.169 seconds after applying this change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sigmavirus24 the change in this line is twofold:
I intend to add tests, but as per the contribution guidelines I thought I'd get feedback as soon as I can to avoid unnecessary work. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for starting on this @omermizr. I think the logic on point 1 is sound, we shouldn't have passed As for point 2, I'm hesitant to remove the stripping behavior directly from At a very high level, At that point we'd rewire things to effectively be: def rebuild_proxies(self, prepared_request, proxies):
new_proxies = self.build_proxies(prepared_request, proxies)
self.rebuild_proxy_auth(prepared_request, new_proxies)
return new_proxies
...
def send(self, request, **kwargs):
...
if 'proxies' not in kwargs:
kwargs['proxies'] = self.build_proxies(request, self.proxies) @sigmavirus24 please disagree with me here if you think I'm on the wrong track. We may also want to consider making these utility functions, or private on the Session. I don't know how I feel about adding new interface surface. If this is a non-starter, we may want to consider simply reverting #5681 for now and going back to the drawing board. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In order to address the performance regression, it'd be enough to change the setdefault. I didn't want to do that because it would've introduced inconsistency with how the proxy auth header is handled (stripping the header would depend on whether 'proxies' is passed to the function or not). The way I see it we have a few options:
WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yep, agreed, I don't think we want to have this inconsistency in the API as it's going to surprise most users, and is likely the wrong behavior when For option 2, I've thought some more, and am leaning towards having the functionality potentially moved out to utils, so we're not adding new methods to Session. We may only need to move the "build" logic out and can leave the auth portion inline for I don't know if option 3 works because we're changing the signature on a public API which is what we wanted to avoid. Anyone who's already extended this function, and is not using keyword arguments, will likely see unexpected breakage from that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've posted #5924 as a talking point for what a potential solution for #5888 may look like. Please let me know if you have any thoughts. As for #5891, the behavior of checking the environment for proxy settings before resolving for For the time being, setting There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I reviewed your PR, basically LGTM. Now, while the escape hatches are ok, I don't think that's really a good way to move forward considering the default behavior is now less performant. Given that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look! Completely agree, Requests has a pretty broad and diverse user base. It's always a difficult balancing act of usability and correctness that works for everyone. Hopefully we've arrived at one of the better outcomes for this case 😄 |
||
kwargs['proxies'] = self.rebuild_proxies(request, self.proxies) | ||
|
||
# It's possible that users might accidentally send a Request object. | ||
# Guard against that specific failure case. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a functionally public API. We can't change the signature like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my bad, I thought this was internal. I'll look into for an alternative approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can just extract this change to a different (private?) function and call it before
rebuild_auth
. WDYT?