-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect behavior with schemeless-dotless host:port URLs #6455
Comments
How does this realistically affect users who aren't using utils which aren't part of the public API? |
ah sorry, I went down a rabbit hole tracking down the issue to this that I forgot to mention the user-facing scenario! one scenario is when using such a URL as a proxy: response = requests.get(
"http://www.example.com",
proxies={"http": "myproxy:8080"},
....
) this used to work (with requests 2.25.1 and python 3.8), but with requests 2.27.1 it fails with ...
requests.exceptions.InvalidProxyURL: Please check proxy URL. It is malformed and could be missing the host. another scenario (less critical) is difference in exceptions ( response = requests.get("www.example.com")
...
MissingSchema: Invalid URL 'www.example.com': No scheme supplied. Perhaps you meant http://www.example.com? vs response = requests.get("hostname:8080")
...
requests.exceptions.InvalidSchema: No connection adapters were found for 'hostname:8080' |
So the last two aren't ever supposed to work. Would it be nice if they raised the same exception? Sure. But never should we be guessing scheme. We never document that schemeless URLs are supported in that way. The proxy case may take investigation. But I suspect we documented a breaking change there. I vaguely remember other people complaining |
Yes, it's definitely the proxy case that brought me here and is causing me issues. The other one is just a nit. |
Hi @itamaro, this was an intentional change in 2.27.0 due to this bug in CPython. The behavior of This is why you're seeing an error now that wasn't occurring previously. Python 3.10
Python 3.7
The only reason this happened to work previously is we did this (arguably bad) shuffle of replacing the netloc with the path for this case. That honestly should have never been done but was there to work around the oddities of url parsing in the standard library. So the short of the story is all version of Requests will be affected by this as soon as you upgrade beyond Python 3.8. We have no reliable way to control this from Requests side anymore. I would recommend updating your usage to ensure proxies are passed with a scheme to avoid any surprises later. Requests 2.25.1 behavior on Python 3.10
|
thanks for the background @nateprewitt !
totally agree this is the preferred solution, but it's not going to be easy doing that in our monorepo with millions of lines of code... (e.g. the entire Meta Python codebase 😬) |
Guys should I try solving this or is this redundant? |
@turingnixstyx I don't think there's anything to be solved. Requests cannot support schemeless proxies in Python 3.9+. It's an unfortunately tedious change for end-users, but we can't do much about the behavior in the standard library. What we're doing now is providing a consistent behavior across all Python versions. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
The urlparse stdlib call behaves differently on Python 3.9+ for schemaless-URLs. It parses string hostname into a schema instead of netloc. The issue is still open and discussed * psf/requests#6455 and apparently it will not be solved, so we need to workaround it if we still want to support legacy, schemaless URLs to be accepted by Elasticsearch handler.
The urlparse stdlib call behaves differently on Python 3.9+ for schemaless-URLs. It parses string hostname into a schema instead of netloc. The issue is still open and discussed * psf/requests#6455 and apparently it will not be solved, so we need to workaround it if we still want to support legacy, schemaless URLs to be accepted by Elasticsearch handler.
The urlparse stdlib call behaves differently on Python 3.9+ for schemaless-URLs. It parses string hostname into a schema instead of netloc. The issue is still open and discussed * psf/requests#6455 and apparently it will not be solved, so we need to workaround it if we still want to support legacy, schemaless URLs to be accepted by Elasticsearch handler.
The urlparse stdlib call behaves differently on Python 3.9+ for schemaless-URLs. It parses string hostname into a schema instead of netloc. The issue is still open and discussed * psf/requests#6455 and apparently it will not be solved, so we need to workaround it if we still want to support legacy, schemaless URLs to be accepted by Elasticsearch handler. GitOrigin-RevId: dd73a0bffa6c4de93a2dd8dc4460b64aedc51255
The urlparse stdlib call behaves differently on Python 3.9+ for schemaless-URLs. It parses string hostname into a schema instead of netloc. The issue is still open and discussed * psf/requests#6455 and apparently it will not be solved, so we need to workaround it if we still want to support legacy, schemaless URLs to be accepted by Elasticsearch handler. GitOrigin-RevId: dd73a0bffa6c4de93a2dd8dc4460b64aedc51255
The urlparse stdlib call behaves differently on Python 3.9+ for schemaless-URLs. It parses string hostname into a schema instead of netloc. The issue is still open and discussed * psf/requests#6455 and apparently it will not be solved, so we need to workaround it if we still want to support legacy, schemaless URLs to be accepted by Elasticsearch handler. GitOrigin-RevId: dd73a0bffa6c4de93a2dd8dc4460b64aedc51255
URLs of the form
hostname:8080
(with no scheme, with "hostname" not containing any dots) can be used to refer to the netloc "hostname:8080"requests.utils.prepend_scheme_if_needed
should correctly prepend thenew_scheme
when provided with such a URL.Expected Result
the prepended-scheme URL should be "http://hostname:8080"
Actual Result
the prepended-scheme URL is "hostname:///8080" (e.g. treating the "hostname" part as the scheme, no host, no port, and "8080" as the path)
I extended the
test_prepend_scheme_if_needed
to demonstrate this behavior (see main...itamaro:requests:schemeless-hostname-anad-port-bug)Reproduction Steps
System Information
The text was updated successfully, but these errors were encountered: