Index URLs containing unescaped "@" in username cause "Failed to parse" error in pip 19.2 #6775

bomri · 2019-07-23T11:16:44Z

Environment

pip version: 19.2
Python version: 3.6.8
OS: ubuntu 18.04

Description
After updating to pip-19.2 from pip-19.1.1 installing packages with --extra-index-url doesn't work and I get:

ERROR: Could not find a version that satisfies the requirement package_name (from versions: none)
ERROR: No matching distribution found for package_name

going back to 19.1.1 works fine

shay108 · 2019-07-23T11:23:32Z

+1
Yup. Happened to us as well.

TTRh · 2019-07-23T11:43:07Z

We also have issue with pip-19.2 when using specific index url :

Looking in indexes: https://<email.com>:****@<artifactory_host>/artifactory/api/pypi/pypi-virtual/simple
Collecting setuptools
ERROR: Could not install packages due to an EnvironmentError: Failed to parse: https://<email.com>:[secure]@<artifactory_host>/artifactory/api/pypi/pypi-virtual/simple/setuptools/

It's working fine with pip-19.1.1. Could it be related ?

cletomartin · 2019-07-23T11:47:13Z

@TTRh just saw the same and open #6776. Not sure if it should be handled in the same issue though.

chrahunt · 2019-07-23T11:59:00Z

@bomri, can you please re-run with --verbose and provide the output?

brainerazer · 2019-07-23T12:28:14Z

Might be related to:
urllib3/urllib3#1640
urllib3/urllib3#1586

philipp-lein · 2019-07-23T12:30:50Z

We had the same problem with a special character in our password.

pradyunsg · 2019-07-23T13:09:11Z

Hi folks! Thanks for filing this issue and chiming in here.

It'd be best that if you're facing this issue, that you use GitHub's reactions on the first post and subscribe to this issue. That way we can avoid "me too" comments which don't really add much value to the discussion.

It'll make it easier for the maintainers to discuss how to go about resolving this issue, in this issue itself and provide updates to y'all more easily. 🙃

bomri · 2019-07-23T13:26:13Z

@chrahunt running with --verbose I get 404 on the pypi.org as expected (since the package doesn't exists there) but from my own pypi server I get:

403 Client Error: Forbidden for url:...

maybe it is related to special characters in the token id like @brainerazer mentioned

slafs · 2019-07-23T13:26:23Z

This has bitten us too. Not sure if this issue is related to the ones @brainerazer linked for urllib3. Our issue appears when trying to download a released .zip package from GitHub with an access token. So stuff like pip install https://<my-access-token>@github.com/myorganization/myprivaterepo/archive/myrelease.zip#egg=mypackage doesn't work anymore with pip 19.2 (works for 19.1.1 as someone already mentioned).

I've created an empty private repo, with an empty release to test that. The output for pip install -vvv ... can be found here:

for pip 19.2,
for pip 19.1.1 (with expected setup.py not found error).

The issue/difference can be observed around Starting new HTTPS connection (1): github.com:443.

pradyunsg · 2019-07-23T13:30:13Z

Maybe @cjerdonek knows more about this stuff than me.

slafs · 2019-07-23T13:31:04Z

I've also run a git bisect session for this issue and it pointed to this commit: ~~eeb74aeb29~~ c63ee61

pradyunsg · 2019-07-23T13:52:01Z

My thoughts: the current behavior is more technically correct and people should start using the new keyring support for this stuff.

At the same time, it definitely isn't nice that this broke existing reasonable workflows. I think if someone can figure out how to re-introduce the legacy behavior here, that'd be great.

Based on pip's documented deprecation policy, we don't need to have a deprecation cycle for such undocumented behaviors. None the less, we should do a deprecation cycle, warning users to switch to the more "correct" syntax or tooling here before dropping support and switching to a nicer error message.

I'm not sure that the bisection found the right commit here but none the less, thanks for doing that!

shay108 · 2019-07-23T14:19:43Z

@pradyunsg, maybe I misunderstand.. can you please direct me to the needed "current behavior"?
If we need to refactor the way we use 3rd party repos (in our case gemfury.com) then that's fine. I just don't understand what the right syntax should be :)

slafs · 2019-07-23T14:21:19Z

I'm not sure that the bisection found the right commit here but none the less, thanks for doing that!

you're totally right. I corrected my test script and now it points to c63ee61 which makes more sense now I guess

pradyunsg · 2019-07-23T15:47:41Z

can you please direct me to the needed "current behavior"?

Whoops. Yes. (This is why I needed a nap)

The issue here is that the parts of a URL should be... urlencoded.

Thus, the @ in the email used, as a username, should be %40. The following is in fact not a valid URL based on the RFC that defines these things:

https://[email protected]:[email protected]/pypi

The correct way to represent this information would be:

https://awesome%40pradyunsg.me:[email protected]/pypi

If someone has an issue with that... They should probably take that up with the folks who wrote that RFC. It's mentioned in the urllib issue linked further up in this thread.

Aside: @markuszoeller had pointed out the workaround in a different issue that's been closed as a duplicate of this.

What I think I'm suggesting in that comment... is that pip could detect when the user does use such invalid URLs (since it accepted that in the past), warn the user, and work like 19.1.1 -- after some time, the warning and workaround would be removed.

If y'all do think that's necessary after this breakage, do let us know.

I'll be more than happy to just skip the additional work if it's not helpful to the folks affected by it -- there's no point in doing things that are not useful for the end users. 🙃

pradyunsg · 2019-07-23T15:53:39Z

I corrected my test script and now it points to c63ee61 which makes more sense now I guess

Indeed. I can see there why we weren't catching these uses earlier -- pip was doing "home grown" processing of the index URL's auth information which obviously didn't consider such input invalid. What changed is we're now using a urllib utility function to get parts of the URL, which is stricter and RFC compliant.

slafs · 2019-07-23T15:57:26Z

pip was doing "home grown" processing of the index URL's auth information which obviously didn't consider such input invalid. What changed is we're now using a urllib3 utility function to get parts of the URL, which is stricter and RFC compliant.

right, now I get it, thanks @pradyunsg. For those watching at home ;) as a fix one can pass https://<my-access-token>:@github.com/myorganization/myprivaterepo/archive/myrelease.zip#egg=mypackage (notice the : after the access token) which makes the URL more RFC conformant (does it?) and pip will happily install the package.

EDIT:
or (even better) https://:<my-access-token>@github.com/...

pradyunsg · 2019-07-23T16:03:54Z

makes the URL more RFC conformant (does it?)

No clue. I'm not a subject matter expert.

All that I've said above is basically a well informed guess that I've made by looking at all the relevant bits of code on my phone... because my laptop is too far away from the bed. 🙃

pradyunsg · 2019-07-23T16:12:34Z

https://:@github.com/myorganization/myprivaterepo/

Noting the last thing that I'm going to say here based on (only) intuition + memory, this seems like pip incorrectly requiring both username / password components to be present.

If that's the case, it's almost certainly a bug, albeit a separate one.

bomri · 2019-07-23T16:15:21Z

I tried @slafs last comment and it seems to be working fine for me.
Adding ":" after the token works while adding it before the token does not

slafs · 2019-07-27T10:23:04Z

Following #6775 (comment), I've opened #6796 for the "one element" auth in the package URL.

chrahunt · 2019-07-27T13:23:00Z

Thanks @slafs. In that case I think we can close this one.

cjerdonek · 2019-07-27T16:07:27Z

How about documenting this case in the docs (including with an example) since it seems common and can trip people up?

chrahunt · 2019-07-27T17:13:26Z

@cjerdonek, added #6797 to track that action.

pradyunsg · 2019-07-30T14:59:35Z

I've posted a PR (#6818), that fixes both of our URL handling issues (this and #6796).

pradyunsg · 2019-07-30T15:04:31Z

I'd appreciate if folks from this thread could take #6818 for a spin and let me know if it works.

To install pip from that branch, into your local environment, run:

pip install https://github.com/pradyunsg/pip/archive/fix/URL-authentication-handling.zip

cjerdonek · 2019-07-30T17:34:30Z

I don’t think that PR fixes this issue? I thought this issue was about usernames that include an @ when a password is present (it doesn’t have to do with None / non-None), and the other issue is about no password being present. But I don’t see the former among the test cases. (There would need to be two unescaped @ symbols in the test cases.) Also, I thought fixing the former would mean we’d have to “fix” / rewrite the URL, which I don’t see being done in the PR.

pradyunsg · 2019-07-30T18:21:33Z

I don’t see the former among the test cases

Ah, yes. My bad, I omitted the test for it. I've gone ahead and added that case to the test suite -- running it locally confirms that it extracted the email address correctly.

"http://[email protected]:[email protected]/path" -> ("http://example.com/path", "[email protected]", "password")

I thought fixing the former would mean we’d have to “fix” / rewrite the URL, which I don’t see being done in the PR.

Ah, yes. Me too initially.

However, I looked at the diff of requests and urllib3 vendors and didn't see anything in a code path that affected us directly here. I decided to try seeing if reverting to the changes from the original structure worked and it seems like it did.

My understanding now, is that it's got something to do with requests' default handling of URLs that contain auth information and (somehow) we're circumventing those issues in our Session subclass.

cjerdonek · 2019-07-30T18:56:45Z

My understanding now, is that it's got something to do with requests' default handling of URLs that contain auth information and (somehow) we're circumventing those issues in our Session subclass.

No, if you look at my comment above where I included the traceback for this issue, you'll see that we pass the original URL: #6775 (comment)
And your PR doesn't "fix" / rewrite the original URL, so that I wouldn't expect that behavior to change in your PR.

Just to be sure, I retried that command in your PR and confirmed that the same error still happens.

cjerdonek · 2019-07-30T18:59:13Z

Here is my comment above where I talk about one approach to fixing the current issue: #6775 (comment)

pradyunsg · 2019-07-31T03:09:58Z

Right, makes sense to me. I think I was mixing issues in my head.

pradyunsg · 2019-07-31T06:08:44Z

How about documenting this case in the docs (including with an example) since it seems common and can trip people up?

It would also make sense to abort in this situation, and print a warning if there's an unescaped "@" in the URL. It's probably more discoverable than the documentation and is friendlier than having it in the documentation.

Summarizing the state of affairs in pip 19.2, for my stupid self:

Authentication with only a token (i.e. username, no password) is broken -- pip's current logic assumes both parts would be present. That's what pip >=19.2 cannot install packages via custom URLs with "one element" auth #6796 is for.
Authentication with a username that's not URL-encoded does not work anymore -- pip 19.2 fails with a traceback, pip 19.1.1 accepts such URLs. The source of this issue is that urllib3 has become more strict in parsing such URLs.

amritbhat786 · 2019-08-02T07:47:09Z

Hi, we are facing the same issue with pip 18.1, urllib3(1.24.1) & Python2.7 but in the password section.
https://pypi.org/simple --extra-index-url http://user:password@@124.4.56:8086 --trusted-host 124.4.56
https://pypi.org/project/urllib3/ doesn't list any change related to the RFC 3986 specification above. Also @cjerdonek @pradyunsg could you please recommend stable versions of both the packages so that we can rather install them.

pradyunsg · 2019-08-11T17:43:50Z

Released pip 19.2.2 containing a fix for the tokens needing a :.

For folks facing issues with @ in their passwordds, the suggested path forward is to escape the @ in the URLs.

bjohnson-va · 2019-08-15T20:49:41Z

Edit: Doesn't seem to be related to the '%', as I changed my password to "password" and got the same result.

Another kink in this: my old generated password (used to) end in "%".

On version 19.2.2

Replacing the % with %25 or leaving it alone, I see the same error:

(Assuming my password was: password%)

ERROR: Exception:
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/pip/_internal/cli/base_command.py", line 188, in main
    status = self.run(options, args)
  File "/Library/Python/2.7/site-packages/pip/_internal/commands/install.py", line 345, in run
    resolver.resolve(requirement_set)
  File "/Library/Python/2.7/site-packages/pip/_internal/legacy_resolve.py", line 196, in resolve
    self._resolve_one(requirement_set, req)
  File "/Library/Python/2.7/site-packages/pip/_internal/legacy_resolve.py", line 359, in _resolve_one
    abstract_dist = self._get_abstract_dist_for(req_to_install)
  File "/Library/Python/2.7/site-packages/pip/_internal/legacy_resolve.py", line 307, in _get_abstract_dist_for
    self.require_hashes
  File "/Library/Python/2.7/site-packages/pip/_internal/operations/prepare.py", line 134, in prepare_linked_requirement
    req.populate_link(finder, upgrade_allowed, require_hashes)
  File "/Library/Python/2.7/site-packages/pip/_internal/req/req_install.py", line 211, in populate_link
    self.link = finder.find_requirement(self, upgrade)
  File "/Library/Python/2.7/site-packages/pip/_internal/index.py", line 1201, in find_requirement
    req.name, specifier=req.specifier, hashes=hashes,
  File "/Library/Python/2.7/site-packages/pip/_internal/index.py", line 1183, in find_candidates
    candidates = self.find_all_candidates(project_name)
  File "/Library/Python/2.7/site-packages/pip/_internal/index.py", line 1111, in find_all_candidates
    if self._validate_secure_origin(logger, link)
  File "/Library/Python/2.7/site-packages/pip/_internal/index.py", line 998, in _validate_secure_origin
    origin = (parsed.scheme, parsed.hostname, parsed.port)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urlparse.py", line 113, in port
    port = int(port, 10)
ValueError: invalid literal for int() with base 10: 'password%%40mydomain.com'

or

ValueError: invalid literal for int() with base 10: 'password%25%40mydomain.com'

booleand · 2019-08-16T07:59:01Z

Can you try encoding only the credential portion of the URL - e.g. password%[email protected]?

bjohnson-va · 2019-08-16T17:42:45Z

Switching to password%[email protected] does allow me to proceed.

pradyunsg · 2019-08-22T03:35:07Z

There's nothing more actionable here. As a closing note, I'll summarize this issue and make a call-to-action.

What happened?

pip 19.2 now requires credentials to be URL quoted. Thus, characters like @ and % need to be URL quoted (like %40 and %25).

Thus, the following index-urls will no longer work:

https://[email protected]:[email protected]/
https://user:password%@mydomain.com/

The equivalents that do work, are:

https://pradyunsg%40email.com:[email protected]/
https://user:password%[email protected]/

Why?

pip (and requests) use urllib3 for a lot of the underlying URL handling. urllib3 has introduced stricter URL parsing in the version that pip upgraded to in 19.2. Since this strictness was only "caught" after the release, pip's developers decided to not to circumvent the stricter parsing. The workflows for affected users had already broken and none pointed out that it'd be particularly difficult for them to quote their credentials properly.

How can you help?

pip's developers do not want to be breaking their users' workflows. However, this is difficult since pip is a project maintained entirely by volunteers, with a legacy codebase that makes on-boarding additional contributors difficult.

There is a substantial lack of developer time on pip, especially given the impact that changes in it, can have on users. To help us avoid issues like this and make more substantial improvements to pip, consider funding Python Packaging improvements. Funding such projects related to pip, would allow current maintainers be able to work on pip for more time, and bring in additional experts (UX experts, project managers, technical writers etc) to make pip better.

triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Jul 23, 2019

pradyunsg mentioned this issue Jul 23, 2019

pip 19.2 install raises "Failed to parse" if username contains '@' #6776

Closed

pradyunsg changed the title ~~error installing packages from --extra-index-url in pip 19.2~~ Index URLs containing @ in username are not parsed appropriately in pip 19.2 Jul 23, 2019

slafs mentioned this issue Jul 23, 2019

AttributeError: 'NoneType' object has no attribute 'is_hash_allowed' #6772

Closed

pradyunsg added C: finder PackageFinder and index related code type: bug A confirmed bug or unintended behavior and removed S: needs triage Issues/PRs that need to be triaged labels Jul 23, 2019

This comment has been minimized.

Sign in to view

bomri closed this as completed Jul 23, 2019

bomri reopened this Jul 23, 2019

slafs mentioned this issue Jul 27, 2019

pip >=19.2 cannot install packages via custom URLs with "one element" auth #6796

Closed

chrahunt mentioned this issue Jul 27, 2019

Document providing credentials in index, index-url, and extra-index-url #6797

Closed

pradyunsg mentioned this issue Jul 30, 2019

Fix handling of tokens (single part credentials) in URLs #6818

Merged

alegonz mentioned this issue Jul 31, 2019

Installing from private repository using --extra-index-url results in a prompt for user credentials #6820

Closed

kevinjfoley mentioned this issue Jul 31, 2019

Symlinks point nowhere pypa/pipx#192

Closed

pradyunsg mentioned this issue Aug 1, 2019

Print warning to ask user to escape URL usernames #6822

Closed

pradyunsg mentioned this issue Aug 6, 2019

Unescaped space in proxy URL causes "Failed to parse" error in pip 19.2.1 #6839

Closed

pradyunsg closed this as completed Aug 22, 2019

jmons mentioned this issue Sep 2, 2019

Certain passwords fail with Extra-index-url with password in pip.conf #6961

Open

pradyunsg unpinned this issue Sep 7, 2019

lock bot added the auto-locked Outdated issues that have been locked by automation label Sep 21, 2019

lock bot locked as resolved and limited conversation to collaborators Sep 21, 2019

Index URLs containing unescaped "@" in username cause "Failed to parse" error in pip 19.2 #6775

Index URLs containing unescaped "@" in username cause "Failed to parse" error in pip 19.2 #6775

Comments

bomri commented Jul 23, 2019 • edited Loading

shay108 commented Jul 23, 2019

TTRh commented Jul 23, 2019 • edited Loading

cletomartin commented Jul 23, 2019

chrahunt commented Jul 23, 2019

brainerazer commented Jul 23, 2019

philipp-lein commented Jul 23, 2019 • edited Loading

pradyunsg commented Jul 23, 2019

bomri commented Jul 23, 2019

slafs commented Jul 23, 2019 • edited Loading

pradyunsg commented Jul 23, 2019

slafs commented Jul 23, 2019 • edited Loading

pradyunsg commented Jul 23, 2019

shay108 commented Jul 23, 2019

slafs commented Jul 23, 2019

This comment has been minimized.

pradyunsg commented Jul 23, 2019 • edited Loading

pradyunsg commented Jul 23, 2019 • edited Loading

slafs commented Jul 23, 2019 • edited Loading

pradyunsg commented Jul 23, 2019

pradyunsg commented Jul 23, 2019 • edited Loading

bomri commented Jul 23, 2019

slafs commented Jul 27, 2019

chrahunt commented Jul 27, 2019

cjerdonek commented Jul 27, 2019

chrahunt commented Jul 27, 2019

pradyunsg commented Jul 30, 2019

pradyunsg commented Jul 30, 2019

cjerdonek commented Jul 30, 2019

pradyunsg commented Jul 30, 2019

cjerdonek commented Jul 30, 2019

cjerdonek commented Jul 30, 2019

pradyunsg commented Jul 31, 2019

pradyunsg commented Jul 31, 2019

amritbhat786 commented Aug 2, 2019

pradyunsg commented Aug 11, 2019 • edited Loading

bjohnson-va commented Aug 15, 2019 • edited Loading

booleand commented Aug 16, 2019

bjohnson-va commented Aug 16, 2019

pradyunsg commented Aug 22, 2019 • edited Loading

bomri commented Jul 23, 2019 •

edited

Loading

TTRh commented Jul 23, 2019 •

edited

Loading

philipp-lein commented Jul 23, 2019 •

edited

Loading

slafs commented Jul 23, 2019 •

edited

Loading

slafs commented Jul 23, 2019 •

edited

Loading

pradyunsg commented Jul 23, 2019 •

edited

Loading

pradyunsg commented Jul 23, 2019 •

edited

Loading

slafs commented Jul 23, 2019 •

edited

Loading

pradyunsg commented Jul 23, 2019 •

edited

Loading

pradyunsg commented Aug 11, 2019 •

edited

Loading

bjohnson-va commented Aug 15, 2019 •

edited

Loading

pradyunsg commented Aug 22, 2019 •

edited

Loading