Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't overuse the keyring #8687

Merged
merged 1 commit into from
Feb 18, 2021
Merged

Don't overuse the keyring #8687

merged 1 commit into from
Feb 18, 2021

Conversation

hroncok
Copy link
Contributor

@hroncok hroncok commented Aug 3, 2020

See the individual commits and their messages.

Fixes #8090

@hroncok
Copy link
Contributor Author

hroncok commented Aug 5, 2020

OK, I've verified this works as expected via https://gist.github.com/encukou/7d227abb62b50af5ed3539dfb725331d (thanks @encukou).

I have a video record of the behavior I can share with the reviewer if needed.

pradyunsg
pradyunsg previously approved these changes Aug 7, 2020
Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, based on a desk-review and reading lots of related code in requests/keyring/pip.

The automated test seems "close enough" to me and I'm happy to trust @hroncok that this works based on their manual tests. :)

@pradyunsg pradyunsg added this to the 20.2.2 milestone Aug 7, 2020
@pradyunsg
Copy link
Member

I'm wondering if we should slot this into 20.2.2 -- any thoughts from others on this front? It's technically not a bugfix for something added in 20.2, but it's a substantial enough improvement that we should release this ASAP IMO.

@uranusjr
Copy link
Member

uranusjr commented Aug 7, 2020

I agree, it’s better to get this out sooner than later.

# Prompt the user for a new username and password
username, password, save = self._prompt_for_password(parsed.netloc)
save = False
if not username or not password:
Copy link
Member

@chrahunt chrahunt Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this have an impact on people that have stored API keys in keyring, or have only provided a username or password in response to a prompt? Not sure how prevalent that is but it can appear as only one of these being set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context on single-part login credentials, see #6796.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, this should probably be: if not username only, right?

Copy link
Member

@chrahunt chrahunt Aug 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I took a closer look. Previously we did:

  1. Check up to 4 places (including keyring) for credentials based on the request URL (_get_new_credentials)
  2. Make request
  3. Get 401 and unconditionally prompt user for credentials with the response URL (handle_401)
  4. Make followup request (handle_401)
  5. Get 401 and warn user (warn_on_401)

Now we do:

  1. Check up to 3 places (not keyring) for credentials based on the request URL (_get_new_credentials)
  2. Make request
  3. Get 401 and check up to 4 places for credentials (including keyring) based on the response URL (handle_401)
  4. If that doesn't return something reasonable, then prompt the user
  5. Make followup request (handle_401)
  6. Get 401 and warn user (warn_on_401)

I think there's a few things to note with the changes we're making here:

  1. The response URL may be different from the request URL in case of redirects, so the second time getting the credentials may behave differently
  2. In the case where the response URL and request URL are the same, it may just try the same credentials again incorrectly
  3. The response URL shouldn't contain credentials, so we don't have to worry about retrying with the same embedded credentials

It seems pretty complicated and to make headway we'd probably want to refactor some of this class so it doesn't get worse. Maybe we can avoid that and just disable keyring for pypi URLs? Especially if this is a change people want to get in soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the last three points:

  1. I think this is the correct behaviour. Handling redirects feels like goodness (e.g. I can bit.ly an index URL), and if a redirect turns out to be malicious, we don't want to pass along legitimate credentials to a different site
  2. Unavoidable, probably? But we're in a failure case by then anyway, so an extra request is not the end of the world.
  3. Correct, though if _get_new_credentials can map it back to a user-specified index url that has embedded credentials, those will be used

It is complicated, but I think this all lines up.

For the single-part credential issue, we probably want if not username and not password: - assuming that if we got a partial credential that we should still try it. _get_new_credentials has the logic for trying to turn a partial credential into a full one.

src/pip/_internal/network/auth.py Outdated Show resolved Hide resolved
@@ -110,7 +112,7 @@ def _get_index_url(self, url):
return None

def _get_new_credentials(self, original_url, allow_netrc=True,
allow_keyring=True):
allow_keyring=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we looked at the impact this might have on keyring providers and associated package index assumptions? From #8103 it seems like Azure DevOps gives us a redirect, not a 401, when authentication is needed, and so may depend on our eager usage of artifacts-keyring. Not that it should necessarily block us, but this may not be a low impact change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I have no idea. My drive is to fix this for users who use the default pypi.org index. While I've tried to make it work for the others as well, I don't have access to such setups.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. Can we just disable the initial keyring support for pypi-related URLs? We keep them here. I think that would narrow the scope of this enough that we can defer investigating my previous concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, Azure DevOps should return 401 now, but it's filtered on user agents to make sure that people who get given the URL land on a useful page. So you can see it through an actual pip install, but likely not if you curl directly.

@pradyunsg
Copy link
Member

@chrahunt @hroncok Are there more changes necessary here on this PR right now? If yes, does anyone have a rough estimate about when they might be addressed?

This is the only outstanding PR in the milestone right now. If this is gonna take more than a couple of days or so, I'd like to cut 20.2.2 release without this.

@hroncok
Copy link
Contributor Author

hroncok commented Aug 10, 2020

I won't be bale to work in all the stuff that @chrahunt talked about any time soon.

@chrahunt Can we get the first commit in for now?

@pradyunsg
Copy link
Member

I don't speak for Chris, but I'd certainly be OK with that. :)

@pfmoore
Copy link
Member

pfmoore commented Aug 10, 2020

My only concern is that @chrahunt seems to be suggesting that we might break people using this for Azure DevOps. I don't have any particular opinion on the correct thing to do here, but if that is a risk (and not just my misunderstanding) do we have a plan for the possibility that we release 20.2.2 with this in, and immediately get cries from Azure users that we need a 20.2.3 to fix their issue?

(I'm fine if that plan is "tell them they'll have to pin to 20.2.1 or wait for 20.3", I just think we should know up front what we expect to say).

@pradyunsg
Copy link
Member

I'll mostly cut this tomorrow/day-after (~12/36 hours?) and I'll be around for at least 5 hours after that. If stuff goes down, I'll be able to cut a 20.2.3 that just reverts the offending commit.

That's assuming I get a response from Chris here that he thinks adding both commits would be better than just one. If not, I think it'd be OK to force-push on this branch (I'm assuming Miro hasn't checked off the relevant box) and merge just the first commit.

@hroncok
Copy link
Contributor Author

hroncok commented Aug 10, 2020

See #8744 for just the first commit.

@chrahunt
Copy link
Member

Are there more changes necessary here on this PR right now?

I'd be mostly OK with this PR if we just scoped it to PyPI URLs. If we go that route we may want to confirm that there are no plans to offer private repos on PyPI anytime soon.

I won't be bale to work in all the stuff that @chrahunt talked about any time soon.

Sorry if I worded it poorly. In my mind I think all my concerns would be addressed if we just set allow_keyring to False on the first request for PyPI URLs. If we don't want to do that then it gets more complicated, as described in #8687 (comment).

@chrahunt Can we get the first commit in for now?

I'm totally on board for that, I gave my 👍 on #8744.

My only concern is that @chrahunt seems to be suggesting that we might break people using this for Azure DevOps.

Yes, I think that's a risk, based on what we've seen in a few recent issues. I have some instructions here and can test shortly.

@chrahunt
Copy link
Member

I was able to successfully install an artifact using this branch and the instructions from #8103 (comment).

output
$ ./env/bin/python -m pip install -vvv --extra-index-url=https://pkgs.dev.azure.com/chrahunt/example3/_packaging/exampl
e3-feed/pypi/simple/ example3
Using pip 20.3.dev0 from /tmp/user/1000/tmp.AfSvFq6j8i/env/lib/python3.8/site-packages/pip (python 3.8)
Non-user install because user site-packages disabled
Created temporary directory: /tmp/user/1000/pip-ephem-wheel-cache-dlpuqld8
Created temporary directory: /tmp/user/1000/pip-req-tracker-gvbk9pnb
Initialized build tracking at /tmp/user/1000/pip-req-tracker-gvbk9pnb
Created build tracker: /tmp/user/1000/pip-req-tracker-gvbk9pnb
Entered build tracker: /tmp/user/1000/pip-req-tracker-gvbk9pnb
Created temporary directory: /tmp/user/1000/pip-install-ys6ad2ys
Looking in indexes: https://pypi.org/simple, https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/
2 location(s) to search for versions of example3:
* https://pypi.org/simple/example3/
* https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/
Fetching project page and analyzing links: https://pypi.org/simple/example3/
Getting page https://pypi.org/simple/example3/
Found index url https://pypi.org/simple
Looking up "https://pypi.org/simple/example3/" in the cache
Request header has "max_age" as 0, cache bypassed
Starting new HTTPS connection (1): pypi.org:443
https://pypi.org:443 "GET /simple/example3/ HTTP/1.1" 404 13
Status code 404 not in (200, 203, 300, 301)
Could not fetch URL https://pypi.org/simple/example3/: 404 Client Error: Not Found for url: https://pypi.org/simple/example3/ - skipping
Fetching project page and analyzing links: https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/
Getting page https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/
Found index url https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/
Looking up "https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/" in the cache
Request header has "max_age" as 0, cache bypassed
Starting new HTTPS connection (1): pkgs.dev.azure.com:443
https://pkgs.dev.azure.com:443 "GET /chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/ HTTP/1.1" 401 307
Found index url https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/
Getting credentials from keyring for https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/
Starting new HTTPS connection (1): pkgs.dev.azure.com:443
https://pkgs.dev.azure.com:443 "GET /chrahunt/example3/_packaging/example3-feed/pypi/simple/ HTTP/1.1" 401 307
Starting new HTTPS connection (1): pkgs.dev.azure.com:443
https://pkgs.dev.azure.com:443 "GET /chrahunt/example3/_packaging/example3-feed/pypi/simple/ HTTP/1.1" 404 46
Found credentials in keyring for pkgs.dev.azure.com
Status code 401 not in (200, 203, 300, 301)
Looking up "https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/" in the cache
Request header has "max_age" as 0, cache bypassed
https://pkgs.dev.azure.com:443 "GET /chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/ HTTP/1.1" 200 None
Updating cache with response from "https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/"
Caching b/c of expires header
  Found link https://pkgs.dev.azure.com/chrahunt/0be4dbf5-2e6a-4329-b457-6bfc3813daf9/_packaging/40f8bcae-0cca-47f2-afd9-ef0739ae93db/pypi/download/example3/0.1/example3-0.1.0-py3-none-any.whl#sha256=ff49370bdba639695bf4d3e96b80b6b0e8a3c65011c1cef658a3885c7d913c58 (from https://pkgs.dev.azure.com/chrahunt/example3/_packaging/example3-feed/pypi/simple/example3/), version: 0.1.0
Given no hashes to check 1 links for project 'example3': discarding no candidates
Using version 0.1.0 (newest of versions: 0.1.0)
Collecting example3
  Created temporary directory: /tmp/user/1000/pip-unpack-4rb_2u_z
  Looking up "https://pkgs.dev.azure.com/chrahunt/0be4dbf5-2e6a-4329-b457-6bfc3813daf9/_packaging/40f8bcae-0cca-47f2-afd9-ef0739ae93db/pypi/download/example3/0.1/example3-0.1.0-py3-none-any.whl" in the cache
  No cache entry available
  https://pkgs.dev.azure.com:443 "GET /chrahunt/0be4dbf5-2e6a-4329-b457-6bfc3813daf9/_packaging/40f8bcae-0cca-47f2-afd9-ef0739ae93db/pypi/download/example3/0.1/example3-0.1.0-py3-none-any.whl HTTP/1.1" 303 0
  Status code 303 not in (200, 203, 300, 301)
  Looking up "https://ajhvsblobprodcus363.blob.core.windows.net/b-0b71cd34f9c34172994c8b112c0efbc5/DBCE858FD979095ABFDDD6486916184A6D96809832BC940EDF646466AA55BB4D00.blob?sv=2019-02-02&sr=b&si=1&sig=CLumZkVQO3JWN3JIWgGeIvjOzsJ7dGughsR4UJwXkRA%3D&spr=https&se=2020-08-12T00%3A01%3A08Z&rscl=x-e2eid-07ee394c-3dec4f3a-8c6ce55b-973c3849-session-07ee394c-3dec4f3a-8c6ce55b-973c3849&rscd=attachment%3B%20filename%3D%22example3-0.1.0-py3-none-any.whl%22" in the cache
  No cache entry available
  Starting new HTTPS connection (1): ajhvsblobprodcus363.blob.core.windows.net:443
  https://ajhvsblobprodcus363.blob.core.windows.net:443 "GET /b-0b71cd34f9c34172994c8b112c0efbc5/DBCE858FD979095ABFDDD6486916184A6D96809832BC940EDF646466AA55BB4D00.blob?sv=2019-02-02&sr=b&si=1&sig=CLumZkVQO3JWN3JIWgGeIvjOzsJ7dGughsR4UJwXkRA%3D&spr=https&se=2020-08-12T00%3A01%3A08Z&rscl=x-e2eid-07ee394c-3dec4f3a-8c6ce55b-973c3849-session-07ee394c-3dec4f3a-8c6ce55b-973c3849&rscd=attachment%3B%20filename%3D%22example3-0.1.0-py3-none-any.whl%22 HTTP/1.1" 200 1002
  Downloading https://pkgs.dev.azure.com/chrahunt/0be4dbf5-2e6a-4329-b457-6bfc3813daf9/_packaging/40f8bcae-0cca-47f2-afd9-ef0739ae93db/pypi/download/example3/0.1/example3-0.1.0-py3-none-any.whl (1.0 kB)
  Updating cache with response from "https://ajhvsblobprodcus363.blob.core.windows.net/b-0b71cd34f9c34172994c8b112c0efbc5/DBCE858FD979095ABFDDD6486916184A6D96809832BC940EDF646466AA55BB4D00.blob?sv=2019-02-02&sr=b&si=1&sig=CLumZkVQO3JWN3JIWgGeIvjOzsJ7dGughsR4UJwXkRA%3D&spr=https&se=2020-08-12T00%3A01%3A08Z&rscl=x-e2eid-07ee394c-3dec4f3a-8c6ce55b-973c3849-session-07ee394c-3dec4f3a-8c6ce55b-973c3849&rscd=attachment%3B%20filename%3D%22example3-0.1.0-py3-none-any.whl%22"
  Caching due to etag
  Added example3 from https://pkgs.dev.azure.com/chrahunt/0be4dbf5-2e6a-4329-b457-6bfc3813daf9/_packaging/40f8bcae-0cca-47f2-afd9-ef0739ae93db/pypi/download/example3/0.1/example3-0.1.0-py3-none-any.whl#sha256=ff49370bdba639695bf4d3e96b80b6b0e8a3c65011c1cef658a3885c7d913c58 to build tracker '/tmp/user/1000/pip-req-tracker-gvbk9pnb'
  Removed example3 from https://pkgs.dev.azure.com/chrahunt/0be4dbf5-2e6a-4329-b457-6bfc3813daf9/_packaging/40f8bcae-0cca-47f2-afd9-ef0739ae93db/pypi/download/example3/0.1/example3-0.1.0-py3-none-any.whl#sha256=ff49370bdba639695bf4d3e96b80b6b0e8a3c65011c1cef658a3885c7d913c58 from build tracker '/tmp/user/1000/pip-req-tracker-gvbk9pnb'
Installing collected packages: example3

Successfully installed example3-0.1.0
Removed build tracker: '/tmp/user/1000/pip-req-tracker-gvbk9pnb'

This doesn't show the behavior I'm concerned with (a redirect instead of a 401, as in #8103 (comment)), so it's a little reassuring.

I think including just the first commit is still less of a risk.

@pradyunsg
Copy link
Member

If we go that route we may want to confirm that there are no plans to offer private repos on PyPI anytime soon.

Hah, I can confirm this. PyPI runs on donated infrastructure, so trying to do something like this (which only makes sense if you have paying customers) would likely result in it losing access to that infrastructure.

@pradyunsg pradyunsg removed this from the 20.2.2 milestone Aug 11, 2020
@pradyunsg
Copy link
Member

Alrighty! #8744 is merged. 20.2.2 coming up soon! :)

@ssbarnea
Copy link
Contributor

ssbarnea commented Sep 8, 2020

Any chance to see this shipped? Current behavior creates deadlocks.

@uranusjr
Copy link
Member

uranusjr commented Sep 9, 2020

Hmm, 20.2.3 was released without this change. Not sure if it’s intended or not @pradyunsg

@pradyunsg
Copy link
Member

We broke out #8744, which was released in 20.2.2. Is there something else that I'm missing?

@pradyunsg pradyunsg dismissed their stale review September 24, 2020 16:14

Things have changed since!

@uranusjr uranusjr added this to the 20.3 milestone Oct 10, 2020
@uranusjr
Copy link
Member

I added this to 20.3. Rebasing needed!

@pradyunsg pradyunsg removed this from the 20.3 milestone Oct 26, 2020
@pradyunsg pradyunsg added the C: keyring Related to pip's keyring integration label Nov 5, 2020
Copy link
Contributor

@zooba zooba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the slow reply, I'm trying to make more of an effort to dredge through my GH notifications.

I think this is an overall improvement, and don't have any concerns wrt the keyring integration or artifacts-keyring compat.

@@ -110,7 +112,7 @@ def _get_index_url(self, url):
return None

def _get_new_credentials(self, original_url, allow_netrc=True,
allow_keyring=True):
allow_keyring=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, Azure DevOps should return 401 now, but it's filtered on user agents to make sure that people who get given the URL land on a useful page. So you can see it through an actual pip install, but likely not if you curl directly.

# Prompt the user for a new username and password
username, password, save = self._prompt_for_password(parsed.netloc)
save = False
if not username or not password:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the last three points:

  1. I think this is the correct behaviour. Handling redirects feels like goodness (e.g. I can bit.ly an index URL), and if a redirect turns out to be malicious, we don't want to pass along legitimate credentials to a different site
  2. Unavoidable, probably? But we're in a failure case by then anyway, so an extra request is not the end of the world.
  3. Correct, though if _get_new_credentials can map it back to a user-specified index url that has embedded credentials, those will be used

It is complicated, but I think this all lines up.

For the single-part credential issue, we probably want if not username and not password: - assuming that if we got a partial credential that we should still try it. _get_new_credentials has the logic for trying to turn a partial credential into a full one.

tests/unit/test_network_auth.py Outdated Show resolved Hide resolved
@calumapplepie
Copy link

What's the status on the PR? The Debian freeze is approaching, and ideally the pip version included will be fully functional.

@pradyunsg
Copy link
Member

@hroncok Would you be able to make time before pip 21.0 to rebase this PR? It'd be great if we could include this in that.

@hroncok
Copy link
Contributor Author

hroncok commented Dec 29, 2020

Rebased. But I'm not sure I am capable of incorporating all the feedback from @zooba.

@hroncok
Copy link
Contributor Author

hroncok commented Dec 29, 2020

CI thinks there's something wrong with the news fragment, but I fail to understand it :(

@uranusjr
Copy link
Member

It is looking for a .rst suffix. Towncrier works fine without it, but pip's check mandates.

@hroncok
Copy link
Contributor Author

hroncok commented Dec 29, 2020

Thanks! File renamed.

@pradyunsg
Copy link
Member

I think this looks OK. If @zooba agrees, I think we can go ahead and merge this? :)

@zooba
Copy link
Contributor

zooba commented Jan 4, 2021

My only remaining comment is that we will still prompt for credentials if keyring provides a username or password but not both, even though that may be legitimate. Switching the "or" in auth:262 to an "and" should fix that, but I can't be 100% sure that it's better.

Given that we don't reuse any partial information in the prompt, I'm inclined to say we should (log a message first and then) send the partial credential. If it's just a PAT, there's no good way for a user to reproduce that at the prompt.

@hroncok
Copy link
Contributor Author

hroncok commented Jan 4, 2021

Switching the "or" in auth:262 to an "and" should fix that, but I can't be 100% sure that it's better.

Happy to do that if that's the consensus.

@pradyunsg
Copy link
Member

Go ahead!

If someone complains, I'll likely @ mention both of you for help. 🙃

@uranusjr
Copy link
Member

Let’s do this.

@uranusjr uranusjr merged commit 62af956 into pypa:master Feb 18, 2021
@hroncok hroncok deleted the keyring_madness branch February 18, 2021 17:04
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: keyring Related to pip's keyring integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pip + twine installed: pip attempts to continuously create and use a "keyring"
8 participants