-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please add support for credentials via environment variables #4789
Comments
Adding the word authentication so that this shows up in my searches. :) |
Note that you can pass the |
Also wanted to point out that you can pass |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Thank you for adding the expansion of environment variables in requirement files. However, I was wondering if environment variables could be implemented for Pip similar to Twine? With Twine (especially for CI) you just need to set TWINE_USERNAME and TWINE_PASSWORD as environment variables in the CI. Thus, there's no need to add the username and password to the repository URL's. Just curious. |
There's keyring support that's integrated and up for the next release -- #5948. |
I'm going to submit a PR to accept taking credentials from env variables. to make it much simpler to integrate safely with CI servers. Credentials should not be specified as command line options in any way as they may easily be leaked in logs or seen in process listings. |
Hello, still open... The PR was closed, even if PIP_PASSWORD was a nice idea. |
Feel free to propose a new one if you think it is a good idea. |
Sorry, I have not the knowledge to implement this PR. Is it possible to resubmit that PR. |
Wanted to chime in and voice support for this. It looks like the PR was closed, but maybe is still a viable option. What would be the process for reviving it? Could someone else just open a new copy of the existing PR, or should we wait and see if @lhupfeldt can revive it? |
You are welcome to reuse/reopen my PR. I just gave up originally because
there was some resistens from core developers.
…On Wed, 22 Jul 2020, 17:54 Tim Orme, ***@***.***> wrote:
Wanted to chime in and voice support for this. It looks like the PR was
closed, but maybe is still a viable option.
What would be the process for reviving it? Could someone else just open a
new copy of the existing PR, or should we wait and see if @lhupfeldt
<https://github.com/lhupfeldt> can revive it?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4789 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMCZSIVUVXEOAXGYE62H6LR44DSPANCNFSM4D7OXPEQ>
.
|
Hello, I think that the PR was fantastic and it works similar to twine. I do not like at all putting my credentials on the file system. |
You'll hit the same resistance again, I suspect. This is precisely the sort of issue that keyring support was intended to avoid - pip needing to implement multiple mechanisms for handling authentication, each for a particular (entirely valid) use case. If keyring support (or keyring itself) isn't sufficient for this use case, we should be improving them, not implementing an alternative mechanism. |
I do not find any article about how to use keyring for "Python3 pip + virtualenv". Could you point me a link? |
This is a reasonable request, but it also doesn't seem like keyring is designed for the problem of CI/CD or automated builds or however you want to think of the issue that env var auth is trying to solve. The fact that it needs a "headless linux" section seems like an indicator of this. Maybe I'm wrong. Either way, this comment from the PR seems to capture my situation nicely:
That sounds like so much effort that I would rather munge my If specifying explicit environment variables that |
Have you raised that with the keyring project? Honestly, that's all we're suggesting here, and we're getting a lot of pushback. Pip added support for keyring, in a good-faith attempt to handle the requests we were getting for a mechanism to store credentials outside of pip¹. We were led to understand that the submitted PR was a good solution to this issue, and we took it on trust that keyring did the job we'd been told that it did. To date, no-one has demonstrated that keyring isn't up to the job. Certainly, we've had people say that keyring doesn't support their use case. But nor does pip - someone will need to write new code, and the idea of adding keyring support was to delegate handling this sort of use case to that project. Until we see a definite statement from them that they aren't interested in supporting the use cases being described here, there's not much pip should be doing (IMO). If keyring come back and say they don't want to support this use case, then pip needs to look at what to do - and in my view, I'd want to reconsider whether we should be looking for an alternative to keyring that does support our users - I still don't want pip to get into the business of credential management².
You can use environment variables in requirements files. Have you tried using that feature to see if it handles your use case? ¹ The approach of using keyring had the additional benefit of not requiring the pip developers to get into questions around what is a secure way of handling credentials - we could leave that to the experts maintaining the keyring project. |
Thanks for your reply. I clearly only have a tiny fragment of the context here. I'll try to find an appropriate place to ask about keyring in CI. Also, I did not know that requirements files would expand environment variables. I had tried it within I appreciate the perspective of leaving credential management to experts and your efforts to keep things going in the right direction. |
@pfmoore chiming in here a bit and don't want to speak too much for others, but one of the cases mentioned elsewhere is that you end up in a bit of catch-22 situation that can't be resolved without some hacking, unless pip itself supports this. If you're in an CI/CD environment where you only have access to a private, password protected PyPI repo, then you are in an unfortunate situation where you can't even install keyring to begin authenticating to that repo, even if it does support it. There are perhaps other ways to get keyring installed, but they end up being a bit messy. Maybe an alternative is to ship keyring with pip or something along those lines, but I'm not sure of the feasibility or impact of that. In short though, the concern is that if pip doesn't support that auth, and the only way to support auth is to install an external package, then we end up stuck in the cases where the external package requires auth. |
I think the "pushback" is because environment variables are a normal way of doing this and keyring is not. Several replies here have outlined the specific issues with adding this as a dependency. Given that, rather than requiring a "definite statement" from the keyring project (and who is going to obtain such a statement?), it would make more sense to explain how keyring is a good solution particularly for CI/CD, as it is basically an exception we would make from norm in order to use this tool. (i.e. this is the only similar tool that would use keyring...)
Environment variables don't put you in the business of credential management. Something else is responsible for setting the environment variables; that's the point of environment variables. Realistically the alternative here is not going to be keyring. An alternative is using PIP_INDEX_URL (an environment variable) with basic auth embedded in it. Which means you already take credentials in an environment variable, so the concern there is odd. And you already fixed the logging of the credentials in the URL several versions ago I think. The problem with this approach is simply that the entire URL now becomes a secret value, rather than just the credentials. I think this request is simply to split the URL from the credentials so that the URL could be hardcoded in a checkin without the credentials. It sounds like environment expansion in requirements.txt is potentially superior. |
You're right - bootstrapping keyring is an issue. But it was known (and acknowledged) when the feature was added, so all I can really say is that the original implementation saw that as an acceptable limitation. I don't personally have a good answer here. Vendoring keyring is unfortunately not possible, because keyring depends on C extensions, and pip cannot vendor C extensions (because pip needs to be platform-neutral - there's a lot more background here, but that's the reality and it's not going to change, unfortunately).
Someone who needs this to work, surely? You seem to be assuming that it's up to the pip developers. Sorry, but it really isn't.
OK, I've no problem if you think my reluctance is odd. Feel free to take it as simply meaning that I won't do anything about this myself, if that helps.
It does indeed sound like that is helpful for people in this situation. Which makes me wonder why no-one found that information. Is the section here in the documentation unclear? Is it hard to find? It may be that people have wasted time debating keyring, when if they'd found the existing feature they could have solved their problem much more easily - so if there's any improvement to the documentation that would have helped, it would be great if you could offer a suggestion (ideally as a PR, but even just an issue describing what you'd like to have seen would be good). |
I guess that most people just put 'package>=version' (or == ...) in requirements.txt. Maybe some reference to requirements.txt from the existing documentation about how to authenticate, and an explanation of what to put in requirements.txt to make pip read credentials from there would help? But I hope you are not referring to embedding credentials in index URLs or even adding index URLs in requirements.txt? Adding URLs in requirements.txt for me would just make the file unreadable with even more substitutions to be made. We have production and not production pypi proxies, and I think other people will have the same. I understand that you are trying to limit the maintenance burden of pip, but we are talking 10 lines of code including logging (excluding the test) (and that would be maybe 7 if the check that both password and username is set was removed, as suggested) and 5 lines of documentation. And this feature seems to be in popular demand. |
@pfmoore you've said (emphasis mine):
And I think this is the misunderstanding in this discussion. We aren't asking for a mechanism to store credentials outside pip. We already have that one (e.g. the credential store in our CI server), and our mechanism, whatever it is, provides the credentials in the form of environment variables (which is very common). However, this mechanism is not keyring. The problem we face is then: how do we pass these credentials, that are already in environment variables, to pip? The option to make pip to use keyring directly is very nice and solves a valid, but different problem, which is how to take credentials from keyring and pass them to pip. |
This probably isn't to everyone's standard, but I do this to store credentials as environment variables. |
I'm overall ambivalent on this -- this discussion has a weird mix of misrepresenting what pip's keyring integration does and never getting an update on what the underlying design constraints are -- it seems like a reasonable request but all the proposed solutions so far seem infeasible to me. This issue never got a "proper" update on the current credential management story for pip after the keyring support got added, so... I guess I just posted that above. I think the next steps here are:
It also breaks the fundamental assumption -- keyring is a Python package with a programmatic API that allows users to import things from it to write a third-party backend. Vendoring it breaks that, eliminating the primary benefit of it -- externally maintained third-party backends for interacting with different credential stores. Also... https://pip.pypa.io/en/stable/topics/authentication/ is a thing now, and I'll add a follow up issue to add a cross-reference to https://pip.pypa.io/en/stable/reference/requirements-file-format/#using-environment-variables there. And, finally, please be mindful that pip is primarily maintained by volunteers. |
Ok, I will give it a try.
The credentials will be set on the current running (shell) environment. Hence normal environment variables. For example, you can fetch them with
Usually two, the default one and a private one. |
The part I still fail to understand is why a separate environment variable is needed in the first place, since it is already possible to specify auth in PIP_INDEX_URL=$(pip config get global.index-url | sed "s/\/\//\0${PIP_USER}:${PIP_PASS}@/") Note that this is more or less what we would do if the support is built into pip, there's nothing hacky about this—or rather, there's nothing magical about having this implemented in pip instead of an ad-hoc Bash one-liner. So I think the bottom line is that we need more concrete, objective reasons to explain exactly why this is a needed feature, rather than subjective "I think pip should pick it up". |
And you want pip to use the credentials to access both of them? |
It's got the same design constraints as pip's needs here, so... honestly, yea... that's quite possibly one of the better outcomes here -- it'll likely even work transparently with twine / flit / poetry etc if you do this right. :) |
That could be a solution to my use case, if I can work out the keyring bootstrapping problem for my CI environment. |
@pradyunsg considering that an environment variables implementation for either pip or keyring would have the same issues regarding naming of environment variables, would this be sufficient to support multiple index urls? Is there any objection to using the PIP_INDEX_AUTH_URL_0=https://index0.example.com
PIP_INDEX_AUTH_USERNAME_0=myusername
PIP_INDEX_AUTH_PASSWORD_0=mypassword
PIP_INDEX_AUTH_URL_1=https://index1.example.com
PIP_INDEX_AUTH_USERNAME_1=myusername1
PIP_INDEX_AUTH_PASSWORD_1=mypassword1 or this might be a better way PIP_INDEX_AUTH_0_URL=https://index0.example.com
PIP_INDEX_AUTH_0_USERNAME=myusername
PIP_INDEX_AUTH_0_PASSWORD=mypassword
PIP_INDEX_AUTH_1_URL=https://index1.example.com
PIP_INDEX_AUTH_1_USERNAME=myusername1
PIP_INDEX_AUTH_1_PASSWORD=mypassword1 |
Hmm, looking at the pip docs on keyring support, I can't see where to specify a username when installing a package using keyring auth. Keyring allows me to set multiple username/password credentials for a single |
It is rare to see this many people take interest in en issue :) It was mentioned that pip is a volunteer effort. I think everybody understands this, and I did submit a PR for this, complete with tests and documentation. I think neglecting the issue of pip requiring installation of a package is really bad (I'm aware of the workaround). In a large company CI setup it quicly becomes a mess if the CI installation also has to take care of installing the build tools for individual and very diverse projects which uses a lot of different technologies. At my company, individual projects do not have OS login to the CI servers, and the servers do not have internet access, so all package/tool installation is done by the CI server and goes through our local repositories. The package installer should not depend on a package. I think the issue of supporting multiple indexes can be seen as an extension, so maybe we could start by just documenting that multiple indexes are not (currently) supported through env variables. If you think supporting this is required before accepting a PR, then we can add that. I have no need for it, and I think most people wont. If you have a private protected repository you can probably proxy all indexes through that. Please take a look at @absassi's comment which explains very nicely why supporting env variables is a good idea, and not a competitor to keyring. |
For those suggesting embedding credentials in URLs, please read e.g. this: https://neilmadden.blog/2019/01/16/can-you-ever-safely-include-credentials-in-a-url/ |
Please take a look at my comment which quotes that, and mentions why the proposed PR wasn't sufficient either. :) See also #6723 (comment) |
I'm fine either way. This is going to have to be distributed separately from pip, so it should be reasonable to pick something generic; but either way, it shouldn't be that difficult to make changes / allow making changes to that prefix. :) |
Sorry @pradyunsg, which of your comments are you referring to? |
I see that @reixd directly accesses a public repo (pypi.org?) and a private one. In that case my implementation would leak the credentials to pypi.org (as documented, but who reads the documentation :) ). A solution which also checks the index url would definitely be better in that case. This exact scenario could be handled by alway attempting access without credentials first, but of cause this would not handle multiple protected repositories requiring different credentials. I'm not sure if this is a real issue though. An index checking solution should allow patterns like |
But the package will be pip-specific, so it makes sense to use |
@lhupfeldt would my example for env vars handle all the cases for multiple public and private index urls if it were implemented directly in pip? Public urls wouldn't have any PIP_INDEX_AUTH_URL_0=https://index0.example.com
PIP_INDEX_AUTH_USERNAME_0=myusername
PIP_INDEX_AUTH_PASSWORD_0=mypassword
PIP_INDEX_AUTH_URL_1=https://index1.example.com
PIP_INDEX_AUTH_USERNAME_1=myusername1
PIP_INDEX_AUTH_PASSWORD_1=mypassword1 |
@wwuck I think your solution with matching index url with |
Hmmm, so after reading #10389 I guess I should hold off on trying to implement a keyring backend for environment variables. |
I don't think that credential helper API would happen anytime soon to solve your problem. If you start developing a keyring backend now, you'd probably be like version 3.0 when that API is released. |
Ok, so can anyone help on my keyring questions? The pip docs don't mention how to specify the username for the keyring credentials. Where is this username specified? And what would happen if I enter multiple username credentials into keyring for a single pip index url? Will it pick one at random? Will it try them all until it gets a successful auth attempt? keyring set https://private-pypi-index.example.com myusername1
keyring set https://private-pypi-index.example.com myusername2 |
I don't see any info about that in the docs... am I missing something? https://pip.pypa.io/en/stable/search/?q=PIP_INDEX_URL&check_keywords=yes&area=default Edit: ah, they're named based on command line options, ok https://pip.pypa.io/en/stable/topics/configuration/#environment-variables |
Are there any updates on this? I have
(Let's assume, it is possible to use variables PIP_USERNAME and PIP_PASSWORD).
And the above example should work without hassle. But what I have to do without variables PIP_USERNAME and PIP_PASSWORD is to change
Or another workaround for multiple pipeline steps is to create
It does not look nice. PIP should have the way to pass a user and password via environment variables like Twine has. |
It's worth noting that for Docker image builds, |
pypi/warehouse#10030 has finally been fixed, so the keyring plugin is finally uploaded to pypi. |
I think I have worked out a way to use pip in a secure way when building container images and environment variables, which might be helpful for others using Docker in a CI/CD context. It's a lot of code, but I think it's at least secure. My specific requirements are:
The following assumes the environment variables Firstly, to build the below image, the command would be: docker build --secret id=INDEX_USER --secret id=INDEX_PASS . The Dockerfile below is not particularly readable, but hopefully it makes some sense. Very briefly, this basically uses Docker build secrets, the netrc functionality and a tmpfs mount to install packages from our internal authenticated feed within our build systems. All without the credentials being leaking in HTTP logs or hitting physical storage. FROM [...]
ARG INDEX_SERVER=[...]
ARG INDEX_URL=https://$INDEX_SERVER/path
RUN pip config --global set global.index-url $INDEX_URL && \
# Make pip use standard certificate store (on at least Ubuntu, Debian & Alpine)
pip config --global set global.cert /etc/ssl/certs/ca-certificates.crt && \
# Avoid not found errors given no Internet access within our CI/CD system
pip config --global set global.disable-pip-version-check true
# Copy build files into a working folder
WORKDIR /build
COPY requirements.txt ./
RUN --mount=type=secret,id=INDEX_USER \
--mount=type=secret,id=INDEX_PASS \
# Create a tmpfs (memory-based) mount for the netrc authentication file valid for this layer only
--mount=type=tmpfs,target=/run/auth \
# The mounted secrets are held in tmpfs (memory-based) files that we need to pull out
INDEX_USER=$(cat /run/secrets/INDEX_USER) \
INDEX_PASS=$(cat /run/secrets/INDEX_PASS) \
# Set the authentication credentials securely (using index-url is insecure)
NETRC_CONTENT="machine $INDEX_SERVER login $INDEX_USER password $INDEX_PASS" \
sh -c "echo \$NETRC_CONTENT" > /run/auth/.netrc && \
# Variable to tell pip (actually Python's netrc module) to pick up the memory-based .netrc file location
NETRC=/run/auth/.netrc \
# Install required packages
pip install --no-warn-script-location --no-cache-dir -r requirements.txt && \
pip check
# Remainder of build
[...] I'll add that this common use of environment variables by Docker seems a good case for supporting environment variables more generally to simplify above, but at least there is a workable alternative for Linux containers. |
Description:
We're using pip in a CI/CD pipeline to install packages from a private repository protected by username/password. Currently there are two options to pass those credentials to pip, either encode it directly in the URL or create a
pip.conf
file. Both options are not very attractive. The first option would entail to have those credentials hard coded in the source code, the second one would mean we'd have to generate this config file during the build process.Most CI/CD build pipelines support some kind of "secret variables", which is a fancy word for environment variables that you can set in the CI/CD and that will be enabled in the build pipeline. This is usually the way to pass secrets.
It would be very helpful if
pip
would also support some mechanism to read secrets from environment variables.See also: https://www.jfrog.com/confluence/display/RTF/PyPI+Repositories#PyPIRepositories-UsingCredentials for a realistic use case.
The text was updated successfully, but these errors were encountered: