Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design for authentication with image registries #11

Closed
2 of 5 tasks
squaremo opened this issue Jul 28, 2020 · 8 comments
Closed
2 of 5 tasks

Design for authentication with image registries #11

squaremo opened this issue Jul 28, 2020 · 8 comments
Labels
blocked/requires-rfc Requires a design proposal enhancement New feature or request

Comments

@squaremo
Copy link
Member

squaremo commented Jul 28, 2020

Must take into account these scenarios:

  • imagePullSecrets -- these still get used, either connected to the service account, or a workload
  • ECR permissions which can be obtained from the environment (see flux v1's ECR adapter, but beware using the metadata service which needs other permissions)
  • GCR permissions, which I'll need to look up
  • Azure permissions -- have to check back, since those have changed (used to be possible to mount a docker config from the host, I think)
  • Possible fallback: refer to permissions mounted into the filesystem (but: not very secure, since everyone will be able to)

In the case of ECR and GCR, the permissions are connected to the account running the controller, so it is not possible to have multi-tenancy (ImageRepository objects that have different access to ECR, say) without some extra machinery. Supporting only single tenancy is OK for now.

@squaremo squaremo added enhancement New feature or request blocked/requires-rfc Requires a design proposal labels Jul 28, 2020
@squaremo
Copy link
Member Author

NB imagePullSecrets for workloads will not in general be in the same namespace or even the same cluster as the ImageRepository objects.

@raviranjithkumar
Copy link

When we are using metadata version, we need to aware of supporting metadata version( IMDSv1 and IMDSv2) . IMDSv1 is not having the security compliance so we need to use IMDSv2. I have already raised this issue in FluxV1 and waiting for the solution

  1. Flux is not able to get the EC2Metadata to detect the AWS region when switching the EKS EC2-instances Metadata-Service-Version from IMDSv1 to IMDSv2. flux#3384
  2. helm-operator is throwing the errors when switching the EKS EC2-instances Metadata-Service-Version from IMDSv1 to IMDSv2.  helm-operator#574

@masih
Copy link

masih commented Dec 28, 2020

so it is not possible to have multi-tenancy (ImageRepository objects that have different access to ECR, say) without some extra machinery.

I wonder if it is possible to take an approach similar to SOPS when role is specified for KMS encrypted secrets. When role is specified, SOPS will attempt to assume the role prior to decryption.

For ECR I wonder if the ImageRepository CRD can be extended to take an assumable role ARN as a secret ref substitude, which it would then assume prior to image metadata fetch, etc. This way, the multi-tenancy is respected by only allowing the image-reflector-controller to run with a service account that can assume other roles instead of a broader access to ECRs of all tenants.

I hope that makes sense.

@squaremo
Copy link
Member Author

squaremo commented Mar 1, 2021

There are solutions for ECR, GCR and Azure given in https://toolkit.fluxcd.io/guides/image-update/#imagerepository-cloud-providers-authentication. These rely on using CronJobs with appropriate credentials (and platform role assignments) to populate an image pull secret.

This approach is better than trying to build authentication into the image reflector controller, because it doesn't need additional, complicated machinery for multi-tenancy, and it generalises to other cloud platforms without needing further work here. The downside is that it's a little more work on the part of the end user -- but the bulk of setting it up is figuring out the platform configuration (IAM roles and whatnot), which you would have to do in either case.

Can anyone here comment on whether they've been able to set things up using the guide linked above?

@181192
Copy link

181192 commented Mar 16, 2021

@squaremo Could we please re-implement the authentication mechanism for ACR as done in Flux v1 by using the the Service Principal credentials available at the host path in /etc/kubernetes/azure.json for AKS clusters. This is the most common way to use the cluster service principal to authenticate with ACR.
I know there has been some issues open in Kubernetes to remove the option and use the default docker credentials location, but that work has been stale since January 2018 (kubernetes/kubernetes#58034). There are several issues like the --azure-container-registry-config flag that is dynamic and will require all container products to Azure to enable Managed Service Identity (MSI) before removing the flag from kubelet.
AAD Pod-Identity is in Preview and not enabled by default when creating AKS so that requires more work for the end user to enable / install.
So it seems like the /etc/kubernetes/azure.json will not go away anytime soon, and if the image-reflector-controller would implement this it will favor the end user for the ease of installation and usage (and migration from Flux v1).

@kingdonb
Copy link
Member

kingdonb commented Oct 6, 2021

We came up to this issue in today's Bug Scrub, and in the discussion it came up that we have this library:

https://github.com/google/go-containerregistry

Does this library solve part of the problem? I understand it's in use already, and Google has added support for authenticating to its own container registry through that library, (but other vendors may not have added their own support.)

@squaremo
Copy link
Member Author

squaremo commented Oct 6, 2021

Does github.com/google/go-containerregistry solve part of the problem?

There is some code in there for getting credentials by GCP-specific means, so yes it helps a bit. The main part of designing integration for any platform is figuring out which means is the most appropriate to running as a controller, limiting dependencies, and using ambient authorisation rather than requiring secrets.

@stefanprodan
Copy link
Member

ACR, GCR, ECR both static and IAM role auth have been implemented and documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked/requires-rfc Requires a design proposal enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants