-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic Authentication Config #1689
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: enj The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ca1ff9a
to
9632981
Compare
/hold To prevent accidental merge. |
9632981
to
67d805b
Compare
Signed-off-by: Monis Khan <[email protected]>
67d805b
to
25dfa57
Compare
|
||
Authentication is Kubernetes is quite flexible and unopinionated. There is no | ||
requirement for the API server to understand where an identity originates from. | ||
The API server supports a variety of command line flags that enable authentication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my benefit, has this idea been discussed prior? What were the objections? It seems like webhook AuthN would have been a natural fit to start with an API instead of an API flag, but wasn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt and @mikedanese may be able to provide some history.
Looking at OIDC as an example, it was added in 2015 via kubernetes/kubernetes#10957
My guess is that it was probably just easier to start with in-tree functionality that was wired through a CLI flag.
Could this functionality be implemented entirely out of tree using a front-proxy configured by CRDs? |
Meaningfully? No. By front-proxy I assume you mean impersonating proxy (and not like an aggregated API proxy). The goal here is to allow end users that have Some thoughts on the proxy approach:
With a built-in API:
While I am all for implementations outside of core, this functionality is akin to RBAC and admission webhooks. RBAC is built-in and fully configurable via the Kube API. This has lead to its adoption and common usage. It is also a relatively limited API. It is effectively the equivalent of Just as RBAC and admission webhooks would not be meaningful if built out of core, the same is true for this proposal. If one could not rely on RBAC and admission webhooks being consistently present across all infrastructure providers, they would be useless. |
cc @weinong for review |
It's mentioned with some subtlety in the "Motivation" section, but this capability would unlock use of many hosted Kubernetes offerings by organizations that need their own authentication system. As it stands today, operators must choose between owning the full cluster provisioning and operation in order to retain control of the API server command-line flags, or giving up on their custom authentication system to buy into a hosted offering. Azure Kubernetes Engine is one counterexample, where it looks at least possible to jam in the right API server command-line flags, perhaps to the surprise of the maintainers. I had heard in the past (from @liggitt, if I recall correctly) that we couldn't offer a subsystem like this, because it's not possible to govern who can configure it, because authentication is a prerequisite to determine who is even asking to configure it. Does the design mandate that any user with "cluster admin" permission can manipulate these AuthenticationConfig objects? |
I will try to distill this down into a sentence and add as an explicit goal.
There is an inherent assumption that you have an existing identity that has
Any user with
A
Just as there is no meaningful way for the RBAC API to guard against a |
Thank you for clarifying. I now see that @micahhausler had asked a similar question earlier (#1689 (comment)). I missed that yesterday, and should have read more carefully to avoid repeating the same point. I hope the Webhook support makes it in before too long. (It's commented out for now in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(fwiw, I'm mostly ok with this - with clarifications around failure recovery)
High level: "what would CIS say?" If we expect the CIS/other security audit recommendation to be "ensure your apiserver has the --disable-dynamic-auth
flag", then there isn't much point going down this path. (I don't know the answer to this question.)
To prevent confusion with identities that are controlled by Kubernetes, the | ||
`system:` prefix will be disallowed in the username and groups contained in the | ||
`user.Info` object. A disallowed username will cause authentication to fail. | ||
All disallowed groups will be filtered out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I can't put use this to assign users to system:masters
group (ie: cluster-admin) ?
(sad face)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption is that you would create your own my-admins
group and assign it cluster-admin
via RBAC. There is no need to abuse system:
identites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can already achieve (abuse) this via the CertificateSigningRequest
API and there is no such restriction build in it. Why would the Dynamic Authentication Config API be different in this regard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can already achieve (abuse) this via the
CertificateSigningRequest
API and there is no such restriction build in it. Why would the Dynamic Authentication Config API be different in this regard?
The latest version of the CSR API actively tries to prevent this kind of abuse.
In general, I believe it was a mistake for us to not tightly control what can assert a system:
identity. Some things need to be reserved for Kube.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can already achieve (abuse) this via the
CertificateSigningRequest
API and there is no such restriction build in it.
The CSR signer, under the control of the cluster operator, can choose not to sign a requested client cert.
// caBundle is a PEM encoded CA bundle used for client auth (x509.ExtKeyUsageClientAuth). | ||
// +listType=atomic | ||
// Required | ||
CABundle []byte `json:"caBundle" protobuf:"bytes,1,opt,name=caBundle"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, please avoid the mistake of the aggregated apiserver api and make this caBundle
a pointer to a separate Secret (or ConfigMap) and not inline. The rest of the AuthenticationConfig is likely to be static (eg: helm) manifest, but the caBundle contents will be unique for every install. This is way easier to generate (and rotate) if it is a separate object (eg: you could just use cert-manager to maintain a self-signed CA).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree: This turned out to be the most annoying part of deploying an admission Webhook.
I vaguely recall hearing the reason why that CA certificates was defined statically, but I can't find the discussion now in Slack. I think it had something to do with not wanting the API server to have to watch the Secrets to reconfigure its Webhook client, though of course the API server already has to watch the Webhook configuration itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not plan to diverge from the approach used by existing APIs here. If they expand to allow references, I will mirror that here. Overall I am not too worried about the x509
case. It is not really an extension point I expect to see used in helm charts.
For webhook
I agree that this adds pain, but again, I do not plan on diverging from the existing APIs here. Long term I would like Kube to provide first class support for a short-lived, auto rotated CA for the service network (and mint serving certs for all services that ask for one). Then the common case of "I just want to run this webhook on the cluster please make the TLS nonsense go away" would be painless.
cc @munnerz since cert-manager is mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, this is the config for the x509 authentication type (i.e. authenticating the user request), not for authenticating the apiserver request to the authenticator, right?
I believe the concerns raised here are with regards to the latter situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, this is the config for the x509 authentication type (i.e. authenticating the user request)
Correct. This CA bundle allows validation of client certs.
I believe the concerns raised here are with regards to the latter situation.
IIUC, the concern was "please let me separate my static config from dynamic config." For example, when configuring an OPA admission webhook that enforces policy across all resources, all ValidatingWebhookConfiguration
across all clusters will basically look the same minus the CA bundle.
One could imagine a similar situation where most of the AuthenticationConfig
object is static minus the CA bundle. I think it is far less likely and I must prefer being able to validate the CA bundle inline. I think there could be a case for allowing an empty CA bundle which would be ignored until filled in by some controller.
cc @JoshVanL @simonswine @mhrabovcin @mattbates @phanama @wallrj I see that you all contributed to |
cluster-admin and impersonate the desired user [kube-oidc-proxy] [teleport]. | ||
Other than being a gross abuse of the impersonation API, this opens the API | ||
server to escalation bugs caused by the proxy (such as improper handling of | ||
incoming request headers). This proxy also intercepts network traffic which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this opens the API server to escalation bugs caused by the proxy (such as improper handling of incoming request headers)
Isn't this true with any authentication handler? I suppose the proposed Prefix
requirement mitigates it to some degree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concern I was raising here is that with an impersonation proxy your network diagram looks like:
[ user agent ] -- user creds over TLS --> [ impersonation reverse proxy ] -- proxy credentials over TLS with impersonation headers --> [ API server ]
The reverse proxy has to handle the incoming request, authenticate it, sanitize it (ex: fail if incoming request has impersonation headers), and then pass it through to the API server with its own credentials. What happens if Kube ever adds new headers that have special meaning? Some headers may be benign from a security perspective and the proxy should pass them through. Others could be like impersonation and change the security parameters of the request. Having a proxy that is cluster-admin
act a confused deputy is something I would like to avoid.
This network is far easier to protect IMO:
[ user agent ] -- user creds over TLS --> [ API server ] -- possible webhook over TLS with user creds --> [ webhook ]
|
||
To prevent confusion with identities that are controlled by Kubernetes, the | ||
`system:` prefix will be disallowed in the username and groups contained in the | ||
`user.Info` object. A disallowed username will cause authentication to fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if the credentials for a disallowed username weren't even forwarded to the dynamic authenticator. The only way of making this work that I can think of is if a static authenticator was able to recognize the format of the credential, but it was the wrong credential, then it could shortcircuit-deny the request. WDYT?
EDIT: alternatively, would it be possible to require clients authenticating with a dynamic authenticator to provide a username hint? (e.g. as a request header, only relevant for OIDC & webhook types). This way, the authentication could be skipped if the requested username doesn't match the prefix, and the request would be auto-denied if the authenticated username didn't match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is the point that @mikedanese brought up in the sig-auth call around "I do not want to forward credentials that were meant for my static webhook to these dynamic webhooks when my static webhook has a transient failure and authentication fell through to the later authenticators." This is a valid concern and we should try to address it.
It would be nice if the credentials for a disallowed username weren't even forwarded to the dynamic authenticator.
This does assume that the API server knows in advance what the username will be. I think we can only know this for sure with certs and SA tokens.
The only way of making this work that I can think of is if a static authenticator was able to recognize the format of the credential, but it was the wrong credential, then it could shortcircuit-deny the request. WDYT?
The idea of adding short circuit deny logic into authentication makes me very nervous. I think the authn stack is much easier to reason about because we do not have to worry about "which authenticator owns a credential." I could also imagine cases where a user has a webhook today that validates tokens that look like service account JWTs. I do not want to accidentally break them.
would it be possible to require clients authenticating with a dynamic authenticator to provide a username hint? (e.g. as a request header, only relevant for OIDC & webhook types). This way, the authentication could be skipped if the requested username doesn't match the prefix, and the request would be auto-denied if the authenticated username didn't match.
I think there are some problems with such an approach:
- The client has to know the username it is trying to authenticate as. Some environments use UUIDs which make this type of flow very painful.
- How would the user tell
kubectl
to pass a such a header? Would we extend theexec
credential plugin mechanism? - If you have more than one webhook configured and they both use opaque tokens, how do you know which webhook to send the token to?
Here is my current thoughts on how we could make this safer:
- We require every
AuthenticationConfig
object to have a name with the format ofdomain:path_segment
. Something likecompany.com:v1
. We reserve[*.]k8s.io:*
for any future Kube usage we come up with. - We make no other changes to
x509
oroidc
because the validation for those authenticators is in-memory of the API server. - For
webhook
, we only pass a token to a webhook for validation if it has the prefix<name_of_authentication_config>/
. For example, if we see a tokencompany.com:v1/D2tIR6OugyAi70do2K90TRL5A
and we have a webhook configured with the namecompany.com:v1
, we would strip the prefix (so that the webhook does not need to know its name) and passD2tIR6OugyAi70do2K90TRL5A
to the webhook. Note that this guarantees that we never pass a standard OIDC token to a webhook.
The above flow requires that a client be given the name of the webhook out of band in some manner and that it concatenate the prefix to the token (this could be done by an exec
credential provider though we could add direct support to kubectl
to help facilitate this).
I cannot think of any approach that does not involve the client providing us extra information to make the webhook
case not leak tokens meant for one webhook to another webhook. Encoding it directly into the token as a prefix seems like the simplest approach that could be done today via exec
plugins. Having to know the (probably static, maybe even well-known like github.com:kubernetes
) name of the webhook authenticator seems okay to me.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also support a simple form of wildcard by allowing the client to send the token as company.com:*/D2tIR6OugyAi70do2K90TRL5A
as a way of indicating "I am okay with sending this token to all company.com
authenticators."
Not sure if that is a good idea, but it would be easy to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal makes sense to me. It requires tokens to be provisioned with the prefix though (unless the client adds them). Is that acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It requires tokens to be provisioned with the prefix though (unless the client adds them). Is that acceptable?
Overall, I think this is acceptable because it could be made transparent to the end-user.
Authentication webhooks and exec
plugins work together quite nicely to give a seamless experience. We also do not need to concern ourselves with oidc
tokens (they just continue to work as they always have and are never sent to the dynamic webhooks). I see a few ways the token provsioner could handle the prefix requirement:
- The provisioner is aware of the prefix it needs to add (i.e. the prefix is static or the provsioner knows how the webhook was configured on a given cluster) and does so automatically; no client side changes are required
- The provisioner is unaware of the prefix (it could be an "old" provisioner or the name of the webhook is not stable across clusters) but the client adds it automatically via:
- Automatic config by the process that creates the kubeconfig file for the end-user
- Probing the cluster somehow for the information (i.e. the
exec
plugin knows how to access some public datastore that tells it what prefix to use on a given cluster) - Manual config by the end-user (the user is informed out of band of the prefix information)
Only the last option leaks this implementation detail to end-users.
// caBundle is a PEM encoded CA bundle used for client auth (x509.ExtKeyUsageClientAuth). | ||
// +listType=atomic | ||
// Required | ||
CABundle []byte `json:"caBundle" protobuf:"bytes,1,opt,name=caBundle"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, this is the config for the x509 authentication type (i.e. authenticating the user request), not for authenticating the apiserver request to the authenticator, right?
I believe the concerns raised here are with regards to the latter situation.
GroupsClaim string `json:"groupsClaim" protobuf:"bytes,4,opt,name=groupsClaim"` | ||
} | ||
|
||
type PrefixConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than a prefix, maybe use a glob pattern (please not a regex)? E.g. if I want to authenticate *@example.com
or *@*.example.com
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like a different use case than PrefixConfig
which is meant to disambiguate the same username/group from different authenticators.
I think the use case is valid because it makes it significantly easier to use something like gitlab.com
as your oidc
IDP (i.e. maybe you do not want every user with a GitLab account to be able to authenticate to your cluster even if they are not authorized to do anything).
The approach I have been thinking around such a use case is:
// the asserted user must be a member of at least one of these groups to authenticate.
// set to ["*"] to disable this check.
// required (cannot be empty)
requiredGroups []string
Basically, a simple gate on group membership for the x509
and oidc
authenticators. webhook
can internally do anything to limit authentication based on other parameters (i.e. some form of authz check) so it matters a lot less in that case.
Incomplete list of concerns:
If this API were used to configure an authenticating proxy, these issues would largely be avoided (in that they would be the responsibility of the proxy operator, not the cluster operator). The front proxy architecture is what we've been recommending to users for the past N years. If there are challenges with developing and operating an authentication proxy, we should identify and fix them. |
Can I get the complete list 😝 On a more serious note, please take the time to enumerate all concerns that you have. I cannot address them if I do not know what they are.
This is handled via the proposed change in #1689 (comment). PTAL.
I disagree. Authentication's extra field (set by a webhook) combined with webhook authorization and webhook admission give you a lot of control over the auth stack if you can set the CLI flags on the API server. All this KEP does it make it easier to use some of that functionality generically without CLI flag access.
This is simply not true. The only requirement is that the final credential used for authentication be x509 / oidc / token. How you get that credential is completely open ended and well supported via
I would argue that these are orthogonal concerns to this KEP. If we did decide to support more authentication methods, we would have to configure them somehow. I am willing to bet that the structure of this API is far better suited to handle new authentication methods than a CLI flags based approach.
OpenShift, PKS, Dex, etc support these types of IDPs by giving the user a token upon successful authentication.
This KEP does not hinder any efforts in that space.
I would argue that this KEP does exactly that. If you could
I do not see how this is any different from a front proxy approach. DoS via the proxy is just as serious as DoS directly against the API. Saying that this kind of DoS is "not your problem because its not your proxy" is hardly helpful. I would argue this is the realm of the on-going priority and fairness efforts.
I do not see this as a real problem. This KEP covers every "good" authn API we have today that has been developed over 5+ years. No one is actively trying to add net new authentication protocols to Kube. It is simply easier to issue short lived certs / tokens that are obtained using a user's desired mode of authentication. In terms of API velocity, auth in Kube is incredibly slow moving.
This is not a solution. All that does is push the responsibility elsewhere and it introduces a whole set of new problems.
And this was a mistake. It was a convenient response that allowed us to do nothing and push the burden onto end-users. This is a real problem that end-users face. The response to this proposed API change from the community has been positive, even from managed Kubernetes offerings. Dynamic control over authn is the third most requested feature in EKS per @micahhausler. All of authentication, authorization, admission, etc could be implemented with a proxy. But no one would be satisfied with that approach. Why do we think that is okay for authentication? Would projects like OPA be nearly as successful in Kube if admission plugins could only be configured via the CLI? Note that I mentioned "an ecosystem" on the last SIG Auth call. I was not referring to an ecosystem of auth proxies. I was referring to an ecosystem of apps built on top of an opinionated Kuberentes native auth stack. If auth stacks were generically portable across Kube in the way admission plugins are, it would actually make sense to develop on top of them. Auth is just another extension point.
I have listed quite a few in #1689 (comment). They cannot be fixed because they are a by product of a proxy based architecture. It is also simply not possible to protect CRD based API with the same level of control as a built-in API. |
--> | ||
|
||
This change aims to add a new API Kubernetes REST API called | ||
`AuthenticationConfig`. It is similar to the `ValidatingWebhookConfiguration` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bike-shedding: AuthenticationConfiguration for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
¯_(ツ)_/¯
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
Signed-off-by: Monis Khan <[email protected]>
@mikedanese @tallclair the KEP is up to date with all comments now. |
cc @liggitt |
occur. Performance will also be increased by limiting the number of webhooks | ||
that can be invoked on a given request. | ||
|
||
It is the responsibility of the token issuer, the webhook, and the client to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a purely practical point of view, how would the token issuer even know the name of the AuthenticationConfig driving it? This model seems like it might work if you deployed a per-cluster authenticator and then wired the configuration across, but that seems antithetical to previous notion of authenticator webhooks as things that can exist outside of cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if a webhook exists outside of the cluster, it still needs to know what cluster is calling it. Otherwise you risk getting into scenarios where the token meant for one cluster can be replayed against another (and if the webhook and cluster have a 1:1 mapping they could just be hard coded with the name).
A simple approach would be to encode the name and cluster info into the request path. An exec
plugin could also perform this by discovering what name it needs to use based on some public pre-auth metadata hosted on the cluster itself.
I'm generally opposed to this KEP.
|
They can, but so far they won't. I know this from having tried for a long while. What I expected you to say is that one could possibly build out this feature on top of the existing configuration approach, by implementing a Webhook server that in turn delegates to any number of other configured Webhook servers. Is that what @mikedanese was suggesting with the front proxy (#1689 (comment))? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also opposed to accepting the proposal in its current form.
My recommendation would be to narrow this to improving the OIDC provider config, and for clusters that wanted to allow control of that config via a REST API, to use a CRD-based API paired with a config file writer.
Configuring authentication via command line flags has some limitations. Some | ||
forms of config are simply easier to specify and understand in a structured | ||
manner such as via a Kubernetes API or file. The existing authentication flags | ||
for OIDC and token webhook limit the user to only one of each of these types of | ||
authentication modes as there is no way to specify the set of flags multiple | ||
times [#71162]. These flag based configs also require an API server restart to | ||
take effect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree these are current limitations, but they are resolveable without exposing authentication configuration via a REST API (structured config and config file reload are used elsewhere in the API server where justified)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt could the config file be loaded via a configmap + restart/sigup to yield similar dynamic behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the configmap bit would depend on whether the API server process was running in a pod that could have configmaps mounted to it (static pods can't), but the reloadable bit seems plausible... that's the alternative I described in #1689 (comment)
#### Story 2 | ||
|
||
Bob creates an `AuthenticationConfig` object with `spec.type` set to `x509`. He | ||
is then able to create a custom signer for use with the CSR API. It can issue | ||
client certificates that are valid for authentication against the Kube API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to overlap significantly with the kubernetes.io/kube-apiserver-client signer
There is a service running in Charlie's cluster: `metrics.cluster.svc`. This | ||
service exposes some metrics about the cluster. The service is assigned the | ||
`system:auth-delegator` role and uses the `tokenreviews` API to limit access to | ||
the data (any bearer token that can be validated via the `tokenreviews` API is | ||
sufficient to be granted access). Charlie uses his GitHub token to authenticate | ||
to the service. The API server calls the dynamic authentication webhook and is | ||
able to validate the GitHub token. Charlie is able to access the service. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to encourage audienceless tokens like the legacy serviceaccount tokens we're trying to phase out. For auth to in-cluster services, credentials that cannot be replayed as API server credentials seem like a better thing to work toward.
Frank is exploring different options for authentication in Kubernetes. He browses | ||
various repos on GitHub. He finds a few projects that are of interest to him. | ||
He is able to try out the functionality using `kubectl apply` to configure his | ||
cluster to use the custom authentication stacks. He finds a solution that he | ||
likes. He uses the appropriate `kubectl apply` command to update his existing | ||
clusters to the new authentication stack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't personally find the ability to casually replace configured authenticators via kubectl apply
a comforting thought, but maybe that's because I'm a paranoid admin :)
To prevent confusion with identities that are controlled by Kubernetes, the | ||
`system:` prefix will be disallowed in the username and groups contained in the | ||
`user.Info` object. A disallowed username will cause authentication to fail. | ||
All disallowed groups will be filtered out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can already achieve (abuse) this via the
CertificateSigningRequest
API and there is no such restriction build in it.
The CSR signer, under the control of the cluster operator, can choose not to sign a requested client cert.
--> | ||
|
||
1. Create a new Kubernetes REST API that allows configuration of authentication | ||
- x509 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed in addition to the CSR mechanism for obtaining kube-apiserver client certificates?
- Token Webhook | ||
2. Changes made via the REST API should be active in the order of minutes without | ||
requiring a restart of the API server | ||
3. Allow the use of a custom authentication stack in hosted Kubernetes offerings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there evidence this is a thing most hosted Kubernetes offerings want and plan to enable? If this would not be enabled by default, or made part of conformance, or opted into broadly, I question whether it should exist as a built-in API at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the general concept trying to be solved but I am not in favor of the approach being proposed to achieve the goal. I would request that if this proposal is accepted that the feature is provided as an “opt-in” feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many many customers I talk to are looking to use k8s clusters from multiple providers and have consistent auth among them. This is something that comes up again and again with enterprise customers.
I think we should be looking at this from the point of view of the end users of the clusters (both cluster admins and app teams) and not hosting providers. After all, they are the real drivers of the success of Kubernetes.
@dcberg -- if this is something that users want and you don't enable it then you'll be at a competitive disadvantage. That is how this should work. Create the capability and then let the real users vote with their feet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Joe, thank you for putting it so well. In several organizations I've seen so far, the cloud provider's authentication system integrated into their Kubernetes offerings are of no use to us. They are merely an impediment, used only begrudgingly and thereafter ignored.
The cost is that these clusters wind up using very few principals with coarsely assigned permissions ("Fine, then every developer is a cluster administrator!"), for lack of finer-grained control that could be employed consistently across all the clusters hosted with different providers.
Audit logging is no longer accurate. RBAC is less useful. In all, the system becomes less secure and less trustworthy for important workload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbeda I don’t disagree that there is a need to have common auth across clusters. While this proposal would not break a hosted solution, it may limit the use of other integrations with the provider.
My ask is that the feature has an “opt-in” mechanism to allow providers flexibility to expose the feature in such as a way to limit impact and/or to point users to an alternative approach.
2020-04-15: Initial KEP draft created | ||
2020-05-15: KEP updated to address various comments | ||
|
||
## Drawbacks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expected a discussion of the increased security exposure of kubectl apply
gaining the ability to add additional authentication sources
The proposed functionality will be gated behind a new feature flag called | ||
`DynamicAuthenticationConfig`. | ||
|
||
The proposed API will reside in the `authentication.k8s.io` group at version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need to be a distinct group if it was going to be opt-in (which I would expect) or be CRD-based (as described in my comment on the alternatives section).
|
||
TBD. | ||
|
||
## Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative is to focus on the OIDC provider options.
- x509 already has a mechanism to obtain client certs usable against the kube-apiserver (CSR)
- getting multiple token webhooks to interoperate on opaque tokens safely/usably seems problematic
- the linked requests were focused on OIDC configurability (Support multiple OIDC ClientIDs on API server kubernetes#71162 / [EKS] [request]: Add ability to set oidc options aws/containers-roadmap#166)
That could look like the following:
- allow the kube-apiserver to configure multiple oidc providers via a structured, reloadable config file
- Separately, create a CRD for custom resources that map to the items in the oidc config in a straightforward way
- Providers that want to expose the config via a REST API can install the CRD and the custom-resource -> OIDC config file process
- Providers that do not want to expose the config still benefit from multiple oidc provider capability and have no additional exposure and no new API surface to lock down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responding to Jordan's overall comment about "make OIDC better."
Overall, I do not understand how a REST API (I discuss two below) ends up being different than the impersonation API or the CSR API from a security perspective. If we give the cluster operator the same level of control, why does it matter?
As a thought experiment, if I was to propose a new UserTokenRequestAPI
that worked like the CSR API's kubernetes.io/kube-apiserver-client
signer, would that be okay? This would be a way for some privileged component to ask for a token for a particular user info that the Kube API would honor. The token itself would be opaque to the client. What ever component issued the token could choose not to based on whatever requirements the cluster-operator specifies. This handles the "the identity is rooted in a trust owned the API server / cluster operator." Such an API adds one layer of indirection but that could be hidden from the end user.
My recommendation would be to narrow this to improving the OIDC provider config, and for clusters that wanted to allow control of that config via a REST API, to use a CRD-based API paired with a config file writer.
1. allow the kube-apiserver to configure multiple oidc providers via a structured, reloadable config file 2. Separately, create a CRD for custom resources that map to the items in the oidc config in a straightforward way 3. Providers that want to expose the config via a REST API can install the CRD and the custom-resource -> OIDC config file process 4. Providers that do not want to expose the config still benefit from multiple oidc provider capability and have no additional exposure and no new API surface to lock down
So we would need to in k/k
:
- Make a new file based API
- Wire that file based config through
- Reload that config
Outside of k/k
:
- Create a new CRD
- Create new controllers to handle the CRD
- Create new admission plugins to restrict access to the CRD, but with far weaker security (they can be removed if you can delete the webhook)
For every single provider:
- Ask them to run all of those components co-located with their API servers
- Configure credentials and the correct authz for these components
- Set the right flags on the API server
This sounds an incredible amount of work in-tree, out-of-tree, and for every single provider. I understand the desire to build things out of core, but I do not think this moves us forward in any meaningful way.
Why would we instead not:
In k/k
:
- Create a new dynamic OIDC config API, in a new group, disabled by default
- Wire the new API through
- Reload the API config
Outside of k/k
:
No work
For every single provider:
- Ask them to pass a single flag to enable the API if they wish to support it
This is the same amount of work and changes in k/k
. There is no work outside of k/k
and the burden on the providers is as small as we can make it. Since its off by default, no security concerns are added. The provider would opt-in in the same way they would for the CRD based flow. They just would not have to do a large amount of provider specific work.
Is there evidence this is a thing most hosted Kubernetes offerings want and plan to enable? If this would not be enabled by default, or made part of conformance, or opted into broadly, I question whether it should exist as a built-in API at all.
From what I can tell, EKS and AKS would. GKE and OpenShift would not (though OpenShift already lets you control IDPs via a Kube REST API). Not sure about others. I suspect Linode, Digital Ocean, Rancher might. It would be enabled in VMware's products (duh).
If every hosted offering had a Kube API that let you configure auth in some way that they were comfortable with, then we would be fine as-is. The reality is that even when providers want to allow this, it is a lot of work to build all the tooling to wire this stuff through. They also tend to use non Kube APIs, which makes it hard to integrate with.
We make extensibility features dynamic when it is expected that they will change over the life of a cluster. I personally can't see ever wanting to change an authentication system once the cluster was doing something. (Also, an unwanted authn system change sounds like the one of the most thorough pwnings I can imagine.) So, I think this is a candidate for static config file configuration, not dynamic API based configuration. Access to the static config files is a higher privilege than "cluster admin". I think requiring the highest privilege to change this is good. If the problem is just getting this right once, at cluster initialization, as I suspect, I'd look at making a config file + support in Cluster API. If the problem is not that, and people want to change this dynamically, during the life of their cluster, more than once... I guess I'd want to see a long list of people who want that, it's really hard for me to believe. |
@lavalamp I think you are conflating things. There is the need to have end users configure it in a standard way regardless of hosting provider that is different for when and how often it changes. This is something that end users have hit their head against. Allowing the apis/mechanisms for doing this in a provider-neutral way is a first step. It gives end users something to tool against and allows them to work with their providers to enable the capability. Cluster API is great but it doesn't get to that need to work against managed clusters. In fact, cluster API is a great option for those customers (and there are many) that opt to run their own clusters so that they can get access to configure auth. |
I've asked the people in my org who would know these things if they agree with this assertion or not. If you can provide clearer evidence for this assertion that would be helpful. Note that static config is also a "standard way". The debate is actually about whether the config must be end-user accessible or not.
It is pointless if the providers wouldn't enable it. If this is to be an end user extensibility feature, it needs to universal. And therefore non-optional, required by conformance. Otherwise you won't actually get more portability, you get less. So, I wouldn't want to start down this road unless there's very clear demand from users. Right now it's not clear that there is (maybe it's clear to you but that doesn't really help the rest of us). |
The audience here is the administrator or operator deploying Kubernetes clusters. It may be a small group with a big impact. In a given organization, there may be one or two such people setting up dozens of clusters to be used by hundreds of people to run applications serving millions of people. Does that count as one user (the organization), or two (the administrators), or hundreds (the developers)? When I've talked to others interested in Kubernetes about authentication, I've heard several times, "Oh, I didn't think you were allowed to configure things like that." In other words, increasingly, this configuration being inaccessible turns people off from thinking about it. They just assume that "Kubernetes does not have a way to add users." These are all anecdotes. What counts as proof? What counts as enough? |
Mostly an aside, could be something to bear in mind. Can I create an AuthenticationConfig that lets me implement authorisation with the client identity masked / erased? For example, using SAML it's relatively easy for an IdP to make an assertion about the bearer: they're a staff member, they have a role in Project 23 and another role in project 42, and they have the following 7 entitlements [entitlements go here]. Current ways of working with Kubernetes seem to me to infer that all authorization decisions are based around a known user identity, whereas that's not always the case. Sometimes you can, eg, take the ARN of an AWS IAM role, and put that in place of the username, or have a mapping. It's good to retain the ability to authorized something even if you don't have a unique identifier for the caller, and never will. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm picturing the same symptoms as https://github.com/kubernetes/enhancements/pull/1689/files#r428871572 and wondering how Kubernetes authnz looks a good few minor releases from now.
(and I hope this feedback is useful)
environments have no consistent way to expose this functionality [#166]. To get | ||
around this limitation, users install authentication proxies that run as | ||
cluster-admin and impersonate the desired user [kube-oidc-proxy] [teleport]. | ||
Other than being a gross abuse of the impersonation API, this opens the API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be useful to restate the intent for the impersonation API (which I think is to provide a model to allow specific identity A to request the ability to act as different identity B in some scope), so that the reader can draw their own conclusion about suitability, alternatives, etc.
My comment at #1689 (review) still reflects my current thinking, and I confirmed with David #1689 (comment) still reflects his. Of the authentication mechanisms this KEP discusses, supporting multiple OIDC providers seems the most reasonable goal. However, I would characterize what this KEP proposes as cluster configuration API, not something that should be built into the kube-apiserver. I considered the following questions:
Some benefits of that approach:
Since we are in agreement that this proposal is not acceptable, we will close this PR and look forward to reviewing proposals to fill in the capability gaps identified in the kube-apiserver. @liggitt, @deads2k, and @mikedanese |
…-username NO-JIRA: enhancements/update/OWNERS: Update the username of David Hurta
Signed-off-by: Monis Khan [email protected]