-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scoped Kubelet API Access #585
Conversation
cc @kubernetes/sig-auth-proposals |
959e69f
to
11b89bf
Compare
Currently, the `system:node` cluster role is automatically bound to the `system:nodes` group. | ||
|
||
Because the node authorizer accomplishes the same purpose, with the benefit of additional restrictions | ||
on secret and configmap access, this binding is no longer needed, and will no longer be set up automatically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to impact anyone using system:node
x509 certs that haven't been able to dynamically make the CN system:node:<nodeName>
. e.g. if a deployment grants the same x509 cert to all nodes but still use the organization system:node
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't. See the section below that addresses this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that those deployment tools, to start supporting 1.7, will either have to start creating that ClusterRoleBinding or dynamically setting the CN.
For example look at kube-up's x509 generation https://github.com/kubernetes/kubernetes/blob/274df99e9bf63213325d9813eec84288d3c5ab73/cluster/common.sh#L996
I don't know how prevalent this is, and I don't think it's a big problem, but it's probably worth checking popular tools for this pattern over this release cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the node authorizer would authorize against the broad rules for system:nodes
members (as the bound cluster role does today), then further subdivide if the identity allowed us to determine the specific node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry. Misread that last section. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I'll call it out to make it clearer
Thanks for working on this, LGTM. |
|
||
The default `NodeIdentifier` implementation: | ||
* `isNode` - true if the user groups contain the `system:nodes` group | ||
* `nodeName` - populated if `isNode` is true, and the user name is in the format `system:node:<nodeName>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will clarify that this dovetails with the client certs requested by the kubelet TLS bootstrap process (and kubeadm join process)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a quick blurb about whether future extension is anticipated (we don't expect there to be things that are like nodes that don't fit this reasonable pattern for cert and naming)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Currently, kubelets have read/write access to all Node and Pod objects, and read access to all Secret and ConfigMap objects. | ||
This means that compromising a node gives access to credentials with power to make escalating API calls that affect other nodes. | ||
|
||
This proposal limits kubelets' API access using a new node authorizer and admission plugin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expected an alternatives section here or below describing why rbac isn't enough and other alternatives considered (I can't remember others). Also a caution about future use of this pattern for another client
* Further restricts secret and configmap requests to only authorize reading objects referenced by pods bound to the node | ||
* Node admission | ||
* Limits a node to mutating its own Node API object | ||
* Limits a node to mutating pods bound to itself |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would add words to the effect of "we represent a policy that is a peer to rbac with code" or something that gives this a conceptual hat hook
|
||
The default `NodeIdentifier` implementation: | ||
* `isNode` - true if the user groups contain the `system:nodes` group | ||
* `nodeName` - populated if `isNode` is true, and the user name is in the format `system:node:<nodeName>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a quick blurb about whether future extension is anticipated (we don't expect there to be things that are like nodes that don't fit this reasonable pattern for cert and naming)
* creating and updating status of their Node API object | ||
* updating status of Pod API objects bound to their node | ||
* creating/deleting "mirror pod" API objects for statically-defined pods running on their node | ||
* reading Secret and ConfigMap objects referenced by pod specs bound to their node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this list is meant to be complete, but also a bunch of read operations (services, endpoints, PVs, etc.), exec/attach/portforward & {token,subjectaccess}reviews
EDIT: exec/attach/portforward are served by the node, but the node doesn't need API access for those
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to (attempting?) to make a semi-complete list, even if they're not in-scope for v1.7 or v1.8 -- be nice to see a what's sufficient to get where we want to be, not just known-necessary.
Also @jbeda mentioned that we should eventually make kubelet/nodes have whitelisted access to the specific set of things they need access to? And if so, having that list of necessary permissions would be a pre-requisite?
Also, I think kubelets rely on an (implicit?) list of all Pods in the system via a watch to pods? If so, then Kubelets have an additional requirement to
- get a list of Pods that are scheduled to that kubelet.
That is, the problem of allowing read access to a specific pod is different from a list query to find all visible-to-me pods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think kubelets rely on an (implicit?) list of all Pods in the system via a watch to pods? If so, then Kubelets have an additional requirement to
- get a list of Pods that are scheduled to that kubelet.
That is, the problem of allowing read access to a specific pod is different from a list query to find all visible-to-me pods?
This proposal limits kubelets' API access using a new node authorizer and admission plugin: | ||
* Node authorizer | ||
* Authorizes requests from nodes using the existing policy rules developed for the `system:node` cluster role | ||
* Further restricts secret and configmap requests to only authorize reading objects referenced by pods bound to the node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure configmaps can realistically be restricted, since a node could just create a mirrorpod referencing the desired configmap. Note that secrets are immune to this attack since mirror pods are forbidden from accessing secrets.
Maybe providing an option that forbids the creation of mirror pods would be a good idea for some deployments that don't rely on them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't expect to let a node establish a relationship from a mirror pod to any API object (secret, configmap, pvc, serviceaccount, etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be a breaking change, since static pods can currently use PVCs and configmaps. I suppose we could make it a hard requirement for adopting the node admission plugin, but that might make adoption more difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a node can simply create a static pod referencing any configmap, we cannot restrict node access to any configmaps (same for pod>pvc>secret references). It was my understanding that static pods were not supposed to reference API resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I aggree, I'm just not sure we ever documented or enforced that anywhere. This is where we enforce it for secrets & service accounts: https://github.com/kubernetes/kubernetes/blob/master/plugin/pkg/admission/serviceaccount/admission.go#L149
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The node restricting admission plugin will forbid those. You'll have to choose whether you want to be able to lock down nodes or use static pods that use those features (which I don't think are intended to be used today). For compatibility, if the enforcement is via admission, they can opt in or out
|
||
The default `NodeIdentifier` implementation: | ||
* `isNode` - true if the user groups contain the `system:nodes` group | ||
* `nodeName` - populated if `isNode` is true, and the user name is in the format `system:node:<nodeName>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the IP address be alternatively (or additionally) used? Is that just adding unnecessary complexity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, the nodeName is used to filter pods, and the kubelet already knows it (since it is responsible for updating the Node status and querying bound pods).
|
||
Limits `update`,`patch`,`delete` of pod and pod/status resources by identifiable nodes: | ||
* only allow modifying pods with nodeName set to the node making the API request (requires fetching the pod on delete) | ||
* do not allow removing a mirror pod annotation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be global, not just for nodes.
Limits `update`,`patch`,`delete` of node and nodes/status resources by identifiable nodes: | ||
* only allow modifying the node object corresponding to the node making the API request | ||
|
||
Limits `update`,`patch`,`delete` of pod and pod/status resources by identifiable nodes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete
should only be required for mirror pods, right? In that case it should be limited to those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure kubelets do the final deletion of a gracefully-deleted pod, after they've run the shutdown hooks and sent the termination signal. @smarterclayton?
Limits `update`,`patch`,`delete` of pod and pod/status resources by identifiable nodes: | ||
* only allow modifying pods with nodeName set to the node making the API request (requires fetching the pod on delete) | ||
* do not allow removing a mirror pod annotation | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also create/update/patch of events, limited to InvolvedObject
s bound to the node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how valuable that is, relative to the cost of evaluation on a high-volume API
Subsequent authorizers in the chain can run and choose to allow the request. | ||
|
||
Requests from identifiable nodes (`IdentifyNode()` returns nodeName != "") for secrets and configmaps are further restricted: | ||
* Requests for secrets are limited to `get`, and the requested secret must be related to the requesting node by one of the following relationships: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a cross reference to #443. With that addition, this should include list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll never let nodes access the existing list and watch apis for secrets, since that exposes the entire list. the bulk watch of individual resources will likely perform get
authorization checks on the individual resources.
* node -> pod -> pvc -> storageclass -> secret | ||
* Requests for configmaps are limited to `get`, and the requested configmap must be related to the requesting node by one of the following relationships: | ||
* node -> pod -> configmap | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also various other read (get/list/watch) requests should be authorized for objects bound to the node, including:
- pods
- pvc & pv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can extend this further in the future if we want to, but I consider subdividing those reads out of scope for this proposal. pods are list/watched across all namespaces via a field selector, which is currently not available to authorization to enforce. PVC and PV objects should not contain sensitive information (technically configmaps should not either, but experience suggests they will)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added section for future work
Currently, the `system:node` cluster role is automatically bound to the `system:nodes` group. | ||
|
||
Because the node authorizer accomplishes the same purpose, with the benefit of additional restrictions | ||
on secret and configmap access, this binding is no longer needed, and will no longer be set up automatically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few other things granted by the system:node
clusterrole not listed here:
// Needed to check API access. These creates are non-mutating
rbac.NewRule("create").Groups(authenticationGroup).Resources("tokenreviews").RuleOrDie(),
rbac.NewRule("create").Groups(authorizationGroup).Resources("subjectaccessreviews", "localsubjectaccessreviews").RuleOrDie(),
// Needed to build serviceLister, to populate env vars for services
rbac.NewRule(Read...).Groups(legacyGroup).Resources("services").RuleOrDie(),
// TODO: restrict to pods scheduled on the bound node once supported
rbac.NewRule(Read...).Groups(legacyGroup).Resources("pods").RuleOrDie(),
// TODO: restrict to namespaces of pods scheduled on bound node once supported
// TODO: change glusterfs to use DNS lookup so this isn't needed?
// Needed for glusterfs volumes
rbac.NewRule("get").Groups(legacyGroup).Resources("endpoints").RuleOrDie(),
// Used to create a certificatesigningrequest for a node-specific client certificate, and watch
// for it to be signed. This allows the kubelet to rotate it's own certificate.
rbac.NewRule("create", "get", "list", "watch").Groups(certificatesGroup).Resources("certificatesigningrequests").RuleOrDie(),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of those rules would be used as the coarse-grained authorization check, then additional restrictions places on specific resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm still confused by this. Yes, those permissions can still be granted as course-grained without additional restrictions, but they need to be granted somewhere. They could be left as RBAC rules, or granted at a course level in the node authorizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As proposed, the built-in node authorizer would broadly authorize users in the system:nodes
group that do not have a username that permits determining the specific node (just like the current RBAC binding of the system:nodes
group to the system:node
does)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also need to do this for some APIs for all users in the system:nodes
group, not just those without a node-specific identity, right? I'm talking about things like "list services" and "get endpoints".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes... clearly I'm not explaining this well :)
This is how the node authorizer would work, in order:
- If a request is not from a node, reject
- If a request is not allowed by the broad authorization rules currently used by the
system:node
RBAC role, reject - If a specific node cannot be identified, allow (this is what lets nodes that don't use node-specific identities continue to work with the broad authorization rules in step 2)
- For resources we care about subdividing access to (secrets, configmaps, etc), determine if the requested object is related to the requesting node, and reject the request if it isn't
- For other resources, allow
FWIW, this is my current authorizer implementation:
func (r *NodeAuthorizer) Authorize(attrs authorizer.Attributes) (bool, string, error) {
nodeName, isNode := r.identifier.NodeIdentity(attrs.GetUser())
if !isNode {
// reject requests from non-nodes
return false, "", nil
}
if len(nodeName) == 0 && r.requireNodeName {
// reject requests from unidentifiable nodes if in strict mode
return false, "unknown node", nil
}
if !rbac.RulesAllow(attrs, r.nodeRules...) {
// reject requests that don't fit within the broad authorization rules
return false, "", nil
}
if len(nodeName) == 0 {
// allow requests from unidentifiable nodes if not in strict mode
return true, "", nil
}
// subdivide access to specific resources
if attrs.GetAPIGroup() != api.GroupName {
// the resources we care about subdividing happen to be in the core API group.
// access to resources in other API groups is not subdivided.
return true, "", nil
}
switch attrs.GetResource() {
case "configmaps":
if attrs.GetVerb() != "get" || len(attrs.GetName()) == 0 || len(attrs.GetNamespace()) == 0 {
glog.V(2).Infof("NODE DENY: %#v", attrs)
return false, "can only get individual configmaps", nil
}
return r.hasPathFrom(nodeName, ConfigMapNode, attrs.GetNamespace(), attrs.GetName()), "", nil
case "secrets":
if attrs.GetVerb() != "get" || len(attrs.GetName()) == 0 || len(attrs.GetNamespace()) == 0 {
glog.V(2).Infof("NODE DENY: %#v", attrs)
return false, "can only get individual secrets", nil
}
return r.hasPathFrom(nodeName, SecretNode, attrs.GetNamespace(), attrs.GetName()), "", nil
case "persistentvolumeclaims":
if attrs.GetVerb() != "get" || len(attrs.GetName()) == 0 || len(attrs.GetNamespace()) == 0 {
glog.V(2).Infof("NODE DENY: %#v", attrs)
return false, "can only get individual persistentvolumeclaims", nil
}
return r.hasPathFrom(nodeName, PVCNode, attrs.GetNamespace(), attrs.GetName()), "", nil
case "persistentvolumes":
if attrs.GetVerb() != "get" || len(attrs.GetName()) == 0 {
glog.V(2).Infof("NODE DENY: %#v", attrs)
return false, "can only get individual persistentvolumes", nil
}
return r.hasPathFrom(nodeName, PVNode, "", attrs.GetName()), "", nil
default:
// Access to other resources is not subdivided
return true, "", nil
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this clears it up.
* `isNode` - true if the user groups contain the `system:nodes` group | ||
* `nodeName` - populated if `isNode` is true, and the user name is in the format `system:node:<nodeName>` | ||
|
||
## Node authorizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify the logic for what goes into the authorizer vs the admission plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
This has some overlap with kubernetes/kubernetes#29459 We might need some additional mechanism for granting nodes permission to objects that are used directly by the Kubelet, and not referenced by any pod. |
d848bf1
to
97e839b
Compare
comments addressed. @alex-mohr @smarterclayton @timstclair PTAL |
this generally lgtm, agree depending on how #29459 shakes out, it may need to be tweaked further. |
CC @kubernetes/sig-node-proposals |
@saad-ali: These labels do not exist in this repository: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
* Node admission | ||
* Limits nodes to only be able to mutate their own Node API object | ||
* Limits nodes to only be able to mutate pods bound to themselves | ||
* Limits nodes to only be able to create mirror pods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create mirror pods that are bound to themselves
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the previous point implies that, but will make that explicit
* Limits nodes to only be able to mutate pods bound to themselves | ||
* Limits nodes to only be able to create mirror pods | ||
* Prevents creating mirror pods that reference API objects (secrets, configmaps, persistent volume claims) | ||
* Prevents creating mirror pods that are not bound to nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bound to themselves
Other than a few comments LGTM |
To run a pod, a kubelet must have read access to the following objects referenced by the pod spec: | ||
* Secrets | ||
* ConfigMaps | ||
* PersistentVolumeClaims (and any bound PersistentVolume or referenced StorageClass object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: As part of local storage management, kubelet will be writing to PersistentVolume and PersistentVolumeClaim objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know. Is that just proposing nodes perform status update operations, or also create/delete operations?
* Further restricts secret and configmap access to only allow reading objects referenced by pods bound to the node making the request | ||
* Node admission | ||
* Limits nodes to only be able to mutate their own Node API object | ||
* Limits nodes to only be able to mutate pods bound to themselves |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it prevent ObjectMeta
mutation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only pods/status
updates are currently allowed. whatever ObjectMeta mutation is allowed via that API is allowed.
* only allow pods with nodeName set to the node making the API request | ||
* Limits `update`,`delete` of node and nodes/status resources: | ||
* only allow modifying the node object corresponding to the node making the API request | ||
* Limits `update`,`delete` of pod and pod/status resources: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kubelet changes the spec of a pod object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, clarified this is [update, pod/status] and [delete, pod]
* Limits `update`,`delete` of pod and pod/status resources: | ||
* only allow modifying pods with nodeName set to the node making the API request (requires fetching the pod on delete) | ||
|
||
For requests made by any user: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information looks statically determinant. Are we doing this in admission instead of validation because we want it to be optional? Why not some sort disableSomeValidation
flag instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split this into a validation section and an admission section
restricting mirror pods from referencing API objects is more related to not letting nodes expand their related objects graph, so I'd probably keep it in the admission plugin
d38e9d4
to
2059e1a
Compare
comments addressed |
@timstclair @deads2k PTAL |
lgtm |
This document proposes limiting a kubelet's API access using a new node authorizer and admission plugin: | ||
* Node authorizer | ||
* Authorizes requests from nodes using a fixed policy identical to the default RBAC `system:node` cluster role | ||
* Further restricts secret and configmap access to only allow reading objects referenced by pods bound to the node making the request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add the ability to read Secrets (well, maaaaaaaybe Secrets) and ConfigMaps from a specific set of namespaces, whether or not Pods bound to the Node reference them. This has to do with the Kubelet needing access to the namespace containing ConfigMaps for Kubelet/Node configuration, wrt kubernetes/kubernetes#29459
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to wait until configmap/secret access for kubelets to read their own config formalizes before including it in this authorizer, since it is not meant to be configurable. I'll add an item to the future work list to track it.
In the meantime, access can be granted using whatever authorization method is in place for the cluster.
* Authorizes requests from nodes using a fixed policy identical to the default RBAC `system:node` cluster role | ||
* Further restricts secret and configmap access to only allow reading objects referenced by pods bound to the node making the request | ||
* Node admission | ||
* Limit nodes to only be able to mutate their own Node API object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
**Can this just be enforced by authorization?** | ||
|
||
Authorization does not have access to request bodies (or the existing object, for update requests), | ||
so it could not restrict access based on fields in the incoming or existing object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seem to be more and more use-cases for per-field access control. It would sure make things like this easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requiring decoding objects prior to authorization is unlikely, especially given API aggregation
|
||
Manifesting RBAC policy rules to give each node access to individual objects within namespaces | ||
would require large numbers of frequently-modified roles and rolebindings, resulting in | ||
significant write-multiplication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Labels and selectors for ConfigMaps and Secrets might make writing the roles easier. I swear I saw an issue or something about this somewhere...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They wouldn't help with transitive relationships (you can access secrets referenced by pods which reference your node)
1. If a request is not from a node (`IdentifyNode()` returns isNode=false), reject | ||
2. If a request is not allowed by the rules in the default `system:node` cluster rule, reject | ||
3. If a specific node cannot be identified (`IdentifyNode()` returns nodeName=""), allow (this lets nodes that don't use node-specific identities continue to work with the broad authorization rules in step 2) | ||
4. If a request is for a secret, configmap, persistent volume or persistent volume claim, reject unless the verb is `get`, and the requested object is related to the requesting node: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 only allowing read requests for these types
I think it makes sense for "related to the requesting node" to also include a set of arbitrary namespaces that are considered to be system-level (or just node-level).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense for "related to the requesting node" to also include a set of arbitrary namespaces that are considered to be system-level (or just node-level).
This also seems like something that would vary by deployment. I'd like to see more use cases around this before making it part of the fixed node authorizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one small comment.
### Kubelets with undifferentiated usernames | ||
|
||
In some deployments, kubelets have credentials that place them in the `system:nodes` group, | ||
but do not identify the particular node they are associated with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add an override option to disable this exception. I can't think of any way it would work, but I'm worried this leaves an opening for a sort of down-grade attack. For a deployment with identifiable node identities, removing this exception would provide additional assurance (and is probably also what we want eventually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree on a strict mode option, will add
2059e1a
to
ab1e749
Compare
Scoped Kubelet API Access
Proposal to subdivide kubelet API access to read secrets/configmaps, and to mutate node and pod resources