-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: enhancements/machine-config: securing MCS #626
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
## Proposal | ||
|
||
If it can be ensured that no one is able to access sensitive data through the Machine Config Server, the port restrictions can be safely removed. In order to achieve this, Machine Configs can be split into two groups: pre-node-approved configs and post-node-approved configs. The pre-node-approved configs are ones which do not contain sensitive information and contain just enough to configure new nodes to submit a certificate signing request (CSR) to the API server. The post-node-approved configs contain everything else, including the overwhelming majority of customer-provided Machine Configs. When a new machine is created, it fetches its Ignition config from the Machine Config Server, which only serves the pre-node-approved configs. At this point, the machine has enough information to submit a CSR, but nothing sensitive (note that for the purposes of this proposal, the CSR bootstrap token and pull secret are not considered sensitive). Once the CSR has been approved and a kubeconfig returned, the Machine Config Daemon is able to directly enumerate Machine Configs, including the post-node-approved ones, so it then applies any remaining configuration to the machine and reboots if necessary. At this point, the machine is fully configured. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful to pull out some of the assumptions and state them very clearly up front. It's included in here, but perhaps before this paragraph, a new paragraph could say:
Security assumptions:
- The CSR bootstrap token is not considered sensitive and being accessible to all applications in the cluster is accessible.
- The pull secret is not considered sensitive and being accessible to all applications in the cluster is accessible
- Contents of MachineConfig resources are considered (potentially) sensitive, so this proposal introduces a change to protect them from unauthorized access through the MCS
Do I have that right?
This proposal would have to depend on some other enhancement that would allow that first assumption to hold. I believe right now you could DoS a cluster with the ability to issue CSRs.
|
||
## Summary | ||
|
||
The bootstrapping process for new cluster nodes is a tricky problem, primarily due to the fact that a new nodes start from a position of almost zero knowledge. These nodes need to fetch their machine configuration and authenticate themselves with the cluster before they can access most resources and have workloads scheduled. At the same time, a malicious actor pretending to be a new node cannot be allowed to join the cluster. Otherwise, they would be able to get access to sensitive resources and could have workloads and their secrets scheduled to them. Today, we protect against this by preventing cluster workloads from accessing the Machine Config Server, the component of the Machine Config Operator which is responsible for serving Ignition configs, derived from a set of Machine Configs, to new machines. This approach presents a problem though: workloads on OpenShift are not allowed to make use of TCP ports 22623 and 22624, the ports used by the Machine Config Server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linebreaks please! (makes it easier to leave review comments)
|
||
## Summary | ||
|
||
The bootstrapping process for new cluster nodes is a tricky problem, primarily due to the fact that a new nodes start from a position of almost zero knowledge. These nodes need to fetch their machine configuration and authenticate themselves with the cluster before they can access most resources and have workloads scheduled. At the same time, a malicious actor pretending to be a new node cannot be allowed to join the cluster. Otherwise, they would be able to get access to sensitive resources and could have workloads and their secrets scheduled to them. Today, we protect against this by preventing cluster workloads from accessing the Machine Config Server, the component of the Machine Config Operator which is responsible for serving Ignition configs, derived from a set of Machine Configs, to new machines. This approach presents a problem though: workloads on OpenShift are not allowed to make use of TCP ports 22623 and 22624, the ports used by the Machine Config Server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach presents a problem though: workloads on OpenShift are not allowed to make use of TCP ports 22623 and 22624, the ports used by the Machine Config Server.
(This is specifically because the blocking rules were done as a quick hack. In theory, we could block every pod from accessing only the MCS without blocking anything else, but that would require noticing when masters were added/moved and updating the rules for every pod when that happened. Since this was supposed to have been a temporary hack, we didn't bother, and instead just blocked access to ports 22623 and 22624 on all IPs.)
|
||
## Proposal | ||
|
||
If it can be ensured that no one is able to access sensitive data through the Machine Config Server, the port restrictions can be safely removed. In order to achieve this, Machine Configs can be split into two groups: pre-node-approved configs and post-node-approved configs. The pre-node-approved configs are ones which do not contain sensitive information and contain just enough to configure new nodes to submit a certificate signing request (CSR) to the API server. The post-node-approved configs contain everything else, including the overwhelming majority of customer-provided Machine Configs. When a new machine is created, it fetches its Ignition config from the Machine Config Server, which only serves the pre-node-approved configs. At this point, the machine has enough information to submit a CSR, but nothing sensitive (note that for the purposes of this proposal, the CSR bootstrap token and pull secret are not considered sensitive). Once the CSR has been approved and a kubeconfig returned, the Machine Config Daemon is able to directly enumerate Machine Configs, including the post-node-approved ones, so it then applies any remaining configuration to the machine and reboots if necessary. At this point, the machine is fully configured. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, the machine has enough information to submit a CSR, but nothing sensitive (note that for the purposes of this proposal, the CSR bootstrap token and pull secret are not considered sensitive). Once the CSR has been approved
Wait... we're saying that
- the MCS cannot distinguish a real proto-node from an attacker, but
- the certificate approver can distinguish a CSR sent by a real proto-node that has received non-secret information from the MCS, from a CSR sent by an attacker that has received non-secret information from the MCS
That doesn't make sense; if the certificate approver can distinguish them, then the MCS ought to be able to distinguish them in the same way. Specifically, the MCS should only respond to connections from IPs that correspond to not-yet-approved Machine
resources (or something like that). Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic there makes sense. If we're willing to automatically approve a CSR, that same logic should be sufficient for accepting a connection to MCS.
However, the MCS doesn't have all the same information that the CSR approver has. It uses the contents of the CSR itself as part of the validation, so we can't copy it easily, unless the MCS request includes more info than it does today.
For reference, see the README in https://github.com/openshift/cluster-machine-approver/ and most of the important logic is in csr_check.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the MCS doesn't have all the same information that the CSR approver has. It uses the contents of the CSR itself as part of the validation, so we can't copy it easily, unless the MCS request includes more info than it does today.
But that other information doesn't matter. If the MCS can see that a request is coming from an IP that isn't the IP of a legitimate "proto-node", then it knows that the requester cannot possibly construct an approvable CSR (that points to itself), and so there is no reason for the MCS to talk to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the source IP good enough? Let's say untrusted-pod-A on node1 is trying to reach the MCS on node2. Won't that request be SNAT'd on node1, so node2 is going to see node1's IP address and consider it legitimate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't make sense; if the certificate approver can distinguish them, then the MCS ought to be able to distinguish them in the same way.
The client for the MCS is Ignition, which has significantly less functionality than the client for the CSR endpoint, the kubelet. Eventually, the approver could be enhanced (by us or the customer) to make use of AWS signed requests or GCP instance identity verification. Keeping the MCS out of the authentication path allows the customer to decide how much security they need for their deployment. If we did everything in the MCS, they wouldn't easily be able to swap out the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the source IP good enough? Let's say untrusted-pod-A on node1 is trying to reach the MCS on node2. Won't that request be SNAT'd on node1, so node2 is going to see node1's IP address and consider it legitimate?
node1's address is not legitimate, because node1 is an actual node, not a "proto-node". An actual functioning node has no reason to talk to the MCS (as I understand it); an IP should only be contacting the MCS if there is a Machine
that claims that IP but no Node
that claims that IP; ie, if the cluster is expecting a node to come up with that IP, but it knows that the node hasn't actually come up yet.
(Even if node1 gets accidentally nuked and you want to recreate it, then (AIUI), you would have to delete the old node1 Node/CSR first, since otherwise the new node1's new CSR would fail to match and the approver would consider it to be an imposter. So even for reinstalls, the rule that "only IPs that correspond to a Machine but not a Node should talk to the MCS" should work.)
Eventually, the approver could be enhanced (by us or the customer) to make use of AWS signed requests or GCP instance identity verification.
Do you mean adding that to the existing approval mechanism, or removing the existing approval mechanism entirely and switching it to be based entirely on cloud APIs? If you just mean adding, then yes, this would let the CSR approver distinguish "this is actually a real node VM" from "this is someone pretending to be a new node VM". But as long as we still have Machine
objects, and the CSR approver still requires that Node
s come up with IPs that match their corresponding Machine
s, then the MCS can check those IPs itself too, and refuse to talk to any IP that isn't a valid proto-node IP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An actual functioning node has no reason to talk to the MCS (as I understand it).
Correct, unless an existing node is re-provisioning for some reason (e.g. disk failure), and the Node object still exists. (Is this a case we need to handle? Will the new CSR even be signed?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...an IP should only be contacting the MCS if there is a Machine that claims that IP but no Node that claims that IP...
Folks can also provision their own instances, without going through the machine API. In that case, they're the hook to handle CSR signing for those machines. With the "MCS includes CSR-approver-style-verification" approach, those folks would need a hook so they could tell the MCS, yes, really serve requests to my new proto-node I manually provisioned".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a no-brainer to include node IP validation in the MCS - that's what my original PR has in fact here: openshift/machine-config-operator#784
But that said, we also need to care about UPI on-premise installations where the network may be exposed to machines outside the cluster. To me, this is where even a non-rotating auth token is extremely helpful - it provides a really strong first layer of defense.
## Alternatives | ||
|
||
- Do nothing and just remove the network policy. This would expose the contents of (what would be) post-node-approved Machine Configs to everything that has network access to the control plane. | ||
- Add support for an authentication token to the Machine Config Server (https://github.com/openshift/enhancements/pull/443). The real downside to this approach is that it makes troubleshooting more difficult, as it would likely need to happen in the initramfs of RHCOS. This is an extremely limited environment and requires console or serial access, which isn't available in a number of environments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the experience different here versus when a user provides an invalid pull-secret at install?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between screwing up
`https://some-ip/some/value/ignition/master
and
https://some-ip/some/value/ignition/master?token=xxxxxxxxxxxxxx
You either cut and paste the secret correctly or not. At some point we also have to verify the remote ip (if you ignore validation errors on some-ip you are going to get MITM'd anyway, so security is not your problem).
The person provisioning a box is ultimately responsible for establishing a verifiable, secure, and verified channel to the thing that will establish the basic trust foundation for the rest of that box.
Why is troubleshooting the problem? Is that the only argument against requiring a shared secret between provisioner and provisionee to establish mutual verification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekwaynecarr if the user provides an invalid pull-secret at install, the bootstrap machine will spin, trying to pull the release image. Eventually, the install will time out and the installer will collect logs. From the perspective of a single compute node, the two scenarios are indistinguishable, but I would expect an admin to look at more than just that. For example, if they SSH into the bootstrap node and run the recommended journalctl
invocation, it will be immediately obvious that the pull secret is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton https://some-ip/some/value/ignition/master
will never change. https://some-ip/some/value/ignition/master?token=xxxxxxxxxxxxxx
will change regularly (specifically, the token). The concern around troubleshooting is the fact that it will invariably be required more often when the Ignition URL is being changed periodically. There is a deeper question in there as well though. How will this token be rotated? This is easy in deployments which make use of the Machine API Operator, but everywhere else, this will have to be a manual process. And as we all know, if there is a periodic, manual task required for security, we can just assume it's not going to happen. I truly believe that the auth token will hurt the security of our clusters, all things considered.
@crawford: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
- The pull secret is not considered sensitive and being accessible to all applications in the cluster is acceptable. | ||
- The contents of Machine Configs are potentially sensitive, especially if the Machine Configs are customer-provided. | ||
|
||
If it can be ensured that no one is able to access sensitive data through the Machine Config Server, the port restrictions can be safely removed. In order to achieve this, Machine Configs can be split into two groups: pre-node-approved configs and post-node-approved configs. The pre-node-approved configs are ones which do not contain sensitive information and contain just enough to configure new nodes to submit a certificate signing request (CSR) to the API server. The post-node-approved configs contain everything else, including the overwhelming majority of customer-provided Machine Configs. When a new machine is created, it fetches its Ignition config from the Machine Config Server, which only serves the pre-node-approved configs. At this point, the machine has enough information to submit a CSR, but nothing sensitive (note that for the purposes of this proposal, the CSR bootstrap token and pull secret are not considered sensitive). Once the CSR has been approved and a kubeconfig returned, the Machine Config Daemon is able to directly enumerate Machine Configs, including the post-node-approved ones, so it then applies any remaining configuration to the machine and reboots if necessary. At this point, the machine is fully configured. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today we carry effectively 1.63 implementations of Ignition - one in the upstream project that runs in the initramfs, and 63% of Ignition in the MCD that runs in the real root. If do this, then we'd hit all limitations of the 63% MCD-Ignition implementation upfront. For example, while this isn't important to OpenShift, today the MCO doesn't support adding/changing users.
We may hit edge cases around e.g. parent directory creation, file write ordering etc. Those are all bugs that should be fixed of course, but we also tend to not hit them today because use of "day 2" MCs is so limited.
If we go this route it may be worth a spike around how we can have the MCO actually reuse Ignition upstream code directly for this.
|
||
## Proposal | ||
|
||
If it can be ensured that no one is able to access sensitive data through the Machine Config Server, the port restrictions can be safely removed. In order to achieve this, Machine Configs can be split into two groups: pre-node-approved configs and post-node-approved configs. The pre-node-approved configs are ones which do not contain sensitive information and contain just enough to configure new nodes to submit a certificate signing request (CSR) to the API server. The post-node-approved configs contain everything else, including the overwhelming majority of customer-provided Machine Configs. When a new machine is created, it fetches its Ignition config from the Machine Config Server, which only serves the pre-node-approved configs. At this point, the machine has enough information to submit a CSR, but nothing sensitive (note that for the purposes of this proposal, the CSR bootstrap token and pull secret are not considered sensitive). Once the CSR has been approved and a kubeconfig returned, the Machine Config Daemon is able to directly enumerate Machine Configs, including the post-node-approved ones, so it then applies any remaining configuration to the machine and reboots if necessary. At this point, the machine is fully configured. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a no-brainer to include node IP validation in the MCS - that's what my original PR has in fact here: openshift/machine-config-operator#784
But that said, we also need to care about UPI on-premise installations where the network may be exposed to machines outside the cluster. To me, this is where even a non-rotating auth token is extremely helpful - it provides a really strong first layer of defense.
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We're getting heat for blocking arbitrary ports on the pod network (TCP 22623/22624). This is one potential path forward and is an alternative to #443.