Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet serving certificate and metrics server addon #10022

Merged
merged 5 commits into from
Oct 9, 2020

Conversation

olemarkus
Copy link
Member

fixes #6879

This PR adds metrics-server as a configurable addon deprecating the manifest-based one.
More importantly, it uses kops-controller to provision the kubelet serving certificate and makes kubelet use it. This lets us avoid using the --kubelet-insecure-tls flag.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 7, 2020
@olemarkus
Copy link
Member Author

/test pull-kops-e2e-cni-flannel

Copy link
Contributor

@agilgur5 agilgur5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this @olemarkus ! Since metrics-server is basically a core component necessary for HPA / VPA (and with little in the form of competition), this resolves a big security risk in getting it set-up, which, as you mentioned in #6879 (comment), may not be truly possible to resolve without this change.

I just had two questions and two comments -- don't remotely have the depth to actually review this. Apologies if these are an inconvenience to answer given my lack of depth!

  1. Am I correct in understanding that this makes it so each kubelet uses the kops CA to provision its serving cert during node bootstrap (instead of self-signing as before)?
    I'm curious if there are more potential use-cases for this type of workflow for other core services that require certs. E.g. in one of my comments I mention the use-case of metrics-server's own serving cert.

  2. Based on [METRICS-SERVER] New kubelet-certificate-authority flag available (v0.3.2) #6879 (comment), am I correct in understanding that the configurable add-on change is not necessary to resolve this problem? Like, based on your comment, I got the impression that the manifest or any other type of metrics-server deploy (I'm using the Helm Chart) should no longer need --kublet-insecure-tls with the cert piece alone, right?
    Which would mean the configurable version just simplifies deploy, but does not resolve the security risk in isolation

namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
Copy link
Contributor

@agilgur5 agilgur5 Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is for whether apiserver verifies metrics-server's certificate (another, different cert, but also insecure). According to kubernetes-sigs/metrics-server#544 / kubernetes-sigs/metrics-server#545 (particularly interesting is kubernetes-sigs/metrics-server#545 (comment)), this is done for similar non-trivial cert provisioning requirements.

Would be great if kops could handle that as well since it has insight into both sides. I think that would also make the configurable variant provide more value as that may not be possible without kops or would at least be significantly easier with it doing the provisioning work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Providing a kops-signed certificate to metrics-server is a whole different story.
With kubelet and other system components, what we really validate is the nodes themselves. We don't have any way of validating pods running in the node.
One way of solving this could be that kops-controller knows that we are deploying metrics-server and provisions secrets for this, but that would be a much larger change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way of solving this could be that kops-controller knows that we are deploying metrics-server and provisions secrets for this, but that would be a much larger change.

Yea that's exactly what I was thinking. Thought it wouldn't be much larger than the current change, but I don't have the depth to know what else that would involve

@@ -1,5 +1,7 @@
# Kubernetes Metrics Server

**This addon is deprecated. Set `spec.metricsServer.enabled: true` instead**
Copy link
Contributor

@agilgur5 agilgur5 Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be good to also update the manifest's line anyway?

- --kubelet-insecure-tls

Might need to add a new manifest though for a new version, not sure if this is getting backported

I would think that may make it easier for gradual transitions or just understanding that there's a difference between the two variants -- an important security one at that. Otherwise may make it seem like they're otherwise equivalent. I was using the metrics-server Helm Chart instead of the add-on as well, but looking at the add-on made me realize they both shared a security risk.
Alternatively, could add a note in the docs here that the deprecated manifest variant is less secure due to --kubelet-insecure-tls usage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new feature, so it won't be backported. Adding a new manifest would also only be for 1.19+, so that kind of defeats the purpose of the deprecation notice.

But what I can do is add a note to the release notes. Should have done so anyway.

Copy link
Contributor

@agilgur5 agilgur5 Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, makes sense. Appreciate all the explanations! I was looking back into this due to some security settings (kube-apiserver not setting --kubelet-certificate-authority) and the release notes for #10836 , and it seems like I forgot to ever respond here.

The release note on metrics-server doesn't say that the new add-on is more secure -- which is a really key piece. The wording makes it sound like the add-on is just done via the new approach now, so other than deprecation it doesn't motivate one to move to the new method as the security piece isn't explicitly mentioned. I was previously using metrics-server directly via its Helm chart, so I also wouldn't know that I could now remove --kubelet-insecure-tls from my config if I didn't dive into and look through the issues / PRs another time.
I think it would be useful to mention that it is more secure than the now deprecated variant and that users of direct metrics-server should remove the --kubelet-insecure-tls flag when upgrading to kOps 1.19+

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestions! Thanks! Would you be able to add a PR for this?

Copy link
Contributor

@agilgur5 agilgur5 Apr 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only took me 2 years to finally get the PR out, but finally made it: #15327 😅

Sorry for the delay -- at least I did get several other PRs I had in my backlog out as well!

cmd/kops-controller/pkg/server/server.go Show resolved Hide resolved
nodeup/pkg/model/kubelet.go Outdated Show resolved Hide resolved
nodeup/pkg/model/kubelet.go Show resolved Hide resolved
@johngmyers
Copy link
Member

I believe the 1.19 manifest will still need to set --kubelet-insecure-tls on cloud providers that don't support bootstrap using kops-controller.

There's still insecureSkipTLSVerify: true on the APIService, but I suppose that's a different issue.

@olemarkus
Copy link
Member Author

I believe the 1.19 manifest will still need to set --kubelet-insecure-tls on cloud providers that don't support bootstrap using kops-controller.

That is true. Using UseKopsControllerForNodeBootstrap instead of k8s version now. Cleaner than duplicating the whole manifest anyway.

There's still insecureSkipTLSVerify: true on the APIService, but I suppose that's a different issue.

Yep. Need a different mechanism for that. Maybe kops-controller could create TLS secrets for known addons. It may be more in the cluster-addons space too.

@hakman
Copy link
Member

hakman commented Oct 9, 2020

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 9, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman, olemarkus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 9, 2020
@hakman
Copy link
Member

hakman commented Oct 9, 2020

/test all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/addons area/api area/documentation area/kops-controller area/nodeup cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[METRICS-SERVER] New kubelet-certificate-authority flag available (v0.3.2)
5 participants