-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clusterapi: refresh kubeconfig bearer tokens for management and workload kubeconfigs dynamically #5951
Conversation
/retest |
@cnmcavoy: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
911efe7
to
17e36cf
Compare
17e36cf
to
2e67057
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: cnmcavoy The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm still keen to see this get included, @cnmcavoy have you done some local testing of the refresh? |
Indeed has been using a fork of the CA with this patch included for a month. Initially this PR only covered the kube config in the clusterapi provider package, but that led us to discover there is a third kubeconfig in use for leader election. I updated the PR earlier to address that issue and we haven't discovered anything new since. |
thanks for the update @cnmcavoy , i'm inclined to merge this based on your comments but i will bring up the PR at the CAPI meeting today just to see if we can get an extra review or two. i will plan to merge this on friday if there is no further activity. |
/assign |
@cnmcavoy code seems sane to me, is there an appropriate slot in the Cluster API provider documentation that we can use to describe this new feature? I'd be happy to build a cluster-autoscaler image from this branch and run through that test flow and report back (which would have the benefit of testing both the functionality and the documentation 😀). |
…oad kubeconfigs dynamically
2e67057
to
148c69f
Compare
@jackfrancis sure! I added a note after the various configuration modes between management and workload clusters. |
Why is this not using something like HTTPWrappersForConfig which invokes NewBearerAuthWithRefreshRoundTripper to handle this flow? You should not be writing any custom code to handle automatic token refresh when client-go already does it for you. |
kubeConfig, err := kube_client_cmd.BuildConfigFromFlags("", rt.kubeconfigPath) | ||
if err != nil { | ||
return nil, fmt.Errorf("cannot build kube cluster config: %w", err) | ||
} | ||
rt.bearerToken = kubeConfig.BearerToken |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loading the entire kubeconfig but only honoring the bearer token changing doesn't really make sense. The bearer token should be in a distinct file that is reloaded, the rest of the contents should remain static.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster Autoscaler can be configured in a mode where it is running in Cluster "1", but the cluster and pods it needs to scale for and manage are in cluster "3", and the cloud configuration and node groups to scale are in cluster "2". In this scenario, there is no service account bearer token, except for running in Cluster "1" due to these being separate clusters.
The contents of the file we are loading is created by the CAPA awscluster controller, only has a bearer token, and is shaped like this:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0t...
server: https://XXXXX.us-east-2.eks.amazonaws.com
name: cluster-name
contexts:
- context:
cluster: cluster-3
user: cluster-3-capi-admin
name: cluster-3-capi-admin@cluster-3
current-context: cluster-3-capi-admin@cluster-3
kind: Config
preferences: {}
users:
- name: cluster-3-capi-admin
user:
token: k8s-aws-v1...
Are you suggesting that CAPA should encode a second secret value with just the token
value contents above, and have the Cluster Autoscaler reload that contents, to simulate the behavior of a file mounted service account bearer token? Or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right if you change the YAML to:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0t...
server: https://XXXXX.us-east-2.eks.amazonaws.com
name: cluster-name
contexts:
- context:
cluster: cluster-3
user: cluster-3-capi-admin
name: cluster-3-capi-admin@cluster-3
current-context: cluster-3-capi-admin@cluster-3
kind: Config
preferences: {}
users:
- name: cluster-3-capi-admin
user:
tokenFile: location_of_file
And have the contents of location_of_file
be:
k8s-aws-v1...
Then client-go will handle the contents of location_of_file
changing for you.
return rt.rt.RoundTrip(req) | ||
} | ||
|
||
var _ http.RoundTripper = &bearerAuthRoundTripper{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to implement utilnet.RoundTripperWrapper
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is this still needed given the Kubernetes client-go library has been updated to a version that should fix this? |
@RichardoC that link to the aws docs is only about eks clusters though, is there a link to an issue in client-go that might talk about this? the capi provider doesn't necessarily use eks, so we would still want a refresh here (as i understand it). |
This functionality is upstream k8s behaviour, just couldn't find the link earlier https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#auto-generated-legacy-serviceaccount-token-clean-up |
@RichardoC I think there is some confusion here, this change was about kubeconfig's with an embedded service account token expiring. This differs from the built-in client-go handling of service account tokens, where you are correct, client-go does automatically refresh. So this is still an active issue for CAPI users with EKS clusters. However, I went to a SIG-APIMachinery meeting last year to discuss better approaches to handling this and they suggested changes downstream, in how the CAPI project manage their kubeconfigs, to avoid this type of issue entirely. So I'm pursuing that over in kubernetes-sigs/cluster-api-provider-aws#4648 If all goes well, there will likely need to be some small autoscaler changes from that, but only for it's helm chart / docs. I'll open a new PR for that work. |
@cnmcavoy thanks for the update and continuing to follow this issue. happy to hear about the status, and thanks for closing this. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Continuously update the bearer token provided by a kube config file for the clusterapi provider.
Which issue(s) this PR fixes:
Fixes #4784
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: