-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adopt existing AKS clusters #1173
Comments
Current blockers for this:
I had my public key configured here, but same would probably happen even without it, as the defaulting would set a newly generated key if one was not set already. This should be solved by checking Possibly if AKS cluster is initially created with an SSH key, and then if that same SSH key is set in |
cc @alexeldeib |
About SSH keys, in theory it might work if the AKS cluster was created with a single SSH key, and that same key is set in the CR. I didn't make it work yet, I'll update here after some more testing (mid next week, as I'm AFK in the beginning of the week). |
Some concrete actions here would be to add an e2e test for adopting an existing AKS cluster with features that are currently supported in CAPZ. We can the iterate from there by adding more AKS features to CAPZ (I would happily submit a proposal for what to add next, or pick up some existing issue) and updating e2e test to cover those. (Before any of these, we probably need at least some basic e2e test for AKS cluster creation, as I understood Cecile there is none atm.) |
somehow I missed this one. looks like #1175 fixed part of the blocker. re: SSH keys, it should work without change. i'd be interested to know if there are any blockers here for the adoption piece. we definitely want e2e either way. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle stale we've added e2e since I made the previous comment. If someone could manually test this (easy), that would be enough to close this out IMO. we could also add a separate e2e for a pre-created AKS cluster, but I'm not sure how much value that delivers relative to the work vs. polishing up other parts of the implementation. |
I have tested this some time ago, I believe when I opened this issue and I got it working after fixing #1175.
Having an e2e that ensures it is working would be useful (and also some docs describing how it works), since from what I saw even small subtle changes can break this, e.g. managed cluster creation works the same with or without #1174 (it does not need the default field value to be set), but it was required for adopting existing managed cluster. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@CecileRobertMichon: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
/remove-lifecycle rotten |
+1 any news on this topic? are there some ongoing activities? |
@kisahm I don't believe anyone is currently working on this but @mtougeron mentioned this topic a while back at CAPI office hours |
I saw a note above that said that he had this working. Is there any way to find out what was done to get it working? |
I'm writing this up today. I should have something to share for you soon. |
It looks like in 1.9.2 I can adopt a CAPI created cluster by just reapplying the generated config files and it figures it all out. I am going to test doing a az cli created cluster as well. |
There are ways to do it, I've written up my notes here but there are some gotchas in the process depending on how you created the original cluster. I'm working with @dtzar to try and figure out if we can adjust capz or call those out better. Some of them might be resolved when capz is refactored to use ASO as well. |
One of the challenges here is that if you generate or create a CAPI/CAPZ YAML file for a cluster where features do not exist yet OR when you submit that YAML to the cluster and there is a mutating webhook, at this point the source of truth (CAPZ YAML) does not match what is deployed. See kubernetes-sigs/cluster-api#8668 The big question is: can we have an officially supported feature which allows this? We'd need to work through the potential dangers/pitfalls. e.g. CAPZ yaml feature is now supported for AKS and it does not match the configuration of the AKS feature which is already deployed on the cluster. How does this get handled? Tangent related - this would be possible today in pure ASO using their asoctl command. |
Posting this related project for reference: https://github.com/tobiasgiese/cluster-api-migration |
/assign |
I followed @mtougeron's steps here that he linked above by cheating a bit and creating a CAPZ AKS cluster, orphaning it (remove it from CAPZ management but leave the resources running in Azure), and reapplying the same templates and the cluster gets adopted fine AFAICT. I'll work on documenting this and adding an e2e test but am not anticipating needing any other changes to enable this. |
@nojnhuh There is a case where there might be a mutating webhook though, correct? So if that was in play this would fail, no? |
@dtzar AKS features that are unknown to CAPZ won't ultimately get set in the ASO YAML so those won't be disabled. And if we make more features available in CAPZ and ensure they default to null, then a CAPZ upgrade managing an adopted cluster shouldn't claim control over those fields automatically. And the mutating webhook is always in play, but if you craft the CAPZ YAML such that the defaulting either does nothing or does what you want, then I don't see where that would be causing any issues. |
Goals
Non-Goals/Future Work
AzureCluster
and other related objects), by creating required CAPI and CAPZ objects that describe it.User Story
As a user of Cluster API Azure, I would like to start managing my existing AKS clusters in a CAPZ management cluster, by creating CAPI and CAPZ objects that are describing my existing clusters, so that I don't have to create new clusters and migrate my workload, as that could cause significant downtime, or it might be even impossible.
Detailed Description
For an existing AKS cluster, that is using preferably only features that are supported by CAPZ, I want to create CAPI (
Cluster
, MachinePool) and CAPZ (AzureManagedCluster
,AzureManagedControlPlane
,AzureManagedMachinePool
) objects that I will submit to management cluster. CAPZ controller will start reconciling CAPZ objects and it will gracefully handle the existence of an already created AKS cluster and it will treat it as it was created by CAPZ.It would be useful to have an e2e test for this scenario.
Contract changes [optional]
It is possible that some changes in CAPZ managed cluster object defaulting and validation would be required, for example setting default SSH key in
AzureManagedControlPlane
is causing issues in my current AKS cluster adoption experiments, as Azure API is returning errorChanging property 'linuxProfile.ssh.publicKeys.keyData' is not allowed.
. I will create a separate issue for this.Data model changes [optional]
It is possible that some changes in managed cluster CRDs would be required (e.g. to solve the above issue with SSH keys, or in other similar cases).
/kind proposal
The text was updated successfully, but these errors were encountered: