-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate resources are created when template is used both as controlPlane ref and infrastructure ref in ClusterClass #6126
Comments
cc @sbueringer |
It makes sense that the issue occurs, as we create instances of control plane and infrastructure cluster and don't handle the case that both could/should be the same. Assuming it's a supported case we have to adjust our "core" topology reconciler to handle that case and I think it also has impact on patches. |
@yastij handling managed provider is another area where we have inconsistency cross providers:
Personally, I prefer the second approach, because I find it less confusing, composable, and with a better separation of concerns. On the same lines, we should also spread awareness about the ClusterClass requirements for supporting cluster upgrades, currently not implemented by any manged provider |
cc @richardcase for EKS |
I'm not surprised this turned out to be an issue. We originally had 2 different kinds I think it's valid for managed services to have the Cluster/Controlplane as the same resource. But i also understand the need for consistency between providers and if we had to re-instate .....more generally, I'm not sure that we have ever thought about what CAPI looks like for managed Kubernetes services (please correct me if i'm wrong here)? As a result, providers have made their own decisions and have tried to fit it into the current resource kinds/reconciliation process/provider types. With the CAPG managed implementation starting soon, its probably something we need to discuss and decide on what a managed service looks like in CAPI. |
@sbueringer @fabriziopandini @pydctw - i'd be happy to help/facilitate any discussions about "what does managed kubernetes in capi look like" if needed. |
It will be great if we can discuss it during the office hour tomorrow as I am blocked progressing on kubernetes-sigs/cluster-api-provider-aws#3166. Will add it to the agenda. |
From the office hours it was mentioned that there is an open PR in CAPZ to also go down to 1 type ( |
With managed Kubernetes services the lines are blurred between the cluster infrastructure and the control plane. So a few solutions that have been discussed:
Sounds like we need to get a proposal/doc together that covers the various potential solutions and then decide a consistent way forward for any provider that has a managed kubernetes service. @fabriziopandini @pydctw - i can make a start on a doc tomorrow and we can all collaborate. How does that sound? |
Sounds great! Short comment about:
We are currently using the control plane object to trigger/monitor rollouts of the control plane to new Kubernetes versions. Assuming that this should be possible with managed clusters too, it might be easier to go done the road of a ControlPlane without machines/replicas (but yeah it doesn't really fit conceptually). But I'm aware that on the other side Clusters without control plane are allowed in the Cluster resource and Clusters without infra clusters are not allowed. |
In the future we can also think that the infrastructure might go away entirely and instead become something else. Truthfully today the InfraCluster object is a stepping stone, we do need the infrastructure to be setup and configured somehow, but most users might want to have something else manage that (like Terraform, Crossplane, etc) and inform Cluster API where to get those values. |
/milestone v1.2 |
Thinking about it a bit more, we should probably meet and define clear responsibilities of reference. Infrastructure, Control Plane, and other references should all have clear delineations. If we think about the responsibilities of an infrastructure provider and its InfraCluster object, we can assume that this object provides the infrastructure for the Kubernetes Cluster, which can include a number of things and today it also includes the load balancer. On the other side, the control plane reference is in charge of managing the Kubernetes control plane, the infrastructure should be left to the other reference. The challenge I'm seeing with the above is that it seems that we've mixed some of these responsibilities when it comes to the managed cloud Kubernetes services. Let's reconvene and chat more about it, we should meet with at least one person from each provider interested in this discussion and push for consistency and separation of responsibilities. cc @yastij @fabriziopandini @sedefsavas @richardcase @CecileRobertMichon |
Meeting to agree the responsibilities would be great 👍 Also agree that we need to be clear of the delineations and this is the issue we are facing with managed Kubernetes services.....the current delineations don't naturally fit.
This is a good example of why the current responsibilities of a control plane & infrastructure provider don't fit well for managed services like EKS and why we have ended up where we are. When you create a EKS control plane in AWS (which we do via a CAPI control plane provider) this creates a load balancer automatically for the API server...this is at odds with the current assumption that the infra provider creates the load balancer and reports the controlplane endoint back to CAPI. So revisiting the provider types and responsibilities in the context of managed Kubernetes services would be great. When shall we get together? Perhaps a doodle is needed? |
Presented Managed Kubernetes in CAPI proposal during CAPI office hrs on 4/20/22 |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
/area topology |
/close |
@fabriziopandini: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This issue is related to #6988 for EKS ClusterClass support, not server side apply. Will keep it open until the proposal is merged. /reopen |
@pydctw: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-lifecycle stale |
/triage accepted |
@pydctw How do we want to treat this core CAPI issue now that the proposal is merged? I assume we have corresponding issues in CAPA so it's fine to close this issue here? |
What steps did you take and what happened:
[A clear and concise description on how to REPRODUCE the bug.]
While doing a PoC of creating an EKS cluster using ClusterClass (CAPI + CAPA), I noticed that two AWSManangedControlPlane (awsmcp) are created from AWSManangedControlPlaneTemplate (awsmcpt) when there should be only one awsmcp. For context, EKS uses an AWS managed control plane so AWSManangedControlPlane in CAPA is the counterpart of KubeadmControlPlane in CAPI.
Further debugging suggests that this is due to the fact AWSManagedControlPlaneTemplate is used both as
spec.controlPlane.ref
andspec.infrastructure.ref
in ClusterClass and CAPI controller clones them twice. FYI, there is no AWSManagedCluster type in CAPA.ClusterClass used for EKS.
This is creating an extra EKS control plane and related infrastructure in AWS and causing panics in CAPA controller.
What did you expect to happen:
Only one AWSManangedControlPlane is created.
Anything else you would like to add:
Environment:
kubectl version
): v1.21.2/etc/os-release
): MacOS/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]
The text was updated successfully, but these errors were encountered: