Duplicate resources are created when template is used both as controlPlane ref and infrastructure ref in ClusterClass #6126

pydctw · 2022-02-14T19:05:08Z

What steps did you take and what happened:
[A clear and concise description on how to REPRODUCE the bug.]

While doing a PoC of creating an EKS cluster using ClusterClass (CAPI + CAPA), I noticed that two AWSManangedControlPlane (awsmcp) are created from AWSManangedControlPlaneTemplate (awsmcpt) when there should be only one awsmcp. For context, EKS uses an AWS managed control plane so AWSManangedControlPlane in CAPA is the counterpart of KubeadmControlPlane in CAPI.

$ kubectl get cluster -n eks
NAME                PHASE         AGE     VERSION
my-eks-cluster-2    Provisioned   2d20h   v1.21.2

$ kubectl get awsmcp -n eks
NAME                              CLUSTER             READY   VPC                     BASTION IP
my-eks-cluster-2-84g7h            my-eks-cluster-2    true    vpc-0fd430763e64830ee
my-eks-cluster-2-c4vqb            my-eks-cluster-2    true    vpc-06b20d0d9d5eae93d

Further debugging suggests that this is due to the fact AWSManagedControlPlaneTemplate is used both asspec.controlPlane.ref and spec.infrastructure.ref in ClusterClass and CAPI controller clones them twice. FYI, there is no AWSManagedCluster type in CAPA.

ClusterClass used for EKS.

apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
  name: eks-clusterclass-v1
  namespace: eks
spec:
  controlPlane:
    ref:
      apiVersion: controlplane.cluster.x-k8s.io/v1beta1
      kind: AWSManagedControlPlaneTemplate
      name: eks-clusterclass-v1-awsmcp
  infrastructure:
    ref:
      apiVersion: controlplane.cluster.x-k8s.io/v1beta1
      kind: AWSManagedControlPlaneTemplate
      name: eks-clusterclass-v1-awsmcp
  workers:
    machineDeployments:
      - class: default-worker
        template:
          bootstrap:
            ref:
              apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
              kind: EKSConfigTemplate
              name: eks-clusterclass-v1-eksconfigtemplate
          infrastructure:
            ref:
              apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
              kind: AWSMachineTemplate
              name: eks-clusterclass-v1-worker-machinetemplate

This is creating an extra EKS control plane and related infrastructure in AWS and causing panics in CAPA controller.

What did you expect to happen:
Only one AWSManangedControlPlane is created.

Anything else you would like to add:

Environment:

Cluster-api version: v1.1.1
Minikube/KIND version: v0.11.1
Kubernetes version: (use kubectl version): v1.21.2
OS (e.g. from /etc/os-release): MacOS

/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

The text was updated successfully, but these errors were encountered:

pydctw · 2022-02-14T19:07:13Z

cc @sbueringer

sbueringer · 2022-02-15T08:45:56Z

It makes sense that the issue occurs, as we create instances of control plane and infrastructure cluster and don't handle the case that both could/should be the same.

Assuming it's a supported case we have to adjust our "core" topology reconciler to handle that case and I think it also has impact on patches.

fabriziopandini · 2022-02-15T14:10:27Z

@yastij handling managed provider is another area where we have inconsistency cross providers:

in CAPA we are using the same AWSManagedControlPlaneTemplate object both as a control plane object and as cluster infrastructure object
in CAPZ there are two different objects, one for the managed control plane object and one for cluster infrastructure object

Personally, I prefer the second approach, because I find it less confusing, composable, and with a better separation of concerns.

On the same lines, we should also spread awareness about the ClusterClass requirements for supporting cluster upgrades, currently not implemented by any manged provider

pydctw · 2022-02-15T17:07:46Z

cc @richardcase for EKS

richardcase · 2022-02-15T17:56:42Z

I'm not surprised this turned out to be an issue.

We originally had 2 different kinds AWSManagedCluster and AWSManagedControlPlane. However AWSManagedCluster was only acting as a pass-through to communicate values from the control plane to capi to satisfy the infrastructure cluster contract....which the control plane already did. So we decided to remove it.

I think it's valid for managed services to have the Cluster/Controlplane as the same resource. But i also understand the need for consistency between providers and if we had to re-instate AWSManagedCluster then i'm good with that but.......

.....more generally, I'm not sure that we have ever thought about what CAPI looks like for managed Kubernetes services (please correct me if i'm wrong here)? As a result, providers have made their own decisions and have tried to fit it into the current resource kinds/reconciliation process/provider types. With the CAPG managed implementation starting soon, its probably something we need to discuss and decide on what a managed service looks like in CAPI.

richardcase · 2022-02-15T18:04:54Z

@sbueringer @fabriziopandini @pydctw - i'd be happy to help/facilitate any discussions about "what does managed kubernetes in capi look like" if needed.

pydctw · 2022-02-15T18:07:51Z

@sbueringer @fabriziopandini @pydctw - i'd be happy to help/facilitate any discussions about "what does managed kubernetes in capi look like" if needed.

It will be great if we can discuss it during the office hour tomorrow as I am blocked progressing on kubernetes-sigs/cluster-api-provider-aws#3166. Will add it to the agenda.

richardcase · 2022-02-16T18:34:44Z

From the office hours it was mentioned that there is an open PR in CAPZ to also go down to 1 type (AzureManagedControlPlane) that satisfies both the cluster and controlplane contracts like we are doing in CAPA: kubernetes-sigs/cluster-api-provider-azure#1468

richardcase · 2022-02-16T18:41:36Z

With managed Kubernetes services the lines are blurred between the cluster infrastructure and the control plane. So a few solutions that have been discussed:

Allow the same kind to be used in the references for infrastructureRef and controlPlaneRef (like CAPA is doing and the PR for CAPZ is proposing). But this currently causes problems with clusterclass as documented by this issue
Ensure there are separate Cluster and ControlPlane kinds, even if the Cluster kind only acts as a passthrough to satisfy the contract with CAPI. For CAPA this causes some strange interactions required for things like controlplane endpoint.
Only supply an infrastructureRef and not supply controlPlaneRef and do all the control plane reconciliation in the infrastructure cluster reconciler. From office hours there was mention that this could cause an issue with clusterclass
Potentially use externally managed somehow.

Sounds like we need to get a proposal/doc together that covers the various potential solutions and then decide a consistent way forward for any provider that has a managed kubernetes service.

@fabriziopandini @pydctw - i can make a start on a doc tomorrow and we can all collaborate. How does that sound?

sbueringer · 2022-02-16T19:50:12Z

Sounds great!

Short comment about:

Only supply an infrastructureRef and not supply controlPlaneRef and do all the control plane reconciliation in the infrastructure cluster reconciler. From office hours there was mention that this could cause an issue with clusterclass

We are currently using the control plane object to trigger/monitor rollouts of the control plane to new Kubernetes versions. Assuming that this should be possible with managed clusters too, it might be easier to go done the road of a ControlPlane without machines/replicas (but yeah it doesn't really fit conceptually).

But I'm aware that on the other side Clusters without control plane are allowed in the Cluster resource and Clusters without infra clusters are not allowed.

vincepri · 2022-02-17T04:43:27Z

But I'm aware that on the other side Clusters without control plane are allowed in the Cluster resource and Clusters without infra clusters are not allowed.

~~We should probably fix this bit in the future, a Cluster should always have a control plane, but it doesn't have to have an infrastructure associated with it necessarily.~~ See below

In the future we can also think that the infrastructure might go away entirely and instead become something else. Truthfully today the InfraCluster object is a stepping stone, we do need the infrastructure to be setup and configured somehow, but most users might want to have something else manage that (like Terraform, Crossplane, etc) and inform Cluster API where to get those values.

vincepri · 2022-02-17T04:43:41Z

/milestone v1.2

vincepri · 2022-02-17T20:06:23Z

Thinking about it a bit more, we should probably meet and define clear responsibilities of reference. Infrastructure, Control Plane, and other references should all have clear delineations.

If we think about the responsibilities of an infrastructure provider and its InfraCluster object, we can assume that this object provides the infrastructure for the Kubernetes Cluster, which can include a number of things and today it also includes the load balancer.

On the other side, the control plane reference is in charge of managing the Kubernetes control plane, the infrastructure should be left to the other reference. The challenge I'm seeing with the above is that it seems that we've mixed some of these responsibilities when it comes to the managed cloud Kubernetes services.

Let's reconvene and chat more about it, we should meet with at least one person from each provider interested in this discussion and push for consistency and separation of responsibilities.

cc @yastij @fabriziopandini @sedefsavas @richardcase @CecileRobertMichon

richardcase · 2022-02-18T07:58:54Z

we should probably meet and define clear responsibilities of reference. Infrastructure, Control Plane, and other references should all have clear delineations

Meeting to agree the responsibilities would be great 👍 Also agree that we need to be clear of the delineations and this is the issue we are facing with managed Kubernetes services.....the current delineations don't naturally fit.

InfraCluster object, we can assume that this object provides the infrastructure for the Kubernetes Cluster, which can include a number of things and today it also includes the load balancer.

the control plane reference is in charge of managing the Kubernetes control plane, the infrastructure should be left to the other reference.

This is a good example of why the current responsibilities of a control plane & infrastructure provider don't fit well for managed services like EKS and why we have ended up where we are.

When you create a EKS control plane in AWS (which we do via a CAPI control plane provider) this creates a load balancer automatically for the API server...this is at odds with the current assumption that the infra provider creates the load balancer and reports the controlplane endoint back to CAPI.

So revisiting the provider types and responsibilities in the context of managed Kubernetes services would be great.

When shall we get together? Perhaps a doodle is needed?

pydctw · 2022-04-20T22:47:13Z

Presented Managed Kubernetes in CAPI proposal during CAPI office hrs on 4/20/22

k8s-triage-robot · 2022-07-19T23:36:28Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

richardcase · 2022-07-20T07:12:01Z

/remove-lifecycle stale

killianmuldoon · 2022-07-21T16:58:34Z

/area topology

fabriziopandini · 2022-07-29T17:42:21Z

/close
this has been fixed by introducing support for server side apply

k8s-ci-robot · 2022-07-29T17:42:31Z

@fabriziopandini: Closing this issue.

In response to this:

/close
this has been fixed by introducing support for server side apply

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pydctw · 2022-07-29T17:48:33Z

This issue is related to #6988 for EKS ClusterClass support, not server side apply. Will keep it open until the proposal is merged.

/reopen

k8s-ci-robot · 2022-07-29T17:48:43Z

@pydctw: Reopened this issue.

In response to this:

This issue is related to #6988 for EKS ClusterClass support, not server side apply. Will keep it open until the proposal is merged.

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pydctw · 2022-07-29T17:49:34Z

/remove-lifecycle stale

fabriziopandini · 2022-07-29T18:07:40Z

/triage accepted

sbueringer · 2022-08-31T18:16:44Z

@pydctw How do we want to treat this core CAPI issue now that the proposal is merged? I assume we have corresponding issues in CAPA so it's fine to close this issue here?

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 14, 2022

pydctw changed the title ~~Duplicate resources are created when they are used both as controlPlane ref and infrastructure ref in ClusterClass~~ Duplicate resources are created when template is used both as controlPlane ref and infrastructure ref in ClusterClass Feb 14, 2022

k8s-ci-robot added this to the v1.2 milestone Feb 17, 2022

pydctw mentioned this issue Feb 23, 2022

Add AWSManagedControlPlaneTemplate type kubernetes-sigs/cluster-api-provider-aws#3166

Open

CecileRobertMichon mentioned this issue Mar 1, 2022

AzureManagedCluster is useless; let's remove it (and keep AzureManagedControlPlane) kubernetes-sigs/cluster-api-provider-azure#1396

Closed

sedefsavas mentioned this issue Jun 6, 2022

Re-introduce AWSManagedCluster for EKS ClusterClass support kubernetes-sigs/cluster-api-provider-aws#3507

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 19, 2022

k8s-ci-robot added the area/topology label Jul 21, 2022

pydctw mentioned this issue Jul 27, 2022

📖 Managed Kubernetes in CAPI proposal #6988

Merged

fabriziopandini added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022

fabriziopandini removed this from the v1.2 milestone Jul 29, 2022

fabriziopandini removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022

k8s-ci-robot closed this as completed Jul 29, 2022

k8s-ci-robot reopened this Jul 29, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 29, 2022

k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022

k8s-ci-robot closed this as completed in #6988 Aug 31, 2022

killianmuldoon added the area/clusterclass Issues or PRs related to clusterclass label May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate resources are created when template is used both as controlPlane ref and infrastructure ref in ClusterClass #6126

Duplicate resources are created when template is used both as controlPlane ref and infrastructure ref in ClusterClass #6126

pydctw commented Feb 14, 2022 •

edited

Loading

pydctw commented Feb 14, 2022

sbueringer commented Feb 15, 2022

fabriziopandini commented Feb 15, 2022

pydctw commented Feb 15, 2022

richardcase commented Feb 15, 2022

richardcase commented Feb 15, 2022

pydctw commented Feb 15, 2022 •

edited

Loading

richardcase commented Feb 16, 2022

richardcase commented Feb 16, 2022 •

edited

Loading

sbueringer commented Feb 16, 2022 •

edited

Loading

vincepri commented Feb 17, 2022 •

edited

Loading

vincepri commented Feb 17, 2022

vincepri commented Feb 17, 2022

richardcase commented Feb 18, 2022

pydctw commented Apr 20, 2022

k8s-triage-robot commented Jul 19, 2022

richardcase commented Jul 20, 2022

killianmuldoon commented Jul 21, 2022

fabriziopandini commented Jul 29, 2022

k8s-ci-robot commented Jul 29, 2022

pydctw commented Jul 29, 2022

k8s-ci-robot commented Jul 29, 2022

pydctw commented Jul 29, 2022

fabriziopandini commented Jul 29, 2022

sbueringer commented Aug 31, 2022

Duplicate resources are created when template is used both as controlPlane ref and infrastructure ref in ClusterClass #6126

Duplicate resources are created when template is used both as controlPlane ref and infrastructure ref in ClusterClass #6126

Comments

pydctw commented Feb 14, 2022 • edited Loading

pydctw commented Feb 14, 2022

sbueringer commented Feb 15, 2022

fabriziopandini commented Feb 15, 2022

pydctw commented Feb 15, 2022

richardcase commented Feb 15, 2022

richardcase commented Feb 15, 2022

pydctw commented Feb 15, 2022 • edited Loading

richardcase commented Feb 16, 2022

richardcase commented Feb 16, 2022 • edited Loading

sbueringer commented Feb 16, 2022 • edited Loading

vincepri commented Feb 17, 2022 • edited Loading

vincepri commented Feb 17, 2022

vincepri commented Feb 17, 2022

richardcase commented Feb 18, 2022

pydctw commented Apr 20, 2022

k8s-triage-robot commented Jul 19, 2022

richardcase commented Jul 20, 2022

killianmuldoon commented Jul 21, 2022

fabriziopandini commented Jul 29, 2022

k8s-ci-robot commented Jul 29, 2022

pydctw commented Jul 29, 2022

k8s-ci-robot commented Jul 29, 2022

pydctw commented Jul 29, 2022

fabriziopandini commented Jul 29, 2022

sbueringer commented Aug 31, 2022

pydctw commented Feb 14, 2022 •

edited

Loading

pydctw commented Feb 15, 2022 •

edited

Loading

richardcase commented Feb 16, 2022 •

edited

Loading

sbueringer commented Feb 16, 2022 •

edited

Loading

vincepri commented Feb 17, 2022 •

edited

Loading