-
Notifications
You must be signed in to change notification settings - Fork 38
Investigate why installing 700+ CRDs causing degradation of performance in apiserver #47
Comments
I suspect Example output w/ provider-aws, flux, crossplane and a few other types, 191 CRDs in total:
References for throttling: Background about kubectl discovery cache: Please also note that |
@chlunde but the slowness in apiserver is experienced after |
I did some experiments using provider-tf-aws & provider-tf-azure with the full set of resources generated for both. Experiment Setup # 1:The experiments have been performed on a darwin_arm64 machine with 8 CPU cores. A control plane consisting of Registering 765 CRDs of provider-tf-awsCPU profiling data collected from The following figure shows CPU utilization for State of the Art for the Established Kubernetes Scalability Thresholds:Unfortunately, the Kubernetes Scalability thresholds file from SummaryFurther tests are needed to measure API latency but I do not expect the high number of registered CRDs would by itself increase latency causing violations of Kubernetes API call latency SLOs, excluding high saturation cases (i.e., I have some other experiments whose results I will publish in separate comments to this issue. |
Experiment Setup # 2:In a cluster where all SummaryAs expected, as we increase the number of CRDs in the cluster, it becomes more expensive to compute the OpenAPI spec described in this comment per CRD. With a back-of-the-envelope calculation, |
What problem are you facing?
Today if you run
kubectl apply -f package/crds
inprovider-tf-aws
, your cluster gets really slow. In GKE, kubectl command just stops after like 50 CRDs.How could Terrajet help solve your problem?
We have some ideas around sharding the controllers and API types, allowing customers to install only a set of them. But we haven't identified the actual problem. So, we need to make sure we know the root cause of the problem before choosing a solution so that we have that problem in mind for future designs.
The text was updated successfully, but these errors were encountered: