diff --git a/docs/proposals/20220411-cluster-api-state-metrics.md b/docs/proposals/20220411-cluster-api-state-metrics.md index e1259dd7fef9..dc651b446904 100644 --- a/docs/proposals/20220411-cluster-api-state-metrics.md +++ b/docs/proposals/20220411-cluster-api-state-metrics.md @@ -33,7 +33,9 @@ status: experimental - [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints) - [Scrapable Information](#scrapable-information) - [Relationship to kube-state-metrics](#relationship-to-kube-state-metrics) - - [Reusage of kube-state-metrics packages](#reusage-of-kube-state-metrics-packages) + - [How does kube-state-metrics work](#how-does-kube-state-metrics-work) + - [Reuse kube-state-metrics packages](#reuse-kube-state-metrics-packages) + - [Package structure for cluster-api-state-metrics](#package-structure-for-cluster-api-state-metrics) - [Security Model](#security-model) - [Risks and Mitigations](#risks-and-mitigations) - [Alternatives](#alternatives) @@ -114,7 +116,7 @@ As an application developer, I would like to deploy cluster-api-state-metrics to Following Cluster API CRDs currently exist. The *In-scope* column marks CRDs for which metrics should be exposed. -In future iterations other CRs may be added or the exporter could be extended to support provider specific CRDs too. +In future iterations other CRs may be added or the cluster-api-state-metrics could be extended to support provider specific CRDs too. | Name | API Group/Version | In-scope | |---------------------------|-----------------------------------------|----------| @@ -162,16 +164,43 @@ The `Cluster` CR will have important information in their status fields similar Currently it is not important to implement metrics for `KubeadmConfig` and `KubeadmConfigTemplate` because both only contain configuration data (e.g., passed via cloud-init to the machine). However they may be compared to `ConfigMaps` or `Secrets`. -#### Reusage of kube-state-metrics packages +#### How does kube-state-metrics work -The proposed exporter should re-use as many packages provided by kube-state-metrics as possible. This allows to re-use flags, configuration and extended functionality like sharding without additional implementation. +Kube-state-metrics exposes metrics by a http endpoint to be consumed by either Prometheus itself or a compatible scraper [[1]]. -Start the exporter using the function `app.RunKubeStateMetrics` [[1]] and provide a custom `RegistryFactory` [[2]] +Since kube-state-metrics v1.5 large performance improvements got introduced to kube-state-metrics which are documented at the [Performance Optimization Proposal](https://github.com/kubernetes/kube-state-metrics/blob/master/docs/design/metrics-store-performance-optimization.md#Proposal). This document also explains the current internals of kube-state-metrics. + +It caches the current state of the metrics using an internal cache and updates this internal state on add, update and delete events of watched resources. + +On requests to `/metrics` the cached data gets concatenated to a single string and returned as a response. + +#### Reuse kube-state-metrics packages + +Cluster-api-state-metrics should re-use as many packages provided by kube-state-metrics as possible. This allows re-use of flags, configuration and extended functionality like sharding or tls configuration without additional implementation. + +An [extension mechanism](https://github.com/kubernetes/kube-state-metrics/pull/1644) was introduced to the kube-state-metrics packages which allows using its basic mechanism but defining custom metrics for Custom Resources. +The `k8s.io/kube-state-metrics/v2/pkg/customresource.RegistryFactory`[[2]] interface was [introduced](https://github.com/kubernetes/kube-state-metrics/pull/1644) to allow defining custom metrics for Custom Resources while leveraging kube-state-metrics logic. + +Cluster-api-state-metrics will have to implement the `customresource.RegistryFactory` interface for each custom resource. +The interface defines the function `MetricFamilyGenerators(allowAnnotationsList, allowLabelsList []string) []generator.FamilyGenerator` to be implemented which then contains the specific metric implementations. +A detailed implementation example is available at the [package documentation](https://pkg.go.dev/k8s.io/kube-state-metrics/v2@v2.4.2/pkg/customresource#RegistryFactory). + +These `customresource.RegistryFactory` implementations get used in a `main.go` which configures and starts the application by using `k8s.io/kube-state-metrics/v2/pkg/app.RunKubeStateMetrics(...)`[[3]]. + +#### Package structure for cluster-api-state-metrics + +- `/exp/pkg/store` contains the metric implementation exposed by the `customresource.RegistryFactory`[[2]] interface. + - `/exp/pkg/store/{cluster,kubeadmcontrolplane,machinedeployment,machine,...}.go` for the custom resource specific implementation of `customresource.RegistryFactory` + - `/exp/pkg/store/factory.go` for implementing the `Factories()` function which groups and exposes the `customresource.RegistryFactory` implementations of this package +- `/exp/state-metrics/main.go` which: + - imports the capi custom resource specific metric *factories* implemented and exposed via `/exp/state-metrics/pkg/store.Factories()` + - imports and uses `k8s.io/kube-state-metrics/v2/pkg/options.NewOptions()`[[3]] to define the same cli flags and options as kube-state-metrics, except the enabled metrics. + - imports and uses `k8s.io/kube-state-metrics/v2/pkg/app.RunKubeStateMetrics(...)`[[3]] to start the metrics server using the given options and the custom registry factory from `store.Factories()`. ### Security Model - RBAC definitions should be generated via kubebuilder annotations. -- RBAC definitions should only grant `get`, `list` and `watch` permissions to CRs relevant for the exporter. +- RBAC definitions should only grant `get`, `list` and `watch` permissions to CRs relevant for the application. ### Risks and Mitigations @@ -185,11 +214,11 @@ This initial implementation provides a baseline on which incremental changes can On a first thought, using kube-state-metrics would be a great fit to retrieve the desired metrics. However, kube-state-metrics does not plan to implement metrics for CRs of CRDs: -> There is no meaningful extension kube-state-metrics can do other than potentially providing a library to reuse the mechanisms built in this repository. [[3]] +> There is no meaningful extension kube-state-metrics can do other than potentially providing a library to reuse the mechanisms built in this repository. [[4]] The linked issue also states that: -> operators should expose metrics about the objects they expose themselves. [[3]] +> operators should expose metrics about the objects they expose themselves. [[4]] Because of that kube-state-metrics itself does not fit this use-case. @@ -209,7 +238,7 @@ Nevertheless, including metrics directly in the controllers may be valid for fut ## Upgrade Strategy -The exporter could follow the API versions of the CAPI controllers. By using a seperate go package using its own `go.mod` file we can prevent adding transitive dependencies to the core module. A `replace` directive inside the `go.mod` can ensure that always the same version will be used for the `sigs.k8s.io/cluster-api` dependency. +Cluster-api-state-metrics could follow the API versions of the CAPI controllers. By using a seperate go package using its own `go.mod` file we can prevent adding transitive dependencies to the core module. A `replace` directive inside the `go.mod` can ensure that always the same version will be used for the `sigs.k8s.io/cluster-api` dependency. ## Additional Details @@ -349,9 +378,10 @@ The initial plan is to add cluster-api-state-metrics as an experimental feature [Pod]: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/pod-metrics.md [StatefulSet]: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/statefulset-metrics.md [kube-state-metrics]: https://github.com/kubernetes/kube-state-metrics -[1]: https://github.com/kubernetes/kube-state-metrics/blob/master/pkg/app/server.go +[1]: https://github.com/kubernetes/kube-state-metrics [2]: https://github.com/kubernetes/kube-state-metrics/blob/master/pkg/customresource/registry_factory.go#L29 -[3]: https://github.com/kubernetes/kube-state-metrics/issues/457 +[3]: https://github.com/kubernetes/kube-state-metrics/blob/master/pkg/app/server.go +[4]: https://github.com/kubernetes/kube-state-metrics/issues/457 [machine phases]: https://github.com/kubernetes-sigs/cluster-api/blob/main/api/v1beta1/machine_phase_types.go [cluster phases]: https://github.com/kubernetes-sigs/cluster-api/blob/main/api/v1beta1/cluster_phase_types.go [machinedeployment phases]: https://github.com/kubernetes-sigs/cluster-api/blob/07c0a4809361927b15cde2747b34142b7c7ead15/api/v1beta1/machinedeployment_types.go#L222-L224