Skip to content

Commit

Permalink
Cluster API State Metrics proposal add more implementation details
Browse files Browse the repository at this point in the history
Signed-off-by: Christian Schlotter <[email protected]>
  • Loading branch information
chrischdi committed Apr 14, 2022
1 parent 2183d63 commit 0cc66c4
Showing 1 changed file with 41 additions and 11 deletions.
52 changes: 41 additions & 11 deletions docs/proposals/20220411-cluster-api-state-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@ status: experimental
- [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
- [Scrapable Information](#scrapable-information)
- [Relationship to kube-state-metrics](#relationship-to-kube-state-metrics)
- [Reusage of kube-state-metrics packages](#reusage-of-kube-state-metrics-packages)
- [How does kube-state-metrics work](#how-does-kube-state-metrics-work)
- [Reuse kube-state-metrics packages](#reuse-kube-state-metrics-packages)
- [Package structure for cluster-api-state-metrics](#package-structure-for-cluster-api-state-metrics)
- [Security Model](#security-model)
- [Risks and Mitigations](#risks-and-mitigations)
- [Alternatives](#alternatives)
Expand Down Expand Up @@ -114,7 +116,7 @@ As an application developer, I would like to deploy cluster-api-state-metrics to

Following Cluster API CRDs currently exist.
The *In-scope* column marks CRDs for which metrics should be exposed.
In future iterations other CRs may be added or the exporter could be extended to support provider specific CRDs too.
In future iterations other CRs may be added or the cluster-api-state-metrics could be extended to support provider specific CRDs too.

| Name | API Group/Version | In-scope |
|---------------------------|-----------------------------------------|----------|
Expand Down Expand Up @@ -162,16 +164,43 @@ The `Cluster` CR will have important information in their status fields similar

Currently it is not important to implement metrics for `KubeadmConfig` and `KubeadmConfigTemplate` because both only contain configuration data (e.g., passed via cloud-init to the machine). However they may be compared to `ConfigMaps` or `Secrets`.

#### Reusage of kube-state-metrics packages
#### How does kube-state-metrics work

The proposed exporter should re-use as many packages provided by kube-state-metrics as possible. This allows to re-use flags, configuration and extended functionality like sharding without additional implementation.
Kube-state-metrics exposes metrics by a http endpoint to be consumed by either Prometheus itself or a compatible scraper [[1]].

Start the exporter using the function `app.RunKubeStateMetrics` [[1]] and provide a custom `RegistryFactory` [[2]]
Since kube-state-metrics v1.5 large performance improvements got introduced to kube-state-metrics which are documented at the [Performance Optimization Proposal](https://github.com/kubernetes/kube-state-metrics/blob/master/docs/design/metrics-store-performance-optimization.md#Proposal). This document also explains the current internals of kube-state-metrics.

It caches the current state of the metrics using an internal cache and updates this internal state on add, update and delete events of watched resources.

On requests to `/metrics` the cached data gets concatenated to a single string and returned as a response.

#### Reuse kube-state-metrics packages

Cluster-api-state-metrics should re-use as many packages provided by kube-state-metrics as possible. This allows re-use of flags, configuration and extended functionality like sharding or tls configuration without additional implementation.

An [extension mechanism](https://github.com/kubernetes/kube-state-metrics/pull/1644) was introduced to the kube-state-metrics packages which allows using its basic mechanism but defining custom metrics for Custom Resources.
The `k8s.io/kube-state-metrics/v2/pkg/customresource.RegistryFactory`[[2]] interface was [introduced](https://github.com/kubernetes/kube-state-metrics/pull/1644) to allow defining custom metrics for Custom Resources while leveraging kube-state-metrics logic.

Cluster-api-state-metrics will have to implement the `customresource.RegistryFactory` interface for each custom resource.
The interface defines the function `MetricFamilyGenerators(allowAnnotationsList, allowLabelsList []string) []generator.FamilyGenerator` to be implemented which then contains the specific metric implementations.
A detailed implementation example is available at the [package documentation](https://pkg.go.dev/k8s.io/kube-state-metrics/[email protected]/pkg/customresource#RegistryFactory).

These `customresource.RegistryFactory` implementations get used in a `main.go` which configures and starts the application by using `k8s.io/kube-state-metrics/v2/pkg/app.RunKubeStateMetrics(...)`[[3]].

#### Package structure for cluster-api-state-metrics

- `/exp/pkg/store` contains the metric implementation exposed by the `customresource.RegistryFactory`[[2]] interface.
- `/exp/pkg/store/{cluster,kubeadmcontrolplane,machinedeployment,machine,...}.go` for the custom resource specific implementation of `customresource.RegistryFactory`
- `/exp/pkg/store/factory.go` for implementing the `Factories()` function which groups and exposes the `customresource.RegistryFactory` implementations of this package
- `/exp/state-metrics/main.go` which:
- imports the capi custom resource specific metric *factories* implemented and exposed via `/exp/state-metrics/pkg/store.Factories()`
- imports and uses `k8s.io/kube-state-metrics/v2/pkg/options.NewOptions()`[[3]] to define the same cli flags and options as kube-state-metrics, except the enabled metrics.
- imports and uses `k8s.io/kube-state-metrics/v2/pkg/app.RunKubeStateMetrics(...)`[[3]] to start the metrics server using the given options and the custom registry factory from `store.Factories()`.

### Security Model

- RBAC definitions should be generated via kubebuilder annotations.
- RBAC definitions should only grant `get`, `list` and `watch` permissions to CRs relevant for the exporter.
- RBAC definitions should only grant `get`, `list` and `watch` permissions to CRs relevant for the application.

### Risks and Mitigations

Expand All @@ -185,11 +214,11 @@ This initial implementation provides a baseline on which incremental changes can
On a first thought, using kube-state-metrics would be a great fit to retrieve the desired metrics. However, kube-state-metrics does not plan to implement metrics for CRs of CRDs:

> There is no meaningful extension kube-state-metrics can do other than potentially providing a library to reuse the mechanisms built in this repository. [[3]]
> There is no meaningful extension kube-state-metrics can do other than potentially providing a library to reuse the mechanisms built in this repository. [[4]]
The linked issue also states that:

> operators should expose metrics about the objects they expose themselves. [[3]]
> operators should expose metrics about the objects they expose themselves. [[4]]
Because of that kube-state-metrics itself does not fit this use-case.

Expand All @@ -209,7 +238,7 @@ Nevertheless, including metrics directly in the controllers may be valid for fut

## Upgrade Strategy

The exporter could follow the API versions of the CAPI controllers. By using a seperate go package using its own `go.mod` file we can prevent adding transitive dependencies to the core module. A `replace` directive inside the `go.mod` can ensure that always the same version will be used for the `sigs.k8s.io/cluster-api` dependency.
Cluster-api-state-metrics could follow the API versions of the CAPI controllers. By using a seperate go package using its own `go.mod` file we can prevent adding transitive dependencies to the core module. A `replace` directive inside the `go.mod` can ensure that always the same version will be used for the `sigs.k8s.io/cluster-api` dependency.

## Additional Details

Expand Down Expand Up @@ -349,9 +378,10 @@ The initial plan is to add cluster-api-state-metrics as an experimental feature
[Pod]: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/pod-metrics.md
[StatefulSet]: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/statefulset-metrics.md
[kube-state-metrics]: https://github.com/kubernetes/kube-state-metrics
[1]: https://github.com/kubernetes/kube-state-metrics/blob/master/pkg/app/server.go
[1]: https://github.com/kubernetes/kube-state-metrics
[2]: https://github.com/kubernetes/kube-state-metrics/blob/master/pkg/customresource/registry_factory.go#L29
[3]: https://github.com/kubernetes/kube-state-metrics/issues/457
[3]: https://github.com/kubernetes/kube-state-metrics/blob/master/pkg/app/server.go
[4]: https://github.com/kubernetes/kube-state-metrics/issues/457
[machine phases]: https://github.com/kubernetes-sigs/cluster-api/blob/main/api/v1beta1/machine_phase_types.go
[cluster phases]: https://github.com/kubernetes-sigs/cluster-api/blob/main/api/v1beta1/cluster_phase_types.go
[machinedeployment phases]: https://github.com/kubernetes-sigs/cluster-api/blob/07c0a4809361927b15cde2747b34142b7c7ead15/api/v1beta1/machinedeployment_types.go#L222-L224
Expand Down

0 comments on commit 0cc66c4

Please sign in to comment.