From 776887f397f3c401ff718fda1b5b2763ecae8af1 Mon Sep 17 00:00:00 2001 From: R-Lawton Date: Wed, 24 Jul 2024 18:00:28 +0100 Subject: [PATCH] wip Signed-off-by: R-Lawton --- ...ty-policy.md => 0000-observability-api.md} | 130 ++++++++++++++++-- 1 file changed, 115 insertions(+), 15 deletions(-) rename rfcs/{0000-observability-policy.md => 0000-observability-api.md} (51%) diff --git a/rfcs/0000-observability-policy.md b/rfcs/0000-observability-api.md similarity index 51% rename from rfcs/0000-observability-policy.md rename to rfcs/0000-observability-api.md index c1c78846..43a32f19 100644 --- a/rfcs/0000-observability-policy.md +++ b/rfcs/0000-observability-api.md @@ -17,7 +17,7 @@ Users of Kuadrant (Platform engineers and or Site reliability engineers) will wa # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -The observability will allow a user to choose different parts of the offered Kuadrant observability. This provides more freedom to the user rather then the all or nothing current approach. with the new way the user will only have to fill in their configuration in one location rather the current many locations. +The observability api, will allow a user to choose different parts of the offered Kuadrant observability. This provides more freedom to the user rather then the all or nothing current approach. with the new way the user will only have to fill in their configuration in one location rather the current many locations. ### Potential configurations @@ -26,18 +26,100 @@ The different aspects a user might want to modify could be the following: | Observability piece | Kuadrant component | Options | |-----------------------|-----------------------------------------------------------|---------------------------------------------------------------------------------------| | Logging | Kuadrant operator, Authorino operator, Limitador Operator | **logLevel**: (string) **logMode**: (string) | -| Tracing | Authorino operator, Limitador operator | **endpoint** (string), **tags** (map[string]string) , **insecure** (bool) | -| Metrics | Kuadrant operator, Authorino, Limitador, DNS Operator | **port** (int32), **deep** (bool) | -| Healthz | Kuadrant operator, Authorino, Limitador, DNS Operator | **port** (int32) | -| Console plugin | Openshift Only currently | **enable** bool | +| Tracing | Authorino operator, Limitador operator | **endpoint** (string), **tags** (map[string]string) , **insecure**(bool) ,strageyRules([]String) | +| Metrics | Kuadrant operator, Authorino, Limitador, DNS Operator | **enable** (bool), **port** (int32), **deep** (bool) | | Alerts * | Kuadrant operator, Authorino, Limitador | **enable** bool | | Dashboards * | Kuadrant operator, Authorino, Limitador | **enable** bool | | Other 3rd Party * | e.g Kiali | **enable** bool | ###### **Note**: Observability pieces with a * denotes these are post v1 milestone -### Example use cases -A use case a user might have would be they desire setting up tracing in the Kuadrant component Limitador as well as have the Kuadrant components have a log level of Debug but just for Authorino. +### Example CR with everything + +```yaml + +apiVersion: kuadrant.io/v1alpha1 +kind: observability +spec: + logging: + logLevel: + authorino: debug + limitador: debug + logMode: + authorino: development + limitador: development + + tracing: + limitador: + endpoint: rpc://tempo.tempo.svc.cluster.local:4317 + insecure: true + tags: tag1, tag2 + strategyRules: + rule1: + best-rule-in-the-world-1 + rule2: + best-rule-in-the-world-2 + authorino: + endpont: rpc://tempo.tempo.svc.cluster.local:4317 + insecure: true + tags: tag1, tag2 + strategyRules: + rule1: + best-rule-in-the-world-1 + rule2: + best-rule-in-the-world-2 + + metrics: + authorino: + enableService: true + port: 8080 + deep: true + authorino-operator: + enableService: true + port: 8080 + limitador: + enableService: true + port: 8080 + deep: false + limitador-operator: + enableService: true + port: 8080 + kuadrant-operator: + enableService: true + port: 8080 + deep: true + dns-operator: + enableService: true + port: 8080 + + alerts: + namespace: my-amazing-namespace + authorino: + operator-level: true + component-level: true + limitador: + operator-level: true + component-level: true + kuadrant: + operator-level: true + component-level: true + + + dashboards: + namespace: my-amazing-namespace + authorino: + operator-level: true + component-level: true + limitador: + operator-level: true + component-level: true + kuadrant: + operator-level: true + + +``` +### Sample use case +A use case a user might have would be they desire setting up tracing in the Limitador operator implementing the required endpoints and optional tag. The user also wants metrics setup with custom ports and requires service and serviceMonitors to be created for Kuadrant-operator and Authorino-operator as well as have the limitador have a log level of Debug but just for Authorino. ```yaml @@ -49,26 +131,44 @@ spec: tracing: limitador: endpoint: rpc://tempo.tempo.svc.cluster.local:4317 - insecure: true + tags: tag1, tag2 metrics: authorino: - deep: true + enableService: true + port: 8084 + deep: true ``` - ### Status -The status of the observability CR will not be if the observability stack is in a "healthy" state i.e Prometheus and Grafana is up and running. It should be the status of only the things that we contribute for example is new Logging and Tracing now in place or is the console plugin responding. We will not be taking responsibility for aspects we don't have control over. +The status of the Observability CR will not be the observability stack is in a "healthy" state i.e Prometheus and Grafana is up and running. It should be the status of only the things that we contribute for example is new Logging and Tracing now in place or is the console plugin responding. We will not be taking responsibility for aspects we don't have control over. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -In terms of how the information supplied in the observability CR will get passed to the other Kuadrant components, multiple options have been brought up: +In terms of how the information supplied in the Observability CR will get passed to the other Kuadrant components, multiple options have been brought up: directly - like setting flags or configuration directly on the deployments of the different Kuadrant components. indirectly - Passing the information to Authorino & limitador via the Authorino & Limitador CRs indirectly - Passing the information to Authorino & limitador in the form of there own Observability CRs -The best approach would be the direct approach, meaning once the observability CR is updated the information is passed to the relevant component resources e.g deployments. This would mean that certain operators will have to be updated to change CRD's to move configuration to the new observability CRD for example Tracing, Logging and metrics in Authorino and Limitador. For future pieces like the dynamic plugin etc we build this with the method in mind. +The best approach would be the indirect approach, meaning once the Observability CR is updated the information is passed to the relevant component CR. For example the tracing section in the Authorino CR spec would be updated with the required endpoints and other configuration in the Observability CR, this would then be updated in the Authorino CR. +#### Adding, modifying and deleting values or no values + +If a values is being configure either being added, modified or removed all changes will have to be made in the Observability CR. If the component level operators are updated to change these values without the Observability CR being changed they will be overridden back to the source of truth the Observability CR provides. + +If the value is removed from the Observability CR it will also be removed from the relevant component CRs. If the value is removed from the component CR but not the Observability CR it will be overridden and added back. + +If no value is provided as the default is acceptable and the component CR is updated to something that changed the default. The value will get overridden to what the default was. + +If no value is given and there is no default in the Observability CR and the component CR is updated to add a value it will be overridden back to no value. + +#### Work thats needed +The indirect approach allows for not much if any changes to the Authorino operator and the Limitador operator etc . The bulk of the work that would be needed would be in the Kuadrant operator. + +In terms of if this piece of work would require its own observability controller the answer needs more discussion. Some of the work could be done by the kuadrant CR but not everything for example the alerts or the dashboards dont make sense to have the kuadrant operator implement them so a new "observability" controller would be needed. This then begs the question if we need a new controller for some parts of the CR it might make sense to have the new controller handle the full CR and not have the Kuadrant CR reconcile it. + +The changes will come into full affect through a phased approach due to the nature of some aspects not being available yet. The phased approach can follow the versioning syntax that k8s like v1beta1 or v1alpha1 and be portrayed in the CRD. + ### Spec overview @@ -81,7 +181,6 @@ The spec of the custom resource will have the following: | Tracing | **endpoint** *required* (string), **tags** *optional* (map[string]string) , **insecure** (bool) | | Metrics | **port** (int32), **deep** (bool) | | Healthz | **port** (int32) | -| Console plugin | **enable** (bool) | | Alerts * | **enable** (bool) | | Dashboards * | **enable** (bool) | | Other 3rd Party * | **enable** (bool) | @@ -105,7 +204,8 @@ By default if no observability CR is provided the default observability landscap The above approach allows for the following: * User experience: The observability CR can be easily read to see what the current state of the observability configuration is. Also theres only one place to update rather then multiple. -* Abstraction: The Kuadrant CR can remain as core functionality and not have it be "muddied" with observability configuration. Observability currently is Logging, metrics and tracing but down the line could include a lot more more configuration. Having it has a standalone API allows for us to future proof observability. +* Abstraction: The [RFC 0006](https://github.com/Kuadrant/architecture/blob/main/rfcs/0006-kuadrant_sub_components_configurations.md) suggests having it in the Kuadrant CR with other non observability related variables but with new proposed ideas and aspects for observability coming down the line along with the current quite extensive options users can choose from, the Kuadrant CR will become "muddied", hard to maintain and hard to read. +* Future proof: Observability currently is Logging, metrics and tracing but down the line could include a lot more more configuration. Having it has a standalone API allows for us easily add new features. * Single source of truth: Rather then having multiple crs to check what the current configuration is theres a single source of truth. ## Other options: