From 5288ebfed9a24a1e9582832133d65d2fb129972d Mon Sep 17 00:00:00 2001 From: Claude Ebaneck Date: Thu, 13 Feb 2020 14:14:28 +0100 Subject: [PATCH] doc: Add a design doc for user customization & persistence Initially, user defined configurations such as changing the number of replicas for a given deployment in a MetalK8s cluster are lost during an upgrade/downgrade scenario. This document explains some of the design choices considered while designing a simplistic tool for MetalK8s that guarantees that user defined configurations are persisted throughout. Closes: #2233 --- docs/developer/architecture/index.rst | 1 + .../user_customization_and_persistence.rst | 201 ++++++++++++++++++ docs/glossary.rst | 12 ++ 3 files changed, 214 insertions(+) create mode 100644 docs/developer/architecture/user_customization_and_persistence.rst diff --git a/docs/developer/architecture/index.rst b/docs/developer/architecture/index.rst index aa580ab153..f558d9f489 100644 --- a/docs/developer/architecture/index.rst +++ b/docs/developer/architecture/index.rst @@ -7,3 +7,4 @@ Architecture Documents deployment monitoring requirements + user_customization_and_persistence diff --git a/docs/developer/architecture/user_customization_and_persistence.rst b/docs/developer/architecture/user_customization_and_persistence.rst new file mode 100644 index 0000000000..85f934606d --- /dev/null +++ b/docs/developer/architecture/user_customization_and_persistence.rst @@ -0,0 +1,201 @@ +User Customization & Persistence +================================ + +Context +------- + +.. todo:: + + This section will be handled by the requirements PR. + + +Design Choices +-------------- + +:term:`ConfigMap` store is chosen as a unified data access and +storage media for user editable configurations in a MetalK8s cluster based on +the above requirements for the following reasons: + +* Ability to support Update operations on ConfigMap's with CLI and UI easily + using our already existing python kubernetes module. +* Guarantee of adaptability and ease of changing the design and implementation + in cases were customer needs evolve rapidly. +* ConfigMap data is store in the :term:`etcd` database which is generally being + backed up. This ensures that user settings cannot be lost easily. + + +Rejected design choices +~~~~~~~~~~~~~~~~~~~~~~~ + +Consul KV +~~~~~~~~~ + +This approach offers a full fledge KV store with a /kv endpoint which allows +CRUD operations on all KV data stored in it. +Consul KV also allows access to past versions of objects and has an optimistic +concurrency when manipulating multiple objects. + +Note that, Consul KV store was rejected because managing operations such as +performing full backups, system restores for a full fledged KV system +requires time and much more efforts than the ConfigMap KV store which is +simplistic and matches the requirements stated. + + +Implementation Details +---------------------- + +Storage format +~~~~~~~~~~~~~~ + +A sample ConfigMap store can be defined with the following fields. + +An example of such a ConfigMap store: + +.. code-block:: yaml + + apiVersion: v1 + kind: ConfigMap + metadata: + namespace: + name: + data: + config.yaml: |- + apiVersion: + kind: + spec: + : + +**Use case 1:** + +Configure and store the number of replicas for service specific Deployments +found in the `metalk8s-monitoring` namespace using the ConfigMap store format. + +.. code-block:: yaml + + apiVersion: v1 + kind: ConfigMap + metadata: + namespace: metalk8s-monitoring + name: metalk8s-grafana-userconfig + data: + config.yaml: |- + apiVersion: metalk8s.scality.com/v1alpha1 + kind: GrafanaUserConfig + spec: + replicas: 2 + +How it works +~~~~~~~~~~~~ + +Service pods and deployments will be configured to consume configuration data +directly from their respective minion external pillars. + +During Bootstrap, these external pillar values will be pre-filled with default +values and the service consumers will be configured to use these values. + +**Using Saltstates(faily manual approach)** + +Once a ConfigMap KV store is updated by the user(say a user changes the +number of replicas for Prometheus deployments to a new value), then perform the +following actions; + + - Apply a salt state that reads the ConfigMap object, validates the schema + based on MetalK8s defined standards and checks the new values passed. + - If the ConfigMap object is valid, the new values passed by the user are + then re-rendered to the pillars. + - Finally, we make sure that the updated values are picked up by their + respective consumers(this might require Pod restarts for changes to take + effect). + +Note that, salt-states are used to sync data and update consumers +of new configurations changes mainly because of the minimum effort it takes to +setup this flow(that is the act of configuration update by the user and it's +propagation to the consumers) but the K8s Operator pattern could be use to +replace configuration synchronization between user defined configurations and +consumers but this approach is much more complex and requires much more effort +to realize. + +**Using Operator architecture(Custom Controllers)** + +When using an Operator(a Custom Controller that works with CRDs), we create a +Custom Resource Definition (CRD) that references a ConfigMap. Once the CRD is +updated by the user, then the Operator is designed to perform the following +actions; + + - The Operator is connected to the API server to watch for changes in the + CRD. + - If the Operator detects a modified ConfigMap, it then determines which line + of action it should take which are; + + - Extract the ConfigMap name and object fields + - Extract the pods associated to this ConfigMap based on it's labels + - Read and validate the ConfigMap data and schema + - If the ConfigMap is valid, update the pillars and restart the + respective pods such that they pick-up new configs from the pillars. + - If the ConfigMap is invalid, log the error and perform no further + action because a bad ConfigMap being applied could lead to cluster + outages + +Iteration 1 +~~~~~~~~~~~ + +- Define and deploy new ConfigMap stores that will hold user configurations + as listed in the requirements +- Template and render Deployment and Pod manifests that will make use of + this user customizable configurations using pillar values +- Document how to change user configurations using kubectl + + +Iteration 2 +~~~~~~~~~~~ + +- Provide a CLI tool for changing any of the user configurations: + + - Count of replicas for chosen Deployments(Prometheus) + - Updating a Dex authentication connector(OpenLDAP, AD and + staticUser store) + - Updating the Alertmanager notification configuration + +- Provide a UI interface that allows Update operations on all user customizable + settings based on the above requirements +- Provide a UI interface for adding, updating and deleting service specific + configurations for example Dex-LDAP connector integration. +- Provide a UI interface for listing MetalK8s available/supported Dex + authentication Connectors +- Provide a UI interface for enabling or disabling Dex authentication + connectors(LDAP, Active Directory, StaticUser store) +- Provide a UI interface for changing the number of replicas for a chosen set + of MetalK8s deployments(Prometheus, ....) +- Add a UI interface for listing Alertmanager notification systems MetalK8s + will support(Slack, email, hipchat) +- Provide a UI interface for adding, modifying and deleting Alertmanager + configurations from the listing above + +Documentation +------------- + +In the Operational Guide: + +* Document how to customize or change any given service settings using the CLI + tool +* Document how to customize or change any given service settings using the UI + interface + +Test Plan +--------- + +- Dex Static User authentication is currently covered in our test-suite and it + will make sense to cover atleast one other authentication connector with the + easiest being LDAP since we readily have access to OpenLDAP Docker images and + automating this process is possible on Scality Cloud + +- Add test that ensures that update operations on user configurations are + propagated down to the various services + +- Other corner cases that require testing to reduce error prone setups include: + + - Checking for invalid values in a user defined configuration(e.g setting + the number of replicas to a string("two")) + - Checking for invalid formats in a user configuration + - Checking that a user lost a configuration and we can actually revert + to default values within the pillars diff --git a/docs/glossary.rst b/docs/glossary.rst index c778420370..73fec3518a 100644 --- a/docs/glossary.rst +++ b/docs/glossary.rst @@ -31,6 +31,18 @@ Glossary and from where the cluster will be deployed to other machines. It also serves as the entrypoint for upgrades of the cluster. + + ConfigMap + A ConfigMap is a Kubernetes object that allows one to store general + configuration information such as environment variables in a key-value + pair format. + ConfigMaps can only be applied to namespaces and once created, they can + be updated automatically without the need of restarting containers that + depend on it. + + |see K8s docs| + `ConfigMap `_. + Controller Manager ``kube-controller-manager`` The Kubernetes controller manager embeds the core control loops shipped