Skip to content

Commit

Permalink
doc: Add a design doc for the persistence of cluster and service conf…
Browse files Browse the repository at this point in the history
…igurations

Initially, user defined configurations such as changing the number of
replicas for a given deployment in a MetalK8s cluster are lost during
an upgrade/downgrade scenario.

This document explains some of the design choices considered while
designing a simplistic tool for MetalK8s that guarantees that user defined
configurations are persisted throughout.

Closes: #2233
  • Loading branch information
Ebaneck committed Mar 3, 2020
1 parent d159dd3 commit 8407ee2
Show file tree
Hide file tree
Showing 2 changed files with 209 additions and 0 deletions.
198 changes: 198 additions & 0 deletions docs/developer/architecture/configurations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,3 +73,201 @@ As a MetalK8s expert, I can use ``kubectl`` command(s) in order to edit all
settings that are exposed. The intent is to have a method / API that an expert
could use, if the right CLI tool or GUI is not available or not functioning as
expected.

Design Choices
--------------

:term:`ConfigMap` is chosen as a unified data access and storage media for
cluster and service configurations in a MetalK8s cluster based on the above
requirements for the following reasons:

* Ability to support Update operations on ConfigMaps with CLI and UI easily
using our already existing python kubernetes module.
* Guarantee of adaptability and ease of changing the design and implementation
in cases where customer needs evolve rapidly.
* ConfigMaps are stored in the :term:`etcd` database which is generally being
backed up. This ensures that user settings cannot be lost easily.

How it works
^^^^^^^^^^^^

During the Bootstrap stages, when we are assertive that the K8s cluster is
fully ready and available we could perform the following actions:

- Create and deploy ConfigMaps that hold cluster and service configurations
and pre-fill them with default values.
- Template service pods and deployments to consume configuration data
directly from the above deployed ConfigMaps

This approach works because in a MetalK8s cluster, ConfigMaps for cluster and
service configurations are available before we deploy the configured services.

**Using Salt states**

Once a ConfigMap is updated by the user (say a user changes the number of
replicas for Prometheus deployments to a new value), then perform the
following actions:

- Apply a Salt state that reads the ConfigMap object, validates the schema
and checks the new values passed and re-applies this configuration value to
the deployment in question.
- Restart the Kubernetes deployment to pickup newly applied service
configurations.

Storage format
~~~~~~~~~~~~~~

A YAML (K8s-like) format was chosen to represent the data field instead of a
flat key-value structure for the following reasons:

- YAML formatted configurations are easy to write and understand hence it will
be simpler for users to edit configurations.
- The YAML format benefits from bearing a schema version, which can be checked
and validated against a version we deploy.
- YAML is a format for describing hierarchical data structures, while using a
flat key-value format would require a form of encoding (and then, decoding)
of this hierarchical structure.

A sample ConfigMap can be defined with the following fields.

.. code-block:: yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: <namespace>
name: <config-name>
data:
config.yaml: |-
apiVersion: <object-version>
kind: <kind>
spec:
<key>: <values>
**Use case 1:**

Configure and store the number of replicas for service specific Deployments
found in the `metalk8s-monitoring` namespace using the ConfigMap format.

.. code-block:: yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metalk8s-monitoring
name: metalk8s-grafana-config
data:
config.yaml: |-
apiVersion: metalk8s.scality.com/v1alpha1
kind: GrafanaConfig
spec:
deployment:
replicas: 2
Non-goals
~~~~~~~~~

This section contains requirements stated above which the current design choice
does not cater for and will be addressed later:

- Persisting newly added Grafana dashboards or new Grafana datasources
especially for modifications added via the Grafana UI cannot be stored in
ConfigMaps and hence will be catered for later.

- As stated in the requirements, adding and editing Prometheus alert rules
is also not covered by the chosen design choice and will need to be addressed
differently. Even if we could use ConfigMaps for Prometheus rules, we prefer
relying on the Prometheus Operator and it's CRD (PrometheusRule).

Rejected design choices
~~~~~~~~~~~~~~~~~~~~~~~

Consul KV vs ConfigMap
~~~~~~~~~~~~~~~~~~~~~~

This approach offers a full fledge KV store with a /kv endpoint which allows
CRUD operations on all KV data stored in it.
Consul KV also allows access to past versions of objects and has an optimistic
concurrency when manipulating multiple objects.

Note that, Consul KV store was rejected because managing operations such as
performing full backups, system restores for a full fledged KV system
requires time and much more efforts than the ConfigMap approach.

Operator (Custom Controller) vs Salt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Operators are useful in that, they provide self-healing functionalities on a
reactive basis. When a user changes a given configuration, it is easy to
reconcile and apply these changes to the in-cluster objects.

The Operator approach was rejected because it is much more complex, requires
much more effort to realize and there is no real need for applying changes
using this method because configuration changes are not frequent
(for a typical MetalK8s admin, changing the number of replicas for a given
deployment could happen once in 3 months or less) as such, having an operator
watch for object changes is not significant and not very useful at this point
in time.

In the Salt approach, Salt Formulas are designed to be idempotent ensuring that
service configuration changes can be applied each time a new configuration is
introduced.

Implementation Details
----------------------

Iteration 1
^^^^^^^^^^^

- Define and deploy new ConfigMap stores that will hold cluster and service
configurations as listed in the requirements. For each ConfigMap, define its
schema, its default values, and how it impacts the configured services
- Template and render Deployment and Pod manifests that will make use of
this persisted cluster and service configurations
- Document how to change cluster and service configurations using kubectl
- Document the entire list of configurations which can be changed by the user

Iteration 2
^^^^^^^^^^^

- Provide a CLI tool for changing any of the cluster and service
configurations:

- Count of replicas for chosen Deployments (Prometheus)
- Updating a Dex authentication connector (OpenLDAP, AD and
staticUser store)
- Updating the Alertmanager notification configuration

- Provide a UI interface for adding, updating and deleting service specific
configurations for example Dex-LDAP connector integration.
- Provide a UI interface for listing MetalK8s available/supported Dex
authentication Connectors
- Provide a UI interface for enabling or disabling Dex authentication
connectors (LDAP, Active Directory, StaticUser store)
- Add a UI interface for listing Alertmanager notification systems MetalK8s
will support (Slack, email)
- Provide a UI interface for adding, modifying and deleting Alertmanager
configurations from the listing above

Documentation
-------------

In the Operational Guide:

* Document how to customize or change any given service settings using the CLI
tool
* Document how to customize or change any given service settings using the UI
interface
* Document the list of service settings which can be configured by the user

Test Plan
---------

- Add test that ensures that update operations on user configurations are
propagated down to the various services

- Other corner cases that require testing to reduce error prone setups include:

- Checking for invalid values in a user defined configuration (e.g setting
the number of replicas to a string ("two"))
- Checking for invalid formats in a user configuration
11 changes: 11 additions & 0 deletions docs/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,17 @@ Glossary
and from where the cluster will be deployed to other machines. It also
serves as the entrypoint for upgrades of the cluster.

ConfigMap
A ConfigMap is a Kubernetes object that allows one to store general
configuration information such as environment variables in a key-value
pair format.
ConfigMaps can only be applied to namespaces and once created, they can
be updated automatically without the need of restarting containers that
depend on it.

|see K8s docs|
`ConfigMap <https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#understanding-configmaps-and-pods/>`_.

Controller Manager
``kube-controller-manager``
The Kubernetes controller manager embeds the core control loops shipped
Expand Down

0 comments on commit 8407ee2

Please sign in to comment.