Skip to content

Commit

Permalink
doc: Add a design doc for user customization & persistence
Browse files Browse the repository at this point in the history
Initially, user defined configurations such as changing the number of
replicas for a given deployment in a MetalK8s cluster are lost during
an upgrade/downgrade scenario.

This document explains some of the design choices considered while
designing a simplistic tool for MetalK8s that guarantees that user defined
configurations are persisted throughout.

Closes: #2233
  • Loading branch information
Ebaneck committed Feb 14, 2020
1 parent 17cbd14 commit 5d2a994
Show file tree
Hide file tree
Showing 3 changed files with 183 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/developer/architecture/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ Architecture Documents
deployment
monitoring
requirements
user_customization_and_persistence
170 changes: 170 additions & 0 deletions docs/developer/architecture/user_customization_and_persistence.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
User Customization & Persistence
================================

Context
-------

.. todo::

This section will be handled by the requirements PR.


Design Choices
--------------

:term:`ConfigMap` store is chosen as a unified data access and
storage media for user editable configurations in a MetalK8s cluster based on
the above requirements for the following reasons:

* Ability to support Update operations on ConfigMap's with CLI and UI easily
using our already existing python kubernetes module.
* Guarantee of adaptability and ease of changing the design and implementation
in cases were customer needs evolve rapidly.


Rejected design choices
~~~~~~~~~~~~~~~~~~~~~~~

Consul KV
~~~~~~~~~

This approach offers a full fledge KV store with a /kv endpoint which allows
CRUD operations on all KV data stored in it.
Consul KV also allows access to past versions of objects and has an optimistic
concurrency when manipulating multiple objects.

Note that, Consul KV store was rejected because managing operations such as
performing full backups, system restores for a full fledged KV system
requires time and much more efforts than the ConfigMap KV store which is
simplistic and matches the requirements stated.


Implementation Details
----------------------

Storage format
~~~~~~~~~~~~~~

A sample ConfigMap store can be defined with the following fields.

An example of such a ConfigMap store:

.. code-block:: yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: <namespace>
name: <config-name>
data:
config.yaml: |-
apiVersion: <object-version>
kind: <kind>
spec:
<key>: <values>
**Use case 1:**

Configure and store the number of replicas for service specific Deployments
found in the `metalk8s-monitoring` namespace using the ConfigMap store format.

.. code-block:: yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metalk8s-monitoring
name: metalk8s-grafana-userconfig
data:
config.yaml: |-
apiVersion: metalk8s.scality.com/v1alpha1
kind: GrafanaUserConfig
spec:
replicas: 2
How it works
~~~~~~~~~~~~

Service pods and deployments will be configured to consume configuration data
directly from their respective minion external pillars.

During Bootstrap, these external pillar values will be pre-filled with default
values and the service consumers will be configured to use these values.

Once a ConfigMap KV store is updated by the user(say a user changes the
number of replicas for Prometheus deployments to a new value), then we apply
a salt state that reads the ConfigMap object, validates the schema and the new
values passed by the user then re-renders the values to the pillars making sure
that the updated values are picked up by their consumers(this might require Pod
restarts for changes to take effect).

Note that, SaltStack & salt-states are used to sync data and update consumers
of new configurations changes mainly because of the minimum effort it takes to
setup this flow(that is the act of configuration update by the user and it's
propagation to the consumers) but the K8s Operator pattern could be use to
replace configuration synchronization between user defined configurations and
consumers but this approach is much more complex and requires much more effort
to realize.

Iteration 1
~~~~~~~~~~~

- Define and deploy new namespaced bounded ConfigMap stores that will hold
user configurations as listed in the requirements
- Template and render Deployment and Pod manifests that will make use of
this user customizable configurations using default pillar values


Iteration 2
~~~~~~~~~~~

- Provide a CLI tool for changing any of the user configurations:

- Count of replicas for chosen Deployments(Prometheus)
- Updating a Dex authentication connector(OpenLDAP, AD and
staticUser store)
- Updating the Alertmanager notification configuration

- Provide a UI interface that allows Update operations on all user customizable
settings based on the above requirements
- Provide a UI interface for adding, updating and deleting service specific
configurations for example Dex-LDAP connector integration.
- Provide a UI interface for listing MetalK8s available/supported Dex
authentication Connectors
- Provide a UI interface for enabling or disabling Dex authentication
connectors(LDAP, Active Directory, StaticUser store)
- Provide a UI interface for changing the number of replicas for a chosen set
of MetalK8s deployments(Prometheus, ....)
- Add a UI interface for listing Alertmanager notification systems MetalK8s
will support(Slack, email, hipchat)
- Provide a UI interface for adding, modifying and deleting Alertmanager
configurations from the listing above

Documentation
-------------

In the Operational Guide:

* Document how to customize or change any given service settings using the CLI
tool
* Document how to customize or change any given service settings using the UI
interface

Test Plan
---------

- Dex Static User authentication is currently covered in our test-suite and it
will make sense to cover atleast one other authentication connector with the
easiest being LDAP since we readily have access to OpenLDAP Docker images and
automating this process is possible on Scality Cloud

- Add test that ensures that update operations on user configurations are
propagated down to the various services

- Other corner cases that require testing to reduce error prone setups include:

- Checking for invalid values in a user defined configuration(e.g setting
the number of replicas to a string("two"))
- Checking for invalid formats in a user configuration
- Checking that a user lost a configuration and we can actually revert
to default values within the pillars
12 changes: 12 additions & 0 deletions docs/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,18 @@ Glossary
and from where the cluster will be deployed to other machines. It also
serves as the entrypoint for upgrades of the cluster.


ConfigMap
A ConfigMap is a Kubernetes object that allows one to store general
configuration information such as environment variables in a key-value
pair format.
ConfigMaps can only be applied to namespaces and once created, they can
be updated automatically without the need of restarting containers that
depend on it.

|see K8s docs|
`ConfigMap <https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#understanding-configmaps-and-pods/>`_.

Controller Manager
``kube-controller-manager``
The Kubernetes controller manager embeds the core control loops shipped
Expand Down

0 comments on commit 5d2a994

Please sign in to comment.