-
Notifications
You must be signed in to change notification settings - Fork 45
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
doc: Add a design doc for user customization & persistence
Initially, user defined configurations such as changing the number of replicas for a given deployment in a MetalK8s cluster are lost during an upgrade/downgrade scenario. This document explains some of the design choices considered while designing a simplistic tool for MetalK8s that guarantees that user defined configurations are persisted throughout. Closes: #2233
- Loading branch information
Showing
3 changed files
with
231 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,3 +7,4 @@ Architecture Documents | |
deployment | ||
monitoring | ||
requirements | ||
user_customization_and_persistence |
218 changes: 218 additions & 0 deletions
218
docs/developer/architecture/user_customization_and_persistence.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,218 @@ | ||
User Customization & Persistence | ||
================================ | ||
|
||
Context | ||
------- | ||
|
||
.. todo:: | ||
|
||
This section will be handled by the requirements PR. | ||
|
||
|
||
Design Choices | ||
-------------- | ||
|
||
:term:`ConfigMap` store is chosen as a unified data access and | ||
storage media for user editable configurations in a MetalK8s cluster based on | ||
the above requirements for the following reasons: | ||
|
||
* Ability to support Update operations on ConfigMap's with CLI and UI easily | ||
using our already existing python kubernetes module. | ||
* Guarantee of adaptability and ease of changing the design and implementation | ||
in cases were customer needs evolve rapidly. | ||
* ConfigMap data is store in the :term:`etcd` database which is generally being | ||
backed up. This ensures that user settings cannot be lost easily. | ||
|
||
.. note:: | ||
|
||
Persisting newly added Grafana dashboards or new Grafana datasources | ||
especially for modifications added via the Grafana UI cannot be stored in | ||
ConfigMaps. | ||
To handle this particular user case, there is a need to provision persistent | ||
storage volumes to handle the persistence of these settings across Pod | ||
restarts. | ||
|
||
Rejected design choices | ||
~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Consul KV | ||
~~~~~~~~~ | ||
|
||
This approach offers a full fledge KV store with a /kv endpoint which allows | ||
CRUD operations on all KV data stored in it. | ||
Consul KV also allows access to past versions of objects and has an optimistic | ||
concurrency when manipulating multiple objects. | ||
|
||
Note that, Consul KV store was rejected because managing operations such as | ||
performing full backups, system restores for a full fledged KV system | ||
requires time and much more efforts than the ConfigMap KV store which is | ||
simplistic and matches the requirements stated. | ||
|
||
|
||
Implementation Details | ||
---------------------- | ||
|
||
Storage format | ||
~~~~~~~~~~~~~~ | ||
|
||
A sample ConfigMap store can be defined with the following fields. | ||
|
||
An example of such a ConfigMap store: | ||
|
||
.. code-block:: yaml | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
namespace: <namespace> | ||
name: <config-name> | ||
data: | ||
config.yaml: |- | ||
apiVersion: <object-version> | ||
kind: <kind> | ||
spec: | ||
<key>: <values> | ||
**Use case 1:** | ||
|
||
Configure and store the number of replicas for service specific Deployments | ||
found in the `metalk8s-monitoring` namespace using the ConfigMap store format. | ||
|
||
.. code-block:: yaml | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
namespace: metalk8s-monitoring | ||
name: metalk8s-grafana-userconfig | ||
data: | ||
config.yaml: |- | ||
apiVersion: metalk8s.scality.com/v1alpha1 | ||
kind: GrafanaUserConfig | ||
spec: | ||
replicas: 2 | ||
How it works | ||
~~~~~~~~~~~~ | ||
|
||
Service pods and deployments will be configured to consume configuration data | ||
directly from their respective minion external pillars. | ||
|
||
During Bootstrap, these external pillar values will be pre-filled with default | ||
values and the service consumers will be configured to use these values. | ||
|
||
**Using Saltstates** | ||
|
||
Once a ConfigMap KV store is updated by the user(say a user changes the | ||
number of replicas for Prometheus deployments to a new value), then perform the | ||
following actions; | ||
|
||
- Apply a salt state that reads the ConfigMap object, validates the schema | ||
based on MetalK8s defined standards and checks the new values passed. | ||
- If the ConfigMap object is valid, the new values passed by the user are | ||
then re-rendered to the pillars. | ||
- Finally, we make sure that the updated values are picked up by their | ||
respective consumers(this might require Pod restarts for changes to take | ||
effect). | ||
|
||
Note that, salt-states are used to sync data and update consumers | ||
of new configurations changes mainly because of the minimum effort it takes to | ||
setup this flow(i.e. the act of configuration update by the user and it's | ||
propagation to the consumers) but the K8s Operator pattern could be use to | ||
replace configuration synchronization between user defined configurations and | ||
consumers. | ||
|
||
The Operator approach is much more complex, requires much more effort | ||
to realize and there is no real need for applying changes using this method | ||
because configuration changes are not frequent(for a typical MetalK8s admin, | ||
changing the number of replicas for a given deployment could happen once in 3 | ||
months) as such, having an operator watch for object changes is not significant | ||
and not very useful at this point in time. | ||
|
||
**Using Operator architecture(Custom Controllers)** | ||
|
||
When using an Operator(a Custom Controller that works with CRDs), we create a | ||
Custom Resource Definition (CRD) that references a ConfigMap. Once a ConfigMap | ||
is updated by the user, then the Operator is designed to perform the following | ||
actions; | ||
|
||
- The Operator is connected to the API server to watch for changes in the | ||
ConfigMap. | ||
- If the Operator detects a modified ConfigMap, it then determines which line | ||
of action it should take which are; | ||
|
||
- Extract the ConfigMap name and object fields | ||
- Extract the pods associated to this ConfigMap based on it's labels | ||
- Read and validate the ConfigMap data and schema | ||
- If the ConfigMap is valid, update the pillars and restart the | ||
respective pods such that they pick-up new configs from the pillars. | ||
- If the ConfigMap is invalid, log the error and perform no further | ||
action because a bad ConfigMap being applied could lead to cluster | ||
outages | ||
|
||
Iteration 1 | ||
~~~~~~~~~~~ | ||
|
||
- Define and deploy new ConfigMap stores that will hold user configurations | ||
as listed in the requirements | ||
- Template and render Deployment and Pod manifests that will make use of | ||
this user customizable configurations using pillar values | ||
- Document how to change user configurations using kubectl | ||
- Create and deploy persistent storage volumes for Grafana dashboards and | ||
datasources | ||
- Document how to create these persistent volume for Grafana dashboards and | ||
datasources | ||
|
||
Iteration 2 | ||
~~~~~~~~~~~ | ||
|
||
- Provide a CLI tool for changing any of the user configurations: | ||
|
||
- Count of replicas for chosen Deployments(Prometheus) | ||
- Updating a Dex authentication connector(OpenLDAP, AD and | ||
staticUser store) | ||
- Updating the Alertmanager notification configuration | ||
|
||
- Provide a UI interface that allows Update operations on all user customizable | ||
settings based on the above requirements | ||
- Provide a UI interface for adding, updating and deleting service specific | ||
configurations for example Dex-LDAP connector integration. | ||
- Provide a UI interface for listing MetalK8s available/supported Dex | ||
authentication Connectors | ||
- Provide a UI interface for enabling or disabling Dex authentication | ||
connectors(LDAP, Active Directory, StaticUser store) | ||
- Provide a UI interface for changing the number of replicas for a chosen set | ||
of MetalK8s deployments(Prometheus, ....) | ||
- Add a UI interface for listing Alertmanager notification systems MetalK8s | ||
will support(Slack, email, hipchat) | ||
- Provide a UI interface for adding, modifying and deleting Alertmanager | ||
configurations from the listing above | ||
|
||
Documentation | ||
------------- | ||
|
||
In the Operational Guide: | ||
|
||
* Document how to customize or change any given service settings using the CLI | ||
tool | ||
* Document how to customize or change any given service settings using the UI | ||
interface | ||
|
||
Test Plan | ||
--------- | ||
|
||
- Dex Static User authentication is currently covered in our test-suite and it | ||
will make sense to cover atleast one other authentication connector with the | ||
easiest being LDAP since we readily have access to OpenLDAP Docker images and | ||
automating this process is possible on Scality Cloud | ||
|
||
- Add test that ensures that update operations on user configurations are | ||
propagated down to the various services | ||
|
||
- Other corner cases that require testing to reduce error prone setups include: | ||
|
||
- Checking for invalid values in a user defined configuration(e.g setting | ||
the number of replicas to a string("two")) | ||
- Checking for invalid formats in a user configuration | ||
- Checking that a user lost a configuration and we can actually revert | ||
to default values within the pillars |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters