Helm chart config for individual hubs as separate files rather than one cluster-level config #857

consideRatio · 2021-11-26T15:11:09Z

I think it is a recurring pain point that the helm chart config are embedded in config/hubs/<k8s-cluster-name>.cluster.yaml files that doesn't have a typical helm chart config. I have found this to be troublesome for myself during local development and debugging, and I'm now running into a situation where additional complexity would be required to do some helm template validation logic for our CI system to resolve #279.

I'd like to propose that we extract the helm chart specific config from <k8s-cluster-name>.cluster.yaml into <hub-name>.hub.helm-chart-config.yaml for example.

I would argue that doing this will reduce the overall complexity of things as well, and that it would move us in the right direction with regards to ease of maintenance and supporting a practical right to replicate.

Tasks to close issue

Reach agreement on extracting the helm chart config into a standalone file is a good idea
Update the deployer script to work with standalone helm chart config files in one way or another
Update config/hubs/schema.yaml to reflect the new structure
Extract the helm chart configs into standalone files

Work blocked by this

I'll wait with investing more time into trying to resolve #279 until we have an agreement on what to do with regards to having the helm chart config embedded in a non-helm-chart config yaml file.

If we agree to extract the helm chart specific config, I can keep working on a PR to close #279 that can be merged as soon as we have resolved this.

The text was updated successfully, but these errors were encountered:

sgibson91 · 2021-11-26T15:20:19Z

I think this issue will also need to be thought about in terms of #818 as well.

The current proposal there is:

instead of having a big config file for the cluster, let's have a directory to represent the cluster plus 1) a cluster-specific yaml file and 2) several hub-specific yaml files for each staging/prod pair.

So a directory tree might look something like:

|-config/
  |-{{ cluster_name }}/
    |-prod.yaml
    |-staging.yaml
    |-cluster.yaml

consideRatio · 2021-11-26T15:40:12Z

Thanks @sgibson91 for chiming in! I'll try to chime in at #818 regarding that as well but will try to avoid mixing the scopes of these issues that still represents somewhat different needs.

If we can reach agreement on extracting the helm chart config into a standalone file is a good idea, its a big step forward for this issue no matter what is done with regards to the bigger structure refactoring in #818.

choldgraf · 2021-11-26T18:38:42Z

Just jumping in quickly from vacation to say that I think the proposal in #818 (comment) sounds reasonable to me.

It feels to me that resolving this issue / that comment could be considered a precursor to resolving #818 (or sub-task of resolving it). If we think we are heading in the direction of factoring out hub- and cluster-specific configuration into separate config, I'd be +1 on going for it now.

Anything that can lay a long-term groundwork that we are happy with is a good idea to do now, while there are relatively fewer unique clusters/hubs to maintain.

Can anybody give a strong reason why we want to keep the configuration combined? If not, I think we should just put this on our backlog and do it

yuvipanda · 2021-11-28T17:34:33Z

Right now, we can use YAML anchors to refer to a prod hub that has the same config as a staging hub - like in

infrastructure/config/hubs/meom-ige.cluster.yaml

Line 14 in 852afb4

config: &meomConfig

. This is a common pattern, as we want to have a staging hub for each cluster we have. If we have separated files, how would we achieve this?

consideRatio · 2021-11-28T22:05:19Z

This is a common pattern, as we want to have a staging hub for each cluster we have. If we have separated files, how would we achieve this?

I suggest that the deployer script is then made able to recognize multiple separate values files, just like helm upgrade --values file1.yaml --values file2.yaml can be used, the deployer script would be able to install some chart with an ordered list of values files.

damianavila · 2021-11-29T13:41:08Z

This is interesting: https://stackoverflow.com/questions/44910886/pyyaml-include-file-and-yaml-aliases-anchors-references, although it feels unstable, not reliable enough...
Going forward with the R2R principle... maybe we should avoid running with this pattern? I know then you would have a new problem because you may want to keep things in sync... 🤔
BTW, @consideRatio seems an interesting alternative if we want to keep the pattern...

consideRatio · 2021-11-29T14:22:46Z

@damianavila what did this pattern refer to in "maybe we should avoid running with this pattern?"

If we want to have shared configuration between hubs etc, I'm thinking we can handle it by having shared values files rather than anchors in a single massive file, and to have the shared values files be used like helm upgrade ... --values file1.yaml --values file2.yaml etc.

For me, the R2R is improved by reduced complexity and ease of comprehension to a large extent. I'm thinking that spitting apart the k8s cluster config files we currently have into helm chart native values files is an improvement regarding complexity and ease of comprehension.

damianavila · 2021-11-30T14:16:33Z

@damianavila what did this pattern refer to in "maybe we should avoid running with this pattern?"

The pattern of relying on anchors to avoid duplication... I am thinking if it makes sense to be repetitive for the sake of explicity...

For me, the R2R is improved by reduced complexity and ease of comprehension to a large extent. I'm thinking that spitting apart the k8s cluster config files we currently have into helm chart native values files is an improvement regarding complexity and ease of comprehension.

I totally agree with that! And I think your proposal with several "value" files is probably a good compromise between easiness to comprehend and not being so repetitive...

consideRatio · 2021-11-30T14:21:00Z

Ah thanks for clarifying @damianavila!

choldgraf · 2022-01-04T16:46:24Z

Update: let's make a proposal

In our team meeting today, we agreed that we have had enough discussion around this that it's time to make a proposal for configuration structure that we can iterate on. @consideRatio and @sgibson91 were interested in working together to come up with something that we can then iterate on.

consideRatio · 2022-01-04T16:48:50Z

A summary of the work I see needed to resolve this and other related issues

This issue was created about one goal, but I hope to see multiple goals solved with the proposed API changes to our configuration that the deployer script reads.

To have pure helm chart values files allowing us to run helm template --validate etc without trouble.
To enable GitHub workflows define jobs that trigger effectively to not redeploy too much or too little.
To enable a dependency on misc helm charts, like for example jupyterhub-ssh
To enable a dependency on a cluster support chart or similar.

I'm not sure what proposal of changes to our custom configuration structure is needed, but I'd like us to consider these kinds of challenges when coming up with a config structure proposal.

sgibson91 · 2022-01-04T16:55:55Z

RE: swapping YAML anchors for separate values file, I wonder if it's a good idea to replicate the common.yaml structure I've seen a lot where multiple hub configs are used. So common.yaml represents config common to a group of hubs and is always loaded, then staging.yaml, prod.yaml, my-other-cool-hub.yaml represent config specific to a single hub.

consideRatio · 2022-01-04T17:11:56Z

About common.yaml, I'm very open to the idea of having such file, but, I don't want it to be used because of its name - but because it is explicitly listed as a values file for the hub.

So, concretely - the deployer script will deploy a helm chart based on values files explicitly specified - not by naming convention. I think that will solve situations where you for example want to have multiple common.yaml files that you opt into.

There is a restriction though, which is that multiple values files that are coalesced will merge in the helm way, while an anchor can be used with finer level of detail to inject elements into an list I think for example - but that is a compromise I think we just have to accept no matter what.

consideRatio · 2022-01-04T17:18:15Z

Some concrete ideas

What we have referred to a "hub" is practically a helm chart release (an installation of a helm chart).
Each hub / helm chart release explicitly lists its values files
Each hub / helm chart release explicitly lists dependencies to other hub / helm chart releases
We attempt to make the support deployment another hub / helm chart release that the hub's in the cluster explicitly depend on

(I'm taking a computer break for today now!)

sgibson91 · 2022-01-04T17:27:25Z

So long as I can still write python deployer deploy ... HUB_NAME on the command line, and the deployer knows where to go for these explicit mappings, I am open to this suggestion. I foresee human error if I have to explicitly remember each values file I need to provide on the command line. We can then also check the mappings into version control too and hence rollback to given config mappings via git and CI can then read them too.

I'm also think I need some help understanding/visualising your other suggestions and how that relates to just isolating config for validation or deployment? (But we can iterate on that!)

consideRatio · 2022-01-04T17:33:45Z

So long as I can still write python deployer deploy ... HUB_NAME on the command line, and the deployer knows where to go for these explicit mappings, I am open to this suggestion. I foresee human error if I have to explicitly remember each values file I need to provide on the command line

Absolutely, I'm thinking the values files are explicit because they are part of the deployer scripts configuration.

sgibson91 · 2022-01-25T13:09:11Z

The currently proposal is outline in https://hackmd.io/@29EjZnuAQx-ME7Olgi2JXw/Sk5S7xP6F and looks something like the following (exact details TBD)

config
└── clusters
    ├── cluster_1
    │   ├── cluster.yaml
    │   ├── prod.hub.values.yaml
    │   ├── staging.hub.values.yaml
    │   └── support.values.yaml
    └── cluster_2
        ├── cluster.yaml
        ├── prod.hub.values.yaml
        ├── staging.hub.values.yaml
        └── support.values.yaml

@choldgraf asked what a version for communities for with more than one hub on a cluster might look like, possibly something like this

config
└── clusters
    ├── cluster_1
    |   ├── community_A
    |   |   ├── staging.hub.values.yaml
    |   |   └── prod.hub.values.yaml
    │   ├── cluster.yaml
    │   ├── prod.hub.values.yaml
    │   ├── staging.hub.values.yaml
    │   └── support.values.yaml
    └── cluster_2
        ├── cluster.yaml
        ├── prod.hub.values.yaml
        ├── staging.hub.values.yaml
        └── support.values.yaml

I think we would have to be careful in terms of name clashing here though...

... Actually maybe a better to address this is to have the filename convention {{ COMMUNITY_NAME }}.{{ HUB_NAME }}.values.yaml. This way we can:

Group hubs together by community based on a simple sort
Can extract {{ COMMUNITY_NAME }}.{{ HUB_NAME }} into the name of the helm release (replace the . for a -)

damianavila · 2022-01-26T21:01:23Z

... Actually maybe a better to address this is to have the filename convention {{ COMMUNITY_NAME }}.{{ HUB_NAME }}.values.yaml.

In that case, I guess we would not have a staging.hub.values.yaml nor prod.hub.values.yaml files anymore because all would be prepended by a community name, right? Our hubs would be 2i2c ones, I suppose?

sgibson91 · 2022-01-27T13:10:27Z

@damianavila The thought still needs fleshing out, but I don't think it's critical right now as we don't currently host any hubs following that pattern.

choldgraf · 2022-03-01T17:21:16Z

I think that with the recent refactors, we could now close this one

some relevant PRs:

is that right?

sgibson91 · 2022-03-01T17:22:43Z

I believe so, unless it is strongly felt that the support config is also included in this umbrella?

choldgraf · 2022-03-01T17:28:20Z

Hmm - I'll leave the answer to that question up to the @2i2c-org/tech-team because I don't understand the implications. Is there a strong reason to block closing this issue on grouping the support config as well?

consideRatio · 2022-03-01T17:29:09Z

Let's close and start fresh for anything remaining!

consideRatio changed the title ~~Separate helm chart config from <name>.cluster.yaml files into <name>.chart-config.yaml files?~~ Separate helm chart config from <k8s-cluster-name>.cluster.yaml files into <hub-name>.hub.chart-config.yaml files? Nov 26, 2021

consideRatio changed the title ~~Separate helm chart config from <k8s-cluster-name>.cluster.yaml files into <hub-name>.hub.chart-config.yaml files?~~ Separate helm chart config from <k8s-cluster-name>.cluster.yaml files into <hub-name>.hub.helm-chart-config.yaml files? Nov 26, 2021

consideRatio mentioned this issue Nov 26, 2021

Parallelize the deployment of hubs on a cluster #818

Closed

choldgraf changed the title ~~Separate helm chart config from <k8s-cluster-name>.cluster.yaml files into <hub-name>.hub.helm-chart-config.yaml files?~~ Helm chart config for individual hubs as separate files rather than one cluster-level config Jan 4, 2022

choldgraf assigned consideRatio and sgibson91 Jan 4, 2022

choldgraf added this to DEPRECATED Engineering and Product Backlog Jan 4, 2022

This was referenced Jan 4, 2022

CI/CD infrastructure for hubs that is automated, tested, and parallelized #879

Closed

Deploy and operate a BinderHub for Pangeo #919

Open

Set up BinderHub deployment infrastructure #863

Closed

choldgraf removed this from DEPRECATED Engineering and Product Backlog Jan 6, 2022

consideRatio closed this as completed Mar 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Helm chart config for individual hubs as separate files rather than one cluster-level config #857

Helm chart config for individual hubs as separate files rather than one cluster-level config #857

consideRatio commented Nov 26, 2021 •

edited by sgibson91

Loading

sgibson91 commented Nov 26, 2021 •

edited

Loading

consideRatio commented Nov 26, 2021

choldgraf commented Nov 26, 2021

yuvipanda commented Nov 28, 2021

consideRatio commented Nov 28, 2021

damianavila commented Nov 29, 2021

consideRatio commented Nov 29, 2021 •

edited

Loading

damianavila commented Nov 30, 2021

consideRatio commented Nov 30, 2021

choldgraf commented Jan 4, 2022

consideRatio commented Jan 4, 2022

sgibson91 commented Jan 4, 2022

consideRatio commented Jan 4, 2022 •

edited

Loading

consideRatio commented Jan 4, 2022

sgibson91 commented Jan 4, 2022

consideRatio commented Jan 4, 2022

sgibson91 commented Jan 25, 2022

damianavila commented Jan 26, 2022

sgibson91 commented Jan 27, 2022

choldgraf commented Mar 1, 2022 •

edited

Loading

sgibson91 commented Mar 1, 2022

choldgraf commented Mar 1, 2022

consideRatio commented Mar 1, 2022

Helm chart config for individual hubs as separate files rather than one cluster-level config #857

Helm chart config for individual hubs as separate files rather than one cluster-level config #857

Comments

consideRatio commented Nov 26, 2021 • edited by sgibson91 Loading

Tasks to close issue

Work blocked by this

sgibson91 commented Nov 26, 2021 • edited Loading

consideRatio commented Nov 26, 2021

choldgraf commented Nov 26, 2021

yuvipanda commented Nov 28, 2021

consideRatio commented Nov 28, 2021

damianavila commented Nov 29, 2021

consideRatio commented Nov 29, 2021 • edited Loading

damianavila commented Nov 30, 2021

consideRatio commented Nov 30, 2021

choldgraf commented Jan 4, 2022

Update: let's make a proposal

consideRatio commented Jan 4, 2022

A summary of the work I see needed to resolve this and other related issues

sgibson91 commented Jan 4, 2022

consideRatio commented Jan 4, 2022 • edited Loading

consideRatio commented Jan 4, 2022

sgibson91 commented Jan 4, 2022

consideRatio commented Jan 4, 2022

sgibson91 commented Jan 25, 2022

damianavila commented Jan 26, 2022

sgibson91 commented Jan 27, 2022

choldgraf commented Mar 1, 2022 • edited Loading

sgibson91 commented Mar 1, 2022

choldgraf commented Mar 1, 2022

consideRatio commented Mar 1, 2022

consideRatio commented Nov 26, 2021 •

edited by sgibson91

Loading

sgibson91 commented Nov 26, 2021 •

edited

Loading

consideRatio commented Nov 29, 2021 •

edited

Loading

consideRatio commented Jan 4, 2022 •

edited

Loading

choldgraf commented Mar 1, 2022 •

edited

Loading