Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Automated/Tracking] Manage cluster components using a GitOps approach with Flux #5

Closed
5 of 7 tasks
rossf7 opened this issue Oct 26, 2023 · 13 comments
Closed
5 of 7 tasks

Comments

@rossf7
Copy link
Contributor

rossf7 commented Oct 26, 2023

Cluster Management

We want to use a GitOps approach for the components running in the cluster using Flux. This is for the minimal set of components that should always be running to support the pipeline.

This is so it is

  • Clear to all participants which components and versions are running in the cluster
  • Easier to contribute to technical tasks by submitting pull requests

The pipeline is responsible for installing applications that are to be measured e.g Falco

Requirements

The components to be installed are listed in the design doc

Phase 1: Base-level cluster components (MVP)

Phase 2: Gather idle metrics for Falco

Phase 3: Gather load-test metrics

More may be added as we continue to develop the pipeline.

Documentation

We should document this process as we go.

@nikimanoledaki
Copy link
Contributor

nikimanoledaki commented Oct 27, 2023

This is great, thank you for opening the issue & listing the initial components!

As an example, I used Flux to deploy Kepler in this repo: https://github.com/nikimanoledaki/sustainability-journey-with-gitops

I ran the following bootstrap command to install and bootstrap Flux and specify that it should reconcile the repo's clusters/ dir:

curl -s https://fluxcd.io/install.sh | sudo bash
flux bootstrap github --owner=$GITHUB_USER --repository=green-reviews-tooling --path=clusters

Thankfully past me documented the steps in the README! 👍

At the time there wasn't a Helm Chart so I used Kustomize to deploy the k8s manifests but we should change that to use the Helm Chart, as you said :)

Docs for bootstrapping Flux with a GitHub repo: https://fluxcd.io/flux/installation/bootstrap/github/

Note: We'll also need to export a GitHub token before running the GitHub command - will create and send it to you privately!

export GITHUB_TOKEN=<gh-token>

@nikimanoledaki
Copy link
Contributor

nikimanoledaki commented Oct 31, 2023

We should also think about having multiple environments. Looking at the Flux docs on structuring repositories for guidance. Here are some ideas - they might not all be viable 🤔


Components/apps

Here is an initial idea that we can iterate on to deploy the individual components/apps:

├── apps
    ├── production
    └── development

apps/production would include:

  • Cilium
  • Kepler
  • Prometheus

apps/development could be for the manual pipeline that includes the above as well as Falco & the demo workload. In production, this would ideally be configured and maintained by the project maintainers.


Infrastructure & cluster provisioning

We could potentially add the cluster and/or infrastructure provisioning as well:

├── infrastructure
│   └── equinix-metal
├── clusters
│   └── production

I'm not sure how well that would work with Ansible and/or OpenTofu. Previously, Terraform worked with the Flux TF Controller, but I don't know if there is a similar integration with OpenTofu. I'm also not sure if Flux would be necessary with Ansible since that is already an IaC tool (but I have not worked with Ansible before so I'm not sure). Lots of questions here.


CNCF Projects

An idea for how we could deploy CNCF Projects:

├── cncf-projects
    ├── falco
    └── <next-project>

Each project could use Kustomize to point to the upstream configuration that is maintained by CNCF Project maintainers. However I'm not sure how/if that works with Ansible configuration. 🤔 The alternative would be to do the self-hosted GitHub Action runners that project maintainers can use directly.

@AntonioDiTuri
Copy link
Contributor

I would be up for taking over this one. Can I get it assigned to me? @nikimanoledaki Should I ask you the github token?
I wanted to ask what is the final output: a pull request with all the needed folder structure and the steps followed to install flux would do?

@rossf7
Copy link
Contributor Author

rossf7 commented Nov 2, 2023

@nikimanoledaki I like that directory structure with the environments and cncf-projects.

Also +1 for having the IaC code under infrastructure I'll add a note to #1. The IaC code will need to bootstrap Flux so we might run into a chicken egg problem but it would be nice to use Flux if we can.

@AntonioDiTuri Thanks, I think a pull request would be good and then depending on where we are with the IaC issue we can see how to integrate both workstreams.

@nikimanoledaki
Copy link
Contributor

nikimanoledaki commented Nov 3, 2023

Should I ask you the github token?

This is a good question. I'm not sure how we should manage this! There are risks if we use our own personal access tokens since the token needs repo-wide access. Any leak or sharing with other folks could give access to private repos that the user has access to.

A bot account could be an option. We would need to request this from the CNCF. Maybe there is one already.

Do you have any other ideas? 🤔

@leonardpahlke
Copy link
Member

We should also think about having multiple environments

Would advocate for, for now all dev.(there is no production now)

@leonardpahlke
Copy link
Member

We don't use personal access tokens in this project. We will go over the org. I will take a look at this after Kubecon

@nikimanoledaki
Copy link
Contributor

We should also think about having multiple environments

Would advocate for, for now all dev.(there is no production now)

Regarding this - we currently do have the manual testing workflow (dev) and we will have the automated process later (prod). We could rename these environments if dev/prod is misleading to something like manual/automated. I think it's worth planning for both in our repository structure. What do you all think? :) Let me know if I may be missing or misunderstanding something.

@nikimanoledaki
Copy link
Contributor

Created this issue to request a PAT and unblock this: #7

@nikimanoledaki
Copy link
Contributor

We have a fine-grained PAT - anyone who needs this can message @leonardpahlke or me (and the new leads soon!) 👍

@nikimanoledaki nikimanoledaki changed the title Manage cluster components using a GitOps approach with Flux [Automation] Manage cluster components using a GitOps approach with Flux Dec 11, 2023
@AntonioDiTuri AntonioDiTuri removed their assignment Dec 11, 2023
@nikimanoledaki nikimanoledaki changed the title [Automation] Manage cluster components using a GitOps approach with Flux [Automated] Manage cluster components using a GitOps approach with Flux Dec 11, 2023
@nikimanoledaki
Copy link
Contributor

nikimanoledaki commented Dec 12, 2023

Heads-up that there is some progress on the Falco side thanks to @incertum to create the repo that will contain the Daemonset/ConfigMaps needed to deploy Falco: falcosecurity/evolution#345

After that, we can add ./clusters/falco.yaml with the following:

---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: falco-cncf-green-reviews-testing
  namespace: flux-system
spec:
  interval: 1m0s
  ref:
    branch: main
  url: https://github.com/falcosecurity/cncf-green-review-testing
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: falco-cncf-green-reviews-testing
  namespace: flux-system
spec:
  interval: 30m0s
  path: ./kustomize
  prune: true
  retryInterval: 2m0s
  sourceRef:
    kind: GitRepository
    name: falco-cncf-green-reviews-testing
  targetNamespace: falco
  timeout: 3m0s
  wait: true

@nikimanoledaki nikimanoledaki moved this from Backlog to In Progress in TAG-Environmental-Sustainability Jan 9, 2024
@nikimanoledaki nikimanoledaki changed the title [Automated] Manage cluster components using a GitOps approach with Flux [Tracking/Automated] Manage cluster components using a GitOps approach with Flux Jan 10, 2024
@nikimanoledaki nikimanoledaki added the priority/critical Top priority label Jan 10, 2024
@nikimanoledaki nikimanoledaki changed the title [Tracking/Automated] Manage cluster components using a GitOps approach with Flux [Automated/Tracking] Manage cluster components using a GitOps approach with Flux Jan 10, 2024
@dipankardas011
Copy link
Contributor

dipankardas011 commented Jan 20, 2024

Cluster Management

We want to use a GitOps approach for the components running in the cluster using Flux. This is for the minimal set of components that should always be running to support the pipeline.

This is so it is

  • Clear to all participants which components and versions are running in the cluster
  • Easier to contribute to technical tasks by submitting pull requests

The pipeline is responsible for installing applications that are to be measured e.g Falco

Requirements

The components to be installed are listed in the design doc

Phase 1: Base-level cluster components (MVP)

Phase 2: Gather idle metrics for Falco

Phase 3: Gather load-test metrics

More may be added as we continue to develop the pipeline.

Documentation

We should document this process as we go.

@rossf7 you might want to update the issue description for the cilium

@nikimanoledaki
Copy link
Contributor

We can close this since it is mostly completed. We have the base cluster environment, which is our MVP.

There is an open PR for the microservice demo workload but holding off since we're going to do idle measurements first.
Lastly, we can revisit the need for a load-testing tool later on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants