-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utility Cluster #186
Comments
@runyontr what's the outcome you are looking for here? A new example that implements this architecture? Or just validation that this can all be done? |
The outcome is three fold:
|
Edit: Okay, I think I'm done for now There's a lot here, such that I think we should break this down into multiple issues (I think I'm seeing somewhere around 6 issues) Note: A lot of this will be assuming we have the native apply stuff done, and some of it might change as we finish out that work In no particular order: Example on how to do HA bootstrappingOnce the native apply stuff is done, the initial images for the Hub's docker registry have to come from somewhere. Right now the plan is to use a bastion host where the Zarf binary is installed and when you Doing so would not be HA, so if that's desired we'd need to run multiple bastions, all with Zarf "embedded registries", behind a load balancer. We see this as an extreme use case, where your SLA has lots of 9's in it. My initial SWAG on the work here is a Terraform project with documentation that deploys the bastions and the load balancer to AWS. It would also have to include how to keep each bastion in sync when updates are needed. Multi-Tenant "Utility Cluster" / "GitOps Service" (both registry & git server)Need to talk through this more. The way I understand this is that User A has access to these X number of images and repos, and User B does not have access to User A's images and repos. They have access to their own images and repos that User A doesn't have access to. If that's the way you're thinking, we'll need to come together and decide what's in scope and get that feature request into the roadmap. Example of an HA "Utility Cluster" / "GitOps Service" / "Hub Cluster"This one looks pretty straightforward, we need an example people can look at that does a prod-like hub cluster with multiple k8s nodes and HA services for docker registry and Gitea (can Gitea do HA?). It would likely be worth doing simply first without Big Bang, then followed up with an enhancement to run it in Big Bang (would require work to get the docker registry and gitea running in Big Bang, as right now they do not) Maybe we don't use Dockerv2 and Gitea for a prod-like HA hub cluster? We could do a gitops-managed, HA, deployment of other more prod-ready services. This would require that Zarf be modified to be able to push images and repos to any git service, not just Dockerv2 and Gitea Example of an HA "Spoke" / "Workload" clusterAlso pretty straightforward. We need an example of doing a multi-node workload cluster that talks to a multi-node "hub" cluster. It would likely be worth doing simply first without Big Bang, then followed up with an enhancement to run the workload on top of Big Bang (potential issues with Traefik, cluster policies, etc) No non-Iron Bank imagesSee #214 and #215 (maybe #213 also, depending on how this all shakes out) Example that "puts it all together"Also straightforward. We need an example that composes all of this into a prod-like holistic system architecture. |
Is there a "utility cluster" example coming down the pipe? I think at this point it would basically be airgapping all git repos/images + throwing an ingress on the zarf deployment so the gitea and docker registry can be reached. |
This is how Zarf worked < v0.15.0 out of the box, but we found it much more complicated for most of our use-cases. I think we could do a utility cluster example, which would essentially just add an ingress to expose the registry and git server. The utility cluster concept was really changed with the new design in v0.15.0, namely the "hub/spoke" model went away in favor of in-cluster services for the default configuration. Happy to discuss further or work together on an example, but our primary design now works of this in-cluster architecture |
Yeah, this is why we are still using v0.14.0 for our use case - just works out of the box. Haven't had a ton of time to look at all the differences with the zarf init package and deploy traefik/some other ingress |
Yeah one of the issues we ran into when Zarf went multi-distro is the ingress, it gets pretty complicated pretty fast and some distros don't even have one available out of the box. Prior to v0.15.0 Zarf was effectively a single-node K3s wrapper for air gap. It's certainly a lot more now, but if that's your exclusive use-case it might be possible to stay there for now. We haven't had the bandwidth to discuss backporting anything for that version as we're still pre-release, but I'm sure if someone in the community wanted to, we could support PRs to do that. |
@jeff-mccoy @runyontr @RothAndrew Is this still a need. Given #186 (comment) This either relevant to @ActionN and related work and should be an epic for a supportable use case, or is no longer relevant and should be closed. Labeling an epic until I hear back from y'all. |
As a system admin and application admin, I would like to not be dependent on connectivity to upstream repo1/registry1 or the DevSecOps platform my application is being developed on. This use case should also cover situations where my production environment is airgapped from my development environment.
Until #174 is closed, I'll be using the terms the terms "Hub/Spoke/appliance" clusters in line with this comment: #174 (comment)
Architecture:
Proposed architecture, requirements and workflows:
Appliance
The appliance cluster(s) are responsible for providing the images and git repositories to support running the Hub cluster. Each node of this cluster is its own k3s appliance cluster that is stood up with the same
zarf init
.Requirements:
In order to make the availability of these images/repos HA, multiple appliance nodes will be stood up on individual VMs and could be proxied by the use of an explicit load balancer, or as an External IP Service
Hub Cluster
The Hub cluster is responsible for providing the images and git repositories to support running the spoke clusters.
Requirements:
Spoke Cluster
Requirements:
Relates to #134
The text was updated successfully, but these errors were encountered: