-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: Add document for v5 direction to share with the broader community
- Loading branch information
1 parent
d5d8d1d
commit 4467d8e
Showing
1 changed file
with
114 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
# Direction for v5 of EKS Blueprints | ||
|
||
## What Has Worked | ||
|
||
- EKS Blueprints was started to [make it easier for customers to adopt Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/blogs/containers/bootstrapping-clusters-with-eks-blueprints/) in a shorter period of time. The project has been quite successful in this regard - hearing from customers stating that EKS Blueprints has helped them get from zero to one or more clusters running with applications in less than 1-2 weeks. | ||
|
||
- EKS Blueprints has also been successful in providing working examples to users that demonstrate architectural patterns and workload solutions. Some popular examples include: | ||
- Spark on EKS | ||
- Karpenter on EKS Fargate | ||
- Transparent encryption with Wireguard and Cilium | ||
- Fully serverless cluster with EKS Fargate | ||
|
||
## What Has Not | ||
|
||
- Scaling and managing addons that are created by EKS Blueprints. With almost [1,200 projects on the CNCF roadmap](https://landscape.cncf.io/), the number of different methods each project can be deployed on a cluster (i.e. - Datadog offers 5 different Helm charts for its service, Prometheus hosts over 30 Helm charts for its services), as well as the different tools used to provision addons (i.e. - Terraform, ArgoCD, FluxCD, etc.), supporting both the number of addons and their different forms has been extremely challenging for the team. In addition to managing just the sheer number of addons, supporting the different configurations that users wish to have exposed as well as testing these various configurations is only compounded by the number of addons and their methods of creation. | ||
|
||
- Managing resources provisioned on the cluster using Terraform. Terraform is a fantastic tool for provisioning infrastructure and it is the tool of choice for many customers when it comes to creating resources in AWS. However, there are a number of downsides with Terraform when it comes to provisioning resources on a Kubernetes cluster. These include: | ||
|
||
- Ordering of dependencies. Terraform wants to evaluate the current state of what it controls and be able to plan a series of actions to align the current state with the desired state *in one action*. It does this once for each `terraform plan` or `terraform apply`, and if any issues are encountered, it simply fails and halts execution. When Terraform cannot infer the the ordering of dependencies across resources (i.e. - through passing outputs of parent resources to arguments of child resources using the Terraform `<resource>.<name>.<attribute>` syntax), it will view the relationship as not coordinated and attempt to execute the resource provisioning in parallel and asynchronously. Any resources that are left waiting for a dependency will eventually timeout waiting and fail, causing Terraform itself to timeout and fail the apply. This is where the reconciliation loop of a Kubernetes controller or operator on the cluster is better suited - continuously trying to reconcile the state over and over again as dependencies are eventually resolved. (To be clear - the issue of dependency ordering still exists, but the controller/operator will keep retrying and on each retry, some resources will succeed which will move the execution along with each cycle until everything is fully deployed. Terraform could do this if it kept re-trying, but it does not do this today) | ||
|
||
- Publicly exposing access to the EKS endpoints in order to provision resources defined outside of the VPC onto the cluster. When using Terraform, the resource provisioning operation is a "push" model where Terraform will "push" requests to the EKS API Server to create resources. Coupled with the fact that the Terraform operation typically resides outside of the VPC where the cluster is running, this results in users enabling public access to the EKS endpoints to provision resources. However, the more widely accepted approach by the Kubernetes community has been the adoption of GitOps which uses a "pull" based model, where an operator or controller running on the cluster will pull the resource definitions from a Git repository and reconcile state from within the cluster itself. This approach is more secure as it does not require public access to the EKS endpoints and instead relies on the cluster's internal network to communicate with the EKS API Server. | ||
|
||
- The nesting of multiple sub-modules as well as the necessity to even require a module to be able to provide support for an addon. When we compare and contrast the Terraform approach to addons versus the GitOps approach, the Terraform approach has a glaring disadvantage - the need to create a module that wraps the addon's Helm chart in order to provision the addon via Terraform. As opposed to the GitOps approach, where users simply consume the charts from where they are stored as needed. This creates a bottleneck on the team to review, test, and validate each new addon as well as the overhead then added for maintaining and updating those addons going forward. This also opens up more areas where breaking changes are encountered which is further compounded by the fact that the Terraform addons are grouped under an "umbrella" module which obfuscates versioning. | ||
|
||
- Being able to support a combination of various tools, modules, frameworks, etc., to meet the needs of customers. The [`terraform-aws-eks`](https://github.com/terraform-aws-modules/terraform-aws-eks) was created long before EKS Blueprints, and many customers had already adopted this module for creating their clusters. In addition, Amazon has since adopted the [`eksctl`](https://github.com/weaveworks/eksctl) as the official CLI for Amazon EKS. When EKS Blueprints was first announced, many customers raised questions, asking if they needed to abandon their current clusters created through those other tools in order to adopt EKS Blueprints. The answer is no - users can and should be able to use their existing clusters as well as the tools they used to create them; EKS Blueprints can help augment that process through its supporting modules (addons, teams, etc.). This left the team with the question - why create a Terraform module for creating an EKS cluster when the [`terraform-aws-eks`](https://github.com/terraform-aws-modules/terraform-aws-eks) already exists; and to add, EKS Blueprints already uses that module for creating the control plane and security groups? | ||
|
||
## What Is Changing | ||
|
||
The direction for EKS Blueprints in v5 will shift from providing an all-encompassing, monolithic "framework" and instead focus more on how users can organize a set of modular components to create the desired solution on Amazon EKS. This will allow customers to use the components of their choosing in a way that is more familiar to them and their organization instead of having to adopt and conform to a framework. | ||
|
||
With this shift, the cluster creation will be removed from the project and instead rely on the [`terraform-aws-eks`](https://github.com/terraform-aws-modules/terraform-aws-eks) module, and the remaining modules will be moved out to their own repository as standalone projects. This leaves the EKS Blueprint project as the canonical place where users can receive guidance on how to configure their clusters to meet a desired architecture, how best to setup their clusters following well-architected practices, as well as references on the various ways that different workloads can be deployed on Amazon EKS. | ||
|
||
### Notable Changes | ||
|
||
1. EKS Blueprints will remove its Amazon EKS cluster Terraform module components (control plane, EKS managed node group, self-managed node group, and Fargate profile modules) from the project. In its place, users are encouraged to utilize the [`terraform-aws-eks`](https://github.com/terraform-aws-modules/terraform-aws-eks) module which meets or exceeds nearly all of the functionality of the EKS Blueprints v4.x cluster module. This includes the Terraform code contained at the root of the project as well as the `aws-eks-fargate-profiles`, `aws-eks-managed-node-groups`, `aws-eks-self-managed-node-groups`, and `launch-templates` modules which will all be removed from the project. | ||
2. The `aws-kms` module will be removed entirely. This was consumed in the root project module for cluster encryption. In its place, users can utilize the KMS key creation functionality of the [`terraform-aws-eks`](https://github.com/terraform-aws-modules/terraform-aws-eks) module or the [`terraform-aws-kms`](https://github.com/terraform-aws-modules/terraform-aws-kms) module if they wish to control the key outside of the creation of the cluster itself. | ||
3. The `emr-on-eks` module will be removed entirely; its replacement can be found in the addons under [`emr-on-eks`](https://github.com/aws-ia/terraform-aws-eks-blueprints/tree/main/modules/kubernetes-addons/emr-on-eks) and you can see an example of its usage under the [`emr-on-eks-fargate`](https://github.com/aws-ia/terraform-aws-eks-blueprints/tree/main/examples/analytics/emr-on-eks-fargate) example. | ||
4. The `irsa` and `helm-addon` modules will be removed entirely; we have released a new external module [`terraform-aws-eks-addon`](https://github.com/aws-ia/terraform-aws-eks-addon) that is available on the Terraform registry that replicates/replaces the functionality of these two modules. This will now allow users, as well as partners, to create their own addons that are not natively supported by EKS Blueprints more easily and following the same process as EKS Blueprints. | ||
5. The `aws-eks-teams` module will be removed entirely; its replacement will be the new external module [`terraform-aws-eks-teams`](#TODO) that incorporates the changes customers have been asking for in https://github.com/aws-ia/terraform-aws-eks-blueprints/issues/842 | ||
|
||
### Resulting Project Structure | ||
|
||
Previously under the v4.x structure, the EKS Blueprint project was comprised of various repositories across multiple AWS organizations that looked roughly like the following: | ||
|
||
#### v4.x Structure | ||
|
||
``` | ||
├── aws-ia/ | ||
| ├── terraform-aws-eks-blueprints/ | ||
| | ├── aws-auth-configmap.tf | ||
| | ├── data.tf | ||
| | ├── eks-worker.tf | ||
| | ├── locals.tf | ||
| | ├── main.tf | ||
| | ├── outputs.tf | ||
| | ├── variables.tf | ||
| | ├── versions.tf | ||
| | ├── examples/ | ||
| | └── modules | ||
| | ├── aws-eks-fargate-profiles/ | ||
| | ├── aws-eks-managed-node-groups/ | ||
| | ├── aws-eks-self-managed-node-groups/ | ||
| | ├── aws-eks-teams/ | ||
| | ├── aws-kms/ | ||
| | ├── emr-on-eks/ | ||
| | ├── irsa/ | ||
| | ├── kubernetes-addons/ | ||
| | └── launch-templates/ | ||
| └── terraform-aws-eks-ack-addons/ | ||
├── awslabs/ | ||
| ├── crossplane-on-eks/ | ||
| └── data-on-eks/ | ||
├── aws-samples/ | ||
| ├── eks-blueprints-add-ons/ | ||
| └── eks-blueprints-workloads/ | ||
└── aws-observability/ | ||
└── terraform-aws-observability-accelerator/ | ||
``` | ||
|
||
#### v5.x Structure | ||
|
||
``` | ||
├── aws-ia/ | ||
| ├── terraform-aws-eks-blueprints/ | ||
| | └── examples/ | ||
| ├── eks-addons/ # Contains addons in supported formats (Terraform, ArgoCD, and FluxCD (future)) | ||
| ├── terraform-aws-eks-addon/ # Module for creating Terraform based addon (IRSA + Helm chart) | ||
| ├── terraform-aws-eks-ack-addons/ | ||
| └── terraform-aws-eks-multi-tenancy/ # Module for creating Kubernetes multi-tenancy constructs | ||
├── awslabs/ | ||
| ├── crossplane-on-eks/ | ||
| └── data-on-eks/ | ||
├── aws-samples/ | ||
| └── eks-blueprints-workloads/ | ||
└── aws-observability/ | ||
└── terraform-aws-observability-accelerator/ | ||
``` | ||
|
||
## What Can Users Expect | ||
|
||
With these changes, the team intends to provide a better experience for users of the EKS Blueprints project as well as new and improved reference architectures. Following the v5 changes, the team intends to: | ||
|
||
1. Improved quality of the examples provided - more information on the intent of the example, why it might be useful for users, what scenarios is the pattern applicable, etc. Where applicable, architectural diagrams and supporting material will be provided to highlight the intent of the example and how its constructed. | ||
2. A more clear distinction between a blueprint and a usage reference. For example - the Karpenter on EKS Fargate blueprint should demonstrate all of the various aspects that users should be aware of and consider in order to take full advantage of this pattern (recommended practices, observability, logging, monitoring, security, day 2 operations, etc.). This is what makes it a blueprint. In contrast, a usage reference would be an example that shows how users can pass configuration values to the Karpenter provisioner. This example is less focused on the holistic architecture and more focused on how one might configure Karpenter using the implementation. The EKS Blueprints repository will focus mostly on holistic architecture and patterns, and any usage references should be saved for the repository that contains that implementation definition (i.e. - the `eks-addons` repository where the addon implementation is defined). | ||
3. Faster, and more responsive feedback. The first part of this is going to be improved documentation on how to contribute which should help clarify whether a contribution is worthy and willing to be accepted by the team before any effort is spent by the contributor. However, the goal of v5 is to focus more on the value added benefits that EKS Blueprints was created to provide as opposed to simply mass producing Helm chart wrappers (addons) and trying to keep up with that operationally intensive process. | ||
4. Lastly, more examples and blueprints that demonstrate various architectures and workloads that run on top of Amazon EKS as well as integrations into other AWS services. | ||
|
||
## To Be Decided | ||
|
||
- What addons will be supported by the EKS Blueprints project and what is the criteria for determining whether an addon will be supported or not (support here means - the project provides the implementation versus users needing to create the implementation on their own or through some other means) | ||
|
||
- What constitutes a blueprint? Meaning, what criteria is used in determining whether or not a blueprint should be created? What is the process for creating, and/or requesting, a blueprint? | ||
|
||
- How does the team decide where a blueprint will be created/stored (which repository)? |
4467d8e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was perusing the branches and commit history (As you do) and could not help noticing this.
I am currently working on building out a 'template' for building EKS clusters and I always like to 'think forwards' so I am following some of the ideas from this, such as using the
terraform-aws-eks
module directly.However, I see in this readme that you mention:
It would seem that that does not exist. Or if it does, it is not public yet. Is the messaging here incorrect, in that you have not yet released it? Or have you released it but it maybe did not get toggled to public yet?
Is there an official forum to discuss this upcoming version? I would be very interested to look at it and possibly contribute.
4467d8e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well thank you for perusing!
Yes, that module does exist today and needs to go through AWS' internal review process before projects can be made public, but for all intents and purposes that module is complete and just waiting review for publishing publicly
Here are some snippets from its README for reference: