Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFP: Update Cilium Helm install docs for EKS and the AWS VPC CNI #31041

Open
caleb-devops opened this issue Feb 28, 2024 · 10 comments
Open

CFP: Update Cilium Helm install docs for EKS and the AWS VPC CNI #31041

caleb-devops opened this issue Feb 28, 2024 · 10 comments
Labels
area/documentation Impacts the documentation, including textual changes, sphinx, or other doc generation code. help-wanted Please volunteer for this by adding yourself as an assignee! integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. kind/cfp kind/feature This introduces new functionality.

Comments

@caleb-devops
Copy link

caleb-devops commented Feb 28, 2024

Cilium Feature Proposal

Is your proposed feature related to a problem?

The documentation for installing CIlium in EKS with Helm currently recommends patching the VPC CNI with kubectl to enable Cilium to manage ENIs instead of the VPC CNI. While this does work, it adds a manual step that prevents bootstrapping a Cilium EKS cluster using Terraform or eksctl.

# Relevant code
kubectl -n kube-system patch daemonset aws-node --type='strategic' -p='{"spec":{"template":{"spec":{"nodeSelector":{"io.cilium/aws-node-enabled":"true"}}}}}'

Describe the feature you'd like

Please update the docs to instead recommend using addon configuration values to patch the vpc-cni at the time it's deployed. Please note that nodeSelector is not a value that can be configured, so instead, affinity must be used.

The VPC CNI can be configured to not run on Cilium managed nodes using the following configuration values:

{"affinity":{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"io.cilium/aws-node-enabled","operator":"In","values":["true"]}]}]}}}}
@caleb-devops caleb-devops added the kind/feature This introduces new functionality. label Feb 28, 2024
@joestringer
Copy link
Member

Sounds like this would be quite helpful, next step would be creating a concerete PR proposal.

@joestringer joestringer added help-wanted Please volunteer for this by adding yourself as an assignee! area/documentation Impacts the documentation, including textual changes, sphinx, or other doc generation code. integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. labels Mar 4, 2024
@Smana
Copy link
Contributor

Smana commented Mar 26, 2024

Hi @caleb-devops , thanks for the tip but when I put this configuration prior to cilium install the coredns addon doesn't start. (Obviously because no CNI are found).

@caleb-devops
Copy link
Author

Hi @Smana. CoreDNS requires that the CNI is deployed, so with the vpc-cni configuration values in place, Cilium will need to be installed before CoreDNS can run. The recommended node taint should prevent other pods (like coredns) from being scheduled on the node until Cilium is deployed.

  taints:
   - key: "node.cilium.io/agent-not-ready"
     value: "true"
     effect: "NoExecute"

@Smana
Copy link
Contributor

Smana commented Mar 27, 2024

Thx @caleb-devops , Actually I already have a toleration. However the cilium install only starts after the EKS module deployment is finished (including CoreDNS which is an EKS addon).

@caleb-devops
Copy link
Author

caleb-devops commented Mar 28, 2024

@Smana you don't need to add the toleration to CoreDNS. Because CoreDNS relies on the CNI, it will need to be deployed after Cilium is installed. For the terraform-aws-modules/eks/aws module, try the following:

  1. Set vpc-cni configuration_values in the terraform-aws-modules/eks/aws module

      cluster_addons = {
        vpc-cni = {
          most_recent    = true
          before_compute = true
    
          configuration_values = jsonencode({
            affinity = {
              nodeAffinity = {
                requiredDuringSchedulingIgnoredDuringExecution = {
                  nodeSelectorTerms = [{
                    matchExpressions = [{
                      key      = "io.cilium/aws-node-enabled"
                      operator = "In"
                      values   = ["true"]
                    }]
                  }]
                }
              }
            }
          })
        }
      }
  2. Install the Cilium Helm chart using the Terraform Helm provider

  3. Install remaining addons (I use the terraform-aws-eks-blueprints-addons module for this)

Copy link

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. and removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels May 28, 2024
@caleb-devops
Copy link
Author

The AWS EKS team will be adding an option to initialize a bare EKS cluster (without any addons) through aws/containers-roadmap#923. After they do, it should no longer be necessary to patch the VPC CNI to disable it.

@aanm aanm added the kind/cfp label Jun 20, 2024
@caleb-devops
Copy link
Author

@truongnht
Copy link

@caleb-devops may I know your eventual script to setup eks together with cilium in one go?

@jgalliers
Copy link

@caleb-devops may I know your eventual script to setup eks together with cilium in one go?

@caleb-devops I am very interested in this too, I'm deploying a bare EKS cluster and there's some very strange order-of-eventing going on with coreDNS refusing to become healthy (and thus the nodes stall out in not ready state)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Impacts the documentation, including textual changes, sphinx, or other doc generation code. help-wanted Please volunteer for this by adding yourself as an assignee! integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. kind/cfp kind/feature This introduces new functionality.
Projects
None yet
Development

No branches or pull requests

6 participants