Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition to EKS managed addons (Enables NetworkPolicy enforcement via aws CNI) #4660

Closed
14 tasks done
consideRatio opened this issue Aug 22, 2024 · 5 comments · Fixed by #5129
Closed
14 tasks done

Comments

@consideRatio
Copy link
Contributor

consideRatio commented Aug 22, 2024

There is a banner in https://eksctl.io/, linking to Cluster creation flexibility for networking add-ons:

image

I think we should act related to this, and transition from installing aws-cni, kube-proxy, and coredns as eksctl managed (but EKS self-managed) addons with dedicated upgrade commands etc, towards installing them as EKS managed addons as enabled by eksctl config.

Motivation

This approach is the new approach, and the old approach is being phased out I think. The new approach is what we find docs about in AWS, and only through the new approach can we for example understand how to enable network policy enforcement for aws-cni plugin.

For example, consider this part from an eksctl config example in AWS docs, this is referencing the new approach where vpc-cni, coredns, and kube-proxy is explicitly listed as EKS managed addons. This accepts configurationValues with enableNetworkPolicy, but something equivalent for the old eksctl self-installed vpc-cni addon doesn't allow itself to be configured like this making us unable to enable network policy enforcement.

addons:
  - name: vpc-cni
    configurationValues: |-
      enableNetworkPolicy: "true"
  - name: coredns
  - name: kube-proxy

Preliminary steps

EDIT: re-creation of addons could be one strategy to try, but also #4660 (comment).

  • Ensure you have the latest eksctl version

  • Trial and verify re-creation of addons:
    This should be seen as disruptive maintenance, so don't do it in a AWS cluster with active users now or if you believe it soon will have. Its not clear how to revert a change like this, because there are no good documentaiton on doing this transitioning.

    Documentation about re-creating addons is available here, and here is a key section screenshotted:
    image

    Note the resolveConflicts: overwrite config, its very relevant for us!

    Initial advice:

    • Before re-creating, open a terminal to watch the associated pods in kube-system so you can get quick feedback when executing the re-creation command (the vpc-cni addons pods are the aws-node pods).
    • Before re-creating, check the version of the associated pods software. Do this by inspecting the pod's manifest image. I think ideally we don't have to pin a version, but it can be good to know what we previosly had before overwriting it.

    Addon specific steps

    • Trial re-creating the coredns addon
    • Verify function of the coredns addon seems OK and try to resolve it otherwise
    • Trial re-creating the kube-proxy addon
    • Verify function of the kube-proxy addon seems OK and try to resolve it otherwise
    • Trial re-creating the vpc-cni addon
    • Verify function of the vpc-cni addon seems OK and try to resolve it otherwise
  • Update k8s upgrade docs about addon updating.
    I think this means to reduce four commands in the docs into just the last command, as that updates all addons listed in the config.

  • Update terraform template for new clusters to include addons listing vpc-cni, coredns and kube-proxy
    I think this makes sense, but I'm not sure.

  • Apply the trialed transition to all other EKS clusters.
    EKS clusters can be listed by deployer config get-clusters --provider=aws.

Definition of done

  • Updated and applied all existing eksctl cluster configs to have addons listing vpc-cni, coredns, kube-proxy.
  • Updated k8s upgrade docs about addons upgrades
  • Updated new cluster template if needed
@consideRatio
Copy link
Contributor Author

I think this work makes #4661 less relevant, but I think doing #4661 first can help reduce differences between clusters, and that in turn improves this issue to be worked without hickups.

I propose we do #4661 before this due to that, and then get this done, and then #4652 could potentially be resolved by applying a simple eksctl config addition to each of our EKS clusters.

 addons:
   - name: vpc-cni
+    configurationValues: |-
+      enableNetworkPolicy: "true"

@consideRatio
Copy link
Contributor Author

I've not assigned an allocation to this. It could be internal engineering if its considered routine maintenance effort - or driven by product dev by seen as a path towards NetworkPolicy enforcement.

@yuvipanda
Copy link
Member

Given various other priorities and commitments, let's roll this into the next round of EKS upgrades (when they happen in a few months).

@yuvipanda
Copy link
Member

I've put this on the internal engineering roadmap and prioritized it accordingly.

@consideRatio
Copy link
Contributor Author

Looking at eksctl update addon --help for other reasons, I saw:

      --force                                          Force migrates an existing self-managed add-on to an EKS managed add-on

So maybe this can be done by simply declaring the addons explicitly, and then doing eksctl update addon --force --config-file=$CLUSTER_NAME.eksctl.yaml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants