Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workflows: update AKS workflows with new taints #584

Merged
merged 1 commit into from
Oct 20, 2021
Merged

Conversation

nbusseneau
Copy link
Member

Re-impacted from: cilium/cilium#17529

Context: we recommend users taint all nodepools with node.cilium.io/agent-not-ready=true:NoSchedule to prevent application pods from being managed by the default AKS CNI plugin.

To this end, the proposed workflow users should follow when installing Cilium into AKS was to replace the initial AKS node pool with a new tainted system node pool, as it is not possible to taint the initial AKS node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools with node.cilium.io/agent-not-ready=true:NoSchedule to prevent application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow instead:

  • Replace the initial node pool with a system node pool tainted with CriticalAddonsOnly=true:NoSchedule, preventing application pods from being scheduled on it.
  • Create a secondary user node pool tainted with node.cilium.io/agent-not-ready=true:NoSchedule to prevent application pods from being scheduled on the user node pool until Cilium is ready to manage them.

@nbusseneau nbusseneau added the area/CI Continuous Integration testing issue or flake label Oct 15, 2021
@nbusseneau nbusseneau requested review from a team as code owners October 15, 2021 14:15
@nbusseneau nbusseneau requested a review from pchaigno October 15, 2021 14:15
@nbusseneau nbusseneau temporarily deployed to ci October 15, 2021 14:17 Inactive
@nbusseneau
Copy link
Member Author

Link to workflow run testing PR changes: https://github.com/cilium/cilium-cli/actions/runs/1346273735

@nbusseneau
Copy link
Member Author

@tklauser Weird issue while trying to run cilium install -- seems consistent:

2021-10-15T14:44:56.791125839Z 🔑 Generating certificates for Hubble...
2021-10-15T14:44:56.815733603Z 
2021-10-15T14:44:56.815775605Z Error: Unable to install Cilium: unable to create secret kube-system/hubble-server-certs: secrets "hubble-server-certs" already exists
2021-10-15T14:44:56.815802205Z ↩️ Rolling back installation...

This does not ring a bell to me: why would there already be a hubble-server-certs before install?

@tklauser
Copy link
Member

This does not ring a bell to me: why would there already be a hubble-server-certs before install?

Not sure why that would be to be honest, also it looks like the CA is already existing, which is somewhat unexpected in a fresh cluster:

 2021-10-15T14:44:56.791087238Z 🔑 Found existing CA in secret cilium-ca

Is it possible that we somehow ended up (re-)using an existing cluster where Cilium was previously installed and not properly uninstalled?

@nbusseneau
Copy link
Member Author

Is it possible that we somehow ended up (re-)using an existing cluster where Cilium was previously installed and not properly uninstalled?

Not possible as the clusters are unique per workflow run, so they are completely clean when created. I'll try to reproduce manually on Monday...

@tklauser
Copy link
Member

Is it possible that we somehow ended up (re-)using an existing cluster where Cilium was previously installed and not properly uninstalled?

Not possible as the clusters are unique per workflow run, so they are completely clean when created. I'll try to reproduce manually on Monday...

From the sysdump it looks like Cilium is already fully deployed before the installation of the agent DaemonSet and operator Deployment from the cilium-cli-install pod even started 🤔

NAMESPACE     NAME                                  READY   STATUS    RESTARTS   AGE     IP            NODE                                 NOMINATED NODE   READINESS GATES
kube-system   azure-cni-networkmonitor-jf7pc        1/1     Running   0          50s     10.240.0.66   aks-userpool-17851042-vmss000000     <none>           <none>
kube-system   azure-cni-networkmonitor-k4275        1/1     Running   0          48s     10.240.0.97   aks-userpool-17851042-vmss000001     <none>           <none>
kube-system   azure-cni-networkmonitor-nlkf5        1/1     Running   0          98s     10.240.0.35   aks-systempool-17851042-vmss000000   <none>           <none>
kube-system   azure-ip-masq-agent-jpjww             1/1     Running   0          48s     10.240.0.97   aks-userpool-17851042-vmss000001     <none>           <none>
kube-system   azure-ip-masq-agent-l89s7             1/1     Running   0          98s     10.240.0.35   aks-systempool-17851042-vmss000000   <none>           <none>
kube-system   azure-ip-masq-agent-zwc5f             1/1     Running   0          50s     10.240.0.66   aks-userpool-17851042-vmss000000     <none>           <none>
kube-system   cilium-8gqhc                          1/1     Running   0          50s     10.240.0.66   aks-userpool-17851042-vmss000000     <none>           <none>
kube-system   cilium-cli-install-qfxh7              0/1     Error     0          35s     10.240.0.66   aks-userpool-17851042-vmss000000     <none>           <none>
kube-system   cilium-gl87d                          1/1     Running   0          48s     10.240.0.97   aks-userpool-17851042-vmss000001     <none>           <none>
kube-system   cilium-operator-8ff45c8f8-lftqk       1/1     Running   0          35s     10.240.0.97   aks-userpool-17851042-vmss000001     <none>           <none>
kube-system   cilium-rqfs7                          1/1     Running   0          98s     10.240.0.35   aks-systempool-17851042-vmss000000   <none>           <none>
kube-system   coredns-autoscaler-54d55c8b75-przwk   1/1     Running   0          2m47s   10.240.0.45   aks-systempool-17851042-vmss000000   <none>           <none>
kube-system   coredns-d4866bcb7-f4jt4               1/1     Running   0          103s    10.240.0.57   aks-systempool-17851042-vmss000000   <none>           <none>
kube-system   coredns-d4866bcb7-fh6tz               0/1     Running   0          13s     10.240.0.54   aks-systempool-17851042-vmss000000   <none>           <none>
kube-system   kube-proxy-kps52                      1/1     Running   0          48s     10.240.0.97   aks-userpool-17851042-vmss000001     <none>           <none>
kube-system   kube-proxy-m5qfr                      1/1     Running   0          50s     10.240.0.66   aks-userpool-17851042-vmss000000     <none>           <none>
kube-system   kube-proxy-qj654                      1/1     Running   0          98s     10.240.0.35   aks-systempool-17851042-vmss000000   <none>           <none>
kube-system   metrics-server-569f6547dd-n6vfz       1/1     Running   0          2m47s   10.240.0.48   aks-systempool-17851042-vmss000000   <none>           <none>
kube-system   tunnelfront-6dc5bb978-m6hhv           1/1     Running   0          2m47s   10.240.0.42   aks-systempool-17851042-vmss000000   <none>           <none>

@nbusseneau nbusseneau marked this pull request as draft October 18, 2021 13:51
@nbusseneau nbusseneau temporarily deployed to ci October 18, 2021 15:07 Inactive
@nbusseneau
Copy link
Member Author

nbusseneau commented Oct 18, 2021

I found the issue. We use an in-cluster script to run cilium install and the pod running this script is scheduled on the initial system nodepool as it is the only one available at the time it is installed via Helm. But while this is happening, we are deleting the initial system nodepool and replacing it with our own, leading to the pod getting rescheduled on another node. The script then restarts from the beginning, and fails because the secret was already created as part of the previous run.

  • Immediate solution (implemented in next push): wait at least for the initial system nodepool to be deleted before installing our Helm chart.
  • Alternate solution: allow cilium install to install on top of a configuration where secrets are already setup. This would also help in case someone's installation is suddenly interrupted in the middle and re-run later.

@nbusseneau nbusseneau temporarily deployed to ci October 18, 2021 15:30 Inactive
@nbusseneau
Copy link
Member Author

New link to workflow run testing workflow changes: https://github.com/cilium/cilium-cli/actions/runs/1355353366
Test passed, removing temporary commit and draft status.

@nbusseneau nbusseneau marked this pull request as ready for review October 18, 2021 15:53
@nbusseneau nbusseneau temporarily deployed to ci October 18, 2021 15:53 Inactive
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Immediate solution (implemented in next push): wait at least for the initial system nodepool to be deleted before installing our Helm chart.

Should we push this change to cilium/cilium as well for consistency?

Probably worth a comment in the code, since reading the rest, one would expect a --no-wait in the delete command as well.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 18, 2021
@pchaigno pchaigno removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 18, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 18, 2021
@pchaigno pchaigno removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 18, 2021
@pchaigno pchaigno added the dont-merge/bad-bot To prevent MLH from marking ready-to-merge. label Oct 18, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 18, 2021
@pchaigno pchaigno removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 18, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 19, 2021
@pchaigno pchaigno removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 19, 2021
@nbusseneau
Copy link
Member Author

nbusseneau commented Oct 19, 2021

Should we push this change to cilium/cilium as well for consistency?

I would personally prefer not to. Arguments:

  • It is not necessary on cilium/cilium because we don't use in-cluster scripts (cilium install is run from the workflow and is thus not interrupted), so it would only slow down cilium/cilium AKS testing (not by much, but we probably gain something like 30 seconds out of it because the Cilium components get installed earlier).
  • Regarding consistency, I think keeping the test code consistent with the way users are supposed to install Cilium whenever possible is more important, and this would be with --no-wait as it is now in the Getting Started Guide.

Probably worth a comment in the code, since reading the rest, one would expect a --no-wait in the delete command as well.

Can do :)

Re-impacted from: cilium/cilium#17529

Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
@nbusseneau nbusseneau temporarily deployed to ci October 19, 2021 13:58 Inactive
@nbusseneau
Copy link
Member Author

@pchaigno Added code comment linking to #589 for tracking purposes, PTAL.

@pchaigno pchaigno added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. and removed dont-merge/bad-bot To prevent MLH from marking ready-to-merge. labels Oct 19, 2021
@nbusseneau
Copy link
Member Author

https://github.com/cilium/cilium-cli/actions/workflows/aks.yaml can be re-enabled once this is merged :)

@aanm aanm merged commit db746f8 into master Oct 20, 2021
@aanm aanm deleted the pr/fix-aks-workflow branch October 20, 2021 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants