-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should nmi (Pod Managed Identity) add-on tolerate all taints? #2146
Comments
Hi ckittel, AKS bot here 👋 I might be just a bot, but I'm told my suggestions are normally quite good, as such:
|
Triage required from @Azure/aks-pm |
Action required from @Azure/aks-pm |
Issue needing attention of @Azure/aks-leads |
Triage required from @Azure/aks-pm |
Action required from @Azure/aks-pm |
Ack and thanks for the feedback, we will add these taints for nmi deployment in the upcoming release. cc @miwithro |
Triage required from @Azure/aks-pm |
@Azure/aks-pm issue needs labels |
This request would be very useful to us. We would like to make use of dedicated node pools and at present the NMI DaemonSet is not deploying to our tainted node pool. |
In AKS side, we already added these tolerations to the nmi daemonset:
@adammal do these tolerations work for you? |
Thanks for responding. Unfortunately, no. I've noticed that DaemonSets like kube-proxy and calico-node (for example) are configured as follows.
and these DaemonSets are deploying to the tainted node pool as expected. Should I be tainting differently? Kubernetes version: v1.19.11 |
I've also noticed that the same problem exists with the AKS-AzureKeyVaultSecretsProvider addon. https://docs.microsoft.com/en-us/azure/aks/csi-secrets-store-driver The two DaemonSets it establishes do not specify any tolerations and as a result do not deploy to tainted node pools either. |
Hi @adammal Thanks for the feedback, I will update the nmi ds tolerations to align with the kube-proxy. For the secret store CSI driver, @ZeroMagic will update the tolerations settings as well. |
Thanks very much guys! Really appreciate it. |
Thanks for reaching out. I'm closing this issue as it was marked with "Answer Provided" and it hasn't had activity for 2 days. |
In my cluster with four node pools (one system, three user pools). Two of the three user node pools share a taint. The remaining has it's own (different) Taint as well. This means that all user node pools have a taint. In this cluster workloads all need to have their appropriate tolerations (one or both of those). I was caught by surprise when
nmi
was subject to the taints. All other system add-ons seem to have the standard:NoExecute op=Exists
and:Schedule op=Exists
andCriticalAddonsOnly op=Exists
tolerations, butnmi
has no tolerations (outside of default daemonset condition taints like node disk-pressure, pid-pressure, etc).Two observations:
nmi
doesn't have doesn't have theCriticalAddonsOnly
toleration. For me this is fine, because if I taint the system node pool (as suggested by the docs), it won't be scheduled there as I don't have needs for it there. However, I could see some wanting it on the system node pool (even if you shouldn't be running user workloads there).nmi
doesn't have:NoExecute op=Exists
and:Schedule op=Exists
tolerations, my nodepool taints used for the dedicated nodes use case (vs system caps concerns), are blocking the deployment of thenmi
cluster wide.Should
nmi
-- as deployed as an add-on -- be more permissive in its tolerations?Kubernetes version: 1.20.2
The text was updated successfully, but these errors were encountered: