Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI request: set --node-taints for primary node pool via CLI #1402

Closed
folkol opened this issue Jan 21, 2020 · 26 comments
Closed

CLI request: set --node-taints for primary node pool via CLI #1402

folkol opened this issue Jan 21, 2020 · 26 comments

Comments

@folkol
Copy link

folkol commented Jan 21, 2020

What happened:

$ az aks create \
    --name foo \
    --resource-group bar \
    --node-taints "foo=bar:NoSchedule"
az: error: unrecognized arguments: --node-taints foo=bar:NoSchedule
usage: az [-h] [--verbose] [--debug]
          [--output {json,jsonc,table,tsv,yaml,none}] [--query JMESPATH]
          {aks} ...

What you expected to happen:

I expected the primary node pool to be tainted in the same way as additional node pools can be tainted like this:

$ az aks nodepool add \
    --name foo \
    --resource-group bar \
    --cluster-name baz \
    --node-taints "foo=bar:NoSchedule"

How to reproduce it (as minimally and precisely as possible):

See "What you expected".

Anything else we need to know?:

This is related to #1401.

@jluk
Copy link
Contributor

jluk commented Jan 21, 2020

Edit: First pool taints can only be set during cluster creation, since taints are create-time only right now.

Thanks for the request, today you can add your own taints to the first node pool when the cluster is being created. You are correct that functionality is not in the CLI yet, but can be set via ARM template after the cluster/pool is created.

Could you share what taints you plan to apply to primary pool and the scenario of tagging goals? Is it to try to isolate system-pods (or the like) to the primary pool?

@folkol
Copy link
Author

folkol commented Jan 22, 2020

I was not aware of the primary node pool taint through ARM templates — that's convenient. (We have instead tainted the individual nodes.)

Our goal with the taints is compartmentalization of the different components of our workload, and not directly related to the system pods. We wouldn't mind the systems pods to be scheduled on the same nodes as our public components — but right now it seems like we need to have one "untainted" node pool for some of the system pods (see the related ticket about this: #1401).

@jluk jluk changed the title Feature request: --node-taints for primary node pool CLI request: set --node-taints for primary node pool via CLI Jan 23, 2020
@jluk
Copy link
Contributor

jluk commented Jan 23, 2020

Manually adding node taints via kubectl won't guarantee persistence/reapplication of those to new nodes created during scale or upgrade events, hence if you rely on a taint logic for scheduling you should set it through the AKS API definition for an agent pool.

Will leave this ticket open for tracking this CLI gap and jump to the related issue.

@jluk jluk self-assigned this Jan 23, 2020
@ondrejhlavacek
Copy link

@jluk How do you add a node taint to an existing node pool using ARM templates? I'm getting

Deployment failed. Correlation ID: ***. {
  "code": "PropertyChangeNotAllowed",
  "message": "Changing property 'properties.nodeTaints' is not allowed.",
  "target": "properties.nodeTaints"
}

@jluk
Copy link
Contributor

jluk commented Jan 27, 2020

@ondrejhlavacek apologies, my previous comment was incorrect that you can do a post-create update of pool taints. I've updated it.

This property is only set during pool creation - so you will need to define the taints on the first agent pool when the cluster is created in the ARM template.

@ondrejhlavacek
Copy link

@jluk Thanks for the clarification! Why is this property read-only during update?

@jluk
Copy link
Contributor

jluk commented Jan 27, 2020

Today it's not a mutable property because there is no reconciliation mechanism for cleanup/reapplication handling. We have that capability on the backlog, but we need to get some more important work done before supporting that.

@ernani
Copy link

ernani commented Apr 7, 2020

Weird thing here, I've added taints to another nodepool I've just added and it shows up as:

TLDR: Any reason why in my case it shows foo=bar:Noschedule and in the doc it shows up as "foo": "bar:NoSchedule"

az aks nodepool add \
    --resource-group foo \
    --cluster-name bar \
    --name footaint \
    --node-count 1 \
    --node-taints "foo=bar:NoSchedule"

But after creating it, the list shows as:

...
"nodeTaints": [
    "foo=bar:NoSchedule"
],
...

Where as in the doc linked here:
https://docs.microsoft.com/en-us/azure/aks/use-multiple-node-pools#specify-a-taint-label-or-tag-for-a-node-pool

It shows as:

...
"nodeTaints":  {
  "foo": "bar:NoSchedule"
},
...

@poochwashere
Copy link

poochwashere commented May 19, 2020

@jluk Sorry for the silly question but I have been trying to deploy an arm template pointed to the api "apiVersion": "2020-03-01" to "type": "Microsoft.ContainerService/managedClusters/agentPools" with this taint setting..."nodeTaints": "CriticalAddonsOnly=true:NoSchedule" but I get this error, no matter what I try.
Any clues would be GREATLY appreciated!

{
    "code": "UnmarshalError",
    "message": "UnmarshalEntity encountered error: json: cannot unmarshal string into Go struct field AgentPoolProperties.properties.nodeTaints of type []string."
}

@felipecruz91
Copy link

felipecruz91 commented Jun 1, 2020

@poochwashere the nodeTaints property is defined as type array. See below an example of how to use it in the ARM template:

"nodeTaints": [
    "CriticalAddonsOnly=true:NoSchedule"
]

@ghost ghost added the action-required label Jul 22, 2020
@ghost
Copy link

ghost commented Jul 27, 2020

Action required from @Azure/aks-pm

@ghost ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label Jul 27, 2020
@ghost
Copy link

ghost commented Aug 6, 2020

Issue needing attention of @Azure/aks-leads

@palma21
Copy link
Member

palma21 commented Aug 6, 2020

Moving to feature request to update taints.

@palma21 palma21 added feature-request Requested Features and removed Needs Attention 👋 Issues needs attention/assignee/owner action-required labels Aug 6, 2020
@jeanfrancoislarente
Copy link

jeanfrancoislarente commented Sep 25, 2020

Any update on this @palma21?

We're trying to create a system-only node pool with taints at cluster creation but taints are not supported during cluster creation and not supported on nodepool update.

--node-taints CriticalAddonsOnly=true:NoSchedule

Thanks in advance!!

@palma21
Copy link
Member

palma21 commented Sep 26, 2020

@jeanfrancoislarente I have a separate issue for that one which we are working on right now. #1833

Will provide updates there soon

@bplasmeijer
Copy link

@jeanfrancoislarente I have a separate issue for that one which we are working on right now. #1833

Will provide updates there soon

Thanks @palma21

nbusseneau added a commit to cilium/cilium that referenced this issue Oct 4, 2021
Context: we recommend recommend users taint all nodepools with
`node.cilium.io/agent-not-ready=true:NoSchedule` to prevent application
pods from being managed by the default AKS CNI plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on the system node pool.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
nbusseneau added a commit to cilium/cilium that referenced this issue Oct 12, 2021
Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
nbusseneau added a commit to cilium/cilium that referenced this issue Oct 13, 2021
Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
aanm pushed a commit to cilium/cilium that referenced this issue Oct 13, 2021
Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
nbusseneau added a commit to cilium/cilium-cli that referenced this issue Oct 15, 2021
Re-impacted from: cilium/cilium#17529

Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
joamaki pushed a commit to joamaki/cilium that referenced this issue Oct 18, 2021
[ upstream commit 2cb55ca ]

Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
Signed-off-by: Jussi Maki <[email protected]>
nbusseneau added a commit to cilium/cilium-cli that referenced this issue Oct 18, 2021
Re-impacted from: cilium/cilium#17529

Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
joamaki pushed a commit to cilium/cilium that referenced this issue Oct 19, 2021
[ upstream commit 2cb55ca ]

Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
Signed-off-by: Jussi Maki <[email protected]>
nbusseneau added a commit to cilium/cilium-cli that referenced this issue Oct 19, 2021
Re-impacted from: cilium/cilium#17529

Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
aanm pushed a commit to cilium/cilium-cli that referenced this issue Oct 20, 2021
Re-impacted from: cilium/cilium#17529

Context: we recommend users taint all nodepools with `node.cilium.io/agent-not-ready=true:NoSchedule`
to prevent application pods from being managed by the default AKS CNI
plugin.

To this end, the proposed workflow users should follow when installing
Cilium into AKS was to replace the initial AKS node pool with a new
tainted system node pool, as it is not possible to taint the initial AKS
node pool, cf. Azure/AKS#1402.

AKS recently pushed a change on the API side that forbids setting up
custom taints on system node pools, cf. Azure/AKS#2578.

It is not possible anymore for us to recommend users taint all nodepools
with `node.cilium.io/agent-not-ready=true:NoSchedule` to prevent
application pods from being managed by the default AKS CNI plugin.

To work around this new limitation, we propose the following workflow
instead:

- Replace the initial node pool with a system node pool tainted with
  `CriticalAddonsOnly=true:NoSchedule`, preventing application pods from
  being scheduled on it.
- Create a secondary user node pool tainted with `node.cilium.io/agent-not-ready=true:NoSchedule`
  to prevent application pods from being scheduled on the user node pool
  until Cilium is ready to manage them.

Signed-off-by: Nicolas Busseneau <[email protected]>
@ramazankilimci
Copy link

Is there any update for this?

@dmeytin
Copy link

dmeytin commented Nov 23, 2021

+1 on this feature

@guiliguili
Copy link

+1

1 similar comment
@samuel-form3
Copy link

+1

@haitch
Copy link
Member

haitch commented Jan 13, 2022

We are working on a change allow --node-taints to be updated without triggering the VMSS re-image, expect to have it ready, CLI will be updated after that.

@michelefa1988
Copy link

@haitch when will this fix be out? It is quite critical that we can prevent pods from being scheduled on the system node, especially when this one node is getting full

@mezzofix
Copy link

any updates on this ?

@ghost ghost added the action-required label Dec 10, 2022
@palma21
Copy link
Member

palma21 commented Feb 3, 2023

All items in this should be concluded

@ghost
Copy link

ghost commented Feb 11, 2023

Thank you for the feature request. I'm closing this issue as this feature has shipped and it hasn't had activity for 7 days.

@ghost ghost closed this as completed Feb 11, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Mar 13, 2023
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests