Support Startup Taints #628

rustrial · 2021-08-20T10:12:22Z

Tell us about your request

As of today, if taints are defined in spec.Constraints.taints, the provisioner will not provision nodes for pods that do not have matching tolerations.

Taints will be applied to every node launched by the Provisioner. If specified, the provisioner will not provision nodes for pods that do not have matching tolerations.

However, some cluster add-ons (usually CNI plugins) use temporary taints on new nodes to make sure no Pods are scheduled onto that nodes unless the CNI DaemonSet is ready on that node. Usually, those temporary node startup taints are then removed by the DaemonSet or by a corresponding controller. A prominent example is Cilium which uses the node.cilium.io/agent-not-ready taint.

Adding toleration to all Pods for such temporary node startup taints would of course void their purpose and might lead to disfunctional Pods as they are scheduled and started before CNI was properly setup on that node. With its current capabilities karpenter is not powerful enough to support such a setup.

What do you want us to build?

Add something like spec.Constraints.provisioningTaints, which contains taints that must be applied to newly created nodes, but might later be removed by workloads and are ignored by the provisioner for scheduling purposes. Specifically, karpenter should support the following behaviour:

provisioner adds all provisioningTaints to newly created nodes and then forgets about them (does not re-add them if they are later removed by a CNI controller).
provisioner waits until all provisioningTaints are removed from a node before it schedules any Pods onto that node.
but otherwise provisioner ignores provisioningTaints when making scheduling decisions (i.e. Pods do not need a matching toleration).

Of course, the proposed solution is just a (viable) placeholder and maybe somebody has a better ideas to resolve this problem, so feel free to propose other solutions.

Are you currently working around this issue?

Not using karpenter at all :-(

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

ellistarn · 2021-08-20T18:25:32Z

Hey @rustrial, this is a very interesting use case. Thanks for writing it up! We've recently been rethinking the provisioner.spec.taints functionality. Specifically, we're thinking about generating taints in response to tolerations, instead of doing nothing. The primary reason for this is to enable use cases where pods wish to run in isolation and would define custom taints to do so. Presently, this would require creating additional provisioners and putting the taints there, which is a bit awkward and high touch.

For your use case, I think the problem is already solved, regardless of our behavior w.r.t taints. This behavior currently works with the AWS VPC CNI and Calico, though I haven't tested Cilium. Let me walk you through the provisioning process:

User creates a typical workload pod PodA (e.g. pause) from a deployment scale out, etc
Karpenter sees the pod watch event and begins the provisioning process
Karpenter computes the resource requirements of daemonsets + PodA and launches a node
Karpenter immediately creates a node object and binds the pod to the node, before the instance is online
KCM Node Lifecycle. Contorller taints the node with NotReady: NoExecute/NoSchedule
KCM Daemonset Lifecycle Controller creates a CNI Pod (cilium/awsvpc/calico/etc) and binds it to the node
The node connects to the API Server, sees PodA and the CNI Pod, and pulls the images
PodA does not execute because of the NotReady: NoExecute taint on the node
CNI Pod does execute because it tolerates NotReady: NoExecute
CNI Pod does node initialization logic and removes the NotReady taint from the node
PodA executes because the node no longer has the NoExecute taint

It's a bit complicated, but the key bit is the NoExecute taint created by the NodeLifecycleController. It's designed to be removed by some daemon pod (typically a CNI), which will cause other pods to execute. For any pods that don't rely on the CNI, you can always tolerate this taint.

It's not clear to me why cilium doesn't use this taint, and instead relies on its own "node.cilium.io/agent-not-ready". I'm curious if it already does this. There may be a really good reason for this, in which case, I like your idea about applying taints at provisioning time. I wonder if we could simplify how this is modeled and use the NoExecute semantic.

spec:
  taints:
    - key: node.cilium.io/agent-not-ready
      effect: NoExecute

Karpenter will apply this taint to all nodes for this provisioner, but it won't be used in scheduling calculations. I think this is roughly the semantic you want.

Thoughts?

ellistarn · 2021-08-20T22:11:25Z

Checking back in. I played with

spec:
  taints:
    - key: node.cilium.io/agent-not-ready
      effect: NoExecute

It looks like the podlifecyclecontroller will evict pods immediately if a NoExecute taint exists on the node (unless it tolerates this taint, which by default, they don't), so this path isn't going to work unless we fundamentally change up our binding behavior, which I'm not excited about. I'm still curious if Cilium supports the node readiness workflow mentioned above.

If not, we may need to consider something like readinessTaints as you're suggesting.

rustrial · 2021-08-22T12:33:32Z

@ellistarn Thank's for your feedback.

KCM Node Lifecycle. Contorller taints the node with NotReady: NoExecute/NoSchedule

Can you please provide some pointers to the documentation of the NotReady taint you mentioned. I am not aware of such a taint and would like to read-up on this to figure out whether we can get this working for Cilium. I am only aware of the common taints mentioned in https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-nodes-by-condition and those taints are not treated specially by EKS.

It looks like the podlifecyclecontroller will evict pods immediately if a NoExecute taint exists on the node (unless it tolerates this taint, which by default, they don't), so this path isn't going to work unless we fundamentally change up our binding behavior, which I'm not excited about.

I think I have an idea for a minimal change to the current Binding behaviour and already started to draft a PR, just to figure out whether this might work (at all).

If not, we may need to consider something like readinessTaints as you're suggesting.

I like the term readinessTaints, its more descriptive then my placeholder name provisioningTaints.

ellistarn · 2021-08-22T16:51:46Z

Heres an example node

apiVersion: v1
kind: Node
metadata:
  creationTimestamp: "2021-08-22T16:45:01Z"
  finalizers:
  - karpenter.sh/termination
  labels:
    karpenter.sh/provisioner-name: default
  name: ip-192-168-58-236.us-west-2.compute.internal
spec:
  providerID: aws:///us-west-2d/i-08429eea8403fbe47
  taints:
  - effect: NoSchedule
    key: karpenter.sh/not-ready
  - effect: NoSchedule
    key: node.kubernetes.io/unreachable
    timeAdded: "2021-08-22T16:45:01Z"
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    timeAdded: "2021-08-22T16:45:01Z"
status:
  allocatable:
    cpu: "8"
    memory: 30310Mi
    pods: "58"
  conditions:
  - lastHeartbeatTime: "2021-08-22T16:45:01Z"
    lastTransitionTime: "2021-08-22T16:45:01Z"
    message: Kubelet never posted node status.
    reason: NodeStatusNeverUpdated
    status: Unknown
    type: Ready
  - lastHeartbeatTime: "2021-08-22T16:45:01Z"
    lastTransitionTime: "2021-08-22T16:45:01Z"
    message: Kubelet never posted node status.
    reason: NodeStatusNeverUpdated
    status: Unknown
    type: MemoryPressure
  - lastHeartbeatTime: "2021-08-22T16:45:01Z"
    lastTransitionTime: "2021-08-22T16:45:01Z"
    message: Kubelet never posted node status.
    reason: NodeStatusNeverUpdated
    status: Unknown
    type: DiskPressure
  - lastHeartbeatTime: "2021-08-22T16:45:01Z"
    lastTransitionTime: "2021-08-22T16:45:01Z"
    message: Kubelet never posted node status.
    reason: NodeStatusNeverUpdated
    status: Unknown
    type: PIDPressure

Until this condition is marked ready,

  - lastHeartbeatTime: "2021-08-22T16:45:01Z"
    lastTransitionTime: "2021-08-22T16:45:01Z"
    message: Kubelet never posted node status.
    reason: NodeStatusNeverUpdated
    status: Unknown
    type: Ready

these taints will be applied by the node lifecycle controller

  - effect: NoSchedule
    key: node.kubernetes.io/unreachable
    timeAdded: "2021-08-22T16:45:01Z"
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    timeAdded: "2021-08-22T16:45:01Z"

See: https://github.com/kubernetes/kubernetes/blob/release-1.17/pkg/controller/nodelifecycle/node_lifecycle_controller.go#L87.

ellistarn · 2021-08-22T17:13:02Z

I think I have an idea for a minimal change to the current Binding behaviour and already started to draft a PR, just to figure out whether this might work (at all).

Do you mind sharing the idea? I might be able to help.

ellistarn · 2021-08-22T17:40:17Z

Did a little digging. The kubelet updates the runtime state for the network here:
https://github.com/kubernetes/kubernetes/blob/3a26b864f4e17a6b7e0e673f583a551820a25b36/pkg/kubelet/kubelet.go#L2334.

This then prevents the node status conditions type: Ready from becoming true. So CNIs are required to cause this to be true, which will in turn cause the taints to be removed.

rustrial · 2021-08-23T07:22:48Z

@ellistarn I see, your are talking about https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions and specifically about the following two taints, which are set by the "node controller" based on the value of the node's Ready condition:

node.kubernetes.io/not-ready (if Ready == False )
node.kubernetes.io/unreachable (if Ready == Unknown)
no taint (if Ready == True)

I also understand that the node's Ready condition depends on the runtime's NetworkReady condition. However, I still don't see how a CNI plugin like Cilium could influence (set) the node's NetworkReady condition. A short search on the Cilium code base reveals no reference to "NetworkReady" and the same is true for AWS CNI plugin. Are we confusing node and pod networking status (conditions) here? A node might have well working and ready networking (e.g. AWS VPC ENI attached ...) while the CNI plugin, which is responsible for pod networking on that node (and not for the node's own networking needs), is not yet ready.

I think I have an idea for a minimal change to the current Binding behaviour and already started to draft a PR, just to figure out whether this might work (at all).

Do you mind sharing the idea? I might be able to help.

Sure, see #631

ellistarn · 2021-08-23T17:19:20Z

Hey @rustrial , I put some comments on the PR. What happens if you simply don't use this taint w/ Cilium? IIUC, it's to prevent a race condition when two CNIs are installed.

rustrial · 2021-08-24T05:56:20Z

@ellistarn Actually, not using the taint we observe that there might occasionally be some Pods which are not properly managed by Cilium, but this could be caused by another race condition in Cilium and is not necessarily related to that taint. I am also curious what conclusion we get from the Slack thread you started on the Cilium #kubernetes channel, hopefully it will shed some light on this.

ellistarn · 2021-08-24T15:54:01Z

Actually, not using the taint we observe that there might occasionally be some Pods which are not properly managed by Cilium, but this could be caused by another race condition in Cilium and is not necessarily related to that taint.

Do you have any links to this?

rustrial · 2021-08-24T16:31:27Z

Do you have any links to this?

@ellistarn No, I have no link. I just assumed/hoped that this problem would be addressed by the Cilium's node.cilium.io/agent-not-ready taint. I propose we focus on the Slack thread for now, trying to figure out when & why that taint is needed and whether we can live without it.

That said, I am convinced that even if it turns-out that the Cilium taint is not necessarily needed, sooner or later this request will come up again. There will always be another cluster add-on that needs some more elaborated control over node readiness and thus will not work (well) with karpenter (as it is today).

ellistarn · 2021-08-27T17:35:22Z

I just discovered a new type of Taint Effect

	// NOT YET IMPLEMENTED. TODO: Uncomment field once it is implemented.
	// Like TaintEffectNoSchedule, but additionally do not allow pods submitted to
	// Kubelet without going through the scheduler to start.
	// Enforced by Kubelet and the scheduler.
	// TaintEffectNoScheduleNoAdmit TaintEffect = "NoScheduleNoAdmit"

If this is what I think it is, it means that if we bind the pods to the kubelet, the kubelet would not start the pod until the taints are removed. I think this is the exact behavior that we want.

rustrial · 2021-08-27T18:56:43Z

Looks promising (in theory), but I did not find any KEP or other detailed description of that Taint Effect. All I found are some comments on issues, the golang types and this design document. None of these sources make it clear what exactly the behaviour of this effect will be and also it looks like it is abandoned (might not be implemented at all).

Need to do some more research, but I guess we have to ask around in the k8s community and find someone, who can shed some light on this.

is this still on the roadmap and what is the timeline?
what is the exact behaviour? (i.e. can we bind a Pod to a node, and it will not be rejected but only gets started once the taint is removed?)
...

ellistarn · 2021-08-27T19:18:39Z

I might be interested in picking up this KEP and driving it. It provides exactly the semantic we need without polluting Karpenter's API or requiring changes to the optimistic scheduling logic. There are a ton of great benefits of pre-binding pods, such as preloading configmaps and images. I want to make sure we can keep those benefits while also supporting the ordered execution.

stevehipwell · 2021-11-04T17:31:30Z

I think the CriticalAddonsOnly taint that gets added to EKS nodes (I have no idea by what or where it's documented) and is present as tolerations in certain AWS charts/manifests might be another edge case that need considering? For this I think Karpenter just needs the ability to ignore tolerations so it doesn't add them to instances it schedules?

ellistarn · 2021-11-04T18:32:55Z

I'd need to dig into this CriticalAddonsOnly detail and where it's coming from. What's the problem that you're trying to solve?

stevehipwell · 2021-11-04T18:49:55Z

@ellistarn the docs say that tolerations on workloads will cause taints to be added? This toleration is quite widespread, take a look at the EBS CSI driver chart for example.

ellistarn · 2021-11-04T18:56:24Z

Sorry if I'm being a bit dense, but what's the problem? If there are pending pods that need capacity (ebs csi driver, in this case), Karpenter will generate nodes with taints for those pods. Pods that do not have these tolerations will not be scheduled to these nodes. This ensures that ebs csi driver will only run alongside other workloads that have CriticalAddonsOnly.

Note: The csi driver includes both a daemonset (which karpenter won't schedule) and controller pods (which karpenter will).

stevehipwell · 2021-11-04T19:08:49Z

@ellistarn the point is that this toleration is to allow scheduling not to drive isolation. It would be undesirable to create a node with this taint for this workload. It's also quite common to tolerate certain taints on workloads that aren't directly targeted at them, these wouldn't want to create the taint either.

ellistarn · 2021-11-04T20:37:56Z

I hear ya.

It would be undesirable to create a node with this taint for this workload.

Can you dig into this a little bit? Why don't you want this pod scheduled?

It's also quite common to tolerate certain taints on workloads that aren't directly targeted at them, these wouldn't want to create the taint either.

If there is existing capacity without taints, the kube scheduler schedule them without any taint generation. I definitely agree that it's not necessary for karpenter to add the taints for the pod to be scheduled. However, by generating taints, you can ensure isolation via the pod spec, which wouldn't be possible otherwise without creating multiple provisioners.

This creates a tradeoff where nodes are packed less tightly (since pods without those tolerations could have been scheduled together), but enables the pod isolation feature.

I could potentially see a pod annotation like karpenter.sh/generate-taints=true. What do you think? Do you have any preference for polarity: whether or not it's enabled or disabled by default.

ellistarn · 2021-11-05T16:22:57Z

Thanks for the writeup -- I'll think deeply on this. Do you mind porting it to a new issue, since it seems like we're outside of the scope of the original ticket.

stevehipwell · 2021-11-08T09:43:26Z

Thank you for moving this for me @ellistarn I had a busy weekend and didn't get a chance.

mattsre · 2022-02-04T01:44:26Z

Just wanted to check on the status of this. Another Cilium user here, same problem as the original issue. Currently we work around this by simply restarting all pods on a node when the CNI pod finishes starting up, but this is obviously a hacky workaround we'd like to avoid keeping.

I noticed you specifically mentioned you don't want to complicate the API with new concepts and was just wondering if in the couple months since then if any better ideas had come to mind?

ellistarn · 2022-02-04T04:46:25Z

Is it possible that this could be fixed in the cilium CNI? Karpenter expresses the intent that Pod A should run on node B. Other controllers should reconcile this intent.

mattsre · 2022-02-04T16:04:45Z

I don't think so. It looks like Cilium uses this taint specifically to block DaemonSets from getting scheduled since they tolerate specific taints - https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#taints-and-tolerations
Relevant issue/proposal - cilium/cilium#16602

Do provisioners in Karpenter use different launch templates in AWS? If so, could we expose some way for the user to add custom args to boostrap.sh ... --kubelet-extra-args? Karpenter internally could merge the args provided by the user with those that Karpenter already sets. This would allow adding taints or custom labels users might want.

If there's a concern about it being too specific to AWS, the functionality could be exposed purely as a way to pass args to the provisoned nodes kubelet, and the implementation would be left up to the cloud provider

ellistarn · 2022-02-04T16:48:04Z

Users can already add custom taints via provisioner.taints which get merged in through kubelet args. This doesn't solve the problem of cilium failing to reconcile pod networking if it's installed after the pods are. IIUC, the kubelet doesn't even attempt to run the pods until the node is ready, and cilium is responsible for triggering node ready, so I want to dig a bit deeper into what the problem is 🤔

mattsre · 2022-02-04T17:10:46Z

Users can already add custom taints via provisioner.taints which get merged in through kubelet args.

Isn't the problem with this that Karpenter uses these taints for scheduling purposes? As in, the pod would have to tolerate these taints for a node to be provisioned. So we can't add the Cilium not-ready taint here, as pods would then have to tolerate it, and make the taint a no-op essentially.

IIUC, the kubelet doesn't even attempt to run the pods until the node is ready, and cilium is responsible for triggering node ready

This might be true, I honestly haven't tested this

ellistarn · 2022-02-04T18:24:56Z

Isn't the problem with this that Karpenter uses these taints for scheduling purposes? As in, the pod would have to tolerate these taints for a node to be provisioned. So we can't add the Cilium not-ready taint here, as pods would then have to tolerate it, and make the taint a no-op essentially.

D'oh, yes. I lost the thread 🤦 . I remember from some months ago that a user was running Cilium without these taints and things were working for them 🤔 it was a while ago though, so I might be misremembering.

mKeRix · 2022-02-22T07:05:47Z

I also stumbled upon this problem when using Cilium (in aws-cni chaining mode) yesterday. As a workaround, I ended up utilizing Kyverno to set the node taint with a mutation policy. I chose to set the taint for all nodes, but in theory you could also write a selector that will only mutate nodes with the karpenter.sh/provisioner-name label. Note that for this to work you will need to remove [Node,*,*] from the resourceFilters in the default Kyverno config. My policy looks like this:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-cilium-readiness-taint
spec:
  rules:
  - name: add-cilium-readiness-taint
    match:
      resources:
        kinds:
        - Node
    preconditions:
      any:
      - key: '{{ request.operation }}'
        operator: Equals
        value: CREATE
    mutate:
      patchStrategicMerge:
        spec:
          taints:
          - key: 'node.cilium.io/agent-not-ready'
            value: 'true'
            effect: NoSchedule

Maybe this helps someone else, it's working out well for me. :)

praveen-livspace · 2022-02-22T07:14:09Z

I also stumbled upon this problem when using Cilium (in aws-cni chaining mode) yesterday. As a workaround, I ended up utilizing Kyverno to set the node taint with a mutation policy. I chose to set the taint for all nodes, but in theory you could also write a selector that will only mutate nodes with the karpenter.sh/provisioner-name label. Note that for this to work you will need to remove [Node,*,*] from the resourceFilters in the default Kyverno config. My policy looks like this:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-cilium-readiness-taint
spec:
  rules:
  - name: add-cilium-readiness-taint
    match:
      resources:
        kinds:
        - Node
    preconditions:
      any:
      - key: '{{ request.operation }}'
        operator: Equals
        value: CREATE
    mutate:
      patchStrategicMerge:
        spec:
          taints:
          - key: 'node.cilium.io/agent-not-ready'
            value: 'true'
            effect: NoSchedule
Maybe this helps someone else, it's working out well for me. :)

This looks like an awesome hack. Let me try doing this with OPA Gatekeeper

zswanson · 2022-03-31T01:08:01Z

I think there is a more generalized problem with daemonsets that need to be running, but that do not add or remove taints. The AWS controllers in particular - aws-vpc-cni, aws-ebs-csi-driver and aws-secretstore-csi-provider. None of them add/remove startup taints but if you have a pod that uses their features that is bound to a karpenter node there is a race condition on startup that may leave the pod Pending for a long time. We've caught pods in this state multiple times, stuck for 15-20 minutes on a new node complaining about resources/capabilities unavailable that the daemonsets provide to the node.

zswanson · 2022-04-07T13:36:48Z

Commenting again on this to add that when we performed an EKS upgrade from 1.21 to 1.22 this became particularly noticeable when we started draining/cordoning nodes en-masse to upgrade the karpenter AMIs to 1.22 compatible - a lot of pods were stuck in Pending waiting for vpc-cni IP assignment or Secret volumes that they couldn't mount because they started on new nodes before the relevant daemonsets were available.

BryanStenson-okta · 2022-04-12T23:24:20Z

Why not just provide provisioner.spec.taintsToIgnore which, optionally, contains a list of taints the provisioner ignores when provisioning new nodes?

If a would-be-created node has a taint in the provisioner.spec.taintsToIgnore list, this taint is excluded when considering if the provisioner should create another node. Being on the ignore list, karpenter could ignore this taint when bin-packing the pod to the node too, right?

BryanStenson-okta · 2022-04-13T18:38:33Z

@ellistarn curious if you've found anything here. Specifically, if adding a provisioner.spec.taintsToIgnore will make life difficult in other ways.

I'm stuck on a project that will require me to solve this, or swap Karpenter out for CAS. My strong preference is to stick with a "fixed" Karpenter -- and I have some bandwidth to implement this -- if the proposed solution makes sense.

BryanStenson-okta · 2022-04-13T22:23:40Z

would love feedback on this ^^^ proposed solution. it seems very simple (maybe I'm missing something?).

ellistarn · 2022-04-13T22:33:06Z

We were chatting about this in standing this morning. We're playing with an idea that would enable us to not bind pods and allow the kube scheduler to do so. These startup taints could either be implemented as custom userdata or through something like taintsToIgnore (I'd probably name this startupTaints).

With just taintsToIgnore and without something where we don't bind, I don't see how this solves the problem.

BryanStenson-okta · 2022-04-13T22:46:46Z

I suppose I see "the problem" as Karpenter not adding nodes to the cluster. I believe taintsToIgnore itself solves this, as this will allow a provisioner to add more nodes to the cluster.

Having said that, there's absolutely a combination of taints which will cause lots of node churn -- that's probably not what we want either (#1671 (comment))

I'd like to hear more about ideas on how to delay or skip binding (although, by skipping binding, we lose some of the sugar Karpenter provides).

…date nodeSelector on pods (aws#628)

rustrial added the feature New feature or request label Aug 20, 2021

alex-berger mentioned this issue Aug 23, 2021

[WIP] Add support for node readiness taints #631

Closed

bwagner5 added this to the v0.4.0 milestone Aug 23, 2021

bwagner5 changed the title ~~Support extra taints not considered when scheduling Pods~~ Support Cilium Aug 23, 2021

bwagner5 changed the title ~~Support Cilium~~ Support Startup Taints Sep 23, 2021

bwagner5 modified the milestones: v0.4.0, v0.5.0 Sep 23, 2021

ellistarn mentioned this issue Sep 23, 2021

Startup Race Condition #670

Closed

stevehipwell mentioned this issue Nov 4, 2021

document Karpenter's reaction to taints/tolerations #659

Closed

ellistarn mentioned this issue Nov 7, 2021

Selecting Multiple Provisioners #783

Closed

ellistarn mentioned this issue Dec 13, 2021

Karpenter not setting CCM uninitialized taint #972

Closed

olemarkus mentioned this issue Dec 14, 2021

Support labels on Provisioner set outside of labels #992

Closed

ellistarn mentioned this issue Dec 23, 2021

Karpenter adds hardcoded Node Name to Pod Spec for scheduling. #1000

Closed

ellistarn added the kep this feature requires a KEP label Mar 22, 2022

BryanStenson-okta mentioned this issue Apr 13, 2022

support optional provisioner.Spec.taintsToIgnore #1671

Closed

3 tasks

tzneal mentioned this issue Apr 18, 2022

Endless provisioning loop of new node #1684

Closed

njtran removed this from the v0.5.0 milestone Apr 21, 2022

tzneal mentioned this issue Apr 29, 2022

support awareness of in-flight nodes & startup taints #1727

Merged

3 tasks

tzneal closed this as completed in #1727 May 3, 2022

ellistarn mentioned this issue Feb 9, 2023

Add node lifecycle documentation kubernetes/website#45074

Open

gfcroft pushed a commit to gfcroft/karpenter-provider-aws that referenced this issue Nov 25, 2023

chore: Allow the karpenter.sh/nodepool node label selector and vali…

28b4b11

…date nodeSelector on pods (aws#628)

zip-chanko mentioned this issue Aug 13, 2024

Karpenter doesn't wait for all the daemon sets to become Ready #6691

Closed

Support Startup Taints #628

Support Startup Taints #628

Comments

rustrial commented Aug 20, 2021 • edited Loading

Community Note

ellistarn commented Aug 20, 2021 • edited Loading

ellistarn commented Aug 20, 2021 • edited Loading

rustrial commented Aug 22, 2021

ellistarn commented Aug 22, 2021 • edited Loading

ellistarn commented Aug 22, 2021 • edited Loading

ellistarn commented Aug 22, 2021

rustrial commented Aug 23, 2021 • edited Loading

ellistarn commented Aug 23, 2021

rustrial commented Aug 24, 2021

ellistarn commented Aug 24, 2021

rustrial commented Aug 24, 2021

ellistarn commented Aug 27, 2021

rustrial commented Aug 27, 2021

ellistarn commented Aug 27, 2021 • edited Loading

stevehipwell commented Nov 4, 2021

ellistarn commented Nov 4, 2021

stevehipwell commented Nov 4, 2021

ellistarn commented Nov 4, 2021 • edited Loading

stevehipwell commented Nov 4, 2021

ellistarn commented Nov 4, 2021 • edited Loading

ellistarn commented Nov 5, 2021 • edited Loading

stevehipwell commented Nov 8, 2021

mattsre commented Feb 4, 2022

ellistarn commented Feb 4, 2022 • edited Loading

mattsre commented Feb 4, 2022 • edited Loading

ellistarn commented Feb 4, 2022 • edited Loading

mattsre commented Feb 4, 2022 • edited Loading

ellistarn commented Feb 4, 2022 • edited Loading

mKeRix commented Feb 22, 2022 • edited Loading

praveen-livspace commented Feb 22, 2022

zswanson commented Mar 31, 2022 • edited Loading

zswanson commented Apr 7, 2022

BryanStenson-okta commented Apr 12, 2022

BryanStenson-okta commented Apr 13, 2022

BryanStenson-okta commented Apr 13, 2022

ellistarn commented Apr 13, 2022

BryanStenson-okta commented Apr 13, 2022

rustrial commented Aug 20, 2021 •

edited

Loading

ellistarn commented Aug 20, 2021 •

edited

Loading

ellistarn commented Aug 20, 2021 •

edited

Loading

ellistarn commented Aug 22, 2021 •

edited

Loading

ellistarn commented Aug 22, 2021 •

edited

Loading

rustrial commented Aug 23, 2021 •

edited

Loading

ellistarn commented Aug 27, 2021 •

edited

Loading

ellistarn commented Nov 4, 2021 •

edited

Loading

ellistarn commented Nov 4, 2021 •

edited

Loading

ellistarn commented Nov 5, 2021 •

edited

Loading

ellistarn commented Feb 4, 2022 •

edited

Loading

mattsre commented Feb 4, 2022 •

edited

Loading

ellistarn commented Feb 4, 2022 •

edited

Loading

mattsre commented Feb 4, 2022 •

edited

Loading

ellistarn commented Feb 4, 2022 •

edited

Loading

mKeRix commented Feb 22, 2022 •

edited

Loading

zswanson commented Mar 31, 2022 •

edited

Loading