diff --git a/website/content/en/docs/faqs.md b/website/content/en/docs/faqs.md index 537db5015652..07d342a798a2 100644 --- a/website/content/en/docs/faqs.md +++ b/website/content/en/docs/faqs.md @@ -6,43 +6,44 @@ weight: 30 ## General ### How does a Provisioner decide to manage a particular node? -Each node will have a set of predetermined Karpenter labels. Provisioners will use the `name` and `namespace` labels to distinguish between Provisioners. Furthermore, a Provisioner will only take action on a node based on the label that details what phase a node is in, e.g. a Provisioner will only consider a node for termination if its phase label says `"underutilized"`. +Karpenter will only take action on nodes that it provisions. All nodes launched by Karpenter will contain be labeled with `karpenter.sh/provisioner-name`. ## Compatibility ### Which Kubernetes versions does Karpenter support? Karpenter releases on a similar cadence to upstream Kubernetes releases. Currently, Karpenter is compatible with Kubernetes versions v1.19+. However, this may change in the future as Karpenter takes dependencies on new Kubernetes features. ### Can I use Karpenter alongside another node management solution? -Provisioners are designed to work alongside static capacity management solutions like EKS Managed Node Groups and EC2 Auto Scaling Groups. Some customers may choose to (1) manage the entirety of their capacity using Provisioner, others may prefer (2) a mixed model with both dynamic and statically managed capacity, some may prefer (3) a fully static approach. We anticipate that most customers will fall into bucket (2) in the short term, and (1) in the long term. +Provisioners are designed to work alongside static capacity management solutions like EKS Managed Node Groups and EC2 Auto Scaling Groups. Some customers may choose to (1) manage the entirety of their capacity using Provisioners, others may prefer (2) a mixed model with both dynamic and statically managed capacity, some may prefer (3) a fully static approach. We anticipate that most customers will fall into bucket (2) in the short term, and (1) in the long term. ### Can I use Karpenter with the Kubernetes Cluster Autoscaler? -Yes, with side effects. Karpenter is a Cluster Autoscaler replacement. Both systems scale up nodes in response to unschedulable pods. If configured together, both systems will race to launch new instances for these pods. Since Provisioners make binding decisions, Karpenter will typically win the scheduling race. In this case, the Cluster Autoscaler will eventually scale down the unnecessary capacity. If the Cluster Autoscaler is configured with Node Groups that have constraints that aren’t supported by any Provisioner, its behavior will continue unimpeded. +Yes, with side effects. Karpenter is a Cluster Autoscaler replacement. Both systems scale up nodes in response to unschedulable pods. If configured together, both systems will race to launch new instances for these pods. Since Karpenter makes binding decisions, Karpenter will typically win the scheduling race. In this case, the Cluster Autoscaler will eventually scale down the unnecessary capacity. If the Cluster Autoscaler is configured with Node Groups that support scheduling constraints that aren’t supported by any Provisioner, its behavior will continue unimpeded. +### Does Karpenter replace the Kube Scheduler? +No. Provisioners work in tandem with the Kube Scheduler. When capacity is unconstrained, the Kube Scheduler will schedule pods as usual. It may schedule pods to nodes managed by Provisioners or other types of capacity in the cluster. Provisioners only attempt to schedule pods when `type=PodScheduled,reason=Unschedulable`. In this case, Karpenter will make a provisioning decision, launch new capacity, and bind pods to the provisioned nodes. Unlike the Cluster Autoscaler, Karpenter does not wait for the Kube Scheduler to make a scheduling decision, as the decision is already made during the provisioning decision. It's possible that a node from another management solution, like the Cluster Autoscaler, could create a race between the `kube-scheduler` and Karpenter. In this case, the first binding call will win, although Karpenter will often win these race conditions due to its performance characteristics. If Karpenter loses this race, the node will eventually be cleaned up. ## Provisioning ### How should I define scheduling constraints? -Karpenter takes a layered approach to scheduling constraints. Each Cloud Provider has its own set of global defaults, which are overriden by defaults specified in the Provisioner, which are overridden by Pod scheduling constraints. This model requires minimal configuration for most use cases, and supports diverse workloads using a single Provisioner. -### Does Karpenter replace the Kube Scheduler? -No. Provisioners work in tandem with the Kube Scheduler. When capacity is unconstrained, the Kube Scheduler will schedule pods as usual. It may schedule pods to nodes managed by Provisioners or other types of capacity in the cluster. Provisioners only attempt to schedule pods when `type=PodScheduled,reason=Unschedulable`. In this case, they will make a provisioning decision, launch new capacity, and bind pods to the provisioned nodes. Provisioners do not wait for the Kube Scheduler to make a scheduling decision in this case, as the decision is already made by nature of making a provisioning decision. It's possible that a node from another management solution, like the Cluster Autoscaler, could create a race between the `kube-scheduler` and Karpenter. In this case, the first binding call will win, although Karpenter will often win these race conditions due to its performance characteristics. If Karpenter loses this race, the node will eventually be cleaned up. +Karpenter takes a layered approach to scheduling constraints. Karpenter comes with a set of global defaults, which may be overriden by Provisioner-level defaults. Further, these may be overriden by pod scheduling constraints. This model requires minimal configuration for most use cases, and supports diverse workloads using a single Provisioner. ### Does Karpenter support node selectors? -Yes. Node selectors are an opt-in mechanism which allow customers to specify the nodes on which a pod can scheduled. Provisioners recognize well-known node selectors on incoming pods and use them to constrain the nodes they generate. You can read more about the well-known node selectors Karpenter supports in the [Concepts](/docs/concepts/#well-known-labels) documentation. For example, well known selectors like `node.kubernetes.io/instance-type`, `topology.kubernetes.io/zone`, `kubernetes.io/os`, `kubernetes.io/arch` are supported, and will ensure that provisioned nodes are constrained accordingly. Additionally, customers may specify arbitrary labels, which will be automatically applied to every node launched by the Provisioner. +Yes. Node selectors are an opt-in mechanism which allow customers to specify the nodes on which a pod can scheduled. Karpenter recognizes [well-known node selectors](https://kubernetes.io/docs/reference/labels-annotations-taints/) on unschedulable pods and uses them to constrain the nodes it provisions. You can read more about the well-known node selectors supported by Karpenter in the [Concepts](/docs/concepts/#well-known-labels) documentation. For example, `node.kubernetes.io/instance-type`, `topology.kubernetes.io/zone`, `kubernetes.io/os`, `kubernetes.io/arch` are supported, and will ensure that provisioned nodes are constrained accordingly. Additionally, customers may specify arbitrary labels, which will be automatically applied to every node launched by the Provisioner. + ### Does Karpenter support taints? -Yes. Taints are an opt-out mechanism which allows customers to specify the nodes on which a pod cannot be scheduled. Unlike labels, Provisioners do not automatically taint nodes in response to pod tolerations, since pod tolerations do not require that corresponding taints exist. However, similar to labels, customers may specify taints for their Provisioner, which will automatically be applied to every node in the group. This means that if a Provisioner is configured with taints, any incoming pods will not be provisioned unless they tolerate the taints. +Yes. Taints are an opt-out mechanism which allows customers to specify the nodes on which a pod cannot be scheduled. Unlike node selectors, Karpenter does not automatically taint nodes in response to pod tolerations. Similar to node selectors, customers may specify taints on their Provisioner, which will be automatically added to every node it provisions. This means that if a Provisioner is configured with taints, any incoming pods will not be scheduled unless the taints are tolerated. ### Does Karpenter support topology spread constraints? -Yes. Provisioners respect `pod.spec.topologySpreadConstraints`. Allocating pods with these constraints may yield highly fragmented nodes, due to their strict nature and complexity of “online binpacking” algorithms. -### Does Karpenter support affinity? -No. Karpenter intentionally does not support affinity due the to [scalability limitations](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity) outlined by SIG Scalability. Instead, we recommend using node selectors or taints instead of node affinity and pod topology spread instead of pod affinity. Do you have a use case for affinity that we're missing? Open an issue in our [GitHub repo](https://github.com/awslabs/karpenter/issues/new/choose) and tell us about it! +Not yet. Karpenter plans to respect `pod.spec.topologySpreadConstraints` by v0.4.0. +### Does Karpenter support node affinity? +Not yet. Karpenter plans to respect `pod.spec.nodeAffinity` by v0.4.0. ### Does Karpenter support custom resource like accelerators or HPC? -Yes. Support for specific custom resources can be implemented by your cloud provider. +Yes. Support for specific custom resources may be implemented by cloud providers. The AWS Cloud Provider supports `nvidia.com/gpu`, `amd.com/gpu`, `aws.amazon.com/neuron`. ### Does Karpenter support daemonsets? -Yes. Provisioners factor in daemonset overhead into all allocation calculations. They also respect daemonset scheduling constraints, such as Nvidia’s GPU Driver Installer. -### Does Karpenter support multiple scheduling defaults? -Provisioners are heterogeneous, which means that the nodes they manage are spread across multiple availability zones, instance types, and capacity types. This flexibility reduces the need for a large number of groups. However, customers may find multiple groups to be useful for more advanced use cases. For example, customers can create multiple groups, and then use the node selector `karpenter.sh/provisioner-name` to target specific groups. This enables advanced use cases like resource isolation and sharding. -### What if my pod is schedulable for multiple Provisioners? -It's possible that unconstrained pods could flexibly schedule in multiple groups. In this case, Provisioners will race to create a scheduling lease for the pod before launching new nodes, which avoids unnecessary scale out. +Yes. Karpenter factors in daemonset overhead into all provisioning calculations. Daemonsets are only included in calculations if their scheduling constraints are applicable to the provisoned node. +### Does Karpenter support multiple Provisioners? +Each Provisioner is capable of defining heterogenous nodes across multiple availability zones, instance types, and capacity types. This flexibility reduces the need for a large number of Provisioners. However, customers may find multiple Provisioners to be useful for more advanced use cases, such as defining multiple sets of provisioning defaults in a single cluster. +### If multiple Provisioners are defined, which will my pod use? +By default, pods will use the rules defined by a Provisioner named `default`. This is analogous to the `default` scheduler. To select an alternative provisioner, use the node selector `karpenter.sh/provisioner-name: alternative-provisioner`. You must either define a default provisioner or explicitly specify `karpenter.sh/provisioner-name` node selector. ## Deprovisioning ### How does Karpenter decide which nodes it can terminate? -A provisioner will only take action on nodes that it manages. This means that a node will only be considered for termination if it is labeled underutilized by the provisioner that manages it. -### How do I know if a node is underutilized? -Nodes are labeled underutilized if they have 0 non-daemonset pods scheduled. We plan to include more use cases in the future. A node needs to be underutilized for a period of time before being considered for termination. +Nodes will only terminate nodes that it manages. Nodes will be considered for termination due to expiry or emptiness (see below). +### When does Karpenter terminate empty nodes? +Nodes are considered empty when they do not have any pods scheduled to them. Daemonsets pods and Failed pods are ignored. Karpenter will send a deletion request to the Kubernetes API, and graceful termination will be handled by termination finalizer. Karpenter will wait for the duration of `ttlSecondsAfterUnderutilized` to terminate an empty node. If `ttlSecondsAfterUnderutilized` is unset, **which it is by default**, Karpenter will not terminate nodes once they are empty. +### When does Karpenter terminate expired nodes? +Nodes are considered expired when the current time exceeds their creation time plus `ttlSecondsUntilExpired`. Karpenter will send a deletion request to the Kubernetes API, and graceful termination will be handled by termination finalizer. If `ttlSecondsUntilExpired` is unset, **which it is by default**, Karpenter will not terminate any nodes due to expiry. ### How does Karpenter terminate nodes? -Karpenter annotates nodes that are underutilized with a time to live (TTL). If the node remains underutilized after the TTL expires, Karpenter then [cordons](https://kubernetes.io/docs/concepts/architecture/nodes/#manual-node-administration) the node and uses the [Kubernetes Eviction API](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/#eviction-api) to evict all non-daemonset pods. Once the node is empty, the node is terminated. -### Does Karpenter support Pod Disruption Budgets? -Yes. The Kubernetes Eviction API will not delete pods that violate a [Pod Disruption Budget (PDB)](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). It also disallows eviction of any pod covered by multiple PDBs, so most users will want to avoid overlapping selectors. See [this](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets) for more. +Karpenter [cordons](https://kubernetes.io/docs/concepts/architecture/nodes/#manual-node-administration) nodes to be terminated and uses the [Kubernetes Eviction API](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/#eviction-api) to evict all non-daemonset pods. After successful eviction of all non-daemonset pods, the node is terminated. If all the pods cannot be evicted, Karpenter won't forcibly terminate them and keep on trying to evict them. Karpenter respects [Pod Disruption Budgets (PDB)](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) by using the Kubernetes Eviction API. ### Does Karpenter support scale to zero? -Yes. Provisioners start at zero and launch or terminate nodes as necessary. We recommend that customers maintain a small amount of static capacity to bootstrap system controllers or run Karpenter outside of their cluster. +Yes. Karpenter only launches or terminates nodes as necessary based on aggregate pod resource requests. Karpenter will only retain nodes in your cluster as long as there are pods using them.