-
Notifications
You must be signed in to change notification settings - Fork 983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Karpenter cannot provision node with same instance type previously used by cluster-autoscaler #5676
Comments
Could you provide logs, your pod spec, and your EC2NodeClass? Using the provided nodepool and a pod with the same resource requests (the original 28, not 26) Karpenter successfully provisions a |
Hello @jmdeal, Logs: Pod specifications:
EC2NodeClass:
|
From doing the conversion and looking at Karpenter's defaults, it looks like DaemonSet requests are definitely pushing y'all over the limit here for what Karpenter thinks is capable of instance type provisioning. Just doing the conversion
Were the logs that you printed above from your setup with |
Just from trying to repro this scenario, when I dropped the From looking over your configuration, you need to set
Also, if you are trying to pass a "0" value through to |
Hello @jonathan-innis,
Ideally the worker node should be provisioned by the master NodePool (instance-family: m6a, instance-cpu: "8") |
Sorry, did you update the |
Hello @jonathan-innis, |
Hello @jonathan-innis,
Here is the global configuration:
Interestingly the moment I first changed the |
Hello @jonathan-innis & @jmdeal, This leaves us with the remaining question why Karpenter (with the default settings, without the modification of the memory overhead parameter) considers the Additionally is there any detailed documentation about the memory overhead and its usage? |
The difference between the |
@jonathan-innis & @jmdeal, |
It is the percentage value that we lop off the top from the memory value that's presented from EC2 DescribeInstanceTypes. The issue here is that the capacity value that's presented by the kubelet is different from the memory presented by the DescribeInstanceTypes call. This is because there is some memory that is taken away and dedicated to the OS and some other memory that is utilized by the hypervisor. The percentage amount taken away from instance types is unfortunately not a smooth curve and I really think that the proper way to address this problem is just to generate data around it. If you take a look here: https://github.com/aws/karpenter-provider-aws/blob/33450d8f82ded870ce65fbde3cec14dbb2c04f50/pkg/providers/instancetype/zz_generated.memory_overhead.go you can see that I took a first-pass at trying to get the overhead details by launching nodes with instance types and then checking the difference between the DescribeInstanceType reported value and the actual node-reported value. |
Hello @jonathan-innis, |
Description
Observed Behavior:
After the migration from cluster-autoscaler to Karpenter, Karpenter cannot provision worker node with the exact same instance type (
m6a.2xlarge
) cluster-autoscaler was configured for due tono instance type which had enough resources and the required offering met the scheduling requirements
.Scenario 1)
When removing the
instance-family
requirement (m6a) from the NodePool configuration and setting theinstance-memory
requirement (32768
) Karpenter manages to provision node from thed
&g
instance categories with the instance typed3en.2xlarge
(Karpenter also considersg4ad.2xlarge
,g4dn.2xlarge
based on the logs) that has the exact same amount of vCPU & Memory them6a.2xlarge
instance type has.Scenario 2)
When removing the
instance-family
requirement (m6a) from the NodePool configuration and keeping theinstance-cpu
requirement (8
) Karpenter manages to provision node from ther
instance category with the instance typer6a.2xlarge
that has the exact same amount of vCPU but double Memory (64GB
) than them6a.2xlarge
instance type.In case there is a worker node with instance type
m6a.2xlarge
already provisioned & available in the cluster the default-scheduler manages to allocate the workload to this worker node, this also confirms that them6a.2xlarge
instance has the capacity to allocate our workload.The 1306 GitHub issue mentions similar behaviour, the same setup was tested with the
vmMemoryOverheadPercent
set to0.01
(from the0.075
default) with no improvements.Temporary workarounds:
2Gi
to26Gi
and wait for Karpenter to provision a worker node withm6a.2xlarge
instance type then revert back the memory request changekarpenter.k8s.aws/instance-cpu
requirement list with"16"
to give Karpenter the possibility to (over)provision a worker node withm6a.4xlarge
instance typeExpected Behavior:
Karpenter should be able to use the same instance type that was allocated to our workload when cluster-autoscaler was used. As the default scheduler of Kubernetes can allocate the workload to a worker node with
m6a.2xlarge
instance type Karpenter should not block the provisioning of the worker node.Reproduction Steps (Please include YAML):
"16"
m6a.4xlarge
instance type instead ofm6a.2xlarge
where the workload could previously fit when cluster-autoscaler was in useNodePool configuration:
Versions:
v0.32.1
kubectl version
):EKS 1.25
Further information:
eu-west-1
BottleRocket OS
(also tested AL2, no improvements)v0.32.1
tov0.32.7
was tested, same behaviour was observedm6a.2xlarge
instance type can be successfully provisioned manuallyIf we summarise the findings it is clearly visible that some instance types cannot allocate our workload based on Karpenter while others can despite the fact they have same vCPU & Memory specifications.
Can you please help us understand Karpenter's logic behind the instance choice?
The text was updated successfully, but these errors were encountered: