-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Karpenter v0.32.4 does not work when deployed via eksctl #7454
Comments
eksctl/pkg/cfn/builder/karpenter.go Lines 151 to 169 in 9575570
Looks like the controller policy is missing the |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Hi @yuxiang-zhang . Can you please assign this issue to me. |
I fixed this issue in my cluster by manually adding these permissions to eksctl-KarpenterControllerPolicy-CLUSTERNAME policy:
These are apparently missing when configuring Karpenter by eksctl. |
Thanks @pstast for the policies. I am yet to raise a PR for the issue. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
I ran into this issue with 0.36.2 as well, and @pstast's recommended fix resolved it for me. |
Still there with karpenter 0.37.0 and eksctl 0.183.0 |
Summary:
Karpenter deployment is successful but it fails to create new nodes
What were you trying to accomplish?
An EKS cluster was created using eksctl version 0.167.0 using the following manifest:
eksctl create cluster -f ./eks-karpenter.yaml
The cluster creation finishes successfully. See logs below.
Apply NodePool and EC2NodeClass, then create a deployment that requires a GPU. The pod enters Pending state. It is expected that karpenter will add a GPU node to the cluster
What happened?
No nodes get added to the cluster
Karpenter pods are in the Running state
Karpenter pod logs show errors:
[karpenter-84bf6fff97-v5v2k] {"level":"ERROR","time":"2024-01-06T09:15:56.457Z","logger":"controller","message":"Reconciler error","commit":"fdf67d0","controller":"nodeclass","controllerGroup":"karpenter.k8s.aws","controllerKind":"EC2NodeClass","EC2NodeClass":{"name":"default"},"namespace":"","name":"default","reconcileID":"fe2de351-d378-4d82-aff7-556160f4d128","error":"creating instance profile, getting instance profile "do-eks-yaml-karpenter_4067990795380418201", AccessDenied: User: arn:aws:sts::<account_id>:assumed-role/eksctl-do-eks-yaml-karpenter-iamservice-role/1704531887056119458 is not authorized to perform: iam:GetInstanceProfile on resource: instance profile do-eks-yaml-karpenter_4067990795380418201 because no identity-based policy allows the iam:GetInstanceProfile action\n\tstatus code: 403, request id: f3a80d84-31cc-44ad-a6a4-91b4d3e56de3"}
How to reproduce it?
eksctl create cluster -f ./eks-karpenter.yaml
kubectl -n karpenter logs -f $(kubectl -n karpenter get pod | grep karpenter | head -n 1 | cut -d ' ' -f 1)
Logs
Cluster creation log:
Karpenter pod log:
Anything else we need to know?
To build the container image for the deployment, from the cloned project directory, execute the following commands:
cd Container-Root/eks/deployment/horizontal-pod-autoscaler/hpa-example
./build.sh
./push.sh
Older versions of Karpenter (e.g. 0.29.0) used with Provisioner and AWSNodeTemplate work as expected.
In this case the v1alpha5 API is used: https://github.com/aws-samples/aws-do-eks/blob/main/Container-Root/eks/deployment/karpenter/provisioner-deploy-v1alpha5.sh
Karpenter works as expected, when the cluster is created without Karpenter, then Karpenter v0.32.4 is deployed by following the instructions here: https://karpenter.sh/v0.32/getting-started/getting-started-with-karpenter/#4-install-karpenter
It appears like eksctl lacks support for the versions of Karpenter that support API v1beta1.
Versions
The text was updated successfully, but these errors were encountered: