diff --git a/website/content/en/preview/getting-started/migrating-from-cas/_index.md b/website/content/en/preview/getting-started/migrating-from-cas/_index.md new file mode 100644 index 000000000000..c13bac95f885 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/_index.md @@ -0,0 +1,177 @@ +--- +title: "Migrating from Cluster Autoscaler" +linkTitle: "Migrating from Cluster Autoscaler" +weight: 10 +--- + +This guide will show you how to switch from the [Kubernetes Cluster Autoscaler](https://github.com/kubernetes/autoscaler) to Karpenter for automatic node provisioning. +We will make the following assumptions in this guide + +* You will use an existing EKS cluster +* You will use existing VPC and subnets +* You will use existing security groups +* Your nodes are part of one or more node groups +* Your workloads have pod disruption budgets that adhere to [EKS best practices](https://aws.github.io/aws-eks-best-practices/karpenter/) +* Your cluster has an [OIDC provider](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html) for service accounts + +This guide will also assume you have the `aws` CLI installed. +You can also perform many of these steps in the console, but we will use the command line for simplicity. + +## Create IAM roles + +To get started with our migration we first need to create two new IAM roles for nodes provisioned with Karpenter and the Karpenter controller. + +To create the Karpenter node role we will use the following policy and commands. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step01-node-iam.sh" language="bash" %}} + +Now attach the required policies to the role + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step02-node-policies.sh" language="bash" %}} + +Now we need to create an IAM role that the Karpenter controller will use to provision new instances. +The controller will be using [IAM Roles for Service Accounts (IRSA)](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) which requires an OIDC endpoint. + +If you have another option for using IAM credentials with workloads (e.g. [kube2iam](https://github.com/jtblin/kube2iam)) your steps will be different. + +First we need to get the OIDC endpoint for the cluster. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step03-env.sh" language="bash" %}} + +Use that information to create our IAM role, inline policy, and trust relationship. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step04-controller-iam.sh" language="bash" %}} + +## Add tags to subnets and security groups + +We need to add tags to our nodegroup subnets so Karpenter will know which subnets to use. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step05-tag-subnets.sh" language="bash" %}} + +Add tags to our security groups. +This command only tags the security groups for the first nodegroup in the cluster. +If you have multiple nodegroups or multiple security groups you will need to decide which one Karpenter should use. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step06-tag-security-groups.sh" language="bash" %}} + +## Update aws-auth ConfigMap + +We need to allow nodes that are using the node IAM role we just created to join the cluter. +To do that we have to modify the `aws-auth` ConfigMap in the cluster. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step07-edit-aws-auth.sh" language="bash" %}} + +You will need to add a section to the mapRoles that looks something like this. +Replace the `${ACCOUNT_NUMBER}` variable with your account, but do not replace the `{{EC2PrivateDNSName}}`. +``` + - groups: + - system:bootstrappers + - system:nodes + rolearn: arn:aws:iam::${ACCOUNT_NUMBER}:role/KarpenterInstanceNodeRole + username: system:node:{{EC2PrivateDNSName}} +``` + +The full aws-auth configmap should have two groups. +One for your Karpenter node role and one for your existing node group. + +## Deploy Karpenter + +First set the Karpenter release you want to deploy. +``` +export KARPENTER_VERSION={{< param "latest_release_version" >}} +``` + +We can now generate a full Karpenter deployment yaml from the helm chart. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step08-generate-chart.sh" language="bash" %}} + +Modify the following lines in the karpenter.yaml file. + +### Set node affinity + +Edit the karpenter.yaml file and find the karpenter deployment affinity rules. +Modify the affinity so karpenter will run on one of the existing node group nodes. + +The rules should look something like this. +Replace the nodegroup value with your `${NODEGROUP}` + +``` + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: karpenter.sh/provisioner-name + operator: DoesNotExist + - matchExpressions: + - key: eks.amazonaws.com/nodegroup + operator: In + values: + - ng-123456 +``` + +Now that our deployment is ready we can create the karpenter namespace, create the provisioner CRD, and then deploy the rest of the karpenter resources. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step09-deploy.sh" language="bash" %}} + +## Create default provisioner + +We need to create a default provisioner so Karpenter knows what types of nodes we want for unscheduled workloads. +You can refer to some of the [example provisioners](https://github.com/aws/karpenter/tree{{< githubRelRef >}}examples/provisioner) for specific needs. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step10-create-provisioner.sh" language="bash" %}} + +## Set nodeAffinity for critical workloads (optional) + +You may also want to set a nodeAffinity for other critical cluster workloads. + +Some examples are + +* coredns +* metric-server + +You can edit them with `kubectl edit deploy ...` and you should add node affinity for your static node group instances. +Modify the `ng-123456` value to match your `$NODEGROUP`. + +``` + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: eks.amazonaws.com/nodegroup + operator: In + values: + - ng-123456 +``` + +## Remove CAS + +Now that karpenter is running we can disable the cluster autoscaler. +To do that we will scale the number of replicas to zero. + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step11-scale-cas.sh" language="bash" %}} + +To get rid of the instances that were added from the node group we can scale our nodegroup down to a minimum size to support Karpenter and other critical services. +We suggest a minimum of 2 nodes for the node group. + +> Note: If your workloads do not have [pod disruption budgets](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) set +> the following command **will cause workloads to be unavailable** + +{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step12-scale-ng.sh" language="bash" %}} + +If you have a lot of nodes or workloads you may want to slowly step down your node groups by a few instances at a time. +It is recommended to watch the transition carefully for workloads that may not have enough replicas running or disruption budgets configured. + +## Verify Karpenter + +As nodegroup nodes are drained you can verify that Karpenter is creating nodes for your workloads. + +```bash +kubectl logs -f -n karpenter -c controller -l app.kubernetes.io/name=karpenter +``` + +You should also see new nodes created in your cluster as the old nodes are removed +```bash +kubectl get nodes +``` \ No newline at end of file diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step01-node-iam.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step01-node-iam.sh new file mode 100644 index 000000000000..9946254d957a --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step01-node-iam.sh @@ -0,0 +1,15 @@ +echo '{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": { + "Service": "ec2.amazonaws.com" + }, + "Action": "sts:AssumeRole" + } + ] +}' > node-trust-policy.json + +aws iam create-role --role-name KarpenterInstanceNodeRole \ + --assume-role-policy-document file://node-trust-policy.json diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step02-node-policies.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step02-node-policies.sh new file mode 100644 index 000000000000..c7cf247b86d0 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step02-node-policies.sh @@ -0,0 +1,11 @@ +aws iam attach-role-policy --role-name KarpenterInstanceNodeRole \ + --policy-arn arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy + +aws iam attach-role-policy --role-name KarpenterInstanceNodeRole \ + --policy-arn arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy + +aws iam attach-role-policy --role-name KarpenterInstanceNodeRole \ + --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly + +aws iam attach-role-policy --role-name KarpenterInstanceNodeRole \ + --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step03-env.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step03-env.sh new file mode 100644 index 000000000000..77f2f11c9676 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step03-env.sh @@ -0,0 +1,5 @@ +CLUSTER_NAME= +OIDC_ENDPOINT=$(aws eks describe-cluster --name ${CLUSTER_NAME} \ + --query "cluster.identity.oidc.issuer" --output text) +AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' \ + --output text) diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step04-controller-iam.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step04-controller-iam.sh new file mode 100644 index 000000000000..1c0e6b49be80 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step04-controller-iam.sh @@ -0,0 +1,63 @@ +echo "{ + \"Version\": \"2012-10-17\", + \"Statement\": [ + { + \"Effect\": \"Allow\", + \"Principal\": { + \"Federated\": \"arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ENDPOINT#*//}\" + }, + \"Action\": \"sts:AssumeRoleWithWebIdentity\", + \"Condition\": { + \"StringEquals\": { + \"${OIDC_ENDPOINT#*//}:aud\": \"sts.amazonaws.com\", + \"${OIDC_ENDPOINT#*//}:sub\": \"system:serviceaccount:karpenter:karpenter\" + } + } + } + ] +}" > controller-trust-policy.json + +aws iam create-role --role-name KarpenterController \ + --assume-role-policy-document file://controller-trust-policy.json + +echo '{ + "Statement": [ + { + "Action": [ + "ssm:GetParameter", + "iam:PassRole", + "ec2:RunInstances", + "ec2:DescribeSubnets", + "ec2:DescribeSecurityGroups", + "ec2:DescribeLaunchTemplates", + "ec2:DescribeInstances", + "ec2:DescribeInstanceTypes", + "ec2:DescribeInstanceTypeOfferings", + "ec2:DescribeAvailabilityZones", + "ec2:DeleteLaunchTemplate", + "ec2:CreateTags", + "ec2:CreateLaunchTemplate", + "ec2:CreateFleet" + ], + "Effect": "Allow", + "Resource": "*", + "Sid": "Karpenter" + }, + { + "Action": "ec2:TerminateInstances", + "Condition": { + "StringLike": { + "ec2:ResourceTag/Name": "*karpenter*" + } + }, + "Effect": "Allow", + "Resource": "*", + "Sid": "ConditionalEC2Termination" + } + ], + "Version": "2012-10-17" +}' > controller-policy.json + +aws iam put-role-policy --role-name KarpenterController \ + --policy-name KarpenterController \ + --policy-document file://controller-policy.json diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step05-tag-subnets.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step05-tag-subnets.sh new file mode 100644 index 000000000000..de972ea2bddd --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step05-tag-subnets.sh @@ -0,0 +1,6 @@ +for NODEGROUP in $(aws eks list-nodegroups --cluster-name ${CLUSTER_NAME} \ + --query 'nodegroups' --output text); do aws ec2 create-tags \ + --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}" \ + --resources $(aws eks describe-nodegroup --cluster-name ${CLUSTER_NAME} \ + --nodegroup-name $NODEGROUP --query 'nodegroup.subnets' --output text ) +done diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step06-tag-security-groups.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step06-tag-security-groups.sh new file mode 100644 index 000000000000..7bc204c4a365 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step06-tag-security-groups.sh @@ -0,0 +1,15 @@ +NODEGROUP=$(aws eks list-nodegroups --cluster-name ${CLUSTER_NAME} \ + --query 'nodegroups[0]' --output text) + +LAUNCH_TEMPLATE=$(aws eks describe-nodegroup --cluster-name ${CLUSTER_NAME} \ + --nodegroup-name ${NODEGROUP} --query 'nodegroup.launchTemplate.{id:id,version:version}' \ + --output text | tr -s "\t" ",") + +SECURITY_GROUPS=$(aws ec2 describe-launch-template-versions \ + --launch-template-id ${LAUNCH_TEMPLATE%,*} --versions ${LAUNCH_TEMPLATE#*,} \ + --query 'LaunchTemplateVersions[0].LaunchTemplateData.SecurityGroupIds' \ + --output text) + +aws ec2 create-tags \ + --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}" \ + --resources ${SECURITY_GROUPS} diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step07-edit-aws-auth.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step07-edit-aws-auth.sh new file mode 100644 index 000000000000..740eae66f345 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step07-edit-aws-auth.sh @@ -0,0 +1 @@ +kubectl edit configmap aws-auth -n kube-system diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step08-generate-chart.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step08-generate-chart.sh new file mode 100644 index 000000000000..95be5c20a610 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step08-generate-chart.sh @@ -0,0 +1,7 @@ +helm template --namespace karpenter \ + karpenter karpenter/karpenter \ + --set aws.defaultInstanceProfile=KarpenterInstanceNodeRole \ + --set clusterEndpoint="${OIDC_ENDPOINT}" \ + --set clusterName=${CLUSTER_NAME} \ + --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"="arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterController" \ + --version ${KARPENTER_VERSION} > karpenter.yaml diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step09-deploy.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step09-deploy.sh new file mode 100644 index 000000000000..c62f9f6536a5 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step09-deploy.sh @@ -0,0 +1,4 @@ +kubectl create namespace karpenter +kubectl create -f \ + https://raw.githubusercontent.com/aws/karpenter/${KARPENTER_VERSION}/charts/karpenter/crds/karpenter.sh_provisioners.yaml +kubectl apply -f karpenter.yaml diff --git a/website/content/en/preview/getting-started/migrating-from-cas/scripts/step10-create-provisioner.sh b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step10-create-provisioner.sh new file mode 100644 index 000000000000..344fc8fb7978 --- /dev/null +++ b/website/content/en/preview/getting-started/migrating-from-cas/scripts/step10-create-provisioner.sh @@ -0,0 +1,12 @@ +cat <