diff --git a/content/karpenter/010_prerequisites/attach_workspaceiam.md b/content/karpenter/010_prerequisites/attach_workspaceiam.md deleted file mode 100644 index a206d197..00000000 --- a/content/karpenter/010_prerequisites/attach_workspaceiam.md +++ /dev/null @@ -1,15 +0,0 @@ ---- -title: "Attach the IAM role to your Workspace" -chapter: false -weight: 50 ---- - -{{% notice note %}} -**Select the tab** and follow the specific instructions depending on whether you are… -{{% /notice %}} - - -{{< tabs name="Region" >}} - {{< tab name="...ON YOUR OWN" include="on_your_own_updateiam.md" />}} - {{< tab name="...AT AN AWS EVENT" include="at_an_aws_updateiam.md" />}} -{{< /tabs >}} \ No newline at end of file diff --git a/content/karpenter/010_prerequisites/aws_event.md b/content/karpenter/010_prerequisites/aws_event.md index 94cdbca0..f342f489 100644 --- a/content/karpenter/010_prerequisites/aws_event.md +++ b/content/karpenter/010_prerequisites/aws_event.md @@ -24,36 +24,20 @@ If you are at an AWS event, an AWS account was created for you to use throughout You are now logged in to the AWS console in an account that was created for you, and will be available only throughout the workshop run time. {{% notice info %}} -In the interest of time for shorter events we sometimes deploy the resources required as a prerequisite for you. If you were told so, please review the cloudformation outputs of the stack that was deployed by **expanding the instructions below**. +In the interest of time we have deployed everything required to run Karpenter for this workshop. All the pre-requisites and dependencies have been deployed. The resources deployed can befound in this CloudFormation Template (**[eks-spot-workshop-quickstarter-cnf.yml](https://raw.githubusercontent.com/awslabs/ec2-spot-workshops/master/content/using_ec2_spot_instances_with_eks/010_prerequisites/prerequisites.files/eks-spot-workshop-quickstart-cnf.yml)**). The template deploys resourcess such as (a) An [AWS Cloud9](https://console.aws.amazon.com/cloud9) workspace with all the dependencies and IAM privileges to run the workshop (b) An EKS Cluster with the name `eksworkshop-eksctl` and (c) a [EKS managed node group](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) with 2 on-demand instances. {{% /notice %}} -{{%expand "Click to reveal detailed instructions" %}} +#### Getting access to Cloud9 -#### What resources are already deployed {#resources_deployed} - -We have deployed the below resources required to get started with the workshop using a CloudFormation Template (**[eks-spot-workshop-quickstarter-cnf.yml](https://raw.githubusercontent.com/awslabs/ec2-spot-workshops/master/content/using_ec2_spot_instances_with_eks/010_prerequisites/prerequisites.files/eks-spot-workshop-quickstart-cnf.yml)**), Please reference the below resources created by the stack. - -+ An [AWS Cloud9](https://console.aws.amazon.com/cloud9) workspace with - - An IAM role created and attached to the workspace with Administrator access - - Kubernetes tools installed (kubectl, jq and envsubst) - - awscli upgraded to v2 - - Created and imported a key pair to Amazon EC2 - - [eksctl](https://eksctl.io/) installed, The official CLI for Amazon EKS - -+ An EKS cluster with the name `eksworkshop-eksctl` and a [EKS managed node group](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) with 2 on-demand instances. - - -#### Use your resources - -In this workshop, you'll need to reference the resources created by the CloudFormation stack that we setup for you. +In this workshop, you'll need to reference the resources created by the CloudFormation stack. 1. On the [AWS CloudFormation console](https://console.aws.amazon.com/cloudformation), select the stack name that starts with **mod-** in the list. -1. In the stack details pane, click the **Outputs** tab. +2. In the stack details pane, click the **Outputs** tab. ![cnf_output](/images/karpenter/prerequisites/cnf_output.png) -It is recommended that you keep this window open so you can easily refer to the outputs and resources throughout the workshop. +It is recommended that you keep this tab / window open so you can easily refer to the outputs and resources throughout the workshop. {{% notice info %}} you will notice additional Cloudformation stacks were also deployed which is the result of the stack that starts with **mod-**. One to deploy the Cloud9 Workspace and two other to create the EKS cluster and managed nodegroup. @@ -78,9 +62,7 @@ aws sts get-caller-identity {{% insert-md-from-file file="karpenter/010_prerequisites/at_an_aws_validaterole.md" %}} -Since we have already setup the prerequisites, **you can head straight to [Test the Cluster]({{< relref "/karpenter/020_eksctl/test.md" >}})** - -{{% /expand%}} +You are now ready to **[Test the Cluster]({{< relref "/karpenter/test.md" >}})** diff --git a/content/karpenter/010_prerequisites/awscli.md b/content/karpenter/010_prerequisites/awscli.md deleted file mode 100644 index 3aeb060e..00000000 --- a/content/karpenter/010_prerequisites/awscli.md +++ /dev/null @@ -1,28 +0,0 @@ ---- -title: "Update to the latest AWS CLI" -chapter: false -weight: 45 -comment: default install now includes aws-cli/1.15.83 ---- - -{{% notice tip %}} -For this workshop, please ignore warnings about the version of pip being used. -{{% /notice %}} - -1. Run the following command to view the current version of aws-cli: -``` -aws --version -``` - -1. Update to the latest version: -``` -curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" -unzip awscliv2.zip -sudo ./aws/install -. ~/.bash_profile -``` - -1. Confirm you have a newer version: -``` -aws --version -``` diff --git a/content/karpenter/010_prerequisites/k8stools.md b/content/karpenter/010_prerequisites/k8stools.md deleted file mode 100644 index 2721ed50..00000000 --- a/content/karpenter/010_prerequisites/k8stools.md +++ /dev/null @@ -1,58 +0,0 @@ ---- -title: "Install Kubernetes Tools" -chapter: false -weight: 40 ---- - -Amazon EKS clusters require kubectl and kubelet binaries and the aws-cli or aws-iam-authenticator -binary to allow IAM authentication for your Kubernetes cluster. - -{{% notice tip %}} -In this workshop we will give you the commands to download the Linux -binaries. If you are running Mac OSX / Windows, please [see the official EKS docs -for the download links.](https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html) -{{% /notice %}} - -#### Install kubectl - -``` -export KUBECTL_VERSION=v1.21.2 -sudo curl --silent --location -o /usr/local/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/${KUBECTL_VERSION}/bin/linux/amd64/kubectl -sudo chmod +x /usr/local/bin/kubectl -``` - -#### Enable Kubectl bash_completion - -``` -kubectl completion bash >> ~/.bash_completion -. /etc/profile.d/bash_completion.sh -. ~/.bash_completion -``` - -#### Set the AWS Load Balancer Controller version - -``` -echo 'export LBC_VERSION="v2.3.0"' >> ~/.bash_profile -. ~/.bash_profile -``` - -#### Install JQ and envsubst -``` -sudo yum -y install jq gettext bash-completion moreutils -``` - -#### Installing YQ for Yaml processing - -``` -echo 'yq() { - docker run --rm -i -v "${PWD}":/workdir mikefarah/yq "$@" -}' | tee -a ~/.bashrc && source ~/.bashrc -``` - -#### Verify the binaries are in the path and executable -``` -for command in kubectl jq envsubst - do - which $command &>/dev/null && echo "$command in path" || echo "$command NOT FOUND" - done -``` \ No newline at end of file diff --git a/content/karpenter/010_prerequisites/prerequisites.files/eks-spot-workshop-quickstart-cnf.yml b/content/karpenter/010_prerequisites/prerequisites.files/eks-spot-workshop-quickstart-cnf.yml index 2f413105..024f3b35 100644 --- a/content/karpenter/010_prerequisites/prerequisites.files/eks-spot-workshop-quickstart-cnf.yml +++ b/content/karpenter/010_prerequisites/prerequisites.files/eks-spot-workshop-quickstart-cnf.yml @@ -3,7 +3,7 @@ AWSTemplateFormatVersion: '2010-09-09' Description: AWS CloudFormation template to create a Cloud9 environment setup with kubectl, eksctl and an EKS cluster with a managed node group. Please allow ~20min for the EKS cluster to be ready. Metadata: Author: - Description: Sandeep Palavalasa + Description: Carlos Rueda License: Description: 'Copyright 2020 Amazon.com, Inc. and its affiliates. All Rights Reserved. @@ -20,29 +20,30 @@ Parameters: C9InstanceType: Description: Example Cloud9 instance type Type: String - Default: t2.micro + Default: t3.micro AllowedValues: - - t2.micro + - t3.micro + - m5.large ConstraintDescription: Must be a valid Cloud9 instance type C9KubectlVersion: Description: Cloud9 instance kubectl version Type: String - Default: v1.21.2 + Default: v1.23.7 ConstraintDescription: Must be a valid kubectl version C9KubectlVersionTEST: Description: Cloud9 instance kubectl version Type: String - Default: v1.21.2 + Default: v1.23.7 ConstraintDescription: Must be a valid kubectl version C9EKSctlVersion: Description: Cloud9 instance eksctl version Type: String - Default: v0.68.0 + Default: v0.110.0 ConstraintDescription: Must be a valid eksctl version EKSClusterVersion: Description: EKS Cluster Version Type: String - Default: 1.21 + Default: 1.23 ConstraintDescription: Must be a valid eks version EKSClusterName: Description: EKS Cluster Name @@ -160,7 +161,7 @@ Resources: Fn::GetAtt: - C9LambdaExecutionRole - Arn - Runtime: python3.6 + Runtime: python3.9 MemorySize: 256 Timeout: '600' Code: @@ -281,6 +282,7 @@ Resources: - !Sub sed -i.bak -e 's/--AZB--/${AWS::Region}b/' /home/ec2-user/environment/eksworkshop.yaml - !Sub sed -i.bak -e 's/--EKS_VERSION--/"'"${EKSClusterVersion}"'"/' /home/ec2-user/environment/eksworkshop.yaml - sudo -H -u ec2-user /usr/local/bin/eksctl create cluster -f /home/ec2-user/environment/eksworkshop.yaml + - sudo -H -u ec2-user /usr/local/bin/aws eks update-kubeconfig --name eksworkshop-eksctl - sudo -H -u ec2-user /usr/local/bin/kubectl get nodes C9BootstrapAssociation: diff --git a/content/karpenter/010_prerequisites/self_paced.md b/content/karpenter/010_prerequisites/self_paced.md index e5d5ce68..bb1194b3 100644 --- a/content/karpenter/010_prerequisites/self_paced.md +++ b/content/karpenter/010_prerequisites/self_paced.md @@ -8,7 +8,10 @@ weight: 10 Only complete this section if you are running the workshop on your own. If you are at an AWS hosted event (such as re:Invent, Kubecon, Immersion Day, etc), go to [Start the workshop at an AWS event]({{< ref "/karpenter/010_prerequisites/aws_event.md" >}}). {{% /notice %}} -### Running the workshop on your own +## Running the workshop on your own + + +### Creating an account to run the workshop {{% notice warning %}} Your account must have the ability to create new IAM roles and scope other IAM permissions. @@ -33,5 +36,53 @@ as an IAM user with administrator access to the AWS account: 1. Take note of the login URL and save: ![Login URL](/images/karpenter/prerequisites/iam-4-save-url.png) +### Deploying CloudFormation + +In the interest of time and to focus just on karpenter, we will install everything required to run this Karpenter workshop using cloudformation. + +1. Download locally this cloudformation stack into a file (**[eks-spot-workshop-quickstarter-cnf.yml](https://raw.githubusercontent.com/awslabs/ec2-spot-workshops/master/content/using_ec2_spot_instances_with_eks/010_prerequisites/prerequisites.files/eks-spot-workshop-quickstart-cnf.yml)**). + +1. Go into the CloudFormation console and select the creation of a new stack. Select **Template is ready**, and then **Upload a template file**, then select the file that you downloaded to your computer and click on **Next** + +1. Fill in the **Stack Name** using 'karpenter-workshop', Leave all the settings in the parameters section with the default prarameters and click **Next** + +1. In the Configure Stack options just scroll to the bottom of the page and click **Next** + +1. Finally in the **Review karpenter-workshop** go to the bottom of the page and tick the `Capabilities` section *I acknowledge that AWS CloudFormation might create IAM resources.* then click **Create stack** + +{{% notice warning %}} +The deployment of this stack may take up to 20minutes. You should wait until all the resources in the cloudformation stack have been completed before you start the rest of the workshop. The template deploys resourcess such as (a) An [AWS Cloud9](https://console.aws.amazon.com/cloud9) workspace with all the dependencies and IAM privileges to run the workshop (b) An EKS Cluster with the name `eksworkshop-eksctl` and (c) a [EKS managed node group](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) with 2 on-demand instances. +{{% /notice %}} + +### Checking the completion of the stack deployment + +One way to check your stack has been fully deployed is to check that all the cloudformation dependencies are green and succedded in the cloudformation dashboard; This should look similar to the state below. + +![cnf_output](/images/karpenter/prerequisites/cfn_stak_completion.png) + +#### Getting access to Cloud9 + +In this workshop, you'll need to reference the resources created by the CloudFormation stack. + +1. On the [AWS CloudFormation console](https://console.aws.amazon.com/cloudformation), select the stack name that starts with **mod-** in the list. + +2. In the stack details pane, click the **Outputs** tab. + +![cnf_output](/images/karpenter/prerequisites/cnf_output.png) + +It is recommended that you keep this tab / window open so you can easily refer to the outputs and resources throughout the workshop. + +{{% notice info %}} +you will notice additional Cloudformation stacks were also deployed which is the result of the stack that starts with **mod-**. One to deploy the Cloud9 Workspace and two other to create the EKS cluster and managed nodegroup. +{{% /notice %}} + +#### Launch your Cloud9 workspace + +- Click on the url against `Cloud9IDE` from the outputs + +{{% insert-md-from-file file="karpenter/010_prerequisites/workspace_at_launch.md" %}} + +{{% insert-md-from-file file="karpenter/010_prerequisites/update_workspace_settings.md" %}} + -Once you have completed the step above, **you can head straight to [Create a Workspace]({{< ref "/karpenter/010_prerequisites/workspace.md" >}})** \ No newline at end of file +You are now ready to **[Test the Cluster]({{< relref "/karpenter/test.md" >}})** \ No newline at end of file diff --git a/content/karpenter/010_prerequisites/sshkey.md b/content/karpenter/010_prerequisites/sshkey.md deleted file mode 100644 index 2878c344..00000000 --- a/content/karpenter/010_prerequisites/sshkey.md +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: "Create an SSH key" -chapter: false -weight: 80 ---- - -{{% notice info %}} -Starting from here, when you see command to be entered such as below, you will enter these commands into Cloud9 IDE. You can use the **Copy to clipboard** feature (right hand upper corner) to simply copy and paste into Cloud9. In order to paste, you can use Ctrl + V for Windows or Command + V for Mac. -{{% /notice %}} - -Please run this command to generate SSH Key in Cloud9. This key will be used on the worker node instances to allow ssh access if necessary. - -``` -ssh-keygen -``` - -{{% notice tip %}} -Press `enter` 3 times to take the default choices -{{% /notice %}} - -Upload the public key to your EC2 region: - -``` -aws ec2 import-key-pair --key-name "eksworkshop" --public-key-material fileb://~/.ssh/id_rsa.pub -``` diff --git a/content/karpenter/010_prerequisites/update_workspaceiam.md b/content/karpenter/010_prerequisites/update_workspaceiam.md deleted file mode 100644 index 2ff11a27..00000000 --- a/content/karpenter/010_prerequisites/update_workspaceiam.md +++ /dev/null @@ -1,52 +0,0 @@ ---- -title: "Update IAM settings for your Workspace" -chapter: false -weight: 60 ---- - -{{% notice info %}} -**Note**: Cloud9 normally manages IAM credentials dynamically. This isn't currently compatible with the EKS IAM authentication, so we will disable it and rely on the IAM role instead. -{{% /notice %}} - -- Return to your workspace and click the sprocket, or launch a new tab to open the Preferences tab -- Select **AWS SETTINGS** -- Turn off **AWS managed temporary credentials** -- Close the Preferences tab -![c9disableiam](/images/karpenter/prerequisites/c9disableiam.png) - -To ensure temporary credentials aren't already in place we will also remove -any existing credentials file: -``` -rm -vf ${HOME}/.aws/credentials -``` - -We should configure our aws cli with our current region as default: -``` -export ACCOUNT_ID=$(aws sts get-caller-identity --output text --query Account) -export AWS_REGION=$(curl -s 169.254.169.254/latest/dynamic/instance-identity/document | jq -r '.region') - -echo "export ACCOUNT_ID=${ACCOUNT_ID}" >> ~/.bash_profile -echo "export AWS_REGION=${AWS_REGION}" >> ~/.bash_profile -aws configure set default.region ${AWS_REGION} -aws configure get default.region -``` - -### Validate the IAM role {#validate_iam} - -Use the [GetCallerIdentity](https://docs.aws.amazon.com/cli/latest/reference/sts/get-caller-identity.html) CLI command to validate that the Cloud9 IDE is using the correct IAM role. - -``` -aws sts get-caller-identity - -``` - -{{% notice note %}} -**Select the tab** and validate the assumed role… -{{% /notice %}} - -{{< tabs name="Region" >}} - {{< tab name="...AT AN AWS EVENT" include="at_an_aws_validaterole.md" />}} - {{< tab name="...ON YOUR OWN" include="on_your_own_validaterole.md" />}} - -{{< /tabs >}} - diff --git a/content/karpenter/010_prerequisites/workspace.md b/content/karpenter/010_prerequisites/workspace.md deleted file mode 100644 index 8c64c21a..00000000 --- a/content/karpenter/010_prerequisites/workspace.md +++ /dev/null @@ -1,43 +0,0 @@ ---- -title: "Create a Workspace" -chapter: false -weight: 30 ---- - -{{% notice warning %}} -If you are running the workshop on your own, the Cloud9 workspace should be built by an IAM user with Administrator privileges, not the root account user. Please ensure you are logged in as an IAM user, not the root -account user. -{{% /notice %}} - -{{% notice info %}} -If you are at an AWS hosted event (such as re:Invent, Kubecon, Immersion Day, or any other event hosted by -an AWS employee) follow the instructions on the region that should be used to launch resources -{{% /notice %}} - -{{% notice tip %}} -Ad blockers, javascript disablers, and tracking blockers should be disabled for -the cloud9 domain, or connecting to the workspace might be impacted. -Cloud9 requires third-party-cookies. You can whitelist the [specific domains]( https://docs.aws.amazon.com/cloud9/latest/user-guide/troubleshooting.html#troubleshooting-env-loading). -{{% /notice %}} - -### Launch Cloud9 in your closest region: - -{{< tabs name="Region" >}} - {{< tab name="N. Virginia" include="us-east-1.md" />}} - {{< tab name="Oregon" include="us-west-2.md" />}} - {{< tab name="Ireland" include="eu-west-1.md" />}} - {{< tab name="Ohio" include="us-east-2.md" />}} - {{< tab name="Singapore" include="ap-southeast-1.md" />}} -{{< /tabs >}} - -- Select **Create environment** -- Name it **eksworkshop**, and take all other defaults -- When it comes up, customize the environment by closing the **welcome tab** -and **lower work area**, and opening a new **terminal** tab in the main work area: -![c9before](/images/using_ec2_spot_instances_with_eks/prerequisites/c9before.png) - -- Your workspace should now look like this: -![c9after](/images/using_ec2_spot_instances_with_eks/prerequisites/c9after.png) - -- If you like this theme, you can choose it yourself by selecting **View / Themes / Solarized / Solarized Dark** -in the Cloud9 workspace menu. diff --git a/content/karpenter/020_eksctl/_index.md b/content/karpenter/020_eksctl/_index.md deleted file mode 100644 index 2b8cac9f..00000000 --- a/content/karpenter/020_eksctl/_index.md +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: "Launch using eksctl" -chapter: true -weight: 20 ---- - -# Launch using [eksctl](https://eksctl.io/) - -[eksctl](https://eksctl.io) is the official CLI for Amazon EKS. It is written in Go, and uses CloudFormation. Eksctl is a tool jointly developed by AWS and [Weaveworks](https://weave.works) that automates much of the experience of creating EKS clusters. - -In this module, we will use eksctl to launch and configure our EKS cluster and nodes. - -{{< youtube jGrdVSlIkNQ >}} diff --git a/content/karpenter/020_eksctl/create_eks_cluster_eksctl_command.md b/content/karpenter/020_eksctl/create_eks_cluster_eksctl_command.md deleted file mode 100644 index a3576d43..00000000 --- a/content/karpenter/020_eksctl/create_eks_cluster_eksctl_command.md +++ /dev/null @@ -1,58 +0,0 @@ ---- -title: "Create EKS cluster Command" -chapter: false -disableToc: true -hidden: true ---- - -Create an eksctl deployment file (eksworkshop.yaml) to create an EKS cluster: - - -``` -cat << EOF > eksworkshop.yaml ---- -apiVersion: eksctl.io/v1alpha5 -kind: ClusterConfig - -metadata: - name: eksworkshop-eksctl - region: ${AWS_REGION} - version: "1.21" - tags: - karpenter.sh/discovery: ${CLUSTER_NAME} -iam: - withOIDC: true -managedNodeGroups: -- amiFamily: AmazonLinux2 - instanceType: m5.large - name: mng-od-m5large - desiredCapacity: 2 - maxSize: 3 - minSize: 0 - labels: - alpha.eksctl.io/cluster-name: ${CLUSTER_NAME} - alpha.eksctl.io/nodegroup-name: mng-od-m5large - intent: control-apps - tags: - alpha.eksctl.io/nodegroup-name: mng-od-m5large - alpha.eksctl.io/nodegroup-type: managed - k8s.io/cluster-autoscaler/node-template/label/intent: control-apps - iam: - withAddonPolicies: - autoScaler: true - cloudWatch: true - albIngress: true - privateNetworking: true - -EOF -``` - -Next, use the file you created as the input for the eksctl cluster creation. - -``` -eksctl create cluster -f eksworkshop.yaml -``` - -{{% notice note %}} -Launching EKS and all the dependencies will take approximately 15 minutes -{{% /notice %}} \ No newline at end of file diff --git a/content/karpenter/020_eksctl/launcheks.md b/content/karpenter/020_eksctl/launcheks.md deleted file mode 100644 index 0c387cd3..00000000 --- a/content/karpenter/020_eksctl/launcheks.md +++ /dev/null @@ -1,95 +0,0 @@ ---- -title: "Launch EKS" -date: 2018-08-07T13:34:24-07:00 -weight: 20 ---- - - -{{% notice warning %}} -**DO NOT PROCEED** with this step unless you have [validated the IAM role]({{< relref "../010_prerequisites/update_workspaceiam.md#validate_iam" >}}) in use by the Cloud9 IDE. You will not be able to run the necessary kubectl commands in the later modules unless the EKS cluster is built using the IAM role. -{{% /notice %}} - -#### Challenge: -**How do I check the IAM role on the workspace?** - -{{%expand "Expand here to see the solution" %}} - -### Validate the IAM role {#validate_iam} - -Use the [GetCallerIdentity](https://docs.aws.amazon.com/cli/latest/reference/sts/get-caller-identity.html) CLI command to validate that the Cloud9 IDE is using the correct IAM role. - -``` -aws sts get-caller-identity - -``` - -You can verify what the output an correct role shoulld be in the **[validate the IAM role section]({{< relref "../010_prerequisites/update_workspaceiam.md" >}})**. If you do see the correct role, proceed to next step to create an EKS cluster. -{{% /expand %}} - - -### Create an EKS cluster - -Create an eksctl deployment file (eksworkshop.yaml) to create an EKS cluster: - - -``` -cat << EOF > eksworkshop.yaml ---- -apiVersion: eksctl.io/v1alpha5 -kind: ClusterConfig - -metadata: - name: eksworkshop-eksctl - region: ${AWS_REGION} - version: "1.21" - tags: - karpenter.sh/discovery: eksworkshop-eksctl -iam: - withOIDC: true -managedNodeGroups: -- amiFamily: AmazonLinux2 - instanceType: m5.large - name: mng-od-m5large - desiredCapacity: 2 - maxSize: 3 - minSize: 0 - labels: - alpha.eksctl.io/cluster-name: eksworkshop-eksctl - alpha.eksctl.io/nodegroup-name: mng-od-m5large - intent: control-apps - tags: - alpha.eksctl.io/nodegroup-name: mng-od-m5large - alpha.eksctl.io/nodegroup-type: managed - k8s.io/cluster-autoscaler/node-template/label/intent: control-apps - iam: - withAddonPolicies: - autoScaler: true - cloudWatch: true - albIngress: true - privateNetworking: true - -EOF -``` - -Next, use the file you created as the input for the eksctl cluster creation. - -``` -eksctl create cluster -f eksworkshop.yaml -``` - -{{% notice info %}} -Launching EKS and all the dependencies will take approximately 15 minutes -{{% /notice %}} - -`eksctl create cluster` command allows you to create the cluster and managed nodegroups in sequence. There are a few things to note in the configuration that we just used to create the cluster and a managed nodegroup. - - * Resources created by `eksctl` have the tag `karpenter.sh/discovery` with the cluster name as the value. We'll need this later. - * Nodegroup configurations are set under the **managedNodeGroups** section, this indicates that the node group is managed by EKS. - * Nodegroup instance type is **m5.large** with **minSize** to 0, **maxSize** to 3 and **desiredCapacity** to 2. This nodegroup has capacity type set to On-Demand Instances by default. - - * Notice that the we add 3 node labels: - * **alpha.eksctl.io/cluster-name**, to indicate the nodes belong to **eksworkshop-eksctl** cluster. - * **alpha.eksctl.io/nodegroup-name**, to indicate the nodes belong to **mng-od-m5large** nodegroup. - * **intent**, to allow you to deploy control applications on nodes that have been labeled with value **control-apps** - - * Amazon EKS adds an additional Kubernetes label **eks.amazonaws.com/capacityType: ON_DEMAND**, to all On-Demand Instances in your managed node group. You can use this label to schedule stateful applications on On-Demand nodes. \ No newline at end of file diff --git a/content/karpenter/020_eksctl/prerequisites.md b/content/karpenter/020_eksctl/prerequisites.md deleted file mode 100644 index c7f3acb9..00000000 --- a/content/karpenter/020_eksctl/prerequisites.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -title: "Prerequisites" -date: 2018-08-07T13:31:55-07:00 -weight: 10 ---- - -For this module, we need to download the [eksctl](https://eksctl.io/) binary: -``` -export EKSCTL_VERSION=v0.68.0 -curl --silent --location "https://github.com/weaveworks/eksctl/releases/download/${EKSCTL_VERSION}/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp - -sudo mv -v /tmp/eksctl /usr/local/bin -``` - -Confirm the eksctl command works: -``` -eksctl version -``` diff --git a/content/karpenter/030_k8s_tools/_index.md b/content/karpenter/040_k8s_tools/_index.md similarity index 96% rename from content/karpenter/030_k8s_tools/_index.md rename to content/karpenter/040_k8s_tools/_index.md index fe2618ad..51d29dcd 100644 --- a/content/karpenter/030_k8s_tools/_index.md +++ b/content/karpenter/040_k8s_tools/_index.md @@ -1,7 +1,7 @@ --- title: "Install Kubernetes Tools" chapter: true -weight: 30 +weight: 40 --- # Install Kubernetes tools diff --git a/content/karpenter/030_k8s_tools/deploy_metric_server.md b/content/karpenter/040_k8s_tools/deploy_metric_server.md similarity index 96% rename from content/karpenter/030_k8s_tools/deploy_metric_server.md rename to content/karpenter/040_k8s_tools/deploy_metric_server.md index 5b873c13..a59fac11 100644 --- a/content/karpenter/030_k8s_tools/deploy_metric_server.md +++ b/content/karpenter/040_k8s_tools/deploy_metric_server.md @@ -9,7 +9,7 @@ weight: 20 Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. These metrics will drive the scaling behavior of the [deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). We will deploy the metrics server using [Kubernetes Metrics Server](https://github.com/kubernetes-sigs/metrics-server). ```sh -kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.0/components.yaml +kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.1/components.yaml ``` Lets' verify the status of the metrics-server `APIService` diff --git a/content/karpenter/030_k8s_tools/helm_deploy.md b/content/karpenter/040_k8s_tools/helm_deploy.md similarity index 75% rename from content/karpenter/030_k8s_tools/helm_deploy.md rename to content/karpenter/040_k8s_tools/helm_deploy.md index 6b2cbe90..1ed31fee 100644 --- a/content/karpenter/030_k8s_tools/helm_deploy.md +++ b/content/karpenter/040_k8s_tools/helm_deploy.md @@ -25,31 +25,20 @@ Before we can get started configuring Helm, we'll need to first install the command line tools that you will interact with. To do this, run the following: ``` -export DESIRED_VERSION=v3.8.2 +export DESIRED_VERSION=v3.9.4 curl -sSL https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash ``` -We can verify the version - -``` -helm version --short -``` - -Let's configure our first Chart repository. Chart repositories are similar to -APT or yum repositories that you might be familiar with on Linux, or Taps for -Homebrew on macOS. +{{% notice note %}} +Note we are using a version relatively old of Helm and the same will apply to the load of the stable repo +This is temporary required to load kube-ops-view. +{{% /notice %}} -Download the `stable` repository so we have something to start with: -``` -helm repo add stable https://charts.helm.sh/stable/ -helm repo update -``` - -Once this is installed, we will be able to list the charts you can install: +We can verify the version ``` -helm search repo stable +helm version --short ``` Finally, let's configure Bash completion for the `helm` command: diff --git a/content/karpenter/030_k8s_tools/install_kube_ops_view.md b/content/karpenter/040_k8s_tools/install_kube_ops_view.md similarity index 51% rename from content/karpenter/030_k8s_tools/install_kube_ops_view.md rename to content/karpenter/040_k8s_tools/install_kube_ops_view.md index 1b26231f..1427479b 100644 --- a/content/karpenter/030_k8s_tools/install_kube_ops_view.md +++ b/content/karpenter/040_k8s_tools/install_kube_ops_view.md @@ -5,36 +5,30 @@ weight: 30 --- Now that we have helm installed, we are ready to use the stable helm catalog and install tools -that will help with understanding our cluster setup in a visual way. The first of those tools that we are going to install is [Kube-ops-view](https://github.com/hjacobs/kube-ops-view) from **[Henning Jacobs](https://github.com/hjacobs)**. +In this step we will install [Kube-ops-view](https://github.com/hjacobs/kube-ops-view) from **[Henning Jacobs](https://github.com/hjacobs)**. Kube-ops-view will help with understanding our cluster setup in a visual way -The following line updates the stable helm repository and then installs kube-ops-view using a LoadBalancer Service type and creating a RBAC (Resource Base Access Control) entry for the read-only service account to read nodes and pods information from the cluster. +The following lines download the spec required to deploy kube-ops-view using a LoadBalancer Service type and creating a RBAC (Resource Base Access Control) entry for the read-only service account to read nodes and pods information from the cluster. ``` -helm install kube-ops-view \ -stable/kube-ops-view \ ---set service.type=LoadBalancer \ ---set nodeSelector.intent=control-apps \ ---version 1.2.4 \ ---set rbac.create=True +mkdir $HOME/environment/kube-ops-view +for file in kustomization.yaml rbac.yaml deployment.yaml service.yaml; do curl "https://raw.githubusercontent.com/awslabs/ec2-spot-workshops/master/content/karpenter/030_k8s_tools/k8_tools.files/kube_ops_view/${file}" > $HOME/environment/kube-ops-view/${file}; done +kubectl apply -k $HOME/environment/kube-ops-view ``` -The execution above installs kube-ops-view exposing it through a Service using the LoadBalancer type. -A successful execution of the command will display the set of resources created and will prompt some advice asking you to use `kubectl proxy` and a local URL for the service. Given we are using the type LoadBalancer for our service, we can disregard this; Instead we will point our browser to the external load balancer. - {{% notice warning %}} -Monitoring and visualization shouldn't be typically be exposed publicly unless the service is properly secured and provide methods for authentication and authorization. You can still deploy kube-ops-view using a Service of type **ClusterIP** by removing the `--set service.type=LoadBalancer` section and using `kubectl proxy`. Kube-ops-view does also [support Oauth 2](https://github.com/hjacobs/kube-ops-view#configuration) +Monitoring and visualization shouldn't be typically be exposed publicly unless the service is properly secured and provide methods for authentication and authorization. You can still deploy kube-ops-view as Service of type **ClusterIP** by removing the `--set service.type=LoadBalancer` section and using `kubectl proxy`. Kube-ops-view does also [support Oauth 2](https://github.com/hjacobs/kube-ops-view#configuration) {{% /notice %}} To check the chart was installed successfully: ``` -helm list +kubectl get svc ``` should display : ``` -NAME NAMESPACE REVISION UPDATED STATUS CHART -kube-ops-view default 1 2020-11-20 05:16:47 deployed kube-ops-view-1.2.4 +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +kube-ops-view LoadBalancer 10.100.162.132 addb6e7f91aae4b0dbd6f5833f9750c3-1014347204.eu-west-1.elb.amazonaws.com 80:31628/TCP 3m58s ``` With this we can explore kube-ops-view output by checking the details about the newly service created. diff --git a/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/deployment.yaml b/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/deployment.yaml new file mode 100644 index 00000000..6fc8f9ae --- /dev/null +++ b/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/deployment.yaml @@ -0,0 +1,53 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + labels: + application: kube-ops-view + component: frontend + name: kube-ops-view +spec: + replicas: 1 + selector: + matchLabels: + application: kube-ops-view + component: frontend + template: + metadata: + labels: + application: kube-ops-view + component: frontend + spec: + nodeSelector: + intent: control-apps + serviceAccountName: kube-ops-view + containers: + - name: service + image: hjacobs/kube-ops-view:20.4.0 + ports: + - containerPort: 8080 + protocol: TCP + readinessProbe: + httpGet: + path: /health + port: 8080 + initialDelaySeconds: 5 + timeoutSeconds: 1 + livenessProbe: + httpGet: + path: /health + port: 8080 + initialDelaySeconds: 30 + periodSeconds: 30 + timeoutSeconds: 10 + failureThreshold: 5 + resources: + limits: + cpu: 400m + memory: 400Mi + requests: + cpu: 400m + memory: 400Mi + securityContext: + readOnlyRootFilesystem: true + runAsNonRoot: true + runAsUser: 1000 \ No newline at end of file diff --git a/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/kustomization.yaml b/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/kustomization.yaml new file mode 100644 index 00000000..bc60c0b4 --- /dev/null +++ b/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/kustomization.yaml @@ -0,0 +1,6 @@ +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization +resources: + - rbac.yaml + - deployment.yaml + - service.yaml diff --git a/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/rbac.yaml b/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/rbac.yaml new file mode 100644 index 00000000..6e2f2fa5 --- /dev/null +++ b/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/rbac.yaml @@ -0,0 +1,33 @@ +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: kube-ops-view +--- +kind: ClusterRole +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: kube-ops-view +rules: +- apiGroups: [""] + resources: ["nodes", "pods"] + verbs: + - list +- apiGroups: ["metrics.k8s.io"] + resources: ["nodes", "pods"] + verbs: + - get + - list +--- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: kube-ops-view +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: kube-ops-view +subjects: +- kind: ServiceAccount + name: kube-ops-view + namespace: default \ No newline at end of file diff --git a/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/service.yaml b/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/service.yaml new file mode 100644 index 00000000..4c6ab902 --- /dev/null +++ b/content/karpenter/040_k8s_tools/k8_tools.files/kube_ops_view/service.yaml @@ -0,0 +1,16 @@ +apiVersion: v1 +kind: Service +metadata: + labels: + application: kube-ops-view + component: frontend + name: kube-ops-view +spec: + selector: + application: kube-ops-view + component: frontend + type: LoadBalancer + ports: + - port: 80 + protocol: TCP + targetPort: 8080 \ No newline at end of file diff --git a/content/karpenter/040_karpenter/_index.md b/content/karpenter/050_karpenter/_index.md similarity index 84% rename from content/karpenter/040_karpenter/_index.md rename to content/karpenter/050_karpenter/_index.md index fad1f29f..235a5403 100644 --- a/content/karpenter/040_karpenter/_index.md +++ b/content/karpenter/050_karpenter/_index.md @@ -2,10 +2,10 @@ title: "Karpenter" titleMenu: "Karpenter" chapter: true -weight: 40 +weight: 50 draft: false --- -In this section we will setup Karpenter. Karpenter is an open-source autoscaling project built for Kubernetes. Karpenter is designed to provide the right compute resources to match your application’s needs in seconds, instead of minutes by observing the aggregate resource requests of unschedulable pods and makes decisions to launch and terminate nodes to minimize scheduling latencies. +In this section we will setup Karpenter. Karpenter is an open-source autoscaling project built for Kubernetes. Karpenter is designed to provide the right compute resources to match your application’s needs in seconds, instead of minutes by observing the aggregate resource requests of unschedulable pods and makes decisions to launch and terminate nodes to optimize the cluster cost. ![Karpenter](/images/karpenter/karpenter_banner.png) diff --git a/content/karpenter/040_karpenter/advanced_provisioner.md b/content/karpenter/050_karpenter/advanced_provisioner.md similarity index 95% rename from content/karpenter/040_karpenter/advanced_provisioner.md rename to content/karpenter/050_karpenter/advanced_provisioner.md index fba8ddc1..3e643145 100644 --- a/content/karpenter/040_karpenter/advanced_provisioner.md +++ b/content/karpenter/050_karpenter/advanced_provisioner.md @@ -1,7 +1,7 @@ --- title: "Deploying Multiple Provisioners" date: 2021-11-07T11:05:19-07:00 -weight: 50 +weight: 60 draft: false --- @@ -28,6 +28,9 @@ kind: Provisioner metadata: name: default spec: + consolidation: + enabled: true + weight: 100 labels: intent: apps requirements: @@ -41,7 +44,6 @@ spec: resources: cpu: 1000 memory: 1000Gi - ttlSecondsAfterEmpty: 30 ttlSecondsUntilExpired: 2592000 providerRef: name: default @@ -126,6 +128,9 @@ Let's spend some time covering a few points in the Provisioners configuration. * The `team1` Provisioner does define a different `AWSNodeTemplate` and changes the AMI from the default [EKS optimized AMI](https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html) to [bottlerocket](https://aws.amazon.com/bottlerocket/). It does also adapts the UserData bootstrapping for this particular provider. +* The `default` Provisioner is setting up a weight of 100. The evaluation of provisioners can use weights, this is useful to force scenarios where you want karpenter to evaluate a provisioner before other. The higher the weightthe higher the priority in the evaluation. The first provisioner to match the workload is the one that gets used. + + {{% notice note %}} If Karpenter encounters a taint in the Provisioner that is not tolerated by a Pod, Karpenter won’t use that Provisioner to provision the pod. It is recommended to create Provisioners that are mutually exclusive. So no Pod should match multiple Provisioners. If multiple Provisioners are matched, Karpenter will randomly choose which to use. {{% /notice %}} diff --git a/content/karpenter/040_karpenter/automatic_node_provisioning.md b/content/karpenter/050_karpenter/automatic_node_provisioning.md similarity index 73% rename from content/karpenter/040_karpenter/automatic_node_provisioning.md rename to content/karpenter/050_karpenter/automatic_node_provisioning.md index 945b8a13..066cb4f9 100644 --- a/content/karpenter/040_karpenter/automatic_node_provisioning.md +++ b/content/karpenter/050_karpenter/automatic_node_provisioning.md @@ -96,24 +96,25 @@ echo type: $(kubectl describe node --selector=intent=apps | grep "beta.kubernete There is something even more interesting to learn about how the node was provisioned. Check out Karpenter logs and look at the new Karpenter created. The lines should be similar to the ones below ```bash -2022-07-01T03:00:19.634Z INFO controller.provisioning Found 1 provisionable pod(s) {"commit": "1f7a67b"} -2022-07-01T03:00:19.634Z INFO controller.provisioning Computed 1 new node(s) will fit 1 pod(s) {"commit": "1f7a67b"} -2022-07-01T03:00:19.790Z DEBUG controller.provisioning.cloudprovider Discovered subnets: [subnet-0e528fbbaf13542c2 (eu-west-1b) subnet-0a9bd9b668d8ae58d (eu-west-1a) subnet-03aec03eee186dc42 (eu-west-1a) subnet-03ff683f2535bcd8d (eu-west-1b)] {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:00:19.871Z DEBUG controller.provisioning.cloudprovider Discovered security groups: [sg-076f0ca74b68addb2 sg-09176f21ae53f5d60] {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:00:19.873Z DEBUG controller.provisioning.cloudprovider Discovered kubernetes version 1.21 {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:00:19.928Z DEBUG controller.provisioning.cloudprovider Discovered ami-0413b176c68479e84 for query "/aws/service/eks/optimized-ami/1.21/amazon-linux-2/recommended/image_id" {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:00:19.972Z DEBUG controller.provisioning.cloudprovider Discovered launch template Karpenter-eksworkshop-eksctl-12663282710833670681 {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:00:23.013Z INFO controller.provisioning.cloudprovider Launched instance: i-05e8535378b1caf35, hostname: ip-192-168-36-234.eu-west-1.compute.internal, type: c5a.xlarge, zone: eu-west-1b, capacityType: spot {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:00:23.042Z INFO controller.provisioning Created node with 1 pods requesting {"cpu":"1125m","memory":"1536Mi","pods":"3"} from types t3a.xlarge, c6a.xlarge, c5a.xlarge, t3.xlarge, c6i.xlarge and 333 other(s) {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:00:23.042Z INFO controller.provisioning Waiting for unschedulable pods {"commit": "1f7a67b"} -2022-07-01T03:00:23.042Z DEBUG controller.events Normal {"commit": "1f7a67b", "object": {"kind":"Pod","namespace":"default","name":"inflate-b9d769f59-rcjnj","uid":"e0f98d1d-eaf6-46ff-9ea0-4d66a6842815","apiVersion":"v1","resourceVersion":"20925"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-36-234.eu-west-1.compute.internal"} +2022-09-05T02:23:43.907Z DEBUG controller.consolidation Discovered 542 EC2 instance types {"commit": "b157d45"} +2022-09-05T02:23:44.070Z DEBUG controller.consolidation Discovered subnets: [subnet-085b9778ddacc06bb (eu-west-1b) subnet-0c6313dad0015b677 (eu-west-1a) subnet-02b72b91f674af299 (eu-west-1a) subnet-0471989b1d4fe9e0a (eu-west-1b)] {"commit": "b157d45"} +2022-09-05T02:23:44.188Z DEBUG controller.consolidation Discovered EC2 instance types zonal offerings for subnets {"alpha.eksctl.io/cluster-name":"eksworkshop-eksctl"} {"commit": "b157d45"} +2022-09-05T02:33:19.634Z DEBUG controller.provisioning 27 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"} +2022-09-05T02:33:19.640Z INFO controller.provisioning Found 1 provisionable pod(s) {"commit": "b157d45"} +2022-09-05T02:33:19.640Z INFO controller.provisioning Computed 1 new node(s) will fit 1 pod(s) {"commit": "b157d45"} +2022-09-05T02:33:19.648Z INFO controller.provisioning Launching node with 1 pods requesting {"cpu":"1125m","memory":"1536Mi","pods":"3"} from types t3a.xlarge, t3.xlarge, c6i.xlarge, c6id.xlarge, c6a.xlarge and 332 other(s) {"commit": "b157d45", "provisioner": "default"} +2022-09-05T02:33:19.742Z DEBUG controller.provisioning.cloudprovider Discovered security groups: [sg-06f776cc53ed4b025 sg-0bb5f95986167d336] {"commit": "b157d45", "provisioner": "default"} +2022-09-05T02:33:19.744Z DEBUG controller.provisioning.cloudprovider Discovered kubernetes version 1.23 {"commit": "b157d45", "provisioner": "default"} +2022-09-05T02:33:19.810Z DEBUG controller.provisioning.cloudprovider Discovered ami-044d355a56926f0c6 for query "/aws/service/eks/optimized-ami/1.23/amazon-linux-2/recommended/image_id" {"commit": "b157d45", "provisioner": "default"} +2022-09-05T02:33:19.981Z DEBUG controller.provisioning.cloudprovider Created launch template, Karpenter-eksworkshop-eksctl-6351194516503745500 {"commit": "b157d45", "provisioner": "default"} +2022-09-05T02:33:22.282Z INFO controller.provisioning.cloudprovider Launched instance: i-091de07c985bd851e, hostname: ip-192-168-61-204.eu-west-1.compute.internal, type: c5.xlarge, zone: eu-west-1b, capacityType: spot {"commit": "b157d45", "provisioner": "default"} ``` We explained earlier on about group-less cluster scalers and how that simplifies operations and maintenance. Let's deep dive for a second into this concept. Notice how Karpenter picks up the instance from a diversified selection of instances. In this case it selected the following instances: ``` -from types t3a.xlarge, c6a.xlarge, c5a.xlarge, t3.xlarge, c6i.xlarge and 333 other(s) +Launching node with 1 pods requesting {"cpu":"1125m","memory":"1536Mi","pods":"3"} from types t3a.xlarge, t3.xlarge, c6i.xlarge, c6id.xlarge, c6a.xlarge and 332 other(s) ``` **Note** how the types, 'nano', 'micro', 'small', 'medium', 'large', where filtered for this selection. While our recommendation is to diversify on as many instances as possible, there are cases where provisioners may want to filter smaller (or specific) instances types. @@ -123,7 +124,7 @@ Instances types might be different depending on the region selected. All this instances are the suitable instances that reduce the waste of resources (memory and CPU) for the pod submitted. If you are interested in Algorithms, internally Karpenter is using a [First Fit Decreasing (FFD)](https://en.wikipedia.org/wiki/Bin_packing_problem#First_Fit_Decreasing_(FFD)) approach. Note however this can change in the future. -We did set Karpenter Provisioner to use [EC2 Spot instances](https://aws.amazon.com/ec2/spot/), and there was no `instance-types` [requirement section in the Provisioner to filter the type of instances](https://karpenter.sh/v0.10.0/provisioner/#instance-types). This means that Karpenter will use the default value of instances types to use. The default value includes all instance types with the exclusion of metal (non-virtualized), [non-HVM](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/virtualization_types.html), and GPU instances.Internally Karpenter used **EC2 Fleet in Instant mode** to provision the instances. You can read more about EC2 Fleet Instant mode [**here**](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instant-fleet.html). Here are a few properties to mention about EC2 Fleet instant mode that are key for Karpenter. +We did set Karpenter Provisioner to use [EC2 Spot instances](https://aws.amazon.com/ec2/spot/), and there was no `instance-types` [requirement section in the Provisioner to filter the type of instances](https://karpenter.sh/v0.16.1/provisioner/#instance-types). This means that Karpenter will use the default value of instances types to use. The default value includes all instance types with the exclusion of metal (non-virtualized), [non-HVM](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/virtualization_types.html), and GPU instances.Internally Karpenter used **EC2 Fleet in Instant mode** to provision the instances. You can read more about EC2 Fleet Instant mode [**here**](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instant-fleet.html). Here are a few properties to mention about EC2 Fleet instant mode that are key for Karpenter. * EC2 Fleet instant mode provides a synchronous call to procure instances, including EC2 Spot, this simplifies and avoid error when provisioning instances. For those of you familiar with [Cluster Autoscaler on AWS](https://github.com/kubernetes/autoscaler/blob/c4b56ea56136681e8a8ff654dfcd813c0d459442/cluster-autoscaler/cloudprovider/aws/auto_scaling_groups.go#L33-L36), you may know about how it uses `i-placeholder` to coordinate instances that have been created in asynchronous ways. @@ -153,7 +154,7 @@ Let's now focus in a few of those parameters starting with the Labels: Labels: ... intent=apps karpenter.sh/capacity-type=spot - node.kubernetes.io/instance-type=t3.medium + node.kubernetes.io/instance-type=c5.xlarge topology.kubernetes.io/region=eu-west-1 topology.kubernetes.io/zone=eu-west-1a karpenter.sh/provisioner-name=default @@ -170,9 +171,11 @@ Another thing to note from the node description is the following section: ```bash System Info: ... + OS Image: Amazon Linux 2 Operating System: linux Architecture: amd64 - Container Runtime Version: containerd://1.4.6 + Container Runtime Version: containerd://1.6.6 + Kubelet Version: v1.23.9-eks-ba74326 ... ``` @@ -229,22 +232,18 @@ This will set a few pods pending. Karpenter will get the pending pod signal and ```bash -2022-07-01T03:13:32.754Z INFO controller.provisioning Found 7 provisionable pod(s) {"commit": "1f7a67b"} -2022-07-01T03:13:32.754Z INFO controller.provisioning Computed 1 new node(s) will fit 7 pod(s) {"commit": "1f7a67b"} -2022-07-01T03:13:32.824Z DEBUG controller.provisioning.cloudprovider Discovered subnets: [subnet-0e528fbbaf13542c2 (eu-west-1b) subnet-0a9bd9b668d8ae58d (eu-west-1a) subnet-03aec03eee186dc42 (eu-west-1a) subnet-03ff683f2535bcd8d (eu-west-1b)] {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:13:32.867Z DEBUG controller.provisioning.cloudprovider Discovered security groups: [sg-076f0ca74b68addb2 sg-09176f21ae53f5d60] {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:13:32.868Z DEBUG controller.provisioning.cloudprovider Discovered kubernetes version 1.21 {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:13:32.929Z DEBUG controller.provisioning.cloudprovider Discovered ami-0413b176c68479e84 for query "/aws/service/eks/optimized-ami/1.21/amazon-linux-2/recommended/image_id" {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:13:33.105Z DEBUG controller.provisioning.cloudprovider Created launch template, Karpenter-eksworkshop-eksctl-12663282710833670681 {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:13:35.047Z INFO controller.provisioning.cloudprovider Launched instance: i-004d9de653118ae9d, hostname: ip-192-168-27-254.eu-west-1.compute.internal, type: t3a.2xlarge, zone: eu-west-1a, capacityType: spot {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:13:35.074Z INFO controller.provisioning Created node with 7 pods requesting {"cpu":"7125m","memory":"10752Mi","pods":"9"} from types t3a.2xlarge, c6a.2xlarge, c5a.2xlarge, t3.2xlarge, c6i.2xlarge and 276 other(s) {"commit": "1f7a67b", "provisioner": "default"} -2022-07-01T03:13:35.074Z INFO controller.provisioning Waiting for unschedulable pods {"commit": "1f7a67b"} +2022-09-05T02:40:15.714Z DEBUG controller.provisioning 27 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"} +2022-09-05T02:40:15.769Z INFO controller.provisioning Found 7 provisionable pod(s) {"commit": "b157d45"} +2022-09-05T02:40:15.769Z INFO controller.provisioning Computed 1 new node(s) will fit 7 pod(s) {"commit": "b157d45"} +2022-09-05T02:40:15.784Z INFO controller.provisioning Launching node with 7 pods requesting {"cpu":"7125m","memory":"10752Mi","pods":"9"} from types inf1.2xlarge, c3.2xlarge, r3.2xlarge, c5a.2xlarge, t3a.2xlarge and 280 other(s) {"commit": "b157d45", "provisioner": "default"} +2022-09-05T02:40:16.111Z DEBUG controller.provisioning.cloudprovider Created launch template, Karpenter-eksworkshop-eksctl-6351194516503745500 {"commit": "b157d45", "provisioner": "default"} +2022-09-05T02:40:18.115Z INFO controller.provisioning.cloudprovider Launched instance: i-0081fc25504eacc93, hostname: ip-192-168-16-71.eu-west-1.compute.internal, type: c5a.2xlarge, zone: eu-west-1a, capacityType: spot {"commit": "b157d45", "provisioner": "default"} ``` Indeed the instances selected this time are larger ! The instances selected in this example were: ```bash -from types t3a.2xlarge, c6a.2xlarge, c5a.2xlarge, t3.2xlarge, c6i.2xlarge and 276 other(s) +Launching node with 7 pods requesting {"cpu":"7125m","memory":"10752Mi","pods":"9"} from types inf1.2xlarge, c3.2xlarge, r3.2xlarge, c5a.2xlarge, t3a.2xlarge and 280 other(s) ``` @@ -273,15 +272,18 @@ Let's cover the second reason why we started with 0 replicas and why we also end {{% /expand %}} -## What Have we learned in this section : +## What Have we learned in this section: In this section we have learned: * Karpenter scales up nodes in a group-less approach. Karpenter select which nodes to scale , based on the number of pending pods and the *Provisioner* configuration. It selects how the best instances for the workload should look like, and then provisions those instances. This is unlike what Cluster Autoscaler does. In the case of Cluster Autoscaler, first all existing node group are evaluated and to find which one is the best placed to scale, given the Pod constraints. -* Karpenter uses cordon and drain [best practices](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) to terminate nodes. The configuration of when a node is terminated can be controlled with `ttlSecondsAfterEmpty` - * Karpenter can scale-out from zero when applications have available working pods and scale-in to zero when there are no running jobs or pods. * Provisioners can be setup to define governance and rules that define how nodes will be provisioned within a cluster partition. We can setup requirements such as `karpenter.sh/capacity-type` to allow on-demand and spot instances or use `karpenter.k8s.aws/instance-size` to filter smaller sizes. The full list of supported labels is available **[here](https://karpenter.sh/v0.13.1/tasks/scheduling/#selecting-nodes)** +* Karpenter uses cordon and drain [best practices](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) to terminate nodes. The configuration of when a node is terminated can be controlled with `ttlSecondsAfterEmpty` + + +The ability to terminate nodes only when they are completely idle is ideal for clusters or provisioners used by **batch** workloads. This is controlled by the setting `ttlSecondsAfterEmpty`. In batch workloads you want to ideally let all the kubernetes `jobs` to complete and for a node to be idle before removing the node. This behaviour is not ideal in scenarios where the workload are long running stateless micro-services. Under this conditions the best approach is to use Karpenter **consolidation** functionality. Let's explore how consolidation works in the next section. + diff --git a/content/karpenter/050_karpenter/consolidation.md b/content/karpenter/050_karpenter/consolidation.md new file mode 100644 index 00000000..7d05303c --- /dev/null +++ b/content/karpenter/050_karpenter/consolidation.md @@ -0,0 +1,283 @@ +--- +title: "Consolidation" +date: 2021-11-07T11:05:19-07:00 +weight: 50 +draft: false +--- + +In the previous section we did set the default provisioner configured with a specific `ttlSecondsAfterEmpty`. This instructs Karpenter to remove nodes after `ttlSecondsAfterEmpty` of a node being empty. Note Karpenter will take Daemonset into consideration.We also know that nodes can be removed when they reach the `ttlSecondsUntilExpired`. This is ideal to force node termination on the cluster while bringing new nodes that will pick up the latest AMI's. + +{{% notice note %}} +Automated deprovisioning is configured through the ProvisionerSpec `.ttlSecondsAfterEmpty`, `.ttlSecondsUntilExpired` and `.consolidation.enabled` fields. If these are not configured, Karpenter will not default values for them and will not terminate nodes. +{{% /notice %}} + +There is another way to configure Karpenter to deprovision nodes called **Consolidation**. This mode is preferred for workloads such as microservices and is imcompatible with setting up the `ttlSecondsAfterEmpty` . When set in consolidation mode Karpenter works to actively reduce cluster cost by identifying when nodes can be removed as their workloads will run on other nodes in the cluster and when nodes can be replaced with cheaper variants due to a change in the workloads. + +Before we proceed to see how Consolidation works, let's change the default provisioner configuration: +``` +cat <> ~/.bash_profile TEMPOUT=$(mktemp) curl -fsSL https://karpenter.sh/"${KARPENTER_VERSION}"/getting-started/getting-started-with-eksctl/cloudformation.yaml > $TEMPOUT \ diff --git a/content/karpenter/040_karpenter/set_up_the_provisioner.md b/content/karpenter/050_karpenter/set_up_the_provisioner.md similarity index 88% rename from content/karpenter/040_karpenter/set_up_the_provisioner.md rename to content/karpenter/050_karpenter/set_up_the_provisioner.md index e5d20340..9acb50fd 100644 --- a/content/karpenter/040_karpenter/set_up_the_provisioner.md +++ b/content/karpenter/050_karpenter/set_up_the_provisioner.md @@ -75,7 +75,7 @@ The configuration for the provider is split into two parts. The first one define {{% notice info %}} -Karpenter has been designed to be generic and support other Cloud and Infrastructure providers. At the moment of writing this workshop (**Karpenter 0.13.1**) main implementation and Provisioner available is on AWS. You can read more about the **[configuration available for the AWS Provisioner here](https://karpenter.sh/v0.13.1/aws/)** +Karpenter has been designed to be generic and support other Cloud and Infrastructure providers. At the moment of writing this workshop (**Karpenter 0.16.1**) main implementation and Provisioner available is on AWS. You can read more about the **[configuration available for the AWS Provisioner here](https://karpenter.sh/v0.16.1/aws/)**. {{% /notice %}} ## Displaying Karpenter Logs @@ -84,10 +84,16 @@ Karpenter has been designed to be generic and support other Cloud and Infrastruc You can create a new terminal window within Cloud9 and leave the command below running so you can come back to that terminal every time you want to look for what Karpenter is doing. {{% /notice %}} -To read Karpenter logs from the console you can run the following command. +To read karpenter logs you first need to find the pod that act as elected leader and get the logs out from it. The following line setup an alias that you can use to automate that. The alias just looks for the headers of all the Karpenter controller logs, search for the pod that has the elected leader message and start streaming the line. ``` -kubectl logs -f deployment/karpenter -c controller -n karpenter +alias kl='for pod in $(kubectl get pods -n karpenter | grep karpenter | awk NF=1) ; do if [[ $(kubectl logs ${pod} -c controller -n karpenter --limit-bytes=4096) =~ .*acquired.* ]]; then kubectl logs ${pod} -c controller -n karpenter -f --tail=20; fi; done' +``` + +From now on to invoke the alias and get the logs we can just use + +``` +kl ``` {{% notice info %}} diff --git a/content/karpenter/040_karpenter/using_alternative_provisioners.md b/content/karpenter/050_karpenter/using_alternative_provisioners.md similarity index 75% rename from content/karpenter/040_karpenter/using_alternative_provisioners.md rename to content/karpenter/050_karpenter/using_alternative_provisioners.md index 6f2e0bb4..14d388ba 100644 --- a/content/karpenter/040_karpenter/using_alternative_provisioners.md +++ b/content/karpenter/050_karpenter/using_alternative_provisioners.md @@ -1,7 +1,7 @@ --- title: "Using Alternative Provisioners" date: 2021-11-07T11:05:19-07:00 -weight: 80 +weight: 90 draft: false --- @@ -125,26 +125,26 @@ But there is something that does not match with what we have seen so far with Ka Well, let's check first Karpenter log. ``` -kubectl logs -f deployment/karpenter -c controller -n karpenter +alias kl='for pod in $(kubectl get pods -n karpenter | grep karpenter | awk NF=1) ; do if [[ $(kubectl logs ${pod} -c controller -n karpenter --limit-bytes=4096) =~ .*acquired.* ]]; then kubectl logs ${pod} -c controller -n karpenter -f --tail=20; fi; done' +kl ``` The output of Karpenter should look similar to the one below ``` ... -2022-07-01T04:12:15.781Z INFO controller.provisioning Found 4 provisionable pod(s) {"commit": "1f7a67b"} -2022-07-01T04:12:15.781Z INFO controller.provisioning Computed 2 new node(s) will fit 4 pod(s) {"commit": "1f7a67b"} -2022-07-01T04:12:15.967Z DEBUG controller.provisioning.cloudprovider Discovered subnets: [subnet-0e528fbbaf13542c2 (eu-west-1b) subnet-0a9bd9b668d8ae58d (eu-west-1a) subnet-03aec03eee186dc42 (eu-west-1a) subnet-03ff683f2535bcd8d (eu-west-1b)] {"commit": "1f7a67b", "provisioner": "team1"} -2022-07-01T04:12:16.063Z DEBUG controller.provisioning.cloudprovider Discovered security groups: [sg-076f0ca74b68addb2 sg-09176f21ae53f5d60] {"commit": "1f7a67b", "provisioner": "team1"} -2022-07-01T04:12:16.071Z DEBUG controller.provisioning.cloudprovider Discovered kubernetes version 1.21 {"commit": "1f7a67b", "provisioner": "team1"} -2022-07-01T04:12:16.179Z DEBUG controller.provisioning.cloudprovider Discovered ami-015933fe34749f648 for query "/aws/service/bottlerocket/aws-k8s-1.21/x86_64/latest/image_id" {"commit": "1f7a67b", "provisioner": "team1"} -2022-07-01T04:12:16.456Z DEBUG controller.provisioning.cloudprovider Created launch template, Karpenter-eksworkshop-eksctl-641081096202606695 {"commit": "1f7a67b", "provisioner": "team1"} -2022-07-01T04:12:17.277Z DEBUG controller.node-state Discovered 531 EC2 instance types {"commit": "1f7a67b", "node": "ip-192-168-25-60.eu-west-1.compute.internal"} -2022-07-01T04:12:17.418Z DEBUG controller.node-state Discovered EC2 instance types zonal offerings {"commit": "1f7a67b", "node": "ip-192-168-25-60.eu-west-1.compute.internal"} -2022-07-01T04:12:18.287Z INFO controller.provisioning.cloudprovider Launched instance: i-0e81a84185e589749, hostname: ip-192-168-37-210.eu-west-1.compute.internal, type: t3a.xlarge, zone: eu-west-1b, capacityType: on-demand {"commit": "1f7a67b", "provisioner": "team1"} -2022-07-01T04:12:18.302Z INFO controller.provisioning.cloudprovider Launched instance: i-03c9fc74527b401f4, hostname: ip-192-168-7-134.eu-west-1.compute.internal, type: t3a.xlarge, zone: eu-west-1a, capacityType: on-demand {"commit": "1f7a67b", "provisioner": "team1"} -2022-07-01T04:12:18.306Z INFO controller.provisioning Created node with 2 pods requesting {"cpu":"2125m","memory":"512M","pods":"4"} from types t3a.xlarge, c6a.xlarge, c5a.xlarge, c6i.xlarge, t3.xlarge and 315 other(s) {"commit": "1f7a67b", "provisioner": "team1"} -2022-07-01T04:12:18.306Z DEBUG controller.events Normal {"commit": "1f7a67b", "object": {"kind":"Pod","namespace":"default","name":"inflate-team1-865b77c748-dp9k5","uid":"5b682809-1ae9-4ed2-85c9-451abc11cf75","apiVersion":"v1","resourceVersion":"43463"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-37-210.eu-west-1.compute.internal"} +2022-09-05T11:11:33.993Z DEBUG controller.provisioning 27 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"} +2022-09-05T11:11:33.993Z DEBUG controller.provisioning 27 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"} +2022-09-05T11:11:33.999Z DEBUG controller.provisioning 27 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"} +2022-09-05T11:11:33.999Z DEBUG controller.provisioning 381 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"} +2022-09-05T11:11:34.006Z INFO controller.provisioning Found 4 provisionable pod(s) {"commit": "b157d45"} +2022-09-05T11:11:34.006Z INFO controller.provisioning Computed 2 new node(s) will fit 4 pod(s) {"commit": "b157d45"} +2022-09-05T11:11:34.007Z INFO controller.provisioning Launching node with 2 pods requesting {"cpu":"2125m","memory":"512M","pods":"4"} from types t3a.xlarge, c6a.xlarge, c5a.xlarge, t3.xlarge, c6i.xlarge and 35 other(s) {"commit": "b157d45", "provisioner": "team1"} +2022-09-05T11:11:34.014Z INFO controller.provisioning Launching node with 2 pods requesting {"cpu":"2125m","memory":"512M","pods":"4"} from types t3a.xlarge, c6a.xlarge, c5a.xlarge, t3.xlarge, c6i.xlarge and 325 other(s) {"commit": "b157d45", "provisioner": "team1"} +2022-09-05T11:11:34.342Z DEBUG controller.provisioning.cloudprovider Discovered launch template Karpenter-eksworkshop-eksctl-14752700009555043417 {"commit": "b157d45", "provisioner": "team1"} +2022-09-05T11:11:36.601Z DEBUG controller.provisioning.cloudprovider InsufficientInstanceCapacity for offering { instanceType: t3a.xlarge, zone: eu-west-1b, capacityType: on-demand }, avoiding for 3m0s {"commit": "b157d45", "provisioner": "team1"} +2022-09-05T11:11:36.748Z INFO controller.provisioning.cloudprovider Launched instance: i-0b44228e7195f7588, hostname: ip-192-168-42-207.eu-west-1.compute.internal, type: c6a.xlarge, zone: eu-west-1b, capacityType: on-demand {"commit": "b157d45", "provisioner": "team1"} +2022-09-05T11:11:38.400Z INFO controller.provisioning.cloudprovider Launched instance: i-0e5173a4f48019515, hostname: ip-192-168-31-229.eu-west-1.compute.internal, type: t3a.xlarge, zone: eu-west-1a, capacityType: on-demand {"commit": "b157d45", "provisioner": "team1"} ... ``` diff --git a/content/karpenter/050_scaling/_index.md b/content/karpenter/060_scaling/_index.md similarity index 99% rename from content/karpenter/050_scaling/_index.md rename to content/karpenter/060_scaling/_index.md index 0f4fc965..019f0f4c 100644 --- a/content/karpenter/050_scaling/_index.md +++ b/content/karpenter/060_scaling/_index.md @@ -1,7 +1,7 @@ --- title: "Scaling App and Cluster" chapter: true -weight: 50 +weight: 60 --- # Implement AutoScaling with HPA and Karpenter diff --git a/content/karpenter/050_scaling/build_and_push_to_ecr.md b/content/karpenter/060_scaling/build_and_push_to_ecr.md similarity index 100% rename from content/karpenter/050_scaling/build_and_push_to_ecr.md rename to content/karpenter/060_scaling/build_and_push_to_ecr.md diff --git a/content/karpenter/050_scaling/deploy_hpa.md b/content/karpenter/060_scaling/deploy_hpa.md similarity index 100% rename from content/karpenter/050_scaling/deploy_hpa.md rename to content/karpenter/060_scaling/deploy_hpa.md diff --git a/content/karpenter/060_scaling/fis_experiment.md b/content/karpenter/060_scaling/fis_experiment.md new file mode 100644 index 00000000..d7f22aed --- /dev/null +++ b/content/karpenter/060_scaling/fis_experiment.md @@ -0,0 +1,190 @@ +--- +title: "Use FIS to Interrupt a Spot Instance" +date: 2022-08-31T13:12:00-07:00 +weight: 50 +--- + +During this workshop we have been making extensive use of Spot instances. One question users of Spot instances ask is how they can reproduce the effects of an instance termination so they can qualify if an application would have degradation or issues when spot instances are terminated and replaced by other instances from pools where capacity is available. + +In this section, you're going to create and run an experiment to [trigger the interruption of Amazon EC2 Spot Instances using AWS Fault Injection Simulator (FIS)](https://aws.amazon.com/blogs/compute/implementing-interruption-tolerance-in-amazon-ec2-spot-with-aws-fault-injection-simulator/). When using Spot Instances, you need to be prepared to be interrupted. With FIS, you can test the resiliency of your workload and validate that your application is reacting to the interruption notices that EC2 sends before terminating your instances. You can target individual Spot Instances or a subset of instances in clusters managed by services that tag your instances such as ASG, EC2 Fleet, and EKS. + +#### What do you need to get started? + +Before you start launching Spot interruptions with FIS, you need to create an experiment template. Here is where you define which resources you want to interrupt (targets), and when you want to interrupt the instance. + +Let's create a CloudFormation template which creates the IAM role (`FISSpotRole`) with the minimum permissions FIS needs to interrupt an instance, and the experiment template (`FISExperimentTemplate`) you're going to use to trigger a Spot interruption: + +``` +export FIS_EXP_NAME=fis-karpenter-spot-interruption +cat < fis-karpenter.yaml +AWSTemplateFormatVersion: 2010-09-09 +Description: FIS for Spot Instances +Parameters: + InstancesToInterrupt: + Description: Number of instances to interrupt + Default: 1 + Type: Number + + DurationBeforeInterruption: + Description: Number of minutes before the interruption + Default: 3 + Type: Number + +Resources: + + FISSpotRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Statement: + - Effect: Allow + Principal: + Service: [fis.amazonaws.com] + Action: ["sts:AssumeRole"] + Path: / + Policies: + - PolicyName: root + PolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Action: 'ec2:DescribeInstances' + Resource: '*' + - Effect: Allow + Action: 'ec2:SendSpotInstanceInterruptions' + Resource: 'arn:aws:ec2:*:*:instance/*' + + FISExperimentTemplate: + Type: AWS::FIS::ExperimentTemplate + Properties: + Description: "Interrupt a spot instance with EKS label intent:apps" + Targets: + SpotIntances: + ResourceTags: + IntentLabel: apps + Filters: + - Path: State.Name + Values: + - running + ResourceType: aws:ec2:spot-instance + SelectionMode: !Join ["", ["COUNT(", !Ref InstancesToInterrupt, ")"]] + Actions: + interrupt: + ActionId: "aws:ec2:send-spot-instance-interruptions" + Description: "Interrupt a Spot instance" + Parameters: + durationBeforeInterruption: !Join ["", ["PT", !Ref DurationBeforeInterruption, "M"]] + Targets: + SpotInstances: SpotIntances + StopConditions: + - Source: none + RoleArn: !GetAtt FISSpotRole.Arn + Tags: + Name: "${FIS_EXP_NAME}" + +Outputs: + FISExperimentID: + Value: !GetAtt FISExperimentTemplate.Id +EoF +``` + +Here are some important notes about the template: + +* You can configure how many instances you want to interrupt with the `InstancesToInterrupt` parameter. In the template it's defined that it's going to interrupt **one** instance. +* You can also configure how much time you want the experiment to run with the `DurationBeforeInterruption` parameter. By default, it's going to take two minutes. This means that as soon as you launch the experiment, the instance is going to receive the two-minute notification Spot interruption warning. +* The most important section is the `Targets` from the experiment template. Under `ResourceTags` we have `IntentLabel: apps` which tells the experiment to only select from the EKS nodes we have labeled with `intent: apps`. If there is more than one instance still running with this label, the instance to be interrupted will be **chosen randomly**. + +#### Create the EC2 Spot Interruption Experiment with FIS + +Run the following commands to create the FIS experiment from your template, it will take a few moments for them to complete: + +``` +aws cloudformation create-stack --stack-name $FIS_EXP_NAME --template-body file://fis-karpenter.yaml --capabilities CAPABILITY_NAMED_IAM +aws cloudformation wait stack-create-complete --stack-name $FIS_EXP_NAME +``` + +#### Run the Spot Interruption Experiment + +You can run the Spot interruption experiment by issuing the following commands: + +``` +FIS_EXP_TEMP_ID=$(aws cloudformation describe-stacks --stack-name $FIS_EXP_NAME --query "Stacks[0].Outputs[?OutputKey=='FISExperimentID'].OutputValue" --output text) +FIS_EXP_ID=$(aws fis start-experiment --experiment-template-id $FIS_EXP_TEMP_ID --no-cli-pager --query "experiment.id" --output text) +``` + +In a few seconds the experiment should complete. This means one of your instances has received a two minute instance interruption notice and will be terminated. You can see the status of the experiment by running: + +``` +aws fis get-experiment --id $FIS_EXP_ID --no-cli-pager +``` + +If the experiment completed successfully you should see a response like this: + +``` +{ + "experiment": { + + ... + + "state": { + "status": "completed", + "reason": "Experiment completed." + }, + "targets": { + "SpotIntances": { + "resourceType": "aws:ec2:spot-instance", + "resourceTags": { + "IntentLabel": "apps" + }, + "filters": [ + { + "path": "State.Name", + "values": [ + "running" + ] + } + ], + "selectionMode": "COUNT(1)" + } + }, + + ... + + } +} +``` + +If `status` is listed as `running`, wait a few seconds and run the command again. If `status` is listed as `failed` with `reason` as `Target resolution returned empty set` it means you do not have any Spot instances running with the `intent: apps` label and so no instance was selected for termination. + +You can watch how your cluster reacts to the notice with kube-ops-view. Recall you can get the URL for your kube-ops-view by running: + +``` +kubectl get svc kube-ops-view | tail -n 1 | awk '{ print "Kube-ops-view URL = http://"$4 }' +``` + +{{% notice note %}} +You can interrupt more instances by running the experiment multiple times and watch how your cluster reacts, just reissue this command: +``` +FIS_EXP_ID=$(aws fis start-experiment --experiment-template-id $FIS_EXP_TEMP_ID --no-cli-pager --query "experiment.id" --output text) +``` +{{% /notice %}} + +## What Have we learned in this section : + +In this section we have learned: + +* We have built an container image using a multi-stage approach and uploaded the resulting microservice into Amazon Elastic Container Registry (ECR). + +* We have deployed a Monte Carlo Microservice applying all the lessons learned from the previous section. + +* We have set up the Horizontal Pod Autoscaler (HPA) to scale our Monte Carlo microservice whenever the average CPU percentage exceeds 50%, We configured it to scale from 3 replicas to 100 replicas + +* We have sent request to the Monte Carlo microservice to stress the CPU of the Pods where it runs. We saw in action dynamic scaling with HPA and Karpenter and now know can we appy this techniques to our kubernetes cluster + +* We have created a FIS experiment and ran it to interrupt one of our Spot instances. We watched how the cluster responded using the visual web tool kube-ops-view. + + +{{% notice info %}} +Congratulations ! You have completed the dynamic scaling section of this workshop. +In the next sections we will collect our conclusions and clean up the setup. +{{% /notice %}} diff --git a/content/karpenter/050_scaling/monte_carlo_pi.md b/content/karpenter/060_scaling/monte_carlo_pi.md similarity index 95% rename from content/karpenter/050_scaling/monte_carlo_pi.md rename to content/karpenter/060_scaling/monte_carlo_pi.md index 35b9a3b1..14d10e5d 100644 --- a/content/karpenter/050_scaling/monte_carlo_pi.md +++ b/content/karpenter/060_scaling/monte_carlo_pi.md @@ -104,7 +104,8 @@ kubectl describe provisioner default We can confirm the statements above by checking Karpenter logs using the following command. By now you should be very familiar with the log lines expected. ``` -kubectl logs -f deployment/karpenter -c controller -n karpenter +alias kl='for pod in $(kubectl get pods -n karpenter | grep karpenter | awk NF=1) ; do if [[ $(kubectl logs ${pod} -c controller -n karpenter --limit-bytes=4096) =~ .*acquired.* ]]; then kubectl logs ${pod} -c controller -n karpenter -f --tail=20; fi; done' +kl ``` Or by runnint the following command to verify the details of the Spot instance created. diff --git a/content/karpenter/050_scaling/test_hpa.md b/content/karpenter/060_scaling/test_hpa.md similarity index 84% rename from content/karpenter/050_scaling/test_hpa.md rename to content/karpenter/060_scaling/test_hpa.md index 1844fd52..307d8cdf 100644 --- a/content/karpenter/050_scaling/test_hpa.md +++ b/content/karpenter/060_scaling/test_hpa.md @@ -102,22 +102,3 @@ or kubectl top pods ``` {{% /expand %}} - - -## What Have we learned in this section : - -In this section we have learned: - -* We have built an container image using a multi-stage approach and uploaded the resulting microservice into Amazon Elastic Container Registry (ECR). - -* We have deployed a Monte Carlo Microservice applying all the lessons learned from the previous section. - -* We have set up the Horizontal Pod Autoscaler (HPA) to scale our Monte Carlo microservice whenever the average CPU percentage exceeds 50%, We configured it to scale from 3 replicas to 100 replicas - -* We have sent request to the Monte Carlo microservice to stress the CPU of the Pods where it runs. We saw in action dynamic scaling with HPA and Karpenter and now know can we appy this techniques to our kubernetes cluster - - -{{% notice info %}} -Congratulations ! You have completed the dynamic scaling section of this workshop. -In the next sections we will collect our conclusions and clean up the setup. -{{% /notice %}} \ No newline at end of file diff --git a/content/karpenter/200_cleanup/_index.md b/content/karpenter/200_cleanup/_index.md index 24997f6d..d610f7e5 100644 --- a/content/karpenter/200_cleanup/_index.md +++ b/content/karpenter/200_cleanup/_index.md @@ -10,6 +10,11 @@ If you're running in an account that was created for you as part of an AWS event If you're running in your own account, make sure you run through these steps to make sure you don't encounter unwanted costs. {{% /notice %}} +## Removing the CloudFormation stack used for FIS +``` +aws cloudformation delete-stack --stack-name $FIS_EXP_NAME +``` + ## Cleaning up HPA, CA, and the Microservice ``` cd ~/environment @@ -22,8 +27,8 @@ kubectl delete -f inflate-spot.yaml kubectl delete -f inflate.yaml helm uninstall aws-node-termination-handler --namespace kube-system helm uninstall karpenter -n karpenter -helm uninstall kube-ops-view -kubectl delete -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.0/components.yaml +kubectl delete -k $HOME/environment/kube-ops-view +kubectl delete -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.1/components.yaml ``` ## Removing the cluster, Managed node groups and Karpenter pre-requisites diff --git a/content/karpenter/300_conclusion/conclusion.md b/content/karpenter/300_conclusion/conclusion.md index df8ae168..4f4ff5ab 100644 --- a/content/karpenter/300_conclusion/conclusion.md +++ b/content/karpenter/300_conclusion/conclusion.md @@ -14,6 +14,7 @@ In the session, we have: - We have learned how Karpenter support custom AMI's and bootsrapping. - We learned how Karpenter uses well-known labels and acts on them procuring capacity that meets criterias such as which architecture to use, which type of instances (On-Demand or Spot) to use. - We learned how Karpenter applies best practices for large scale deployment by diversifying and using allocation strategies for both on demand instances and EC2 Spot instances, we also learned applications have still full control and can set Node Selectors such as `node.kubernetes.io/instance-type: m5.2xlarge` or `topology.kubernetes.io/zone=us-east-1c` to specify explicitely what instance type to use or which AZ an application must be deployed in. +- Learned how deprovisioning works in Karpenter and how to set up the Cluster Consolidation option. - Configured a DaemonSet using **AWS-Node-Termination-Handler** to handle spot interruptions gracefully. We also learned that in future version the integration with the termination controller will be proactive in handling Spot Terminations and Rebalance recommendations. # EC2 Spot Savings diff --git a/content/karpenter/_index.md b/content/karpenter/_index.md index 824f5b03..d3d86d9b 100644 --- a/content/karpenter/_index.md +++ b/content/karpenter/_index.md @@ -11,7 +11,7 @@ In this workshop, you will learn how to provision, manage, and maintain your Kub On EKS we will run a small EKS managed node groups, to deploy a minimum set of On-Demand instances that we will use to deploy controllers. After that we will use Karpenter to deploy a mix of On-Demand and Spot instances to showcase a few of the benefits of running a group-less auto scaler. EC2 Spot Instances allow you to architect for optimizations on cost and scale. -This workshop is originally based on AWS [EKS Workshop](https://eksworkshop.com/)but expands and focuses on how efficient Flexible Compute can be implemented using Karpenter. You can find there more modules and learn about other Amazon Elastic Kubernetes Service best practices. +This workshop is originally based on AWS [EKS Workshop](https://eksworkshop.com/) but expands and focuses on how efficient Flexible Compute can be implemented using Karpenter. You can find there more modules and learn about other Amazon Elastic Kubernetes Service best practices. {{% notice note %}} In this workshop we will not cover the introduction to EKS. We expect users of this workshop to understand about Kubernetes, Horizontal Pod Autoscaler and Cluster Autoscaler. Please refer to the **[Containers with EKS](using_ec2_spot_instances_with_eks/005_introduction.html)** workshops diff --git a/content/karpenter/020_eksctl/console_credentials.md b/content/karpenter/console_credentials.md similarity index 99% rename from content/karpenter/020_eksctl/console_credentials.md rename to content/karpenter/console_credentials.md index c6efbfc6..9b890e00 100644 --- a/content/karpenter/020_eksctl/console_credentials.md +++ b/content/karpenter/console_credentials.md @@ -1,7 +1,7 @@ --- title: "EKS Console Credentials" date: 2018-08-07T13:36:57-07:00 -weight: 40 +weight: 30 --- In this section we will set up the configuration you need to explore the Elastic Kubernetes Service (EKS) section in the AWS Console and the properties of the newly created EKS cluster. diff --git a/content/karpenter/020_eksctl/test.md b/content/karpenter/test.md similarity index 99% rename from content/karpenter/020_eksctl/test.md rename to content/karpenter/test.md index ff1f14dc..8f35d89c 100644 --- a/content/karpenter/020_eksctl/test.md +++ b/content/karpenter/test.md @@ -1,7 +1,7 @@ --- title: "Test the Cluster" date: 2018-08-07T13:36:57-07:00 -weight: 30 +weight: 20 --- ## Test the cluster: Confirm your Nodes, if we see 2 nodes then we know we have authenticated correctly: diff --git a/static/images/karpenter/prerequisites/cfn_stak_completion.png b/static/images/karpenter/prerequisites/cfn_stak_completion.png new file mode 100644 index 00000000..a986f784 Binary files /dev/null and b/static/images/karpenter/prerequisites/cfn_stak_completion.png differ