Upgrade to EKS 1.15 + eksctl 0.16.0 + AWS CLI 2.07 + CA 1.15.5 (#46)

Bump all version to 1.15 Bump AWS CLI to v2.07 OnDemand Nodegroup does now run in Manage Nodegroups instead of unmanaged, that way we can explain the differences between those with Unmanaged, and with EKS Fargate. Change in the front page to the EKS workshop Change the order of the installation of the metric server to accommodate for installation of kube-ops-report at early stages. Aside from Kube-ops-view, kube-resource-report is now installed to show cost allocation and savings coming from EC2 Spot nodes Change of the allocation strategy to Capacity-optimized Changed some of the message to highlight capacity-optimized benefits Added extra information and changes to avoid references to Spot Instance termination handler. Using AWS Node Termination Handler instead. Jenkins change to capacity optimized is still manual, but there is a reference to the configuration that should be used instead, this probably should change in the future.
awslabs · Apr 8, 2020 · 6ce0b32 · 6ce0b32
1 parent 7083211
commit 6ce0b32
Show file tree

Hide file tree

Showing 20 changed files with 143 additions and 286 deletions.
diff --git a/content/_index.md b/content/_index.md
@@ -43,13 +43,12 @@ In this workshop you will assume the role of a data engineer, tasked with cost o
 costs for running Spark applications, using Amazon EMR and EC2 Spot Instances.
 {{< /card >}}
 
-{{< card workshop 
-    "launching_ec2_spot_instances"
-    "Launching EC2 Spot Instances"
-    "Amazon-EC2_Spot-Instance_light-bg.png" 
+{{< card important_workshop 
+    "using_ec2_spot_instances_with_eks"
+    "Using Spot Instances with EKS"
+    "Amazon-Elastic-Container-Service-for-Kubernetes.svg" 
 >}}
-In this workshop you will explore different ways of requesting Amazon EC2 Spot requests
-and understand how to qualify workloads for EC2 Spot.
+In this workshop, you learn how to provision, manage, and maintain your Amazon Kubernetes clusters with Amazon EKS at any scale on Spot Instances to architect for optimizations on cost and scale.
 {{< /card >}}
 
 {{< card workshop 

diff --git a/content/using_ec2_spot_instances_with_eks/cleanup.md b/content/using_ec2_spot_instances_with_eks/cleanup.md
@@ -18,7 +18,7 @@ Before you clean up the resources and complete the workshop, you may want to rev
 kubectl delete hpa monte-carlo-pi-service
 kubectl delete -f ~/environment/cluster-autoscaler/cluster_autoscaler.yml
 kubectl delete -f monte-carlo-pi-service.yml
-helm delete --purge kube-ops-view metrics-server
+helm delete --purge kube-ops-view kube-resource-report metrics-server
 ```
 
 ## Removing eks nodegroups
@@ -28,6 +28,8 @@ od_nodegroup=$(eksctl get nodegroup --cluster eksworkshop-eksctl | tail -n 1 | a
 eksctl delete nodegroup --cluster eksworkshop-eksctl --name $od_nodegroup
 ```
 
+This operation may take some time. Once that it completes you can proceed with the deletion of the cluster.
+
 ## Removing the cluster
 ```
 eksctl delete cluster --name eksworkshop-eksctl

diff --git a/content/using_ec2_spot_instances_with_eks/eksctl/launcheks.md b/content/using_ec2_spot_instances_with_eks/eksctl/launcheks.md
@@ -41,13 +41,23 @@ The following command will create an eks cluster with the name `eksworkshop-eksc
 .It will also create a nodegroup with 2 on-demand instances.
 
 ```
-eksctl create cluster --version=1.14 --name=eksworkshop-eksctl --nodes=2 --alb-ingress-access --region=${AWS_REGION} --node-labels="lifecycle=OnDemand,intent=control-apps" --asg-access
+eksctl create cluster --version=1.15 --name=eksworkshop-eksctl --managed --nodes=2 --alb-ingress-access --region=${AWS_REGION} --node-labels="lifecycle=OnDemand,intent=control-apps" --asg-access
 ```
 
-eksctl allows us to pass parameters to initialize the cluster. While initializing the cluster, eksctl does also allow us to create a nodegroup. The nodegroup will have two m5.large nodes and it will bootstrap with the labels **lifecycle=OnDemand** and **intent=control-apps**.
+eksctl allows us to pass parameters to initialize the cluster. While initializing the cluster, eksctl does also allow us to create nodegroups.
+
+The managed nodegroup will have two m5.large nodes and it will bootstrap with the labels **lifecycle=OnDemand** and **intent=control-apps**.
 
 {{% notice info %}}
 Launching EKS and all the dependencies will take approximately **15 minutes**
 {{% /notice %}}
 
+The command above, created a **Managed Nodegroup**. [Amazon EKS managed node groups](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) automate the provisioning and lifecycle management of nodes. Managed Nodegroups use the latest [EKS-optimized AMIs](https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html). The node run in your AWS account provisioned as apart of an EC2 Auto Scaling group that is managed for you by Amazon EKS. This means EKS takes care of the lifecycle management and undifferentiated heavy lifting on operations such as node updates, handling of terminations, gracefully drain of nodes to ensure that your applications stay available.
+
+
+
+
+
+
+
 
diff --git a/content/using_ec2_spot_instances_with_eks/eksctl/prerequisites.md b/content/using_ec2_spot_instances_with_eks/eksctl/prerequisites.md
@@ -6,7 +6,7 @@ weight: 10
 
 For this module, we need to download the [eksctl](https://eksctl.io/) binary:
 ```
-export EKSCTL_VERSION=0.13.0
+export EKSCTL_VERSION=0.16.0
 curl --silent --location "https://github.com/weaveworks/eksctl/releases/download/${EKSCTL_VERSION}/eksctl_Linux_amd64.tar.gz" | tar xz -C /tmp
 sudo mv -v /tmp/eksctl /usr/local/bin
 ```

diff --git a/content/using_ec2_spot_instances_with_eks/helm_root/deploy_metric_server.md b/content/using_ec2_spot_instances_with_eks/helm_root/deploy_metric_server.md
@@ -0,0 +1,32 @@
+---
+title: "Deploy the Metric server"
+date: 2020-03-07T08:30:11-07:00
+weight: 20
+---
+
+### Deploy the Metrics Server
+Metrics Server is a cluster-wide aggregator of resource usage data. These metrics will drive the scaling behavior of the [deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). We will deploy the metrics server using `Helm` configured earlier in this workshop.
+
+```
+helm install stable/metrics-server \
+    --name metrics-server \
+    --version 2.10.0 \
+    --namespace metrics
+```
+
+### Confirm the Metrics API is available.
+
+Return to the terminal in the Cloud9 Environment
+```
+kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
+```
+If all is well, you should see a status message similar to the one below in the response
+```
+status:
+  conditions:
+  - lastTransitionTime: 2018-10-15T15:13:13Z
+    message: all checks passed
+    reason: Passed
+    status: "True"
+    type: Available
+```
diff --git a/content/using_ec2_spot_instances_with_eks/helm_root/install_kube_ops_view.md b/content/using_ec2_spot_instances_with_eks/helm_root/install_kube_ops_view.md
@@ -1,17 +1,21 @@
 ---
 title: "Install Kube-ops-view"
 date: 2018-08-07T08:30:11-07:00
-weight: 20
+weight: 30
 ---
 
 Now that we have helm installed, we are ready to use the stable helm catalog and install tools 
-that will help with understanding our cluster setup in a visual way. The first of those tools that we are going to install is [Kube-ops-view](https://github.com/hjacobs/kube-ops-view) from Henning Jacobs.
+that will help with understanding our cluster setup in a visual way. The first of those tools that we are going to install is [Kube-ops-view](https://github.com/hjacobs/kube-ops-view) from **[Henning Jacobs](https://github.com/hjacobs)**.
 
 The following line updates the stable helm repository and then installs kube-ops-view using a LoadBalancer Service type and creating a RBAC (Resource Base Access Control) entry for the read-only service account to read nodes and pods information from the cluster.
 
 ```
 helm repo update
-helm install stable/kube-ops-view --name kube-ops-view --set service.type=LoadBalancer --set rbac.create=True
+helm install stable/kube-ops-view \
+--name kube-ops-view \
+--set service.type=LoadBalancer \
+--set nodeSelector.intent=control-apps \
+--set rbac.create=True
 ```
 
 The execution above installs kube-ops-view  exposing it through a Service using the LoadBalancer type.
@@ -56,4 +60,45 @@ Spend some time checking the state and properties of your EKS cluster.
 
 ![kube-ops-view](/images/using_ec2_spot_instances_with_eks/helm/kube-ops-view-legend.png)
 
+### Exercise
+
+{{% notice info %}}
+In this exercise we will install and explore another great tool, **[kube-resource-report](https://github.com/hjacobs/kube-resource-report)** by [Henning Jacob](https://github.com/hjacobs). Kube-resource-report generates a utilization report and associates a cost to namespaces, applications and pods. Kube-resource-report does also take into consideration the Spot savings. It uses the [describe-spot-price-history](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeSpotPriceHistory.html) average value of the reported in the last three days to provide an estimate for the cost of EC2 Spot nodes.  
+{{% /notice %}}
+
+ * Now that we have a way to visualize our cluster with kube-ops-view, how about visualizing the estimated cost used by our cluster  namespaces, applications and pods? Follow the instructions described at **[kube-resource-report](https://github.com/hjacobs/kube-resource-report)** github repository and figure out how to deploy the helm chart with the right required parameters. (links to hints: [1](https://helm.sh/docs/chart_template_guide/values_files/), [2](https://github.com/hjacobs/kube-resource-report/blob/master/chart/kube-resource-report/values.yaml), [3](https://github.com/hjacobs/kube-resource-report/blob/master/chart/kube-resource-report/templates/deployment.yaml), [4](https://github.com/hjacobs/kube-resource-report/blob/master/chart/kube-resource-report/templates/service.yaml))
+
+
+{{%expand "Show me the solution" %}}
+Execute the following command in your Cloud9 terminal
+```
+git clone https://github.com/hjacobs/kube-resource-report
+helm install --name kube-resource-report \
+--set service.type=LoadBalancer \
+--set service.port=80 \
+--set container.port=8080 \
+--set rbac.create=true \
+--set nodeSelector.intent=control-apps \
+kube-resource-report/chart/kube-resource-report
+```
+
+This will install the chart with the right setup, ports and the identification of the label *aws.amazon.com/spot*, that when is defined on a resource, will be used to extract EC2 Spot historic prices associated with the resource. Note that during the rest of the workshop we will still use the `lifecycle` label to identify Spot instances, and only use `aws.amazon.com/spot` to showcase the integration with kube-resource-report. 
+
+Once installed, you should be able to get the Service/Loadbalancer URL using:
+```
+kubectl get svc kube-resource-report | tail -n 1 | awk '{ print "Kube-resource-report URL = http://"$4 }'
+```
+{{% notice note %}}
+You may need to refresh the page and clean your browser cache. The creation and setup of the LoadBalancer may take a few minutes; usually in four minutes or so you should see kube-resource-report. 
+{{% /notice %}}
+
+Kube-resource-reports will keep track in time of the cluster. Further more, it identifies EC2 Spot nodes and uses [AWS Historic Spot price API](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeSpotPriceHistory.html) to calculates the current price of the EC2 Spot instances and attribute the correct cost.
+
+![kube-resource-reports](/images/using_ec2_spot_instances_with_eks/helm/kube-resource-reports.png)
+
+{{% /expand %}}
+
+The result of this exercise should show kube-resource-report estimated cost of your cluster as well as the utilization of different components.
+
+
 
diff --git a/content/using_ec2_spot_instances_with_eks/jenkins/increasing_resilience.md b/content/using_ec2_spot_instances_with_eks/jenkins/increasing_resilience.md
@@ -44,6 +44,10 @@ If you are using CloudFormation or Terraform for your production EKS clusters (o
 
 **Question**: Is the new instance type in the ASG different than the one that existed before? if so, it means that the previous instance type was selected as the cheapest option, while the new instance type was selected from the capacity pool which is least likely to be interrupted.
 
+{{% notice info %}}
+While in this exercise we have manually modified the allocation strategy through the EC2 Management Console, starting from 0.16.0 EKSCTL supports **capacity-optimized** allocation strategy. In fact you can go back to the original nodegroups that we created and see how this was implemented with eksctl.
+{{% /notice %}}
+
 
 #### Increasing resilience: Automatic Jenkins job retries
 We can configure Jenkins to automatically retry running jobs in case of failures. One possible failure would be when a Jenkins agent is running a job on an EC2 Spot Instance that is going to be terminated due to an EC2 Spot Interruption, when EC2 needs the capacity back. To configure automatic retries for jobs, follow these steps:

diff --git a/content/using_ec2_spot_instances_with_eks/jenkins/setup_agents.md b/content/using_ec2_spot_instances_with_eks/jenkins/setup_agents.md
@@ -31,9 +31,11 @@ nodeGroups:
       labels:
         lifecycle: Ec2Spot
         intent: jenkins-agents
+        aws.amazon.com/spot: "true"
       tags:
           k8s.io/cluster-autoscaler/node-template/label/lifecycle: Ec2Spot
           k8s.io/cluster-autoscaler/node-template/label/intent: jenkins-agents
+          k8s.io/cluster-autoscaler/node-template/label/aws.amazon.com/spot: "true"
 EoF
 ```
 

diff --git a/content/using_ec2_spot_instances_with_eks/prerequisites/awscli.md b/content/using_ec2_spot_instances_with_eks/prerequisites/awscli.md
@@ -17,7 +17,10 @@ aws --version
 
 1. Update to the latest version:
 ```
-pip install --user --upgrade awscli
+curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
+unzip awscliv2.zip
+sudo ./aws/install
+. ~/.bash_profile
 ```
 
 1. Confirm you have a newer version:

diff --git a/content/using_ec2_spot_instances_with_eks/prerequisites/k8stools.md b/content/using_ec2_spot_instances_with_eks/prerequisites/k8stools.md
@@ -15,7 +15,7 @@ for the download links.](https://docs.aws.amazon.com/eks/latest/userguide/gettin
 
 #### Install kubectl
 ```
-export KUBECTL_VERSION=v1.14.9
+export KUBECTL_VERSION=v1.15.10
 sudo curl --silent --location -o /usr/local/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/${KUBECTL_VERSION}/bin/linux/amd64/kubectl
 sudo chmod +x /usr/local/bin/kubectl
 ```

diff --git a/content/using_ec2_spot_instances_with_eks/scaling/deploy_ca.files/cluster_autoscaler.yml b/content/using_ec2_spot_instances_with_eks/scaling/deploy_ca.files/cluster_autoscaler.yml
@@ -126,7 +126,7 @@ spec:
       nodeSelector:
         intent: control-apps      
       containers:
-        - image: k8s.gcr.io/cluster-autoscaler:v1.14.6
+        - image: k8s.gcr.io/cluster-autoscaler:v1.15.5
           name: cluster-autoscaler
           resources:
             limits:

diff --git a/content/using_ec2_spot_instances_with_eks/scaling/deploy_ca.md b/content/using_ec2_spot_instances_with_eks/scaling/deploy_ca.md
@@ -51,7 +51,8 @@ Using the file browser on the left, open **cluster-autoscaler/cluster_autoscaler
 
  * **Save** the file
 
-This command contains all of the configuration for the Cluster Autoscaler. Each `--nodes` entry defines a new Autoscaling Group mapping to a Cluster Autoscaler nodegroup. Cluster Autoscaler will consider the nodegroups selected when scaling the cluster. The syntax of the line is minimum nodes **(1)**, max nodes **(5)** and **ASG Name**.
+This command contains all of the configuration for the Cluster Autoscaler. Each `--nodes` entry defines a new Autoscaling Group mapping to a Cluster Autoscaler nodegroup. Cluster Autoscaler will consider the nodegroups selected when scaling the cluster. The syntax of the line is minimum nodes **(0)**, max nodes **(5)** and **ASG Name**.
+
 
 ### Deploy the Cluster Autoscaler
 

diff --git a/content/using_ec2_spot_instances_with_eks/scaling/deploy_hpa.md b/content/using_ec2_spot_instances_with_eks/scaling/deploy_hpa.md
@@ -7,39 +7,10 @@ weight: 40
 So far we have scaled the number of replicas manually. We also have built an understanding around how Cluster Autoscaler does scale the cluster.
 In this section we will deploy the **[Horizontal Pod Autoscaler (HPA)](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)** and a rule to scale our application once it reaches a CPU threshold. The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization or memory. 
 
-For HPA to evaluate metrics we must first deploy Metric Server ! 
-
-### Deploy the Metrics Server
-Metrics Server is a cluster-wide aggregator of resource usage data. These metrics will drive the scaling behavior of the [deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). We will deploy the metrics server using `Helm` configured earlier in this workshop.
-
-```
-helm install stable/metrics-server \
-    --name metrics-server \
-    --version 2.8.3 \
-    --namespace metrics
-```
-
 {{% notice note %}}
 Horizontal Pod Autoscaler is more versatile than just scaling on CPU and Memory. There are other projects different from the metric server that can be consider when looking scaling on the back of other metrics. For example [prometheus-adapter](https://github.com/helm/charts/tree/master/stable/prometheus-adapter) can be used wit custom metrics imported from [prometheus](https://prometheus.io/)
 {{% /notice %}}
 
-### Confirm the Metrics API is available.
-
-Return to the terminal in the Cloud9 Environment
-```
-kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
-```
-If all is well, you should see a status message similar to the one below in the response
-```
-status:
-  conditions:
-  - lastTransitionTime: 2018-10-15T15:13:13Z
-    message: all checks passed
-    reason: Passed
-    status: "True"
-    type: Available
-```
-
 
 ### Create an HPA resource associated with the Monte Carlo Pi Service