Skip to content

Commit

Permalink
Upgrade to EKS 1.15 + eksctl 0.16.0 + AWS CLI 2.07 + CA 1.15.5 (#46)
Browse files Browse the repository at this point in the history
Bump all version to 1.15
Bump AWS CLI to v2.07
OnDemand Nodegroup does now run in Manage Nodegroups instead of unmanaged, that way we can explain the differences between those with Unmanaged, and with EKS Fargate.
Change in the front page to the EKS workshop
Change the order of the installation of the metric server to accommodate for installation of kube-ops-report at early stages.
Aside from Kube-ops-view, kube-resource-report is now installed to show cost allocation and savings coming from EC2 Spot nodes
Change of the allocation strategy to Capacity-optimized
Changed some of the message to highlight capacity-optimized benefits
Added extra information and changes to avoid references to Spot Instance termination handler. Using AWS Node Termination Handler instead.
Jenkins change to capacity optimized is still manual, but there is a reference to the configuration that should be used instead, this probably should change in the future.
  • Loading branch information
ruecarlo authored Apr 8, 2020
1 parent 7083211 commit 6ce0b32
Show file tree
Hide file tree
Showing 20 changed files with 143 additions and 286 deletions.
11 changes: 5 additions & 6 deletions content/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,12 @@ In this workshop you will assume the role of a data engineer, tasked with cost o
costs for running Spark applications, using Amazon EMR and EC2 Spot Instances.
{{< /card >}}

{{< card workshop
"launching_ec2_spot_instances"
"Launching EC2 Spot Instances"
"Amazon-EC2_Spot-Instance_light-bg.png"
{{< card important_workshop
"using_ec2_spot_instances_with_eks"
"Using Spot Instances with EKS"
"Amazon-Elastic-Container-Service-for-Kubernetes.svg"
>}}
In this workshop you will explore different ways of requesting Amazon EC2 Spot requests
and understand how to qualify workloads for EC2 Spot.
In this workshop, you learn how to provision, manage, and maintain your Amazon Kubernetes clusters with Amazon EKS at any scale on Spot Instances to architect for optimizations on cost and scale.
{{< /card >}}

{{< card workshop
Expand Down
4 changes: 3 additions & 1 deletion content/using_ec2_spot_instances_with_eks/cleanup.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Before you clean up the resources and complete the workshop, you may want to rev
kubectl delete hpa monte-carlo-pi-service
kubectl delete -f ~/environment/cluster-autoscaler/cluster_autoscaler.yml
kubectl delete -f monte-carlo-pi-service.yml
helm delete --purge kube-ops-view metrics-server
helm delete --purge kube-ops-view kube-resource-report metrics-server
```

## Removing eks nodegroups
Expand All @@ -28,6 +28,8 @@ od_nodegroup=$(eksctl get nodegroup --cluster eksworkshop-eksctl | tail -n 1 | a
eksctl delete nodegroup --cluster eksworkshop-eksctl --name $od_nodegroup
```

This operation may take some time. Once that it completes you can proceed with the deletion of the cluster.

## Removing the cluster
```
eksctl delete cluster --name eksworkshop-eksctl
Expand Down
14 changes: 12 additions & 2 deletions content/using_ec2_spot_instances_with_eks/eksctl/launcheks.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,23 @@ The following command will create an eks cluster with the name `eksworkshop-eksc
.It will also create a nodegroup with 2 on-demand instances.

```
eksctl create cluster --version=1.14 --name=eksworkshop-eksctl --nodes=2 --alb-ingress-access --region=${AWS_REGION} --node-labels="lifecycle=OnDemand,intent=control-apps" --asg-access
eksctl create cluster --version=1.15 --name=eksworkshop-eksctl --managed --nodes=2 --alb-ingress-access --region=${AWS_REGION} --node-labels="lifecycle=OnDemand,intent=control-apps" --asg-access
```

eksctl allows us to pass parameters to initialize the cluster. While initializing the cluster, eksctl does also allow us to create a nodegroup. The nodegroup will have two m5.large nodes and it will bootstrap with the labels **lifecycle=OnDemand** and **intent=control-apps**.
eksctl allows us to pass parameters to initialize the cluster. While initializing the cluster, eksctl does also allow us to create nodegroups.

The managed nodegroup will have two m5.large nodes and it will bootstrap with the labels **lifecycle=OnDemand** and **intent=control-apps**.

{{% notice info %}}
Launching EKS and all the dependencies will take approximately **15 minutes**
{{% /notice %}}

The command above, created a **Managed Nodegroup**. [Amazon EKS managed node groups](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) automate the provisioning and lifecycle management of nodes. Managed Nodegroups use the latest [EKS-optimized AMIs](https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html). The node run in your AWS account provisioned as apart of an EC2 Auto Scaling group that is managed for you by Amazon EKS. This means EKS takes care of the lifecycle management and undifferentiated heavy lifting on operations such as node updates, handling of terminations, gracefully drain of nodes to ensure that your applications stay available.








Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ weight: 10

For this module, we need to download the [eksctl](https://eksctl.io/) binary:
```
export EKSCTL_VERSION=0.13.0
export EKSCTL_VERSION=0.16.0
curl --silent --location "https://github.com/weaveworks/eksctl/releases/download/${EKSCTL_VERSION}/eksctl_Linux_amd64.tar.gz" | tar xz -C /tmp
sudo mv -v /tmp/eksctl /usr/local/bin
```
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
title: "Deploy the Metric server"
date: 2020-03-07T08:30:11-07:00
weight: 20
---

### Deploy the Metrics Server
Metrics Server is a cluster-wide aggregator of resource usage data. These metrics will drive the scaling behavior of the [deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). We will deploy the metrics server using `Helm` configured earlier in this workshop.

```
helm install stable/metrics-server \
--name metrics-server \
--version 2.10.0 \
--namespace metrics
```

### Confirm the Metrics API is available.

Return to the terminal in the Cloud9 Environment
```
kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
```
If all is well, you should see a status message similar to the one below in the response
```
status:
conditions:
- lastTransitionTime: 2018-10-15T15:13:13Z
message: all checks passed
reason: Passed
status: "True"
type: Available
```
Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
---
title: "Install Kube-ops-view"
date: 2018-08-07T08:30:11-07:00
weight: 20
weight: 30
---

Now that we have helm installed, we are ready to use the stable helm catalog and install tools
that will help with understanding our cluster setup in a visual way. The first of those tools that we are going to install is [Kube-ops-view](https://github.com/hjacobs/kube-ops-view) from Henning Jacobs.
that will help with understanding our cluster setup in a visual way. The first of those tools that we are going to install is [Kube-ops-view](https://github.com/hjacobs/kube-ops-view) from **[Henning Jacobs](https://github.com/hjacobs)**.

The following line updates the stable helm repository and then installs kube-ops-view using a LoadBalancer Service type and creating a RBAC (Resource Base Access Control) entry for the read-only service account to read nodes and pods information from the cluster.

```
helm repo update
helm install stable/kube-ops-view --name kube-ops-view --set service.type=LoadBalancer --set rbac.create=True
helm install stable/kube-ops-view \
--name kube-ops-view \
--set service.type=LoadBalancer \
--set nodeSelector.intent=control-apps \
--set rbac.create=True
```

The execution above installs kube-ops-view exposing it through a Service using the LoadBalancer type.
Expand Down Expand Up @@ -56,4 +60,45 @@ Spend some time checking the state and properties of your EKS cluster.

![kube-ops-view](/images/using_ec2_spot_instances_with_eks/helm/kube-ops-view-legend.png)

### Exercise

{{% notice info %}}
In this exercise we will install and explore another great tool, **[kube-resource-report](https://github.com/hjacobs/kube-resource-report)** by [Henning Jacob](https://github.com/hjacobs). Kube-resource-report generates a utilization report and associates a cost to namespaces, applications and pods. Kube-resource-report does also take into consideration the Spot savings. It uses the [describe-spot-price-history](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeSpotPriceHistory.html) average value of the reported in the last three days to provide an estimate for the cost of EC2 Spot nodes.
{{% /notice %}}

* Now that we have a way to visualize our cluster with kube-ops-view, how about visualizing the estimated cost used by our cluster namespaces, applications and pods? Follow the instructions described at **[kube-resource-report](https://github.com/hjacobs/kube-resource-report)** github repository and figure out how to deploy the helm chart with the right required parameters. (links to hints: [1](https://helm.sh/docs/chart_template_guide/values_files/), [2](https://github.com/hjacobs/kube-resource-report/blob/master/chart/kube-resource-report/values.yaml), [3](https://github.com/hjacobs/kube-resource-report/blob/master/chart/kube-resource-report/templates/deployment.yaml), [4](https://github.com/hjacobs/kube-resource-report/blob/master/chart/kube-resource-report/templates/service.yaml))


{{%expand "Show me the solution" %}}
Execute the following command in your Cloud9 terminal
```
git clone https://github.com/hjacobs/kube-resource-report
helm install --name kube-resource-report \
--set service.type=LoadBalancer \
--set service.port=80 \
--set container.port=8080 \
--set rbac.create=true \
--set nodeSelector.intent=control-apps \
kube-resource-report/chart/kube-resource-report
```

This will install the chart with the right setup, ports and the identification of the label *aws.amazon.com/spot*, that when is defined on a resource, will be used to extract EC2 Spot historic prices associated with the resource. Note that during the rest of the workshop we will still use the `lifecycle` label to identify Spot instances, and only use `aws.amazon.com/spot` to showcase the integration with kube-resource-report.

Once installed, you should be able to get the Service/Loadbalancer URL using:
```
kubectl get svc kube-resource-report | tail -n 1 | awk '{ print "Kube-resource-report URL = http://"$4 }'
```
{{% notice note %}}
You may need to refresh the page and clean your browser cache. The creation and setup of the LoadBalancer may take a few minutes; usually in four minutes or so you should see kube-resource-report.
{{% /notice %}}

Kube-resource-reports will keep track in time of the cluster. Further more, it identifies EC2 Spot nodes and uses [AWS Historic Spot price API](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeSpotPriceHistory.html) to calculates the current price of the EC2 Spot instances and attribute the correct cost.

![kube-resource-reports](/images/using_ec2_spot_instances_with_eks/helm/kube-resource-reports.png)

{{% /expand %}}

The result of this exercise should show kube-resource-report estimated cost of your cluster as well as the utilization of different components.



Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ If you are using CloudFormation or Terraform for your production EKS clusters (o

**Question**: Is the new instance type in the ASG different than the one that existed before? if so, it means that the previous instance type was selected as the cheapest option, while the new instance type was selected from the capacity pool which is least likely to be interrupted.

{{% notice info %}}
While in this exercise we have manually modified the allocation strategy through the EC2 Management Console, starting from 0.16.0 EKSCTL supports **capacity-optimized** allocation strategy. In fact you can go back to the original nodegroups that we created and see how this was implemented with eksctl.
{{% /notice %}}


#### Increasing resilience: Automatic Jenkins job retries
We can configure Jenkins to automatically retry running jobs in case of failures. One possible failure would be when a Jenkins agent is running a job on an EC2 Spot Instance that is going to be terminated due to an EC2 Spot Interruption, when EC2 needs the capacity back. To configure automatic retries for jobs, follow these steps:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,11 @@ nodeGroups:
labels:
lifecycle: Ec2Spot
intent: jenkins-agents
aws.amazon.com/spot: "true"
tags:
k8s.io/cluster-autoscaler/node-template/label/lifecycle: Ec2Spot
k8s.io/cluster-autoscaler/node-template/label/intent: jenkins-agents
k8s.io/cluster-autoscaler/node-template/label/aws.amazon.com/spot: "true"
EoF
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,10 @@ aws --version

1. Update to the latest version:
```
pip install --user --upgrade awscli
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
. ~/.bash_profile
```

1. Confirm you have a newer version:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ for the download links.](https://docs.aws.amazon.com/eks/latest/userguide/gettin

#### Install kubectl
```
export KUBECTL_VERSION=v1.14.9
export KUBECTL_VERSION=v1.15.10
sudo curl --silent --location -o /usr/local/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/${KUBECTL_VERSION}/bin/linux/amd64/kubectl
sudo chmod +x /usr/local/bin/kubectl
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ spec:
nodeSelector:
intent: control-apps
containers:
- image: k8s.gcr.io/cluster-autoscaler:v1.14.6
- image: k8s.gcr.io/cluster-autoscaler:v1.15.5
name: cluster-autoscaler
resources:
limits:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ Using the file browser on the left, open **cluster-autoscaler/cluster_autoscaler

* **Save** the file

This command contains all of the configuration for the Cluster Autoscaler. Each `--nodes` entry defines a new Autoscaling Group mapping to a Cluster Autoscaler nodegroup. Cluster Autoscaler will consider the nodegroups selected when scaling the cluster. The syntax of the line is minimum nodes **(1)**, max nodes **(5)** and **ASG Name**.
This command contains all of the configuration for the Cluster Autoscaler. Each `--nodes` entry defines a new Autoscaling Group mapping to a Cluster Autoscaler nodegroup. Cluster Autoscaler will consider the nodegroups selected when scaling the cluster. The syntax of the line is minimum nodes **(0)**, max nodes **(5)** and **ASG Name**.


### Deploy the Cluster Autoscaler

Expand Down
29 changes: 0 additions & 29 deletions content/using_ec2_spot_instances_with_eks/scaling/deploy_hpa.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,39 +7,10 @@ weight: 40
So far we have scaled the number of replicas manually. We also have built an understanding around how Cluster Autoscaler does scale the cluster.
In this section we will deploy the **[Horizontal Pod Autoscaler (HPA)](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)** and a rule to scale our application once it reaches a CPU threshold. The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization or memory.

For HPA to evaluate metrics we must first deploy Metric Server !

### Deploy the Metrics Server
Metrics Server is a cluster-wide aggregator of resource usage data. These metrics will drive the scaling behavior of the [deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). We will deploy the metrics server using `Helm` configured earlier in this workshop.

```
helm install stable/metrics-server \
--name metrics-server \
--version 2.8.3 \
--namespace metrics
```

{{% notice note %}}
Horizontal Pod Autoscaler is more versatile than just scaling on CPU and Memory. There are other projects different from the metric server that can be consider when looking scaling on the back of other metrics. For example [prometheus-adapter](https://github.com/helm/charts/tree/master/stable/prometheus-adapter) can be used wit custom metrics imported from [prometheus](https://prometheus.io/)
{{% /notice %}}

### Confirm the Metrics API is available.

Return to the terminal in the Cloud9 Environment
```
kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
```
If all is well, you should see a status message similar to the one below in the response
```
status:
conditions:
- lastTransitionTime: 2018-10-15T15:13:13Z
message: all checks passed
reason: Passed
status: "True"
type: Available
```


### Create an HPA resource associated with the Monte Carlo Pi Service

Expand Down
Loading

0 comments on commit 6ce0b32

Please sign in to comment.