Skip to content

Commit

Permalink
chore(*): updates screenshots and install instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
sacksi28 committed Aug 1, 2022
1 parent b271450 commit 017c42e
Show file tree
Hide file tree
Showing 15 changed files with 155 additions and 118 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,20 @@
title: "EMR Instance Fleets"
weight: 30
---
The **[instance fleet](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-instance-fleet.html)** configuration for Amazon EMR clusters lets you select a wide variety of provisioning options for Amazon EC2 instances, and helps you develop a flexible and elastic resourcing strategy for each node type in your cluster. You can have only one instance fleet per master, core, and task node type. In an instance fleet configuration, you specify a *target capacity* for **[On-Demand Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-on-demand-instances.html)** and **[Spot Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html)** within each fleet.

When adopting Spot Instances into your workload, it is recommended to be flexible around how to launch your workload in terms of Availability Zone and Instance Types. This is in order to be able to achieve the required scale from multiple Spot capacity pools (a combination of EC2 instance type in an availability zone) or one capacity pool which has sufficient capacity, as well as decrease the impact on your workload in case some of the Spot capacity is interrupted with a 2-minute notice when EC2 needs the capacity back, and allow EMR to replenish the capacity with a different instance type.
When adopting Spot Instances into your workload, it is recommended to be flexible with selection of instance types across family, generation, sizes, and Availability Zones. With higher instance type diversification, Amazon EMR has more capacity pools to allocate capacity from, and chooses the Spot Instances which are least likely to be interrupted.

With EMR instance fleets, you specify target capacities for On-Demand Instances and Spot Instances within each fleet (Master, Core, Task). When the cluster launches, Amazon EMR provisions instances until the targets are fulfilled. You can specify up to five EC2 instance types per fleet for Amazon EMR to use when fulfilling the targets. You can also select multiple subnets for different Availability Zones.

{{% notice info %}}
**[Click here](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-instance-fleet.html)** to learn more about EMR Instance Fleets in the official documentation.
{{% notice note %}}
EMR allows a maximum of five instance types per fleet, when you use use the default Amazon EMR cluster instance fleet configuration. However, you can specify a maximum of 15 instance types per fleet when you create a cluster using the EMR console and enable **[allocation strategy](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-instance-fleet.html#emr-instance-fleet-allocation-strategy)** option. This limit is further increased to 30 when you create a cluster using AWS CLI or Amazon EMR API and enable allocation strategy option.
{{% /notice %}}

While a cluster is running, if Amazon EC2 reclaims a Spot Instance or if an instance fails, Amazon EMR tries to replace the instance with any of the instance types that you specify in your fleet. This makes it easier to regain capacity in case some of the instances get interrupted by EC2 when it needs the Spot capacity back.
As an enhancement to the default EMR instance fleets cluster configuration, the allocation strategy feature is available in EMR version **5.12.1 and later**. With allocation strategy EMR instance fleet with:

These options do not exist within the default EMR configuration option "Uniform Instance Groups", hence we will be using EMR Instance Fleets only.
* On-Demand Instances uses a lowest-priced strategy, which launches the lowest-priced On-Demand Instances first.
* Spot Instances uses a **[capacity-optimized](https://aws.amazon.com/about-aws/whats-new/2020/06/amazon-emr-uses-real-time-capacity-insights-to-provision-spot-instances-to-lower-cost-and-interruption/)** allocation strategy, which allocates instances from most-available Spot Instance pools and lowers the chance of further interruptions. This allocation strategy is appropriate for workloads that have a higher cost of interruption such as persistent EMR clusters running Apache Spark, Apache Hive, and Presto.

As an enhancement to the default EMR instance fleets cluster configuration, the allocation strategy feature is available in EMR version **5.12.1 and later**. With allocation strategy:
* On-Demand instances use a lowest-price strategy, which launches the lowest-priced instances first.
* Spot instances use a **[capacity-optimized](https://aws.amazon.com/about-aws/whats-new/2020/06/amazon-emr-uses-real-time-capacity-insights-to-provision-spot-instances-to-lower-cost-and-interruption/)** allocation strategy, which allocates instances from most-available Spot Instance pools and lowers the chance of interruptions. This allocation strategy is appropriate for workloads that have a higher cost of interruption such as persistent EMR clusters running Apache Spark, Apache Hive, and Presto.

{{% notice note %}}
This allocation strategy option also lets you specify **up to 15 EC2 instance types on task instance fleet**. By default, Amazon EMR allows a maximum of 5 instance types for each type of instance fleet. By enabling allocation strategy, you can diversify your Spot request for task instance fleet across 15 instance pools. With more instance type diversification, Amazon EMR has more capacity pools to allocate capacity from, this allows you to get more compute capacity.
{{% /notice %}}
These options do not exist within the default EMR configuration option uniform instance groups, hence we recommend using instance fleets with Spot Instances.

{{% notice info %}}
**[Click here](https://aws.amazon.com/blogs/big-data/optimizing-amazon-emr-for-resilience-and-cost-with-capacity-optimized-spot-instances/)** for an in-depth blog post about capacity-optimized allocation strategy for Amazon EMR instance fleets.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,29 +78,30 @@ Now, let's look at **Spark History Server** application user interface:
### Using CloudWatch Metrics
EMR emits several useful metrics to CloudWatch metrics. You can use the AWS Management Console to look at the metrics in two ways:
1. In the EMR console, under the Monitoring tab in your Cluster's page
1. In the EMR console, under the **Monitoring** tab in your cluster's page
2. By browsing to the CloudWatch service, and under Metrics, searching for the name of your cluster (copy it from the EMR Management Console) and clicking **EMR > Job Flow Metrics**
{{% notice note %}}
The metrics will take a few minutes to populate.
{{% /notice %}}
Some notable metrics:
* **AppsRunning** - you should see 1 since we only submitted one step to the cluster.
* **ContainerAllocated** - this represents the number of containers that are running on your cluster, on the Core and Task Instance Fleets. These would the be Spark executors and the Spark Driver.
* **MemoryAllocatedMB** & **MemoryAvailableMB** - you can graph them both to see how much memory the cluster is actually consuming for the wordcount Spark application out of the memory that the instances have.
Some notable metrics:
#### Number of executors in the cluster
With 32 Spot Units in the Task Instance Fleet, EMR launched either 8 * xlarge (running one executor) or 4 * 2xlarge instances (running 2 executors) or 2 * 4xlarge instances (running 4 executors), so the Task Instance Fleet provides 8 executors / containers to the cluster.
The Core Instance Fleet launched one xlarge instance, able to run one executor.
{{%expand "Question: Did you see more than 9 containers in CloudWatch Metrics and in YARN ResourceManager? if so, do you know why? Click to expand the answer" %}}
Your Spark application was configured to run in Cluster mode, meaning that the **Spark driver is running on the Core node**. Since it is counted as a container, this adds a container to our count, but it is not an executor.
{{% /expand%}}
* **AppsRunning** - you should see 1 since we only submitted one step to the cluster.
* **ContainerAllocated** - this represents the number of containers that are running on core and task fleets. These would the be Spark executors and the Spark Driver.
* **Memory allocated MB** & **Memory available MB** - you can graph them both to see how much memory the cluster is actually consuming for the wordcount Spark application out of the memory that the instances have.
#### Managed Scaling in Action
You enabled managed cluster scaling and EMR scaled out to 64 Spot units in the task fleet. EMR could have launched either 16 * xlarge (running one executor per xlarge) or 8 * 2xlarge instances (running 2 executors per 2xlarge) or 4 * 4xlarge instances (running 4 executors pe r4xlarge), so the task fleet provides 16 executors / containers to the cluster. The core fleet launched one xlarge instance and it will run one executor / container, so in total 17 executors / containers will be running in the cluster.
1. In your EMR cluster page, in the AWS Management Console, go to the **Steps** tab.
1. Go to the **Events** tab to see the scaling events.
![scalingEvent](/images/running-emr-spark-apps-on-spot/emrsparkscalingevent.png)
EMR Managed Scaling constantly monitors [key metrics](https://docs.aws.amazon.com/emr/latest/ManagementGuide/managed-scaling-metrics.html) and automatically increases or decreases the number of instances or units in your cluster based on workload.
EMR Managed cluster scaling constantly monitors [key metrics](https://docs.aws.amazon.com/emr/latest/ManagementGuide/managed-scaling-metrics.html) and automatically increases or decreases the number of instances or units in your cluster based on workload.
{{%expand "Question: Did you see more than 17 containers in CloudWatch Metrics and in YARN ResourceManager? if so, do you know why? Click to expand the answer" %}}
Your Spark application was configured to run in Cluster mode, meaning that the **Spark driver is running on the Core node**. Since it is counted as a container, this adds a container to our count, but it is not an executor.
{{% /expand%}}
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,17 @@ title: "Fleet configuration options"
weight: 85
---

While our cluster is starting (7-8 minutes) and the step is running (4-10 minutes depending on the instance types that were selected) let's take the time to look at some of the EMR Instance Fleets configurations we didn't dive into when starting the cluster.
While our cluster is starting (7-8 minutes) and the step is running (4-10 minutes depending on the instance types that were selected) let's take the time to look at some of the EMR instance fleets configurations we didn't dive into when starting the cluster.

![fleetconfigs](/images/running-emr-spark-apps-on-spot/emrinstancefleets-core1.png)
![fleetconfigs](/images/running-emr-spark-apps-on-spot/emrinstancefleets-core.png)

#### Maximum Spot price
Since Nov 2017, Amazon EC2 Spot Instances [changed the pricing model and bidding was eliminated](https://aws.amazon.com/blogs/compute/new-amazon-ec2-spot-pricing/). We have an optional "Max-price" field for our Spot requests, which would limit how much we're willing to pay for the instance. It is recommended to leave this value at 100% of the On-Demand price, in order to avoid limiting our instance diversification. We are going to pay the Spot market price regardless of the Maximum price that we can specify, and setting a higher max price does not increase the chance of getting Spot capacity nor does it decrease the chance of getting your Spot Instances interrupted when EC2 needs the capacity back. You can see the current Spot price in the AWS Management Console under EC2 -> Spot Requests -> **Pricing History**.
Since Nov 2017, Amazon EC2 Spot Instances [changed the pricing model and bidding was eliminated](https://aws.amazon.com/blogs/compute/new-amazon-ec2-spot-pricing/). We have an optional "Max-price" field for our Spot requests, which would limit how much we're willing to pay for the instance. It is recommended to leave this value at 100% of the On-Demand price, in order to avoid limiting our instance diversification. We are going to pay the Spot market price regardless of the Maximum price that we can specify, and setting a higher max price does not increase the chance of getting Spot capacity nor does it decrease the chance of getting your Spot Instances interrupted when EC2 needs the capacity back. You can see the current Spot price in the AWS Management Console under **EC2 >> Spot Requests >> Pricing History**.

#### Each instance counts as X units
This configuration allows us to give each instance type in our diversified fleet a weight that will count towards our Total units. By default, this weight is configured as the number of YARN VCores that the instance type has by default (this would typically equate to the number of EC2 vCPUs) - this way it's easy to set the Total units to the number of vCPUs we want our cluster to run with, and EMR will select the best instances while taking into account the required number of instances to run. For example, if r4.xlarge is the instance type that EMR found to be the least likely to be interrupted, its weight is 4 and our total units (only Spot) is 32, then 8 * r4.xlarge instances will be launched by EMR in the fleet.
If my Spark application is memory driven, I can set the total units to the total amount of memory I want my cluster to run with, and change the "Each instance counts as" field to the total memory of the instance, leaving aside some memory for the operating system and other processes. For example, for the r4.xlarge I can set its weight to 25. If I then set up the Total units to 500 then EMR will bring up 20 * r4.xlarge instances in the fleet. Since our executor size is 18 GB, one executor will run on this instance type.
This configuration allows us to give each instance type in our diversified fleet a weight that will count towards our **Target capacity**. By default, this weight is configured as the number of YARN VCores that the instance type has by default, this would typically equate to the number of EC2 vCPUs. For example, r4.xlarge has a default weight set to 4 units and you have set the **Target capacity** for task fleet to 32. If EMR picks r4.xlarge as the most available Spot Instance, then 8 * r4.xlarge instances will be launched by EMR in the task fleet.

If your Spark application is memory driven, you can set the **Target capacity** to the total amount of memory you want the cluster to run with. You can change the **Each instance counts as** field to the total memory of the instance, leaving aside some memory for the operating system and other processes. For example, for the r4.xlarge you can set **Each instance counts as** 18 units and set **Target capacity** to 144. If EMR picks r4.xlarge as the most available Spot Instance, then 8 * r4.xlarge instances will be launched by EMR in the task fleet. Since your executor size is 18 GB, one executor will run on each r4.xlarge instance.

#### Provisioning timeout
You can determine that after a set amount of minutes, if EMR is unable to provision your selected Spot Instances due to lack of capacity, it will either start On-Demand instances instead, or terminate the cluster. This can be determined according to the business definition of the cluster or Spark application - if it is SLA bound and should complete even at On-Demand price, then the "Switch to On-Demand" option might be suitable. However, make sure you diversify the instance types in the fleet when looking to use Spot Instances, before you look into failing over to On-Demand.
Loading

0 comments on commit 017c42e

Please sign in to comment.