diff --git a/content/running_spark_apps_with_emr_on_spot_instances/cloud9-awscli.md b/content/running_spark_apps_with_emr_on_spot_instances/cloud9-awscli.md index 80810716..a6427cb9 100644 --- a/content/running_spark_apps_with_emr_on_spot_instances/cloud9-awscli.md +++ b/content/running_spark_apps_with_emr_on_spot_instances/cloud9-awscli.md @@ -10,26 +10,32 @@ For this workshop, please ignore warnings about the version of pip being used. {{% /notice %}} 1. Uninstall the AWS CLI 1.x by running: -```bash -sudo pip uninstall -y awscli -``` + ```bash + sudo pip uninstall -y awscli + ``` 1. Install the AWS CLI 2.x by running the following command: -``` -curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" -unzip awscliv2.zip -sudo ./aws/install -. ~/.bash_profile -``` + ``` + curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" + unzip awscliv2.zip + sudo ./aws/install + . ~/.bash_profile + ``` 1. Confirm you have a newer version: -``` -aws --version -``` + ``` + aws --version + ``` 1. Create an SSH Key pair so you can then SSH into the EMR cluster -```bash -aws ec2 create-key-pair --key-name emr-workshop-key-pair --query "KeyMaterial" --output text > emr-workshop-key-pair.pem -chmod 400 emr-workshop-key-pair.pem -``` + ```bash + aws ec2 create-key-pair --key-name emr-workshop-key-pair --query "KeyMaterial" --output text > emr-workshop-key-pair.pem + chmod 400 emr-workshop-key-pair.pem + ``` + +1. Install JQ + + ```bash + sudo yum -y install jq + ``` \ No newline at end of file diff --git a/content/running_spark_apps_with_emr_on_spot_instances/examining_cluster.md b/content/running_spark_apps_with_emr_on_spot_instances/examining_cluster.md index 96a5be75..377a973f 100644 --- a/content/running_spark_apps_with_emr_on_spot_instances/examining_cluster.md +++ b/content/running_spark_apps_with_emr_on_spot_instances/examining_cluster.md @@ -19,17 +19,19 @@ To connect to the web interfaces running on our EMR cluster you need to use SSH First, we need to grant SSH access from the Cloud9 environment to the EMR cluster master node: 1. In your EMR cluster page, in the AWS Management Console, go to the **Summary** tab -2. Click on the ID of the security group in **Security groups for Master** -3. Check the Security Group with the name **ElasticMapReduce-master** -4. In the lower pane, click the **Inbound tab** and click the **Edit inbound rules** -5. Click **Add Rule**. Under Type, select **SSH**, under Source, select **Custom**. As the Cloud9 environment and the EMR cluster are on the default VPC, introduce the CIDR of your Default VPC (e.g. 172.16.0.0/16). To check your VPC CIDR, go to the [VPC console](https://console.aws.amazon.com/vpc/home?#) and look for the CIDR of the **Default VPC**. -6. Click **Save** +1. Click on the ID of the security group in **Security groups for Master** +1. Check the Security Group with the name **ElasticMapReduce-master** +1. In the lower pane, click the **Inbound tab** and click the **Edit inbound rules** +1. Click **Add Rule**. Under Type, select **SSH**, under Source, select **Custom**. As the Cloud9 environment and the EMR cluster are on the default VPC, introduce the CIDR of your Default VPC (e.g. 172.16.0.0/16). To check your VPC CIDR, go to the [VPC console](https://console.aws.amazon.com/vpc/home?#) and look for the CIDR of the **Default VPC**. +1. Click **Save** -At this stage, we'll be able to ssh into the EMR master node. First we will access the Ganglia web interface to look at cluster metrics: +At this stage, we'll be able to ssh into the EMR master node. + +#### Access Ganglia web interface to look at cluster metrics 1. Go to the EMR Management Console, click on your cluster, and open the **Application user interfaces** tab. You'll see the list of on-cluster application interfaces. -2. Copy the master node DNS name from one of the interface urls, it will look like ec2.xx-xxx-xxx-xxx..compute.amazonaws.com -3. Establish an SSH tunnel to port 80, where Ganglia is bound, executing the below command on your Cloud9 environment (update the command with your master node DNS name): +1. Copy the master node DNS name from one of the interface urls, it will look like ec2.xx-xxx-xxx-xxx..compute.amazonaws.com +1. Establish an SSH tunnel to port 80, where Ganglia is bound, executing the below command on your Cloud9 environment (update the command with your master node DNS name): ``` ssh -i ~/environment/emr-workshop-key-pair.pem -N -L 8080:ec2-###-##-##-###.compute-1.amazonaws.com:80 hadoop@ec2-###-##-##-###.compute-1.amazonaws.com @@ -44,27 +46,43 @@ At this stage, we'll be able to ssh into the EMR master node. First we will acce Are you sure you want to continue connecting (yes/no)? ``` -4. Now, on your Cloud9 environment, click on the "Preview" menu on the top and then click on "Preview Running Application". You'll see a browser window opening on the environment with an Apache test page. on the URL, append /ganglia/ to access the Ganglia Interface. The url will look like https://xxxxxx.vfs.cloud9.eu-west-1.amazonaws.com/ganglia/. -![Cloud9-Ganglia](/images/running-emr-spark-apps-on-spot/cloud9-ganglia.png) -5. Click on the button next to "Browser" (arrow inside a box) to open Ganglia in a dedicated browser page.Have a look around. Take notice of the heatmap (**Server Load Distribution**). Notable graphs are: -* **Cluster CPU last hour** - this will show you the CPU utilization that our Spark application consumed on our EMR cluster. you should see that utilization varied and reached around 70%. -* **Cluster Memory last hour** - this will show you how much memory we started the cluster with, and how much Spark actually consumed. +1. Now, on your Cloud9 environment, click on the "Preview" menu on the top and then click on "Preview Running Application". You'll see a browser window opening on the Cloud9 environment with an Apache test page. On the URL, append /ganglia/ to access the Ganglia Interface. + + The url will look like https://xxxxxx.vfs.cloud9.eu-west-1.amazonaws.com/ganglia/ + + ![Cloud9-Ganglia](/images/running-emr-spark-apps-on-spot/cloud9-ganglia.png) + +1. Click on the button next to "Browser" (arrow inside a box) to open Ganglia in a dedicated browser page. Have a look around. Take notice of the heatmap (**Server Load Distribution**). Notable graphs are: + + * **Cluster CPU last hour** - this will show you the CPU utilization that our Spark application consumed on our EMR cluster. you should see that utilization varied and reached around 70%. + * **Cluster Memory last hour** - this will show you how much memory we started the cluster with, and how much Spark actually consumed. + + ![Cloud9-Ganglia-Pop-Out](/images/running-emr-spark-apps-on-spot/cloud9-ganglia-pop-out.png) + +#### Access Resource Manager application user interface + +{{% notice note %}} +If Cloud9 fails to show preview of applications in the pop up window, then click on the pop out link (screenshot below) to access the application user interfaces in a new browser window. See below the screenshot: +![Cloud9-Preview-Fail](/images/running-emr-spark-apps-on-spot/cloud9-previewfail.png) +{{% /notice %}} -Now, let's look at the **Resource Manager** application user interface. -1. Go to the Cloud9 terminal where you have established the ssh connection, and press ctrl+c to close it. -1. Create an SSH tunnel to the cluster master node on port 8088 by running this command (update the command with your master node DNS name): +1. Go to the Cloud9 terminal where you have established the ssh tunnel, and press ctrl+c to close the tunnel. +1. Create a new SSH tunnel to the cluster master node on port 8088 by running this command (update the command with your master node DNS name): ``` ssh -i ~/environment/emr-workshop-key-pair.pem -N -L 8080:ec2-###-##-##-###.compute-1.amazonaws.com:8088 hadoop@ec2-###-##-##-###.compute-1.amazonaws.com ``` 1. Now, on your browser, update the URL to "/cluster" i.e. https://xxxxxx.vfs.cloud9.eu-west-1.amazonaws.com/cluster + + ![Cloud9-Resource-Manager](/images/running-emr-spark-apps-on-spot/cloud9-resource-manager.png) + 1. On the left pane, click Nodes, and in the node table, you should see the number of containers that each node ran. -Now, let's look at **Spark History Server** application user interface: +#### Access Spark History Server application user interface -1. Go to the Cloud9 terminal where you have established the ssh connection, and press ctrl+c to close it. +1. Go to the Cloud9 terminal where you have established the ssh tunnel, and press ctrl+c to close the tunnel. 1. Create an SSH tunnel to the cluster master node on port 18080 by running this command (update the command with your master node DNS name): ``` @@ -72,14 +90,19 @@ Now, let's look at **Spark History Server** application user interface: ``` 1. Now, on your browser, go to the base URL of your Cloud9 environment i.e. https://xxxxxx.vfs.cloud9.eu-west-1.amazonaws.com/ + + ![Cloud9-Spark-History-Server](/images/running-emr-spark-apps-on-spot/cloud9-spark-history-server.png) + 1. Click on the App ID in the table (where App Name = Amazon reviews word count) and go to the **Executors** tab 1. You can again see the number of executors that are running in your EMR cluster under the **Executors table** ### Using CloudWatch Metrics + EMR emits several useful metrics to CloudWatch metrics. You can use the AWS Management Console to look at the metrics in two ways: + 1. In the EMR console, under the **Monitoring** tab in your cluster's page -2. By browsing to the CloudWatch service, and under Metrics, searching for the name of your cluster (copy it from the EMR Management Console) and clicking **EMR > Job Flow Metrics** +1. By browsing to the CloudWatch service, and under Metrics, searching for the name of your cluster (copy it from the EMR Management Console) and clicking **EMR > Job Flow Metrics** {{% notice note %}} The metrics will take a few minutes to populate. @@ -91,7 +114,7 @@ Some notable metrics: * **ContainerAllocated** - this represents the number of containers that are running on core and task fleets. These would the be Spark executors and the Spark Driver. * **Memory allocated MB** & **Memory available MB** - you can graph them both to see how much memory the cluster is actually consuming for the wordcount Spark application out of the memory that the instances have. -#### Managed Scaling in Action +### Managed Scaling in Action You enabled managed cluster scaling and EMR scaled out to 64 Spot units in the task fleet. EMR could have launched either 16 * xlarge (running one executor per xlarge) or 8 * 2xlarge instances (running 2 executors per 2xlarge) or 4 * 4xlarge instances (running 4 executors pe r4xlarge), so the task fleet provides 16 executors / containers to the cluster. The core fleet launched one xlarge instance and it will run one executor / container, so in total 17 executors / containers will be running in the cluster. diff --git a/static/images/running-emr-spark-apps-on-spot/cloud9-ganglia-pop-out.png b/static/images/running-emr-spark-apps-on-spot/cloud9-ganglia-pop-out.png new file mode 100644 index 00000000..287599ad Binary files /dev/null and b/static/images/running-emr-spark-apps-on-spot/cloud9-ganglia-pop-out.png differ diff --git a/static/images/running-emr-spark-apps-on-spot/cloud9-previewfail.png b/static/images/running-emr-spark-apps-on-spot/cloud9-previewfail.png new file mode 100644 index 00000000..58748302 Binary files /dev/null and b/static/images/running-emr-spark-apps-on-spot/cloud9-previewfail.png differ diff --git a/static/images/running-emr-spark-apps-on-spot/cloud9-resource-manager.png b/static/images/running-emr-spark-apps-on-spot/cloud9-resource-manager.png new file mode 100644 index 00000000..22c5bf92 Binary files /dev/null and b/static/images/running-emr-spark-apps-on-spot/cloud9-resource-manager.png differ diff --git a/static/images/running-emr-spark-apps-on-spot/cloud9-spark-history-server.png b/static/images/running-emr-spark-apps-on-spot/cloud9-spark-history-server.png new file mode 100644 index 00000000..e6fa906c Binary files /dev/null and b/static/images/running-emr-spark-apps-on-spot/cloud9-spark-history-server.png differ