Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EMR Capacity Optimized and Instance Selector CLI #130

Merged
merged 26 commits into from
Jan 8, 2021

Conversation

jagpk
Copy link
Contributor

@jagpk jagpk commented Dec 15, 2020

Issue #, if available:
resolves #87
resolves #98
resolves #86

Description of changes:

  1. Added Cloud9 IDE to install and run amazon-ec2-instance-selector CLI.
  2. Added EMR allocation strategies, Capacity Optimized for Spot and Lowest Price for OD.
  3. Updated Selecting instance types use AWS Instance Selector CLI.
  4. Updated instance sizes for tasks fleet between xl to 4xl and task fleet instance diversification to 15 instances.
  5. Change Tasks fleet - Spot units to 32 from 40, to make it a multiple for 16 vcores (supports 4xl).
  6. Updated Screenshots for Master (c4, c5 are not available in all AZs), and example tasks fleets with the 12 instance types.
  7. Removed I3 instances as they are on average 30% more expensive than R family instances.
  8. Updated 'fleet config' page and 'examine cluster' page to show examples with 32 task nodes.
  9. Updated Savings Summary to reflect latest console UI changes.
  10. Updated cliexport image to reflect the changes related to EMR cluster with allocation strategies.
  11. Updated cleanup steps to delete cloud9.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jagpk jagpk changed the title EMR Instance EMR Capacity Optimized and Instance Selector CLI Dec 15, 2020
While a cluster is running, if Amazon EC2 reclaims a Spot Instance or if an instance fails, Amazon EMR tries to replace the instance with any of the instance types that you specify in your fleet. This makes it easier to regain capacity in case some of the instances get interrupted by EC2 when it needs the Spot capacity back.\
These options do not exist within the default EMR configuration option "Uniform Instance Groups", hence we will be using EMR Instance Fleets only.

As an enhancement to the default EMR instance fleets cluster configuration, the allocation strategy feature is available in EMR version **5.12.1 and later**. With allocation strategy -/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not come well formatted : all the '-/' and '/' are causing confusion... I'm assuming you were looking for a bullet point list, but you got this:

image

* Instances which have vCPU to Memory ratio of 1:8, same as R Instance family\
* Instances with CPU Architecture x86_64 and no GPU Instances\
* Instances that belong to current generation\
* Instances types that are not supported by EMR such as R5N, R5ad and R5b. Enhanced z, I and D Instance families, which are priced higher than R family. So basically, adding a deny list with the regular expression `.*n.*|.*ad.*|.*b.*|^[zid].*`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the bullet list doesn't come as well formatted. Please read the section regarding list with bullet points : https://guides.github.com/features/mastering-markdown/ and amend accordingly. \ are causing issues with the carriage returns in some cases

image

In this case this section should be changed to :

For the purpose of this workshop we will select instances based on below criteria

 * Instances that have minimum 4 vCPUs and maximum 16 vCPUs
 * Instances which have vCPU to Memory ratio of 1:8, same as R Instance family
 * Instances with CPU Architecture x86_64 and no GPU Instances
 * Instances that belong to current generation
 * Instances types that are not supported by EMR such as R5N, R5ad and R5b. Enhanced z, I and D Instance families, which are priced higher than R family. So basically, adding a deny list with the regular expression `.*n.*|.*ad.*|.*b.*|^[zid].*`.

Copy link
Contributor

@ruecarlo ruecarlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome change... So much easier to read ! all well formatted :)

@ruecarlo ruecarlo merged commit 77479c0 into awslabs:master Jan 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants