diff --git a/content/ecs-spot-capacity-providers/architecture.md b/content/ecs-spot-capacity-providers/architecture.md index 078b22db..d2c5758f 100644 --- a/content/ecs-spot-capacity-providers/architecture.md +++ b/content/ecs-spot-capacity-providers/architecture.md @@ -24,4 +24,4 @@ Here is the overall architecture of what you will be building throughout this wo #### Here is a diagram of the resulting architecture: -![Overall Architecture](/images/ecs-spot-capacity-providers/architecture1.png) \ No newline at end of file +![Overall Architecture](/images/ecs-spot-capacity-providers/architecture1.png) diff --git a/content/ecs-spot-capacity-providers/module-1/_index.md b/content/ecs-spot-capacity-providers/module-1/_index.md index 5e67931b..fc0150a9 100644 --- a/content/ecs-spot-capacity-providers/module-1/_index.md +++ b/content/ecs-spot-capacity-providers/module-1/_index.md @@ -41,4 +41,5 @@ The strategy sets FARGATE as the default capacity provider. That means if there Click _***Update Cluster***_ on the top right corner to see default Capacity Provider Strategy. As shown base=1 is set for FARGATE Capacity Provider. -![ECS Cluster](/images/ecs-spot-capacity-providers/c2.png) \ No newline at end of file +![ECS Cluster](/images/ecs-spot-capacity-providers/c2.png) + diff --git a/content/ecs-spot-capacity-providers/module-1/attach_iam_role.md b/content/ecs-spot-capacity-providers/module-1/attach_iam_role.md new file mode 100644 index 00000000..369eb703 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-1/attach_iam_role.md @@ -0,0 +1,54 @@ +--- +title: "Attach the IAM role to your Workspace" +chapter: true +weight: 20 +--- + +### Attach the IAM role to your Workspace + +- Follow [this deep link to find your Cloud9 EC2 instance](https://console.aws.amazon.com/ec2/v2/home?#Instances:tag:Name=aws-cloud9-.*workshop.*;sort=desc:launchTime) +- Select the instance, then choose **Actions / Instance Settings / Attach/Replace IAM Role** +- Choose **ecsspotworkshop-admin** from the **IAM Role** drop down, and select **Apply** +- Return to your workspace and click the sprocket, or launch a new tab to open the Preferences tab +- Select **AWS SETTINGS** +- Turn off **AWS managed temporary credentials** +- Close the Preferences tab +- To ensure temporary credentials aren't already in place we will also remove any existing credentials file: +``` +rm -vf ${HOME}/.aws/credentials +``` + +- We should configure our aws cli with our current region as default: +``` +export ACCOUNT\_ID=$(aws sts get-caller-identity --output text --query Account) + export AWS\_REGION=$(curl -s 169.254.169.254/latest/dynamic/instance-identity/document | jq -r '.region') + echo "export ACCOUNT\_ID=${ACCOUNT\_ID}" \>\> ~/.bash\_profile + echo "export AWS\_REGION=${AWS\_REGION}" \>\> ~/.bash\_profile + aws configure set default.region ${AWS\_REGION} + aws configure get default.region +``` + +- Use the [GetCallerIdentity](https://docs.aws.amazon.com/cli/latest/reference/sts/get-caller-identity.html) CLI command to validate that the Cloud9 IDE is using the correct IAM role. + +``` +aws sts get-caller-identity +``` +- The output assumed-role name should contain: + +``` +{ + "Account": "000474600478", + "UserId": "AROAQAHCJ2QPAONSHPAXY:i-01ad7d6cd53ba8945", + "Arn": "arn:aws:sts::000474600478:assumed-role/ecsspotworkshop-admin/i-01ad7d6cd53ba8945" + } +``` + + + +#### Attach IAM role to your Cloud 9 Environment: +![Cloud 9 Environment](/images/ecs-spot-capacity-providers/iam_attach_role.png) + + + + +Now you are done with Module-1, Proceed to Module-2 of this workshop. \ No newline at end of file diff --git a/content/ecs-spot-capacity-providers/module-1/cli_setup.md b/content/ecs-spot-capacity-providers/module-1/cli_setup.md new file mode 100644 index 00000000..14c7f11d --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-1/cli_setup.md @@ -0,0 +1,40 @@ +--- +title: "CLI Setup" +chapter: true +weight: 10 +--- + +### Setup AWS CLI and other tools + +Make sure the latest version of the AWS CLI is installed by running: + +``` +sudo pip install -U awscli +``` +Install dependencies for use in the workshop by running: + +``` +sudo yum -y install jq gettext +``` + +### Clone the GitHub repo + +In order to execute the steps in the workshop, you'll need to clone the workshop GitHub repo. + +In the Cloud9 IDE terminal, run the following command: + +(remove before prod0 +``` +git clone https://github.com/jalawala/ec2-spot-workshops.git +``` +``` +git clone https://github.com/awslabs/ec2-spot-workshops.git +``` +Change into the workshop directory: + +``` +cd ec2-spot-workshops/workshops/ecs-spot-capacity-providers +``` + +Feel free to browse around. You can also browse the directory structure in the **Environment** tab on the left, and even edit files directly there by double clicking on them. + diff --git a/content/ecs-spot-capacity-providers/module-1/create_iam_role.md b/content/ecs-spot-capacity-providers/module-1/create_iam_role.md new file mode 100644 index 00000000..45614b72 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-1/create_iam_role.md @@ -0,0 +1,43 @@ +--- +title: "Create IAM roles for your Workspace" +chapter: true +weight: 15 +--- + +### Create IAM roles for your Workspace + + +In order to work with ECS from our workstation, we will need the appropriate permissions for our developer workstation instance. + +1. Go to the [IAM Console](https://console.aws.amazon.com/iam/home), **Roles** > **Create New Role > AWS Service > EC2.** We will later assign this role to our workstation instance. +1. Click **Next: Permissions.** Confirm that **AdministratorAccess** is checked (TBD: to restrict needed permissions only) +1. Click **Next:Tags** Take the defaults, and click **Next: Review** to review. +1. Enter **ecsspotworkshop-admin** for the Name, and click **Create role**. + +
When creating a capacity provider, you can optionally enable managed scaling. When managed scaling is enabled, Amazon ECS manages the scale-in and scale-out actions of the Auto Scaling group. On your behalf, Amazon ECS creates an AWS Auto Scaling scaling plan with a target tracking scaling policy based on the target capacity value you specify. Amazon ECS then associates this scaling plan with your Auto Scaling group. For each of the capacity providers with managed scaling enabled, an Amazon ECS managed CloudWatch metric with the prefix AWS/ECS/ManagedScaling is created along with two CloudWatch alarms. The CloudWatch metrics and alarms are used to monitor the container instance capacity in your Auto Scaling groups and will trigger the Auto Scaling group to scale in and scale out as needed. -
\ No newline at end of file + + diff --git a/content/ecs-spot-capacity-providers/module-2/fargate_service.md b/content/ecs-spot-capacity-providers/module-2/fargate_service.md new file mode 100644 index 00000000..f5d1ea10 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-2/fargate_service.md @@ -0,0 +1,123 @@ +--- +title: "Create ECS Fargate Services" +chapter: true +weight: 2 +--- + +### Create ECS Fargate Services + +In this section, we will create 3 ECS Services to show how tasks can be deployed across FARGATE and FARGATE\_SPOT capacity providers(CP). + + +| **Service Name** | **No. of Tasks** | **No. of Tasks on FARGATE CP** | **Number of Tasks on FARGATE_SPOT CP** | **CP Strategy** | +| --- | --- |--- |--- |--- | +| **webapp-fargate-service-fargate** | 2 | 2 | 0 | FARGATE Capacity Provider weight =1 | +| **fargate-service-fargate-spot** | 2 | 0 | 2 | FARGATE_SPOT Capacity Provider weight =1 | +| **fargate-service-fargate-mix** | 4 | 3 | 1 | FARGATE Capacity Provider weight =3 FARGATE_SPOT Capacity Provider weight =1 | + +We will be creating the ECS services and tasks in the new VPC we created in the Module-1 i.e. **Quick-Start-VPC** + +So let's first find the default public subnets created in this VPC. You can find the subnet IDs in this VPC in the AWS console as shown below, under the VPC service. + +Alternatively you can run the below command to list all the subnets in this VPC + +``` +aws ec2 describe-subnets --filters "Name=tag:aws:cloudformation:stack-name,Values=Quick-Start-VPC" | jq -r '.Subnets[].SubnetId' +``` + +The output from above command looks like below. + +``` +subnet-07a877ee28959daa3 +subnet-015fc3e06f653980a +subnet-003ef0ebc04c89b2d +``` + +Run the below command to set a variable for the subnets. We will use this variable in other steps. + +``` +export PUBLIC\_SUBNET\_LIST="subnet-07a877ee28959daa3,subnet-015fc3e06f653980a,subnet-003ef0ebc04c89b2d" +``` + +Now let's find the default security group created in this VPC. You can find it in the AWS console as follows. + +You can also run the below command to list the default security group in this VPC + +``` +export VPC\_ID=$(aws ec2 describe-vpcs --filters "Name=tag:aws:cloudformation:stack-name,Values=Quick-Start-VPC" | jq -r '.Vpcs[0].VpcId') + echo "Quick Start VPC ID is $VPC\_ID" +``` + +The output from above command looks like below. + +``` +Quick Start VPC ID is vpc-0a2fc4f24cbfab696 +``` + +``` +export SECURITY\_GROUP=$( aws ec2 describe-security-groups --filters "Name=vpc-id,Values=$VPC\_ID" | jq -r '.SecurityGroups[0].GroupId') + echo "Default Security group is $SECURITY\_GROUP" +``` + +The output from above command looks like below. + +``` +Default Security group is sg-03ccfca80f9fddf4d +``` + +Deploy the service **webapp-fargate-service-fargate** using below command. + +``` +aws ecs create-service \ + --capacity-provider-strategy capacityProvider=FARGATE,weight=1 \ + --cluster EcsSpotWorkshopCluster \ + --service-name webapp-fargate-service-fargate \ + --task-definition webapp-fargate-task:1 \ + --desired-count 2 \ + --region $AWS\_REGION \ + --network-configuration "awsvpcConfiguration={subnets=[$PUBLIC\_SUBNET\_LIST],securityGroups=[$SECURITY\_GROUP],assignPublicIp="ENABLED"}" + +``` +Note the capacity provider strategy used for this service. It provides weight only for FARGATE capacity provider. This strategy overrides the default capacity provider strategy which is set to FARGATE capacity provider. + +That means ECS schedules all of the tasks (2 in this case) in service on the FARGATE Capacity providers. + +Deploy the service **webapp-fargate-service-fargate-spot** using below command + +``` +aws ecs create-service \ + --capacity-provider-strategy capacityProvider=FARGATE\_SPOT,weight=1 \ + --cluster EcsSpotWorkshopCluster \ + --service-name webapp-fargate-service-fargate-spot \ + --task-definition webapp-fargate-task:1 \ + --desired-count 2\ + --region $AWS\_REGION \ + --network-configuration "awsvpcConfiguration={subnets=[$PUBLIC\_SUBNET\_LIST],securityGroups=[$SECURITY\_GROUP],assignPublicIp="ENABLED"}" +``` + +Note the capacity provider strategy used for this service. It provides weight only for FARGATE\_SPOT capacity provider. This strategy overrides the default capacity provider strategy which is set to FARGATE capacity provider. + +That means ECS schedules all of the tasks (2 in this case) in service on the FARGATE\_SPOT Capacity providers. + +Deploy the service **webapp-fargate-service-fargate-mix** using below command + +``` +aws ecs create-service \ + --capacity-provider-strategy capacityProvider=FARGATE,weight=3 capacityProvider=FARGATE\_SPOT,weight=1 \ + --cluster EcsSpotWorkshopCluster \ + --service-name webapp-fargate-service-fargate-mix \ + --task-definition webapp-fargate-task:1 \ + --desired-count 4\ + --region $AWS\_REGION \ + --network-configuration "awsvpcConfiguration={subnets=[$PUBLIC\_SUBNET\_LIST],securityGroups=[$SECURITY\_GROUP],assignPublicIp="ENABLED"}" +``` + +Note the capacity provider strategy used for this service. It provides a weight of 3 to FARGATE and 1 to FARGATE\_SPOT capacity provider. This strategy overrides the default capacity provider strategy which is set to FARGATE capacity provider. + +That means ECS schedules splits the total tasks (4 in this case) in 3:1 ratio between FARGATE and FARGATE\_SPOT Capacity providers. + +But how do you verify if ECS really scheduled the tasks in this way? + +Click on the service **webapp-fargate-service-fargate-mix** and select Tasks Tab + +Click on each task and note the Capacity Provider \ No newline at end of file diff --git a/content/ecs-spot-capacity-providers/module-2/fargate_task.md b/content/ecs-spot-capacity-providers/module-2/fargate_task.md new file mode 100644 index 00000000..6cc986ff --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-2/fargate_task.md @@ -0,0 +1,19 @@ +--- +title: "Create ECS Fargate Tasks" +chapter: true +weight: 1 +--- + +### Create ECS Fargate Tasks + +In this section, we will create a task definition for for tasks to be launched on the Fargate Capacity Providers. + +Run the below command to create the task definition + +``` +aws ecs register-task-definition --cli-input-json file://webapp-fargate-task.jso +``` + +The task will look like this in console + +PIC: TBD \ No newline at end of file diff --git a/content/ecs-spot-capacity-providers/module-3/_index.md b/content/ecs-spot-capacity-providers/module-3/_index.md index 2fb15a23..1fd249af 100644 --- a/content/ecs-spot-capacity-providers/module-3/_index.md +++ b/content/ecs-spot-capacity-providers/module-3/_index.md @@ -83,3 +83,4 @@ To ensure that your containers exit gracefully before the task stops, the follow is using. Specifying a stopTimeout value gives you time between the moment the task state change event is received and the point at which the container is forcefully stopped. • The **SIGTERM** signal must be received from within the container to perform any cleanup actions. + diff --git a/content/ecs-spot-capacity-providers/module-3/asg_with_od.md b/content/ecs-spot-capacity-providers/module-3/asg_with_od.md new file mode 100644 index 00000000..60ee8777 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-3/asg_with_od.md @@ -0,0 +1,95 @@ +--- +title: "Creating an Auto Scaling Group (ASG) with EC2 On-Demand Instances" +chapter: true +weight: 5 +--- + + +### Creating an Auto Scaling Group (ASG) with EC2 On-Demand Instances + +In this section, we will create an EC2 Auto Scaling Group for On-Demand Instances using the Launch Template created in previous section. + +Copy the file **templates/asg.json** for the Auto scaling group configuration. + +``` +cp templates/asg.json . +``` + +Take a moment to look at the user asg.json to see various configuration options in the ASG. + +Set the following variables for auto scaling configuration + +``` +export ASG_NAME=ecs-spot-workshop-asg-od + export OD_PERCENTAGE=100 # Note that ASG will have 100% On-Demand, 0% Spot +``` + +Set the auto scaling service linked role (created in the Module-1) ARN + +Note: Replace the **\<AWS Acount ID\>** with your AWS account + +``` +export SERVICE_ROLE_ARN="arn:aws:iam::\<AWS Account ID\>:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling_ec2" +``` + +Run the following command to substitute the template with actual values from the global variables + +``` +sed -i -e "s#%ASG_NAME%#$ASG_NAME#g" -e "s#%OD_PERCENTAGE%#$OD_PERCENTAGE#g" -e "s#%PUBLIC_SUBNET_LIST%#$PUBLIC_SUBNET_LIST#g" -e "s#%SERVICE_ROLE_ARN%#$SERVICE_ROLE_ARN#g" asg.json +``` + +Create the ASG for the On Demand Instances + +``` +aws autoscaling create-auto-scaling-group --cli-input-json file://asg.json + ASG_ARN=$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name $ASG_NAME_OD | jq -r '.AutoScalingGroups[0].AutoScalingGroupARN') +echo "$ASG_NAME_OD ARN=$ASG_ARN" +``` + +The output of the above command looks like below + +``` +ecs-spot-workshop-asg-od ARN=arn:aws:autoscaling:us-east-1:000474600478:autoScalingGroup:1e9de503-068e-4d78-8272-82536fc92d14:autoScalingGroupName/ecs-spot-workshop-asg-od +``` + +The above auto scaling group looks like below in the console + + +### Creating a Capacity Provider using above ASG with EC2 On-demand instances. + +A capacity provider is used in association with a cluster to determine the infrastructure that a task runs + +on. + +Copy the template file **templates/ecs-capacityprovider.json** to the current directory. + +``` +cp -Rfp templates/ecs-capacityprovider.json . +``` + +Take a moment to look at the user ecs-capacityprovider.json to see various configuration options in the Capacity Provider. When creating a capacity provider, you specify the following details: + +1. An Auto Scaling group Amazon Resource Name (ARN) + +1. Whether or not to enable managed scaling. When managed scaling is enabled, Amazon ECS manages the scale-in and scale-out actions of the Auto Scaling group through the use of AWS Auto Scaling scaling plans. When managed scaling is disabled, you manage your Auto Scaling groups yourself. +1. Whether or not to enable managed termination protection. When managed termination protection is enabled, Amazon ECS prevents Amazon EC2 instances that contain tasks and that are in an Auto Scaling group from being terminated during a scale-in action. Managed termination protection can only be enabled if the Auto Scaling group also has instance protection from scale in enabled + +Run below commands to replace the configuration values in the template file. + +``` +export CAPACITY_PROVIDER_NAME=od-capacity_provider + sed -i -e "s#%CAPACITY_PROVIDER_NAME%#$CAPACITY_PROVIDER_NAME#g" -e "s#%ASG_ARN%#$ASG_ARN#g" ecs-capacityprovider.json +``` + +Create the On-Demand Capacity Provider with Auto scaling group + +``` +CAPACITY_PROVIDER_ARN=$(aws ecs create-capacity-provider --cli-input-json file://ecs-capacityprovider.json | jq -r '.capacityProvider.capacityProviderArn') + echo "$OD_CAPACITY_PROVIDER_NAME ARN=$CAPACITY_PROVIDER_ARN" +``` + +The output of the above command looks like + +``` +od-capacity_provider ARN=arn:aws:ecs:us-east-1:000474600478:capacity-provider/od-capacity_provider +``` diff --git a/content/ecs-spot-capacity-providers/module-3/asg_with_spot.md b/content/ecs-spot-capacity-providers/module-3/asg_with_spot.md new file mode 100644 index 00000000..616696d1 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-3/asg_with_spot.md @@ -0,0 +1,60 @@ +--- +title: "Creating an Auto Scaling Group (ASG) with EC2 Spot Instances" +chapter: true +weight: 10 +--- + +### Creating an Auto Scaling Group (ASG) with EC2 Spot Instances + +In this section, let us create an Auto Scaling group for EC2 Spot Instances using the Launch Template created in previous section. This procedure is exactly same as the previous section except the few changes specific to the configuration for EC2 Spot instances. + +One of the best practices for adoption of Spot Instances is to diversify the EC2 instances across different instance types and availability zones, in order to tap into multiple spare capacity pools. The ASG currently will support up to 20 different instance type configurations for diversification. + +One key criteria for choosing the instance size can be based on the ECS Task vCPU and Memory limit configuration. For example, look at the ECS task resource limits in the file **webapp-ec2-task.json** + +_**"cpu": "256", "memory": "1024"**_ + +This means the ratio for vCPU:Memory is **1:4**. So it would be ideal to select instance size which satisfy this criteria. The instance lowest size which satisfy this critera are of large size. Please note there may be bigger sizes which satisfy 1:4 ratio. But in this workshop, let's select the smallest size i.e. large to illustrate the aspect of EC2 spot diversification. + +So let's select different instance types and generations for large size using the Instance Types console within the AWS EC2 console as follows. + +We selected 10 different instant types as seen asg.json but you can configure up to 20 different instance types in an Autoscaling group. + +Copy the file **templates/asg.json** for the Auto scaling group configuration. + +``` +cp templates/asg.json . +``` + +Take a moment to look at the user asg.json to see various configuration options in the ASG. + +Set the following variables for auto scaling configuration + +``` +export ASG'NAME=ecs-spot-workshop-asg-spot + export OD'PERCENTAGE=0 # Note that ASG will have 0% On-Demand, 100% Spot +``` + +Run the following commands to substitute the template with actual values from the global variables + +``` +sed -i -e "s#%ASG'NAME%#$ASG'NAME#g" -e "s#%OD'PERCENTAGE%#$OD'PERCENTAGE#g" -e "s#%PUBLIC'SUBNET'LIST%#$PUBLIC'SUBNET'LIST#g" -e "s#%SERVICE'ROLE'ARN%#$SERVICE'ROLE'ARN#g" asg.json +``` + +Create the Auto scaling group for EC2 spot + +``` +aws autoscaling create-auto-scaling-group --cli-input-json file://asg.json + ASG'ARN=$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name $ASG'NAME'SPOT | jq -r '.AutoScalingGroups[0].AutoScalingGroupARN') + echo "$ASG'NAME'SPOT ARN=$ASG'ARN" +``` + +The output for the above command looks like this + +``` +ecs-spot-workshop-asg-spot ARN=arn:aws:autoscaling:us-east-1:000474600478:autoScalingGroup:dd7a67e0-4df0-4cda-98d7-7e13c36dec5b:autoScalingGroupName/ecs-spot-workshop-asg-spot +``` + +The above auto scaling looks like below in console + +Image: TBD \ No newline at end of file diff --git a/content/ecs-spot-capacity-providers/module-3/cp_with_ec2spot.md b/content/ecs-spot-capacity-providers/module-3/cp_with_ec2spot.md new file mode 100644 index 00000000..8218e3cb --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-3/cp_with_ec2spot.md @@ -0,0 +1,56 @@ +--- +title: "Creating a Capacity Provider using ASG with EC2 Spot instances" +chapter: true +weight: 15 +--- + +### Creating a Capacity Provider using ASG with EC2 Spot instances. + +A capacity provider is used in association with a cluster to determine the infrastructure that a task runs on. + +Copy the template file **templates/ecs-capacityprovider.json** to the current directory. + +``` +cp -Rfp templates/ecs-capacityprovider.json . +``` + +Run the following commands to substitute the template with actual values from the global variables + +``` +export CAPACITY_PROVIDER_NAME=ec2spot-capacity_provider + sed -i -e "s#%CAPACITY_PROVIDER_NAME%#$CAPACITY_PROVIDER_NAME#g" -e "s#%ASG_ARN%#$ASG_ARN#g" ecs-capacityprovider.json +``` +``` +CAPACITY_PROVIDER_ARN=$(aws ecs create-capacity-provider --cli-input-json file://ecs-capacityprovider.json | jq -r '.capacityProvider.capacityProviderArn') + echo "$SPOT_CAPACITY_PROVIDER_NAME ARN=$CAPACITY_PROVIDER_ARN" +``` + +The output of the above command looks like + +``` +spot-capacity_provider ARN=arn:aws:ecs:us-east-1:000474600478:capacity-provider/ec2spot-capacity_provider +``` + +### Update ECS Cluster with Auto Scaling Capacity Providers + +So far we created two Auto Scaling Capacity Providers. Now let's update our existing ECS Cluster with these Capacity Providers. + +Run the following command to create the ECS Cluster + +``` +aws ecs put-cluster-capacity-providers \ + --cluster EcsSpotWorkshopCluster \ + --capacity-providers FARGATE FARGATE_SPOT od-capacity_provider ec2spot-capacity_provider \ + --default-capacity-provider-strategy capacityProvider=od-capacity_provider,base=1,weight=1 \ + --region $AWS_REGION +``` + +The ECS cluster should now contain 4 Capacity Providers: 2 from Auto Scaling groups (1 for OD and 1 for Spot), 1 from FARGATE and 1 from FARGATE_SPOT + +Also note the default capacity provider strategy used in the above command. It sets base=1 and weight=1 for On-demand Auto Scaling Group Capacity Provider. This will override the previous default capacity strategy which is set to FARGATE capacity provider. + +Click on the **Update Cluster** on the top right corner to see default Capacity Provider Strategy. As shown base=1 is set for OD Capacity Provider. + +That means if there is no capacity provider strategy specified during the deploying Tasks/Services, ECS by default chooses the OD Capacity Provider to launch them. + +Click on Cancel as we don't want to change the default strategy for now. diff --git a/content/ecs-spot-capacity-providers/module-3/create_ec2_launch_template.md b/content/ecs-spot-capacity-providers/module-3/create_ec2_launch_template.md new file mode 100644 index 00000000..a32c5fc7 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-3/create_ec2_launch_template.md @@ -0,0 +1,96 @@ +--- +title: "Create an EC2 launch template" +chapter: true +weight: 1 +--- + +### Create an EC2 launch template + +EC2 Launch Templates reduce the number of steps required to create an instance by capturing all launch parameters within one resource. + +You can create a launch template that contains the configuration information to launch an instance. Launch templates enable you to store launch parameters so that you do not have to specify them every time you launch an instance. For example, a launch template can contain the ECS optimized AMI, instance type, User data section, Instance Profile / Role and network settings that you typically use to launch instances. When you launch an instance using the Amazon EC2 console, an AWS SDK, or a command line tool, you can specify the launch template to use. Instance user data required to bootstrap the instance into the ECS Cluster. + +You will create a launch template to specify configuration parameters for launching instances in this workshop. + +Copy the template file **templates/user-data.txt** to the current directory, + +``` +cp templates/user-data.txt . + +``` + +Take a moment to look at the user data script to see the bootstrapping actions that is performing. Also notice ECS auto draining is enabled in the configuration + +``` +echo "ECS_ENABLE_SPOT_INSTANCE_DRAINING=true" >> /etc/ecs/ecs.config +``` + +Set the following variables for the resources be used in creating the launch template in this workshop. + +Set the ARN of the IAM role **ecslabinstanceprofile** created in Module-1 + +Get your AWS account id with below command. This is needed in the next step. + +``` +echo $ACCOUNT_ID +``` + +Note: Replace the **AWS Acount ID** with your AWS account in the below command. + +``` +export IAM_INSTANT_PROFILE_ARN=arn:aws:iam::$ACCOUNT_ID :instance-profile/ecslabinstanceprofile +``` + +It is recommended to use the latest ECS Optimized AMI which contains the ECS container agent. This is used to join the ECS cluster/ + +``` +export AMI_ID=$(aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2/recommended | jq -r 'last(.Parameters[]).Value' | jq -r '.image\id') + echo "Latest ECS Optimized Amazon AMI\ID is $AMI_ID" +``` + +The output from above command looks like below. + +``` +Latest ECS Optimized Amazon AMI_ID is ami-07a63940735aebd38 +``` + +copy the template file **templates/launch-template-data.json** to the current directory, + +``` +cp templates/launch-template-data.json . +``` + +Run the following commands to substitute the template with actual values from the variables. + +``` +sed -i -e "s#%instanceProfile%#$IAM_INSTANT_PROFILE_ARN#g" -e "s#%instanceSecurityGroup%#$SECURITY_GROUP#g" -e "s#%ami-id%#$AMI_ID#g" -e "s#%UserData%#$(cat user-data.txt | base64 --wrap=0)#g" launch-template-data.json +``` + + +Now let is create the launch template + +``` +LAUCH_TEMPLATE_ID=$(aws ec2 create-launch-template --launch-template-name ecs-spot-workshop-lt --version-description 1 --launch-template-data file://launch-template-data.json | jq -r '.LaunchTemplate.LaunchTemplateId') + echo "Amazon LAUCH_TEMPLATE_ID is $LAUCH_TEMPLATE_ID" +``` + + +The output from above command looks like this, you can also view this Launch Template in the Console. + +``` +Amazon LAUCH_TEMPLATE_ID is lt-023e2e52afc51d7ed +``` + + +Verify that the contents of the launch template are correct: + +``` +aws ec2 describe-launch-template-versions --launch-template-name ecs-spot-workshop-lt +``` + + +Verify that the contents of the launch template user data are correct: + +``` +aws ec2 describe-launch-template-versions --launch-template-name ecs-spot-workshop-lt--output json | jq -r '.LaunchTemplateVersions[].LaunchTemplateData.UserData' | base64 --decode +``` \ No newline at end of file diff --git a/content/ecs-spot-capacity-providers/module-3/ecs_ec2_service.md b/content/ecs-spot-capacity-providers/module-3/ecs_ec2_service.md new file mode 100644 index 00000000..32d844e4 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-3/ecs_ec2_service.md @@ -0,0 +1,86 @@ +--- +title: "Create ECS EC2 Services" +chapter: true +weight: 25 +--- + +### Create ECS EC2 Services + +In this section, we will create 3 ECS Services to show how tasks can be deployed across On-demand and EC2 Spot based Auto Scaling Capacity providers. + +| **Service Name** | **Number of Tasks** | **Number of Tasks on On-demand ASG Capacity Provider** | **Number of Tasks on EC2 Spot ASG Capacity Provider** | **Capacity Provider Strategy** | +| --- | --- | --- | --- | --- | +| **webapp-ec2-service-od** | 2 | 2 | 0 | OD Capacity Provider weight =1 | +| **webapp-ec2-service-spot** | 2 | 0 | 2 | Spot Capacity Provider weight =1 | +| **webapp-ec2-service-mix** | 6 | 2 | 4 | OD Capacity Provider weight =1 Spot Capacity Provider weight =3 | + +Deploy the service **webapp-ec2-service-od** using below command. + +``` +aws ecs create-service \ + --capacity-provider-strategy capacityProvider=od-capacity_provider,weight=1 \ + --cluster EcsSpotWorkshopCluster\ + --service-name webapp-ec2-service-od\ + --task-definition webapp-ec2-task:1 \ + --desired-count 2\ + --region $AWS_REGION +``` + +Note the capacity provider strategy used for this service. It provides weight only for On-demand based ASG capacity provider. This strategy overrides the default capacity provider strategy which is set to On-demand ASG capacity provider. + +That means ECS schedules all of the tasks (2 in this case) in service on the On-demand ASG Capacity providers. + +Note this ASG does not have any instances launched since the desired capacity is set to Zero. Since this ECS service deployment needs 2 tasks to be placed in the ECS cluster, it triggers CloudWatch alarms for Cluster Capacity. Based on the specified weight for Capacity Providers, the OD Capacity Provider in this case (i.e. corresponding Auto scaling group) scales 2 instances to schedule 2 tasks for this service. + +Notice the change in the desired capacity in the On-demand Auto Scaling Group + +Deploy the service **webapp-ec2-service-spot** using below command. + +``` +aws ecs create-service \ + --capacity-provider-strategy capacityProvider=ec2spot-capacity_provider,weight=1 \ + --cluster EcsSpotWorkshopCluster\ + --service-name webapp-ec2-service-spot\ + --task-definition webapp-ec2-task:1 \ + --desired-count 2\ + --region $AWS_REGION +``` +Note the capacity provider strategy used for this service. It provides weight only for EC2 Spot based ASG capacity provider. This strategy overrides the default capacity provider strategy which is set to On-demand ASG capacity provider. + +That means ECS schedules all of the tasks (2 in this case) in service on the EC2 Spot ASG Capacity providers. + +Note this ASG does not have any instances launched since the desired capacity is set to Zero. Since this ECS service deployment needs 2 tasks to be placed in the ECS cluster, it triggers Cloud watch alarms for Cluster Capacity. Based on the specified weightage for Capacity Providers, the Spot Capacity Provider in this case (i.e. corresponding Auto scaling group) scales 2 instances to schedule 2 tasks for this service. + +Notice the change in the desired capacity in the Spot Auto Scaling Group + +Deploy the service **webapp-ec2-service-mix** using below command + +``` +aws ecs create-service \ + --capacity-provider-strategy capacityProvider=od-capacity_provider,weight=1 \ + capacityProvider=ec2spot-capacity_provider,weight=3 \ + --cluster EcsSpotWorkshopCluster\ + --service-name webapp-ec2-service-mix\ + --task-definition webapp-ec2-task:1 \ + --desired-count 6\ + --region $AWS_REGION +``` +Note the capacity provider strategy used for this service. It provides weight of 1 to On-demand based ASG capacity provider and weight of 3 to EC2 Spot based ASG capacity provider. This strategy overides the default capacity provider strategy which is set to On-demand ASG capacity provider. + +That means ECS schedules splits total number of the tasks (6 in this case) in service in 1:3 ration which means 2 tasks on On-demand based ASG capacity provider and 4 tasks on EC2 Spot ASG Capacity provider. + +Note that On-demand and Spot bases ASGs already have few instances launched for the ECS services launched earlier. Since this ECS service deployment needs 6 tasks to be placed in the ECS cluster, it triggers Cloud watch alarms for Cluster Capacity. Based on the specified weight (i.e. OD=1, Spot=3) for Capacity Providers, Both OD and Spot Capacity Provider in this case (i.e. corresponding Auto scaling groups) scales additional instances in Spot and OD instances. + +Now look at the AWS console to see these 6 tasks running for this service. + +But how do we know if ECS really satisfy our tasks placement requirement of 2 tasks on OD and 4 on Spot instances? + +To check which task is placed on which instance type (OD or Spot), click on the Task Id. + +As shown task Id 0c6ca084-12a4-4469-a3b5-bbb0ad3c7bc3 is placed on OD Capacity Provider + +Check the remaining 5 tasks and check if ECS confirms to our Task Placement strategy. + +The final view of the Capacity Providers looks like below + +Image: TBD \ No newline at end of file diff --git a/content/ecs-spot-capacity-providers/module-3/tasks_for_ec2_cp.md b/content/ecs-spot-capacity-providers/module-3/tasks_for_ec2_cp.md new file mode 100644 index 00000000..faf5a0e3 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-3/tasks_for_ec2_cp.md @@ -0,0 +1,20 @@ +--- +title: "Create ECS Tasks for EC2 Capacity Providers" +chapter: true +weight: 20 +--- + +### Create ECS Tasks for EC2 Capacity Providers + +In this section, we will create a task definition for for tasks to be launched on the Auto Scaling Capacity Providers. + +Run the below command to create the task definition + +``` +aws ecs register-task-definition --cli-input-json file://webapp-ec2-task.json +WEBAPP_EC2_TASK_DEF=$(cat webapp-ec2-task.json | jq -r '.family') +``` + +The task will look like this in console + +Image: TBD \ No newline at end of file diff --git a/content/ecs-spot-capacity-providers/module-4/_index.md b/content/ecs-spot-capacity-providers/module-4/_index.md new file mode 100644 index 00000000..f92e72e2 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-4/_index.md @@ -0,0 +1,35 @@ +--- +title: "Module-4 Cluster Monitoring and Spot Interruption Handling" +chapter: true +weight: 40 +--- + +## **Module-4 Cluster Monitoring and Spot Interruption Handling** + +### ECS Cluster Monitoring using Container Insights + +Use CloudWatch Container Insights to collect, aggregate, and summarize metrics and logs from your containerized applications and microservices. Container Insights is available for Amazon Elastic Container Service, Amazon Elastic Kubernetes Service, and Kubernetes platforms on Amazon EC2. The metrics include utilization for resources such as CPU, memory, disk, and network. Container Insights also provides diagnostic information, such as container restart failures, to help you isolate issues and resolve them quickly. You can also set CloudWatch alarms on metrics that Container Insights collects + +Run the below command to enable the container insights to the existing cluster. Container Insights collects metrics at the cluster, task, and service levels. + +``` +aws ecs update-cluster-settings --cluster EcsSpotWorkshopCluster --settings name=containerInsights,value=enabled +``` + +To deploy the CloudWatch agent to collect instance-level metrics from Amazon ECS clusters that + +are hosted on EC2 instance, use a quick start setup with a default configuration, + +``` +export ClusterName=EcsSpotWorkshopCluster +export Region="$AWS_REGION" +aws cloudformation create-stack --stack-name CWAgentECS-${ClusterName}-${Region} \ + --template-body https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/ecs-task-definition-templates/deployment-mode/daemon-service/cwagent-ecs-instance-metric/cloudformation-quickstart/cwagent-ecs-instance-metric-cfn.json \ + --parameters ParameterKey=ClusterName,ParameterValue=${ClusterName} ParameterKey=CreateIAMRoles,ParameterValue=True \ + --capabilities CAPABILITY_NAMED_IAM \ + --region ${Region} +``` + +The container insigts metrics for this cluster will be available in cloud watch. + +Amazon EC2 terminates your Spot Instance when it needs the capacity back. Amazon EC2 provides a Spot Instance interruption notice, which gives the instance a two-minute warning before it is interrupted. diff --git a/content/ecs-spot-capacity-providers/module-4/spot_inturruption_handling.md b/content/ecs-spot-capacity-providers/module-4/spot_inturruption_handling.md new file mode 100644 index 00000000..367aa7ee --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-4/spot_inturruption_handling.md @@ -0,0 +1,41 @@ +--- +title: "Spot Interruption Handling on EC2 Spot Instances" +chapter: true +weight: 5 +--- + +### Spot Interruption Handling on EC2 Spot Instances + +When Amazon EC2 is going to interrupt your Spot Instance, the interruption notification will be available in two ways + +**Amazon CloudWatch Events** + +EC2 service emits an event two minutes prior to the actual interruption. This event can be detected by Amazon CloudWatch Events. + +**instance-action in the MetaData service (IMDS)** + +If your Spot Instance is marked to be stopped or terminated by the Spot service, the instance-action + +item is present in your instance metadata. + +### Spot Interruption Handling on ECS Fargate Spot + +When tasks using Fargate Spot capacity are stopped due to a Spot interruption, a two-minute warning + +is sent before a task is stopped. The warning is sent as a task state change event to Amazon EventBridge + +and a SIGTERM signal to the running task. When using Fargate Spot as part of a service, the service + +scheduler will receive the interruption signal and attempt to launch additional tasks on Fargate Spot if + +capacity is available. + +To ensure that your containers exit gracefully before the task stops, the following can be configured: + +• A stopTimeout value of 120 seconds or less can be specified in the container definition that the task + +is using. Specifying a stopTimeout value gives you time between the moment the task state change + +event is received and the point at which the container is forcefully stopped. + +• The SIGTERM signal must be received from within the container to perform any cleanup actions. diff --git a/content/ecs-spot-capacity-providers/modules.md b/content/ecs-spot-capacity-providers/modules.md index cb48d7cc..c69b45ff 100644 --- a/content/ecs-spot-capacity-providers/modules.md +++ b/content/ecs-spot-capacity-providers/modules.md @@ -4,7 +4,6 @@ chapter: false weight: 13 --- - This workshop has been broken down into modules. These modules are designed to be completed in sequence. If you are reading this at a live AWS event, the workshop attendants will give you a high level run down of the labs. Then it’s up to you to follow the instructions below to complete the labs. @@ -14,4 +13,5 @@ These modules are designed to be completed in sequence. If you are reading this | --- | --- | | **Module-1** | Saving costs using AWS Fargate Spot Capacity Providers | | **Module-2** | Saving costs using EC2 spot with Auto Scaling Group Capacity Providers | -| **Module-3** | Spot Interruption Handling | \ No newline at end of file +| **Module-3** | Spot Interruption Handling | + diff --git a/content/ecs-spot-capacity-providers/prerequisites/_index.md b/content/ecs-spot-capacity-providers/prerequisites/_index.md index 4915dfd1..edd222c5 100644 --- a/content/ecs-spot-capacity-providers/prerequisites/_index.md +++ b/content/ecs-spot-capacity-providers/prerequisites/_index.md @@ -25,3 +25,4 @@ The command starts after **$**. Words that are ***UPPER_ITALIC_BOLD*** indicate 1. This workshop is self-paced. The instructions will walk you through achieving the workshop’s goal using the AWS Management Console. 2. While the workshop provides step by step instructions, *please do take a moment to look around and understand what is happening at each step* as this will enhance your learning experience. The workshop is meant as a getting started guide, but you will learn the most by digesting each of the steps and thinking about how they would apply in your own environment and in your own organization. You can even consider experimenting with the steps to challenge yourself. + diff --git a/workshops/ecs-deep-learning-workshop/lab-1-setup/cfn-templates/ecs-deep-learning-workshop.yaml b/workshops/ecs-deep-learning-workshop/lab-1-setup/cfn-templates/ecs-deep-learning-workshop.yaml deleted file mode 100644 index e8200e51..00000000 --- a/workshops/ecs-deep-learning-workshop/lab-1-setup/cfn-templates/ecs-deep-learning-workshop.yaml +++ /dev/null @@ -1,813 +0,0 @@ ---- -AWSTemplateFormatVersion: 2010-09-09 -Description: Environment for running ECS Deep Learning Workshop -Mappings: - CidrMappings: - public-subnet-1: - CIDR: 10.0.1.0/24 - public-subnet-2: - CIDR: 10.0.2.0/24 - vpc: - CIDR: 10.0.0.0/16 - ECSAmi: - ap-northeast-1: - AMI: ami-0d5f884dada5562c6 - ap-northeast-2: - AMI: ami-0060ad36f655af38b - ap-south-1: - AMI: ami-056a07eb5b1d13734 - ap-southeast-1: - AMI: ami-065c0bd2832a70f9d - ap-southeast-2: - AMI: ami-0aa8b7a8042811ddf - ca-central-1: - AMI: ami-0d50dee936e241e7e - eu-central-1: - AMI: ami-03804565a6baf6d30 - eu-west-1: - AMI: ami-0dbcd2533bc72c3f6 - eu-west-2: - AMI: ami-005307409c5f6e76c - eu-west-3: - AMI: ami-024c0b7d07abc6526 - sa-east-1: - AMI: ami-0078e33a9103e1e58 - us-east-1: - AMI: ami-0254e5972ebcd132c - us-east-2: - AMI: ami-0a0d2004b44b9287c - us-gov-west-1: - AMI: ami-a842dcc9 - us-west-1: - AMI: ami-0de5608ca20c07aa2 - us-west-2: - AMI: ami-093381d21a4fc38d1 -Outputs: - awsRegionName: - Description: The name of the AWS Region your template was launched in - Value: - Ref: AWS::Region - cloudWatchLogsGroupName: - Description: Name of the CloudWatch Logs Group - Value: - Ref: cloudWatchLogsGroup - ecrRepositoryName: - Description: The name of the ECR Repository - Value: - Ref: ecrRepository - ecsClusterName: - Description: The name of the ECS Cluster - Value: - Ref: ecsCluster - inputBucketName: - Description: The name of the input S3 Bucket - Value: - Ref: inputBucket - outputBucketName: - Description: The name of the output S3 Bucket - Value: - Ref: outputBucket - spotFleetName: - Description: The name of the Spot Fleet - Value: - Ref: spotFleet -Parameters: - KeyName: - Description: Name of an existing EC2 KeyPair to enable SSH access to the EC2 instances - Type: AWS::EC2::KeyPair::KeyName - SourceCidr: - Default: 0.0.0.0/0 - Description: Optional - CIDR/IP range for instance ssh access - defaults to 0.0.0.0/0 - Type: String -Resources: - attachGateway: - DependsOn: - - vpc - - internetGateway - Properties: - InternetGatewayId: - Ref: internetGateway - VpcId: - Ref: vpc - Type: AWS::EC2::VPCGatewayAttachment - cloudWatchLogsGroup: - Properties: - RetentionInDays: 7 - Type: AWS::Logs::LogGroup - ecrRepository: - Type: AWS::ECR::Repository - ecsCluster: - Type: AWS::ECS::Cluster - inputBucket: - Type: AWS::S3::Bucket - internetGateway: - DependsOn: - - vpc - Type: AWS::EC2::InternetGateway - outputBucket: - Type: AWS::S3::Bucket - predictTaskDefinition: - Properties: - ContainerDefinitions: - - Command: - - DATE=`date -Iseconds` && echo "running predict_imagenet.py $IMAGEURL" && /usr/local/bin/predict_imagenet.py - $IMAGEURL | tee results && echo "results being written to s3://$OUTPUTBUCKET/predict_imagenet.results.$HOSTNAME.$DATE.txt" - && aws s3 cp results s3://$OUTPUTBUCKET/predict_imagenet.results.$HOSTNAME.$DATE.txt - && echo "Task complete!" - EntryPoint: - - /bin/bash - - -c - Environment: - - Name: IMAGEURL - Value: https://images-na.ssl-images-amazon.com/images/G/01/img15/pet-products/small-tiles/23695_pets_vertical_store_dogs_small_tile_8._CB312176604_.jpg - - Name: OUTPUTBUCKET - Value: - Ref: outputBucket - - Name: AWS_DEFAULT_REGION - Value: - Ref: AWS::Region - Image: - Fn::Join: - - '' - - - Ref: AWS::AccountId - - .dkr.ecr. - - Ref: AWS::Region - - .amazonaws.com/ - - Ref: ecrRepository - - :latest - LogConfiguration: - LogDriver: awslogs - Options: - awslogs-group: - Ref: cloudWatchLogsGroup - awslogs-region: - Ref: AWS::Region - awslogs-stream-prefix: predict_imagenet - Memory: '2048' - Name: ecs-deep-learning-workshop - Privileged: 'true' - Type: AWS::ECS::TaskDefinition - publicRoute: - DependsOn: - - publicRouteTable - - attachGateway - Properties: - DestinationCidrBlock: 0.0.0.0/0 - GatewayId: - Ref: internetGateway - RouteTableId: - Ref: publicRouteTable - Type: AWS::EC2::Route - publicRouteTable: - DependsOn: - - vpc - - attachGateway - Properties: - Tags: - - Key: Name - Value: Public Route Table - VpcId: - Ref: vpc - Type: AWS::EC2::RouteTable - publicSubnet1: - DependsOn: attachGateway - Properties: - AvailabilityZone: - Fn::Select: - - 0 - - Fn::GetAZs: - Ref: AWS::Region - CidrBlock: - Fn::FindInMap: - - CidrMappings - - public-subnet-1 - - CIDR - MapPublicIpOnLaunch: true - Tags: - - Key: Name - Value: Public Subnet 1 - VpcId: - Ref: vpc - Type: AWS::EC2::Subnet - publicSubnet1RouteTableAssociation: - DependsOn: - - publicRouteTable - - publicSubnet1 - - attachGateway - Properties: - RouteTableId: - Ref: publicRouteTable - SubnetId: - Ref: publicSubnet1 - Type: AWS::EC2::SubnetRouteTableAssociation - publicSubnet2: - DependsOn: attachGateway - Properties: - AvailabilityZone: - Fn::Select: - - 1 - - Fn::GetAZs: - Ref: AWS::Region - CidrBlock: - Fn::FindInMap: - - CidrMappings - - public-subnet-2 - - CIDR - MapPublicIpOnLaunch: true - Tags: - - Key: Name - Value: Public Subnet 2 - VpcId: - Ref: vpc - Type: AWS::EC2::Subnet - publicSubnet2RouteTableAssociation: - DependsOn: - - publicRouteTable - - publicSubnet2 - - attachGateway - Properties: - RouteTableId: - Ref: publicRouteTable - SubnetId: - Ref: publicSubnet2 - Type: AWS::EC2::SubnetRouteTableAssociation - scalableTarget: - DependsOn: - - spotFleet - - spotFleetAutoscaleRole - Properties: - MaxCapacity: 1 - MinCapacity: 1 - ResourceId: - Fn::Join: - - / - - - spot-fleet-request - - Ref: spotFleet - RoleARN: - Fn::GetAtt: - - spotFleetAutoscaleRole - - Arn - ScalableDimension: ec2:spot-fleet-request:TargetCapacity - ServiceNamespace: ec2 - Type: AWS::ApplicationAutoScaling::ScalableTarget - scalingPolicy: - Properties: - PolicyName: - Fn::Join: - - '-' - - - Ref: AWS::StackName - - StepPolicy - PolicyType: StepScaling - ScalingTargetId: - Ref: scalableTarget - StepScalingPolicyConfiguration: - AdjustmentType: PercentChangeInCapacity - Cooldown: 30 - MetricAggregationType: Average - StepAdjustments: - - MetricIntervalLowerBound: 0 - ScalingAdjustment: 100 - Type: AWS::ApplicationAutoScaling::ScalingPolicy - securityGroup: - Properties: - GroupDescription: Spot Fleet Instance Security Group - SecurityGroupIngress: - - CidrIp: - Ref: SourceCidr - FromPort: 22 - IpProtocol: tcp - ToPort: 22 - - CidrIp: 0.0.0.0/0 - FromPort: 80 - IpProtocol: tcp - ToPort: 80 - VpcId: - Ref: vpc - Type: AWS::EC2::SecurityGroup - spotFleet: - DependsOn: - - spotFleetRole - - spotFleetInstanceProfile - - ecsCluster - Properties: - SpotFleetRequestConfigData: - AllocationStrategy: diversified - IamFleetRole: - Fn::GetAtt: - - spotFleetRole - - Arn - LaunchSpecifications: - - IamInstanceProfile: - Arn: - Fn::GetAtt: - - spotFleetInstanceProfile - - Arn - ImageId: - Fn::FindInMap: - - ECSAmi - - Ref: AWS::Region - - AMI - InstanceType: m4.large - KeyName: - Ref: KeyName - Monitoring: - Enabled: true - SecurityGroups: - - GroupId: - Ref: securityGroup - SubnetId: - Fn::Join: - - ',' - - - Ref: publicSubnet1 - - Ref: publicSubnet2 - UserData: - Fn::Base64: - Fn::Sub: '#!/bin/bash -xe - - yum -y --security update - - yum -y update ecs-init - - service docker restart - - yum -y install aws-cli git emacs nano aws-cfn-bootstrap - - echo ECS_CLUSTER=${ecsCluster} >> /etc/ecs/ecs.config - - echo ECS_AVAILABLE_LOGGING_DRIVERS=[\"json-file\",\"awslogs\"] >> /etc/ecs/ecs.config - - su - ec2-user -c "aws configure set default.region ${AWS::Region}" - - mkdir /home/ec2-user/.docker - - cat << EOF > /home/ec2-user/.docker/config.json - - { - - "credsStore": "ecr-login" - - } - - EOF - - chown -R ec2-user. /home/ec2-user/.docker - - git clone https://github.com/awslabs/amazon-ecr-credential-helper.git - - cd amazon-ecr-credential-helper && make docker && cp bin/local/docker-credential-ecr-login - /usr/local/bin/ - - INSTANCE_ID=$(curl 169.254.169.254/latest/meta-data/instance-id 2>/dev/null) - - /opt/aws/bin/cfn-signal -s true -i $INSTANCE_ID "${spotFleetWaitConditionHandle}" - - ' - - IamInstanceProfile: - Arn: - Fn::GetAtt: - - spotFleetInstanceProfile - - Arn - ImageId: - Fn::FindInMap: - - ECSAmi - - Ref: AWS::Region - - AMI - InstanceType: m4.xlarge - KeyName: - Ref: KeyName - Monitoring: - Enabled: true - SecurityGroups: - - GroupId: - Ref: securityGroup - SubnetId: - Fn::Join: - - ',' - - - Ref: publicSubnet1 - - Ref: publicSubnet2 - UserData: - Fn::Base64: - Fn::Sub: '#!/bin/bash -xe - - yum -y --security update - - yum -y update ecs-init - - service docker restart - - yum -y install aws-cli git emacs nano aws-cfn-bootstrap - - echo ECS_CLUSTER=${ecsCluster} >> /etc/ecs/ecs.config - - echo ECS_AVAILABLE_LOGGING_DRIVERS=[\"json-file\",\"awslogs\"] >> /etc/ecs/ecs.config - - su - ec2-user -c "aws configure set default.region ${AWS::Region}" - - mkdir /home/ec2-user/.docker - - cat << EOF > /home/ec2-user/.docker/config.json - - { - - "credsStore": "ecr-login" - - } - - EOF - - chown -R ec2-user. /home/ec2-user/.docker - - git clone https://github.com/awslabs/amazon-ecr-credential-helper.git - - cd amazon-ecr-credential-helper && make docker && cp bin/local/docker-credential-ecr-login - /usr/local/bin/ - - INSTANCE_ID=$(curl 169.254.169.254/latest/meta-data/instance-id 2>/dev/null) - - /opt/aws/bin/cfn-signal -s true -i $INSTANCE_ID "${spotFleetWaitConditionHandle}" - - ' - - IamInstanceProfile: - Arn: - Fn::GetAtt: - - spotFleetInstanceProfile - - Arn - ImageId: - Fn::FindInMap: - - ECSAmi - - Ref: AWS::Region - - AMI - InstanceType: c4.large - KeyName: - Ref: KeyName - Monitoring: - Enabled: true - SecurityGroups: - - GroupId: - Ref: securityGroup - SubnetId: - Fn::Join: - - ',' - - - Ref: publicSubnet1 - - Ref: publicSubnet2 - UserData: - Fn::Base64: - Fn::Sub: '#!/bin/bash -xe - - yum -y --security update - - yum -y update ecs-init - - service docker restart - - yum -y install aws-cli git emacs nano aws-cfn-bootstrap - - echo ECS_CLUSTER=${ecsCluster} >> /etc/ecs/ecs.config - - echo ECS_AVAILABLE_LOGGING_DRIVERS=[\"json-file\",\"awslogs\"] >> /etc/ecs/ecs.config - - su - ec2-user -c "aws configure set default.region ${AWS::Region}" - - mkdir /home/ec2-user/.docker - - cat << EOF > /home/ec2-user/.docker/config.json - - { - - "credsStore": "ecr-login" - - } - - EOF - - chown -R ec2-user. /home/ec2-user/.docker - - git clone https://github.com/awslabs/amazon-ecr-credential-helper.git - - cd amazon-ecr-credential-helper && make docker && cp bin/local/docker-credential-ecr-login - /usr/local/bin/ - - INSTANCE_ID=$(curl 169.254.169.254/latest/meta-data/instance-id 2>/dev/null) - - /opt/aws/bin/cfn-signal -s true -i $INSTANCE_ID "${spotFleetWaitConditionHandle}" - - ' - - IamInstanceProfile: - Arn: - Fn::GetAtt: - - spotFleetInstanceProfile - - Arn - ImageId: - Fn::FindInMap: - - ECSAmi - - Ref: AWS::Region - - AMI - InstanceType: c4.xlarge - KeyName: - Ref: KeyName - Monitoring: - Enabled: true - SecurityGroups: - - GroupId: - Ref: securityGroup - SubnetId: - Fn::Join: - - ',' - - - Ref: publicSubnet1 - - Ref: publicSubnet2 - UserData: - Fn::Base64: - Fn::Sub: '#!/bin/bash -xe - - yum -y --security update - - yum -y update ecs-init - - service docker restart - - yum -y install aws-cli git emacs nano aws-cfn-bootstrap - - echo ECS_CLUSTER=${ecsCluster} >> /etc/ecs/ecs.config - - echo ECS_AVAILABLE_LOGGING_DRIVERS=[\"json-file\",\"awslogs\"] >> /etc/ecs/ecs.config - - su - ec2-user -c "aws configure set default.region ${AWS::Region}" - - mkdir /home/ec2-user/.docker - - cat << EOF > /home/ec2-user/.docker/config.json - - { - - "credsStore": "ecr-login" - - } - - EOF - - chown -R ec2-user. /home/ec2-user/.docker - - git clone https://github.com/awslabs/amazon-ecr-credential-helper.git - - cd amazon-ecr-credential-helper && make docker && cp bin/local/docker-credential-ecr-login - /usr/local/bin/ - - INSTANCE_ID=$(curl 169.254.169.254/latest/meta-data/instance-id 2>/dev/null) - - /opt/aws/bin/cfn-signal -s true -i $INSTANCE_ID "${spotFleetWaitConditionHandle}" - - ' - - IamInstanceProfile: - Arn: - Fn::GetAtt: - - spotFleetInstanceProfile - - Arn - ImageId: - Fn::FindInMap: - - ECSAmi - - Ref: AWS::Region - - AMI - InstanceType: r3.large - KeyName: - Ref: KeyName - Monitoring: - Enabled: true - SecurityGroups: - - GroupId: - Ref: securityGroup - SubnetId: - Fn::Join: - - ',' - - - Ref: publicSubnet1 - - Ref: publicSubnet2 - UserData: - Fn::Base64: - Fn::Sub: '#!/bin/bash -xe - - yum -y --security update - - yum -y update ecs-init - - service docker restart - - yum -y install aws-cli git emacs nano aws-cfn-bootstrap - - echo ECS_CLUSTER=${ecsCluster} >> /etc/ecs/ecs.config - - echo ECS_AVAILABLE_LOGGING_DRIVERS=[\"json-file\",\"awslogs\"] >> /etc/ecs/ecs.config - - su - ec2-user -c "aws configure set default.region ${AWS::Region}" - - mkdir /home/ec2-user/.docker - - cat << EOF > /home/ec2-user/.docker/config.json - - { - - "credsStore": "ecr-login" - - } - - EOF - - chown -R ec2-user. /home/ec2-user/.docker - - git clone https://github.com/awslabs/amazon-ecr-credential-helper.git - - cd amazon-ecr-credential-helper && make docker && cp bin/local/docker-credential-ecr-login - /usr/local/bin/ - - INSTANCE_ID=$(curl 169.254.169.254/latest/meta-data/instance-id 2>/dev/null) - - /opt/aws/bin/cfn-signal -s true -i $INSTANCE_ID "${spotFleetWaitConditionHandle}" - - ' - - IamInstanceProfile: - Arn: - Fn::GetAtt: - - spotFleetInstanceProfile - - Arn - ImageId: - Fn::FindInMap: - - ECSAmi - - Ref: AWS::Region - - AMI - InstanceType: r3.xlarge - KeyName: - Ref: KeyName - Monitoring: - Enabled: true - SecurityGroups: - - GroupId: - Ref: securityGroup - SubnetId: - Fn::Join: - - ',' - - - Ref: publicSubnet1 - - Ref: publicSubnet2 - UserData: - Fn::Base64: - Fn::Sub: '#!/bin/bash -xe - - yum -y --security update - - yum -y update ecs-init - - service docker restart - - yum -y install aws-cli git emacs nano aws-cfn-bootstrap - - echo ECS_CLUSTER=${ecsCluster} >> /etc/ecs/ecs.config - - echo ECS_AVAILABLE_LOGGING_DRIVERS=[\"json-file\",\"awslogs\"] >> /etc/ecs/ecs.config - - su - ec2-user -c "aws configure set default.region ${AWS::Region}" - - mkdir /home/ec2-user/.docker - - cat << EOF > /home/ec2-user/.docker/config.json - - { - - "credsStore": "ecr-login" - - } - - EOF - - chown -R ec2-user. /home/ec2-user/.docker - - git clone https://github.com/awslabs/amazon-ecr-credential-helper.git - - cd amazon-ecr-credential-helper && make docker && cp bin/local/docker-credential-ecr-login - /usr/local/bin/ - - INSTANCE_ID=$(curl 169.254.169.254/latest/meta-data/instance-id 2>/dev/null) - - /opt/aws/bin/cfn-signal -s true -i $INSTANCE_ID "${spotFleetWaitConditionHandle}" - - ' - TargetCapacity: 1 - TerminateInstancesWithExpiration: true - Type: AWS::EC2::SpotFleet - spotFleetAutoscaleRole: - Properties: - AssumeRolePolicyDocument: - Statement: - - Action: - - sts:AssumeRole - Effect: Allow - Principal: - Service: - - application-autoscaling.amazonaws.com - Version: 2012-10-17 - ManagedPolicyArns: - - arn:aws:iam::aws:policy/service-role/AmazonEC2SpotFleetAutoscaleRole - Path: / - Type: AWS::IAM::Role - spotFleetInstanceProfile: - DependsOn: - - spotFleetInstanceRole - Properties: - Path: / - Roles: - - Ref: spotFleetInstanceRole - Type: AWS::IAM::InstanceProfile - spotFleetInstanceRole: - Properties: - AssumeRolePolicyDocument: - Statement: - - Action: - - sts:AssumeRole - Effect: Allow - Principal: - Service: - - ec2.amazonaws.com - Version: 2012-10-17 - ManagedPolicyArns: - - arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role - Path: / - Policies: - - PolicyDocument: - Statement: - - Action: s3:ListBucket - Effect: Allow - Resource: - Fn::Join: - - '' - - - 'arn:aws:s3:::' - - Ref: outputBucket - - Action: - - s3:PutObject - - s3:GetObject - - s3:DeleteObject - Effect: Allow - Resource: - Fn::Join: - - '' - - - 'arn:aws:s3:::' - - Ref: outputBucket - - /* - - Action: - - ecr:DescribeRepositories - - ecr:ListImages - - ecr:InitiateLayerUpload - - ecr:UploadLayerPart - - ecr:CompleteLayerUpload - - ecr:PutImage - Effect: Allow - Resource: - Fn::Join: - - '' - - - 'arn:aws:ecr:' - - Ref: AWS::Region - - ':' - - Ref: AWS::AccountId - - :repository/ - - Ref: ecrRepository - Version: '2012-10-17' - PolicyName: - Fn::Join: - - '-' - - - Ref: AWS::StackName - - ecs-deep-learning-workshop-role - Type: AWS::IAM::Role - spotFleetRole: - Properties: - AssumeRolePolicyDocument: - Statement: - - Action: - - sts:AssumeRole - Effect: Allow - Principal: - Service: - - spotfleet.amazonaws.com - Version: 2012-10-17 - ManagedPolicyArns: - - arn:aws:iam::aws:policy/service-role/AmazonEC2SpotFleetRole - Path: / - Type: AWS::IAM::Role - spotFleetWaitCondition: - DependsOn: spotFleetWaitConditionHandle - Properties: - Count: 1 - Handle: - Ref: spotFleetWaitConditionHandle - Timeout: 900 - Type: AWS::CloudFormation::WaitCondition - spotFleetWaitConditionHandle: - Type: AWS::CloudFormation::WaitConditionHandle - vpc: - Properties: - CidrBlock: - Fn::FindInMap: - - CidrMappings - - vpc - - CIDR - EnableDnsHostnames: true - EnableDnsSupport: true - Tags: - - Key: Name - Value: VPC for ECS Deep Learning Workshop - Type: AWS::EC2::VPC -... diff --git a/workshops/ecs-deep-learning-workshop/lab-2-build/mxnet/Dockerfile b/workshops/ecs-deep-learning-workshop/lab-2-build/mxnet/Dockerfile deleted file mode 100644 index 56dd0a6d..00000000 --- a/workshops/ecs-deep-learning-workshop/lab-2-build/mxnet/Dockerfile +++ /dev/null @@ -1,38 +0,0 @@ -FROM mxnet/python - -ENV DEBIAN_FRONTEND noninteractive - -RUN apt-get -y update -RUN apt-get -y install git \ - python-opencv \ - build-essential \ - python3-dev \ - python3-tk - -RUN pip install opencv-python dumb-init awscli matplotlib - -ENV WORKSHOPDIR /root/ecs-deep-learning-workshop -RUN mkdir ${WORKSHOPDIR} - -RUN cd ${WORKSHOPDIR} \ - && git clone --recursive https://github.com/apache/incubator-mxnet.git mxnet - -COPY predict_imagenet.py /usr/local/bin/ - -RUN pip install jupyter - -RUN jupyter-notebook --generate-config --allow-root \ - && sed -i "s/#c.NotebookApp.ip = 'localhost'/c.NotebookApp.ip = '*'/g" /root/.jupyter/jupyter_notebook_config.py \ - && sed -i "s/#c.NotebookApp.allow_remote_access = False/c.NotebookApp.allow_remote_access = True/g" /root/.jupyter/jupyter_notebook_config.py - -ARG PASSWORD - -RUN python3 -c "from notebook.auth import passwd;print(passwd('${PASSWORD}') if '${PASSWORD}' != '' else 'sha1:c6bd96fb0824:6654e9eabfc54d0b3d0715ddf9561bed18e09b82')" > ${WORKSHOPDIR}/password_temp - -RUN sed -i "s/#c.NotebookApp.password = ''/c.NotebookApp.password = '$(cat ${WORKSHOPDIR}/password_temp)'/g" /root/.jupyter/jupyter_notebook_config.py - -RUN rm ${WORKSHOPDIR}/password_temp - -WORKDIR ${WORKSHOPDIR} -EXPOSE 8888 -CMD ["/usr/local/bin/dumb-init", "/usr/local/bin/jupyter-notebook", "--no-browser", "--allow-root"] diff --git a/workshops/ecs-deep-learning-workshop/lab-2-build/mxnet/predict_imagenet.py b/workshops/ecs-deep-learning-workshop/lab-2-build/mxnet/predict_imagenet.py deleted file mode 100755 index 15d76a25..00000000 --- a/workshops/ecs-deep-learning-workshop/lab-2-build/mxnet/predict_imagenet.py +++ /dev/null @@ -1,67 +0,0 @@ -#!/usr/bin/env python3 - -from __future__ import print_function -import os, sys, urllib.request - -if len(sys.argv) < 2: - print("Usage:", sys.argv[0], "