From ac4a5def5698a5ef8a77f9c7bff4fcfbc7563782 Mon Sep 17 00:00:00 2001 From: Rajesh Kesaraju Date: Tue, 13 Oct 2020 18:35:35 -0400 Subject: [PATCH 1/4] Cleaned Module-2 Deleted unwanted files on module-2 --- .../module-2/_index.md | 9 +- .../module-2/asg_with_od.md | 91 ------------------ .../module-2/asg_with_spot.md | 53 ----------- .../module-2/cp_with_ec2spot.md | 53 ----------- .../module-2/create_ec2_launch_template.md | 93 ------------------- .../module-2/ecs_ec2_service.md | 79 ---------------- .../module-2/tasks_for_ec2_cp.md | 13 --- 7 files changed, 2 insertions(+), 389 deletions(-) delete mode 100644 content/ecs-spot-capacity-providers/module-2/asg_with_od.md delete mode 100644 content/ecs-spot-capacity-providers/module-2/asg_with_spot.md delete mode 100644 content/ecs-spot-capacity-providers/module-2/cp_with_ec2spot.md delete mode 100644 content/ecs-spot-capacity-providers/module-2/create_ec2_launch_template.md delete mode 100644 content/ecs-spot-capacity-providers/module-2/ecs_ec2_service.md delete mode 100644 content/ecs-spot-capacity-providers/module-2/tasks_for_ec2_cp.md diff --git a/content/ecs-spot-capacity-providers/module-2/_index.md b/content/ecs-spot-capacity-providers/module-2/_index.md index f125ebf3..b8d67195 100644 --- a/content/ecs-spot-capacity-providers/module-2/_index.md +++ b/content/ecs-spot-capacity-providers/module-2/_index.md @@ -10,14 +10,9 @@ Amazon EC2 terminates your Spot Instance when it needs the capacity back. Amazon When Amazon EC2 is going to interrupt your Spot Instance, the interruption notification will be available in two ways -- ***Amazon EventBridge Events*** +1. ***Amazon EventBridge Events:*** EC2 service emits an event two minutes prior to the actual interruption. This event can be detected by Amazon CloudWatch Events. - -EC2 service emits an event two minutes prior to the actual interruption. This event can be detected by Amazon CloudWatch Events. - -- ***Instance-action in the MetaData service (IMDS)*** - -If your Spot Instance is marked to be stopped or terminated by the Spot service, the instance-action item is present in your instance metadata. +1. ***Instance-action in the MetaData service (IMDS):*** If your Spot Instance is marked to be stopped or terminated by the Spot service, the instance-action item is present in your instance metadata. look at the user data section in the Launch template configuration. diff --git a/content/ecs-spot-capacity-providers/module-2/asg_with_od.md b/content/ecs-spot-capacity-providers/module-2/asg_with_od.md deleted file mode 100644 index 22c2620a..00000000 --- a/content/ecs-spot-capacity-providers/module-2/asg_with_od.md +++ /dev/null @@ -1,91 +0,0 @@ ---- -title: "Creating an Auto Scaling Group (ASG) with EC2 On-Demand Instances" -weight: 5 ---- - -In this section, we will create an EC2 Auto Scaling Group for On-Demand Instances using the Launch Template created in previous section. - -Copy the file **templates/asg.json** for the Auto scaling group configuration. - -``` -cp templates/asg.json . -``` - -Take a moment to look at the user asg.json to see various configuration options in the ASG. - -Set the following variables for auto scaling configuration - -``` -export ASG_NAME=ecs-spot-workshop-asg-od - export OD_PERCENTAGE=100 # Note that ASG will have 100% On-Demand, 0% Spot -``` - -Set the auto scaling service linked role ARN - -Note: Replace the **\<AWS Acount ID\>** with your AWS account - -``` -export SERVICE_ROLE_ARN="arn:aws:iam::\<AWS Account ID\>:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling_ec2" -``` - -Run the following command to substitute the template with actual values from the global variables - -``` -sed -i -e "s#%ASG_NAME%#$ASG_NAME#g" -e "s#%OD_PERCENTAGE%#$OD_PERCENTAGE#g" -e "s#%PUBLIC_SUBNET_LIST%#$PUBLIC_SUBNET_LIST#g" -e "s#%SERVICE_ROLE_ARN%#$SERVICE_ROLE_ARN#g" asg.json -``` - -Create the ASG for the On Demand Instances - -``` -aws autoscaling create-auto-scaling-group --cli-input-json file://asg.json - ASG_ARN=$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name $ASG_NAME_OD | jq -r '.AutoScalingGroups[0].AutoScalingGroupARN') -echo "$ASG_NAME_OD ARN=$ASG_ARN" -``` - -The output of the above command looks like below - -``` -ecs-spot-workshop-asg-od ARN=arn:aws:autoscaling:us-east-1:000474600478:autoScalingGroup:1e9de503-068e-4d78-8272-82536fc92d14:autoScalingGroupName/ecs-spot-workshop-asg-od -``` - -The above auto scaling group looks like below in the console - - -### Creating a Capacity Provider using above ASG with EC2 On-demand instances. - -A capacity provider is used in association with a cluster to determine the infrastructure that a task runs - -on. - -Copy the template file **templates/ecs-capacityprovider.json** to the current directory. - -``` -cp -Rfp templates/ecs-capacityprovider.json . -``` - -Take a moment to look at the user ecs-capacityprovider.json to see various configuration options in the Capacity Provider. When creating a capacity provider, you specify the following details: - -1. An Auto Scaling group Amazon Resource Name (ARN) - -1. Whether or not to enable managed scaling. When managed scaling is enabled, Amazon ECS manages the scale-in and scale-out actions of the Auto Scaling group through the use of AWS Auto Scaling scaling plans. When managed scaling is disabled, you manage your Auto Scaling groups yourself. -1. Whether or not to enable managed termination protection. When managed termination protection is enabled, Amazon ECS prevents Amazon EC2 instances that contain tasks and that are in an Auto Scaling group from being terminated during a scale-in action. Managed termination protection can only be enabled if the Auto Scaling group also has instance protection from scale in enabled - -Run below commands to replace the configuration values in the template file. - -``` -export CAPACITY_PROVIDER_NAME=od-capacity_provider - sed -i -e "s#%CAPACITY_PROVIDER_NAME%#$CAPACITY_PROVIDER_NAME#g" -e "s#%ASG_ARN%#$ASG_ARN#g" ecs-capacityprovider.json -``` - -Create the On-Demand Capacity Provider with Auto scaling group - -``` -CAPACITY_PROVIDER_ARN=$(aws ecs create-capacity-provider --cli-input-json file://ecs-capacityprovider.json | jq -r '.capacityProvider.capacityProviderArn') - echo "$OD_CAPACITY_PROVIDER_NAME ARN=$CAPACITY_PROVIDER_ARN" -``` - -The output of the above command looks like - -``` -od-capacity_provider ARN=arn:aws:ecs:us-east-1:000474600478:capacity-provider/od-capacity_provider -``` diff --git a/content/ecs-spot-capacity-providers/module-2/asg_with_spot.md b/content/ecs-spot-capacity-providers/module-2/asg_with_spot.md deleted file mode 100644 index 9aac7a0f..00000000 --- a/content/ecs-spot-capacity-providers/module-2/asg_with_spot.md +++ /dev/null @@ -1,53 +0,0 @@ ---- -title: "Creating an Auto Scaling Group (ASG) with EC2 Spot Instances" -weight: 10 ---- - -In this section, let us create an Auto Scaling group for EC2 Spot Instances using the Launch Template created in previous section. This procedure is exactly same as the previous section except the few changes specific to the configuration for EC2 Spot instances. - -One of the best practices for adoption of Spot Instances is to diversify the EC2 instances across different instance types and availability zones, in order to tap into multiple spare capacity pools. The ASG currently will support up to 20 different instance type configurations for diversification. - -One key criteria for choosing the instance size can be based on the ECS Task vCPU and Memory limit configuration. For example, look at the ECS task resource limits in the file **webapp-ec2-task.json** - -_**"cpu": "256", "memory": "1024"**_ - -This means the ratio for vCPU:Memory is **1:4**. So it would be ideal to select instance size which satisfy this criteria. The instance lowest size which satisfy this critera are of large size. Please note there may be bigger sizes which satisfy 1:4 ratio. But in this workshop, let's select the smallest size i.e. large to illustrate the aspect of EC2 spot diversification. - -So let's select different instance types and generations for large size using the Instance Types console within the AWS EC2 console as follows. - -We selected 10 different instant types as seen asg.json but you can configure up to 20 different instance types in an Autoscaling group. - -Copy the file **templates/asg.json** for the Auto scaling group configuration. - -``` -cp templates/asg.json . -``` - -Take a moment to look at the user asg.json to see various configuration options in the ASG. - -Set the following variables for auto scaling configuration - -``` -export ASG'NAME=ecs-spot-workshop-asg-spot - export OD'PERCENTAGE=0 # Note that ASG will have 0% On-Demand, 100% Spot -``` - -Run the following commands to substitute the template with actual values from the global variables - -``` -sed -i -e "s#%ASG'NAME%#$ASG'NAME#g" -e "s#%OD'PERCENTAGE%#$OD'PERCENTAGE#g" -e "s#%PUBLIC'SUBNET'LIST%#$PUBLIC'SUBNET'LIST#g" -e "s#%SERVICE'ROLE'ARN%#$SERVICE'ROLE'ARN#g" asg.json -``` - -Create the Auto scaling group for EC2 spot - -``` -aws autoscaling create-auto-scaling-group --cli-input-json file://asg.json - ASG'ARN=$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name $ASG'NAME'SPOT | jq -r '.AutoScalingGroups[0].AutoScalingGroupARN') - echo "$ASG'NAME'SPOT ARN=$ASG'ARN" -``` - -The output for the above command looks like this - -``` -ecs-spot-workshop-asg-spot ARN=arn:aws:autoscaling:us-east-1:000474600478:autoScalingGroup:dd7a67e0-4df0-4cda-98d7-7e13c36dec5b:autoScalingGroupName/ecs-spot-workshop-asg-spot -``` diff --git a/content/ecs-spot-capacity-providers/module-2/cp_with_ec2spot.md b/content/ecs-spot-capacity-providers/module-2/cp_with_ec2spot.md deleted file mode 100644 index 0c04057c..00000000 --- a/content/ecs-spot-capacity-providers/module-2/cp_with_ec2spot.md +++ /dev/null @@ -1,53 +0,0 @@ ---- -title: "Creating a Capacity Provider using ASG with EC2 Spot instances" -weight: 15 ---- - -A capacity provider is used in association with a cluster to determine the infrastructure that a task runs on. - -Copy the template file **templates/ecs-capacityprovider.json** to the current directory. - -``` -cp -Rfp templates/ecs-capacityprovider.json . -``` - -Run the following commands to substitute the template with actual values from the global variables - -``` -export CAPACITY_PROVIDER_NAME=ec2spot-capacity_provider - sed -i -e "s#%CAPACITY_PROVIDER_NAME%#$CAPACITY_PROVIDER_NAME#g" -e "s#%ASG_ARN%#$ASG_ARN#g" ecs-capacityprovider.json -``` -``` -CAPACITY_PROVIDER_ARN=$(aws ecs create-capacity-provider --cli-input-json file://ecs-capacityprovider.json | jq -r '.capacityProvider.capacityProviderArn') - echo "$SPOT_CAPACITY_PROVIDER_NAME ARN=$CAPACITY_PROVIDER_ARN" -``` - -The output of the above command looks like - -``` -spot-capacity_provider ARN=arn:aws:ecs:us-east-1:000474600478:capacity-provider/ec2spot-capacity_provider -``` - -### Update ECS Cluster with Auto Scaling Capacity Providers - -So far we created two Auto Scaling Capacity Providers. Now let's update our existing ECS Cluster with these Capacity Providers. - -Run the following command to create the ECS Cluster - -``` -aws ecs put-cluster-capacity-providers \ - --cluster EcsSpotWorkshopCluster \ - --capacity-providers FARGATE FARGATE_SPOT od-capacity_provider ec2spot-capacity_provider \ - --default-capacity-provider-strategy capacityProvider=od-capacity_provider,base=1,weight=1 \ - --region $AWS_REGION -``` - -The ECS cluster should now contain 4 Capacity Providers: 2 from Auto Scaling groups (1 for OD and 1 for Spot), 1 from FARGATE and 1 from FARGATE_SPOT - -Also note the default capacity provider strategy used in the above command. It sets base=1 and weight=1 for On-demand Auto Scaling Group Capacity Provider. This will override the previous default capacity strategy which is set to FARGATE capacity provider. - -Click on the **Update Cluster** on the top right corner to see default Capacity Provider Strategy. As shown base=1 is set for OD Capacity Provider. - -That means if there is no capacity provider strategy specified during the deploying Tasks/Services, ECS by default chooses the OD Capacity Provider to launch them. - -Click on Cancel as we don't want to change the default strategy for now. diff --git a/content/ecs-spot-capacity-providers/module-2/create_ec2_launch_template.md b/content/ecs-spot-capacity-providers/module-2/create_ec2_launch_template.md deleted file mode 100644 index 88344d0a..00000000 --- a/content/ecs-spot-capacity-providers/module-2/create_ec2_launch_template.md +++ /dev/null @@ -1,93 +0,0 @@ ---- -title: "Create an EC2 launch template" -weight: 1 ---- - -EC2 Launch Templates reduce the number of steps required to create an instance by capturing all launch parameters within one resource. - -You can create a launch template that contains the configuration information to launch an instance. Launch templates enable you to store launch parameters so that you do not have to specify them every time you launch an instance. For example, a launch template can contain the ECS optimized AMI, instance type, User data section, Instance Profile / Role and network settings that you typically use to launch instances. When you launch an instance using the Amazon EC2 console, an AWS SDK, or a command line tool, you can specify the launch template to use. Instance user data required to bootstrap the instance into the ECS Cluster. - -You will create a launch template to specify configuration parameters for launching instances in this workshop. - -Copy the template file **templates/user-data.txt** to the current directory, - -``` -cp templates/user-data.txt . - -``` - -Take a moment to look at the user data script to see the bootstrapping actions that is performing. Also notice ECS auto draining is enabled in the configuration - -``` -echo "ECS_ENABLE_SPOT_INSTANCE_DRAINING=true" >> /etc/ecs/ecs.config -``` - -Set the following variables for the resources be used in creating the launch template in this workshop. - -Set the ARN of the IAM role **ecslabinstanceprofile** created in Module-1 - -Get your AWS account id with below command. This is needed in the next step. - -``` -echo $ACCOUNT_ID -``` - -Note: Replace the **AWS Acount ID** with your AWS account in the below command. - -``` -export IAM_INSTANT_PROFILE_ARN=arn:aws:iam::$ACCOUNT_ID :instance-profile/ecslabinstanceprofile -``` - -It is recommended to use the latest ECS Optimized AMI which contains the ECS container agent. This is used to join the ECS cluster/ - -``` -export AMI_ID=$(aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2/recommended | jq -r 'last(.Parameters[]).Value' | jq -r '.image\id') - echo "Latest ECS Optimized Amazon AMI\ID is $AMI_ID" -``` - -The output from above command looks like below. - -``` -Latest ECS Optimized Amazon AMI_ID is ami-07a63940735aebd38 -``` - -copy the template file **templates/launch-template-data.json** to the current directory, - -``` -cp templates/launch-template-data.json . -``` - -Run the following commands to substitute the template with actual values from the variables. - -``` -sed -i -e "s#%instanceProfile%#$IAM_INSTANT_PROFILE_ARN#g" -e "s#%instanceSecurityGroup%#$SECURITY_GROUP#g" -e "s#%ami-id%#$AMI_ID#g" -e "s#%UserData%#$(cat user-data.txt | base64 --wrap=0)#g" launch-template-data.json -``` - - -Now let is create the launch template - -``` -LAUCH_TEMPLATE_ID=$(aws ec2 create-launch-template --launch-template-name ecs-spot-workshop-lt --version-description 1 --launch-template-data file://launch-template-data.json | jq -r '.LaunchTemplate.LaunchTemplateId') - echo "Amazon LAUCH_TEMPLATE_ID is $LAUCH_TEMPLATE_ID" -``` - - -The output from above command looks like this, you can also view this Launch Template in the Console. - -``` -Amazon LAUCH_TEMPLATE_ID is lt-023e2e52afc51d7ed -``` - - -Verify that the contents of the launch template are correct: - -``` -aws ec2 describe-launch-template-versions --launch-template-name ecs-spot-workshop-lt -``` - - -Verify that the contents of the launch template user data are correct: - -``` -aws ec2 describe-launch-template-versions --launch-template-name ecs-spot-workshop-lt--output json | jq -r '.LaunchTemplateVersions[].LaunchTemplateData.UserData' | base64 --decode -``` \ No newline at end of file diff --git a/content/ecs-spot-capacity-providers/module-2/ecs_ec2_service.md b/content/ecs-spot-capacity-providers/module-2/ecs_ec2_service.md deleted file mode 100644 index 25c7a7c0..00000000 --- a/content/ecs-spot-capacity-providers/module-2/ecs_ec2_service.md +++ /dev/null @@ -1,79 +0,0 @@ ---- -title: "Create ECS EC2 Services" -weight: 25 ---- - -In this section, we will create 3 ECS Services to show how tasks can be deployed across On-demand and EC2 Spot based Auto Scaling Capacity providers. - -| **Service Name** | **Number of Tasks** | **Number of Tasks on On-demand ASG Capacity Provider** | **Number of Tasks on EC2 Spot ASG Capacity Provider** | **Capacity Provider Strategy** | -| --- | --- | --- | --- | --- | -| **webapp-ec2-service-od** | 2 | 2 | 0 | OD Capacity Provider weight =1 | -| **webapp-ec2-service-spot** | 2 | 0 | 2 | Spot Capacity Provider weight =1 | -| **webapp-ec2-service-mix** | 6 | 2 | 4 | OD Capacity Provider weight =1 Spot Capacity Provider weight =3 | - -Deploy the service **webapp-ec2-service-od** using below command. - -``` -aws ecs create-service \ - --capacity-provider-strategy capacityProvider=od-capacity_provider,weight=1 \ - --cluster EcsSpotWorkshopCluster\ - --service-name webapp-ec2-service-od\ - --task-definition webapp-ec2-task:1 \ - --desired-count 2\ - --region $AWS_REGION -``` - -Note the capacity provider strategy used for this service. It provides weight only for On-demand based ASG capacity provider. This strategy overrides the default capacity provider strategy which is set to On-demand ASG capacity provider. - -That means ECS schedules all of the tasks (2 in this case) in service on the On-demand ASG Capacity providers. - -Note this ASG does not have any instances launched since the desired capacity is set to Zero. Since this ECS service deployment needs 2 tasks to be placed in the ECS cluster, it triggers CloudWatch alarms for Cluster Capacity. Based on the specified weight for Capacity Providers, the OD Capacity Provider in this case (i.e. corresponding Auto scaling group) scales 2 instances to schedule 2 tasks for this service. - -Notice the change in the desired capacity in the On-demand Auto Scaling Group - -Deploy the service **webapp-ec2-service-spot** using below command. - -``` -aws ecs create-service \ - --capacity-provider-strategy capacityProvider=ec2spot-capacity_provider,weight=1 \ - --cluster EcsSpotWorkshopCluster\ - --service-name webapp-ec2-service-spot\ - --task-definition webapp-ec2-task:1 \ - --desired-count 2\ - --region $AWS_REGION -``` -Note the capacity provider strategy used for this service. It provides weight only for EC2 Spot based ASG capacity provider. This strategy overrides the default capacity provider strategy which is set to On-demand ASG capacity provider. - -That means ECS schedules all of the tasks (2 in this case) in service on the EC2 Spot ASG Capacity providers. - -Note this ASG does not have any instances launched since the desired capacity is set to Zero. Since this ECS service deployment needs 2 tasks to be placed in the ECS cluster, it triggers Cloud watch alarms for Cluster Capacity. Based on the specified weightage for Capacity Providers, the Spot Capacity Provider in this case (i.e. corresponding Auto scaling group) scales 2 instances to schedule 2 tasks for this service. - -Notice the change in the desired capacity in the Spot Auto Scaling Group - -Deploy the service **webapp-ec2-service-mix** using below command - -``` -aws ecs create-service \ - --capacity-provider-strategy capacityProvider=od-capacity_provider,weight=1 \ - capacityProvider=ec2spot-capacity_provider,weight=3 \ - --cluster EcsSpotWorkshopCluster\ - --service-name webapp-ec2-service-mix\ - --task-definition webapp-ec2-task:1 \ - --desired-count 6\ - --region $AWS_REGION -``` -Note the capacity provider strategy used for this service. It provides weight of 1 to On-demand based ASG capacity provider and weight of 3 to EC2 Spot based ASG capacity provider. This strategy overides the default capacity provider strategy which is set to On-demand ASG capacity provider. - -That means ECS schedules splits total number of the tasks (6 in this case) in service in 1:3 ration which means 2 tasks on On-demand based ASG capacity provider and 4 tasks on EC2 Spot ASG Capacity provider. - -Note that On-demand and Spot bases ASGs already have few instances launched for the ECS services launched earlier. Since this ECS service deployment needs 6 tasks to be placed in the ECS cluster, it triggers Cloud watch alarms for Cluster Capacity. Based on the specified weight (i.e. OD=1, Spot=3) for Capacity Providers, Both OD and Spot Capacity Provider in this case (i.e. corresponding Auto scaling groups) scales additional instances in Spot and OD instances. - -Now look at the AWS console to see these 6 tasks running for this service. - -But how do we know if ECS really satisfy our tasks placement requirement of 2 tasks on OD and 4 on Spot instances? - -To check which task is placed on which instance type (OD or Spot), click on the Task Id. - -As shown task Id 0c6ca084-12a4-4469-a3b5-bbb0ad3c7bc3 is placed on OD Capacity Provider - -Check the remaining 5 tasks and check if ECS confirms to our Task Placement strategy. diff --git a/content/ecs-spot-capacity-providers/module-2/tasks_for_ec2_cp.md b/content/ecs-spot-capacity-providers/module-2/tasks_for_ec2_cp.md deleted file mode 100644 index d4fe734c..00000000 --- a/content/ecs-spot-capacity-providers/module-2/tasks_for_ec2_cp.md +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: "Create ECS Tasks for EC2 Capacity Providers" -weight: 20 ---- - -In this section, we will create a task definition for for tasks to be launched on the Auto Scaling Capacity Providers. - -Run the below command to create the task definition - -``` -aws ecs register-task-definition --cli-input-json file://webapp-ec2-task.json -WEBAPP_EC2_TASK_DEF=$(cat webapp-ec2-task.json | jq -r '.family') -``` From e1dfa0f411b3435da3141f3e7c0f634e2d7cc54f Mon Sep 17 00:00:00 2001 From: Rajesh Kesaraju Date: Thu, 15 Oct 2020 13:28:11 -0400 Subject: [PATCH 2/4] Moved Module-2 inturrption handling page to first section module. Moved Module-2 inturrption handling page to first section module. --- .../module-1/spot_inturruption_handling.md | 75 +++++++++++++++++++ 1 file changed, 75 insertions(+) create mode 100644 content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md diff --git a/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md b/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md new file mode 100644 index 00000000..be20ea10 --- /dev/null +++ b/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md @@ -0,0 +1,75 @@ +--- +title: "Inturruption Handling On EC2 Spot Instances" +weight: 80 +--- + +Amazon EC2 terminates your Spot Instance when it needs the capacity back. Amazon EC2 provides a Spot Instance interruption notice, which gives the instance a two-minute warning before it is interrupted. + +When Amazon EC2 is going to interrupt your Spot Instance, the interruption notification will be available in two ways + +1. ***Amazon EventBridge Events:*** EC2 service emits an event two minutes prior to the actual interruption. This event can be detected by Amazon CloudWatch Events. + +1. ***Instance-action in the MetaData service (IMDS):*** If your Spot Instance is marked to be stopped or terminated by the Spot service, the instance-action item is present in your instance metadata. + +look at the user data section in the Launch template configuration. + +``` +echo "ECS_ENABLE_SPOT_INSTANCE_DRAINING=true" >> /etc/ecs/ecs.config +``` + +The above configuration enables automatic draining of spot instances at the time of spot interruption notice. The ECS container agent runnining on the ECS container instances handles the interruption using the Instance Metadata service. + +If the application can also handle the interruption to implement any checkpointing or saving the data. The web application (app.py) we used to buld docker image in the Module-2 shows two ways to handle the spot interruption within a docker container. + +In the first method, it check the instance metadata service for spot interruption and display a message to web page notifying the users. + +Note: The ECS tasks should not be accessing EC2 metadata. For security reasons, this should be blocked this in a Prod environment. + +``` +URL = "http://169.254.169.254/latest/meta-data/spot/termination-time" +SpotInt = requests.get(URL) +if SpotInt.status_code == 200: + response += "

This Spot Instance Got Interruption and Termination Date is {}


".format(SpotInt.text) +``` + +In the second method, it listens to the **SIGTERM** signal. The ECS container agent calls StopTask API to stop all the tasks running on the Spot Instance. + +When StopTask is called on a task, the equivalent of docker stop is issued to the containers running in the task. This results in a **SIGTERM** value and a default 30-second timeout, after which the SIGKILL value is sent and the containers are forcibly stopped. If the container handles the **SIGTERM** value gracefully and exits within 30 seconds from receiving it, no SIGKILL value is sent. + + +The application can listen to the **SIGTERM** signal and handle the interruption gracefully. + +``` +class Ec2SpotInterruptionHandler: + signals = { + signal.SIGINT: 'SIGINT', + signal.SIGTERM: 'SIGTERM' + } + +def __init__(self): + signal.signal(signal.SIGINT, self.exit_gracefully) + signal.signal(signal.SIGTERM, self.exit_gracefully) + +def exit_gracefully(self, signum, frame): + print("\nReceived {} signal".format(self.signals[signum])) + if self.signals[signum] == 'SIGTERM': + print("Looks like there is a Spot Interruption. Let's wrap up the processing to avoid forceful killing of the applucation in next 30 sec ...") +``` + +Spot Interruption Handling on ECS Fargate Spot +--- + +When tasks using Fargate Spot capacity are stopped due to a Spot interruption, a two-minute warning is sent before a task is stopped. The warning is sent as a task state change event to Amazon EventBridge +and a SIGTERM signal to the running task. When using Fargate Spot as part of a service, the service +scheduler will receive the interruption signal and attempt to launch additional tasks on Fargate Spot if +capacity is available. + +To ensure that your containers exit gracefully before the task stops, the following can be configured: + +• A stopTimeout value of 120 seconds or less can be specified in the container definition that the task +is using. Specifying a stopTimeout value gives you time between the moment the task state change event is received and the point at which the container is forcefully stopped. + +• The **SIGTERM** signal must be received from within the container to perform any cleanup actions. + + +***Congratulations !!!*** you have successfully completed the workshop module and learnt how to create ASG CPs and schedule ECS services across Spot and On-demand CPs. You may proceed to optional module using Fargate Spot Capacity Providers. \ No newline at end of file From 4faa695780f666fe804d2a8fbeb9754f41020e0d Mon Sep 17 00:00:00 2001 From: Rajesh Kesaraju Date: Thu, 15 Oct 2020 16:31:01 -0400 Subject: [PATCH 3/4] Updated to reflect changes Updated to reflect old file changes to this new file. --- .../module-1/spot_inturruption_handling.md | 39 +++++++++---------- 1 file changed, 19 insertions(+), 20 deletions(-) diff --git a/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md b/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md index be20ea10..05044f76 100644 --- a/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md +++ b/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md @@ -1,45 +1,46 @@ --- -title: "Inturruption Handling On EC2 Spot Instances" -weight: 80 +title: "Module-2: Spot Interruption Handling" +weight: 40 +--- + +Inturruption Handling On EC2 Spot Instances --- Amazon EC2 terminates your Spot Instance when it needs the capacity back. Amazon EC2 provides a Spot Instance interruption notice, which gives the instance a two-minute warning before it is interrupted. -When Amazon EC2 is going to interrupt your Spot Instance, the interruption notification will be available in two ways +When Amazon EC2 is going to interrupt your Spot Instance, the interruption notification will be available in two ways: 1. ***Amazon EventBridge Events:*** EC2 service emits an event two minutes prior to the actual interruption. This event can be detected by Amazon CloudWatch Events. -1. ***Instance-action in the MetaData service (IMDS):*** If your Spot Instance is marked to be stopped or terminated by the Spot service, the instance-action item is present in your instance metadata. - -look at the user data section in the Launch template configuration. +1. ***EC2 Instance Metadata service (IMDS):*** If your Spot Instance is marked for termination by EC2, the instance-action item is present in your instance metadata. -``` +In the Launch Template configuration, we added: +```plaintext echo "ECS_ENABLE_SPOT_INSTANCE_DRAINING=true" >> /etc/ecs/ecs.config ``` +When Amazon ECS Spot Instance draining is enabled on the instance, ECS receives the Spot Instance interruption notice and places the instance in DRAINING status. When a container instance is set to DRAINING, Amazon ECS prevents new tasks from being scheduled for placement on the container instance [Click here](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-instance-spot.html) to learn more. -The above configuration enables automatic draining of spot instances at the time of spot interruption notice. The ECS container agent runnining on the ECS container instances handles the interruption using the Instance Metadata service. +The web application (app.py) we used to buld docker image in this module shows two ways to handle the EC2 Spot interruption within a docker container. This allows you to perform actions such as preventing the processing of new work, checkpointing the progress of a batch job, or gracefully exiting the application to complete tasks such as ensuring database connections are properly closed -If the application can also handle the interruption to implement any checkpointing or saving the data. The web application (app.py) we used to buld docker image in the Module-2 shows two ways to handle the spot interruption within a docker container. +In the first method, it checks the instance metadata service for spot interruption and display a message to web page notifying the users (this is, of course, just a demonstration and not for real-life scenarios). -In the first method, it check the instance metadata service for spot interruption and display a message to web page notifying the users. +{{% notice warning %}} +In a production environment, you should not provide access from the ECS tasks to the IMDS. This is done in this workshop for simplification purposes. +{{% /notice %}} -Note: The ECS tasks should not be accessing EC2 metadata. For security reasons, this should be blocked this in a Prod environment. -``` +```plaintext URL = "http://169.254.169.254/latest/meta-data/spot/termination-time" SpotInt = requests.get(URL) if SpotInt.status_code == 200: - response += "

This Spot Instance Got Interruption and Termination Date is {}


".format(SpotInt.text) + response += "

This Spot Instance will be terminated at: {}


".format(SpotInt.text) ``` -In the second method, it listens to the **SIGTERM** signal. The ECS container agent calls StopTask API to stop all the tasks running on the Spot Instance. +In the second method, it listens to the **SIGTERM** signal. The ECS container agent calls the StopTask API to stop all the tasks running on the Spot Instance. When StopTask is called on a task, the equivalent of docker stop is issued to the containers running in the task. This results in a **SIGTERM** value and a default 30-second timeout, after which the SIGKILL value is sent and the containers are forcibly stopped. If the container handles the **SIGTERM** value gracefully and exits within 30 seconds from receiving it, no SIGKILL value is sent. - -The application can listen to the **SIGTERM** signal and handle the interruption gracefully. - -``` +```python class Ec2SpotInterruptionHandler: signals = { signal.SIGINT: 'SIGINT', @@ -71,5 +72,3 @@ is using. Specifying a stopTimeout value gives you time between the moment the t • The **SIGTERM** signal must be received from within the container to perform any cleanup actions. - -***Congratulations !!!*** you have successfully completed the workshop module and learnt how to create ASG CPs and schedule ECS services across Spot and On-demand CPs. You may proceed to optional module using Fargate Spot Capacity Providers. \ No newline at end of file From 9b506e03247e09205e67dd7314c56000ff750161 Mon Sep 17 00:00:00 2001 From: Rajesh Kesaraju Date: Thu, 15 Oct 2020 16:51:09 -0400 Subject: [PATCH 4/4] Module-2 restructured with Optional Fargate We'll find a batter way to identify main flow and optional flow. Until then leaving module terminology. --- .../module-1/spot_inturruption_handling.md | 7 +- .../module-2/_index.md | 76 ++++++------------- .../{module-3 => module-2}/fargate_service.md | 2 +- .../{module-3 => module-2}/fargate_task.md | 0 .../module-3/_index.md | 42 ---------- .../ecs-spot-capacity-providers/modules.md | 3 +- 6 files changed, 26 insertions(+), 104 deletions(-) rename content/ecs-spot-capacity-providers/{module-3 => module-2}/fargate_service.md (98%) rename content/ecs-spot-capacity-providers/{module-3 => module-2}/fargate_task.md (100%) delete mode 100644 content/ecs-spot-capacity-providers/module-3/_index.md diff --git a/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md b/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md index 05044f76..ff1773fa 100644 --- a/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md +++ b/content/ecs-spot-capacity-providers/module-1/spot_inturruption_handling.md @@ -1,9 +1,6 @@ --- -title: "Module-2: Spot Interruption Handling" -weight: 40 ---- - -Inturruption Handling On EC2 Spot Instances +title: "Inturruption Handling On EC2 Spot Instances" +weight: 80 --- Amazon EC2 terminates your Spot Instance when it needs the capacity back. Amazon EC2 provides a Spot Instance interruption notice, which gives the instance a two-minute warning before it is interrupted. diff --git a/content/ecs-spot-capacity-providers/module-2/_index.md b/content/ecs-spot-capacity-providers/module-2/_index.md index 05044f76..23931cd4 100644 --- a/content/ecs-spot-capacity-providers/module-2/_index.md +++ b/content/ecs-spot-capacity-providers/module-2/_index.md @@ -1,74 +1,42 @@ --- -title: "Module-2: Spot Interruption Handling" +title: "Module-2 (Optional): Saving costs using AWS Fargate Spot Capacity Providers" weight: 40 --- -Inturruption Handling On EC2 Spot Instances +AWS Fargate Capacity Providers --- -Amazon EC2 terminates your Spot Instance when it needs the capacity back. Amazon EC2 provides a Spot Instance interruption notice, which gives the instance a two-minute warning before it is interrupted. +Amazon ECS cluster capacity providers enable you to use both Fargate and Fargate Spot capacity with your Amazon ECS tasks. With Fargate Spot you can run interruption tolerant Amazon ECS tasks at a discounted rate compared to the Fargate price. Fargate Spot runs tasks on spare compute capacity. When AWS needs the capacity back, your tasks will be interrupted with a two-minute warning -When Amazon EC2 is going to interrupt your Spot Instance, the interruption notification will be available in two ways: +Creating a New ECS Cluster That Uses Fargate Capacity Providers +--- -1. ***Amazon EventBridge Events:*** EC2 service emits an event two minutes prior to the actual interruption. This event can be detected by Amazon CloudWatch Events. +When a new Amazon ECS cluster is created, you specify one or more capacity providers to associate with the cluster. The associated capacity providers determine the infrastructure to run your tasks on. Set the following global variables for the names of resources be created in this workshop -1. ***EC2 Instance Metadata service (IMDS):*** If your Spot Instance is marked for termination by EC2, the instance-action item is present in your instance metadata. +Run the following command to create a new cluster and associate both the Fargate and Fargate Spot capacity providers with it. -In the Launch Template configuration, we added: -```plaintext -echo "ECS_ENABLE_SPOT_INSTANCE_DRAINING=true" >> /etc/ecs/ecs.config ``` -When Amazon ECS Spot Instance draining is enabled on the instance, ECS receives the Spot Instance interruption notice and places the instance in DRAINING status. When a container instance is set to DRAINING, Amazon ECS prevents new tasks from being scheduled for placement on the container instance [Click here](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-instance-spot.html) to learn more. - -The web application (app.py) we used to buld docker image in this module shows two ways to handle the EC2 Spot interruption within a docker container. This allows you to perform actions such as preventing the processing of new work, checkpointing the progress of a batch job, or gracefully exiting the application to complete tasks such as ensuring database connections are properly closed - -In the first method, it checks the instance metadata service for spot interruption and display a message to web page notifying the users (this is, of course, just a demonstration and not for real-life scenarios). - -{{% notice warning %}} -In a production environment, you should not provide access from the ECS tasks to the IMDS. This is done in this workshop for simplification purposes. -{{% /notice %}} - - -```plaintext -URL = "http://169.254.169.254/latest/meta-data/spot/termination-time" -SpotInt = requests.get(URL) -if SpotInt.status_code == 200: - response += "

This Spot Instance will be terminated at: {}


".format(SpotInt.text) +aws ecs create-cluster \ +--cluster-name EcsSpotWorkshop \ +--capacity-providers FARGATE FARGATE_SPOT \ +--region $AWS_REGION \ +--default-capacity-provider-strategy capacityProvider=FARGATE,base=1,weight=1 ``` +If the above command fails with below error, run the command again. It should create the cluster now. -In the second method, it listens to the **SIGTERM** signal. The ECS container agent calls the StopTask API to stop all the tasks running on the Spot Instance. - -When StopTask is called on a task, the equivalent of docker stop is issued to the containers running in the task. This results in a **SIGTERM** value and a default 30-second timeout, after which the SIGKILL value is sent and the containers are forcibly stopped. If the container handles the **SIGTERM** value gracefully and exits within 30 seconds from receiving it, no SIGKILL value is sent. - -```python -class Ec2SpotInterruptionHandler: - signals = { - signal.SIGINT: 'SIGINT', - signal.SIGTERM: 'SIGTERM' - } - -def __init__(self): - signal.signal(signal.SIGINT, self.exit_gracefully) - signal.signal(signal.SIGTERM, self.exit_gracefully) - -def exit_gracefully(self, signum, frame): - print("\nReceived {} signal".format(self.signals[signum])) - if self.signals[signum] == 'SIGTERM': - print("Looks like there is a Spot Interruption. Let's wrap up the processing to avoid forceful killing of the applucation in next 30 sec ...") +``` +“An error occurred (InvalidParameterException) when calling the CreateCluster operation: Unable to assume the service linked role. Please verify that the ECS service linked role exists.“ ``` -Spot Interruption Handling on ECS Fargate Spot ---- +The ECS cluster will look like below in the AWS Console. Select ECS in **Services** and click on **Clusters** on left panel + +![ECS Cluster](/images/ecs-spot-capacity-providers/c1.png) -When tasks using Fargate Spot capacity are stopped due to a Spot interruption, a two-minute warning is sent before a task is stopped. The warning is sent as a task state change event to Amazon EventBridge -and a SIGTERM signal to the running task. When using Fargate Spot as part of a service, the service -scheduler will receive the interruption signal and attempt to launch additional tasks on Fargate Spot if -capacity is available. +Note that above ECS cluster create command also specifies a default capacity provider strategy. -To ensure that your containers exit gracefully before the task stops, the following can be configured: +The strategy sets FARGATE as the default capacity provider. That means if there is no capacity provider strategy specified during the deployment of Tasks/Services, ECS by default chooses the FARGATE Capacity Provider to launch them. -• A stopTimeout value of 120 seconds or less can be specified in the container definition that the task -is using. Specifying a stopTimeout value gives you time between the moment the task state change event is received and the point at which the container is forcefully stopped. +Click _***Update Cluster***_ on the top right corner to see default Capacity Provider Strategy. As shown base=1 is set for FARGATE Capacity Provider. -• The **SIGTERM** signal must be received from within the container to perform any cleanup actions. +![ECS Cluster](/images/ecs-spot-capacity-providers/c2.png) diff --git a/content/ecs-spot-capacity-providers/module-3/fargate_service.md b/content/ecs-spot-capacity-providers/module-2/fargate_service.md similarity index 98% rename from content/ecs-spot-capacity-providers/module-3/fargate_service.md rename to content/ecs-spot-capacity-providers/module-2/fargate_service.md index 31a87576..585fc1ed 100644 --- a/content/ecs-spot-capacity-providers/module-3/fargate_service.md +++ b/content/ecs-spot-capacity-providers/module-2/fargate_service.md @@ -97,4 +97,4 @@ As you see 3 tasks were placed on FARGATE and 1 is placed on FARGATE_SPOT Capaci ***Optional Exercise:*** Try changing the Capacity Provider Strategy by assigning different weightrs to FARGATE and FARGATE_SPOT Capacity Providers and update the service. -***Congratulations !!!*** you have successfully completed Module-3. +***Congratulations !!!*** you have successfully completed the workshop!!!. diff --git a/content/ecs-spot-capacity-providers/module-3/fargate_task.md b/content/ecs-spot-capacity-providers/module-2/fargate_task.md similarity index 100% rename from content/ecs-spot-capacity-providers/module-3/fargate_task.md rename to content/ecs-spot-capacity-providers/module-2/fargate_task.md diff --git a/content/ecs-spot-capacity-providers/module-3/_index.md b/content/ecs-spot-capacity-providers/module-3/_index.md deleted file mode 100644 index e3fbf823..00000000 --- a/content/ecs-spot-capacity-providers/module-3/_index.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: "Module-3 (Optional): Saving costs using AWS Fargate Spot Capacity Providers" -weight: 40 ---- - -AWS Fargate Capacity Providers ---- - -Amazon ECS cluster capacity providers enable you to use both Fargate and Fargate Spot capacity with your Amazon ECS tasks. With Fargate Spot you can run interruption tolerant Amazon ECS tasks at a discounted rate compared to the Fargate price. Fargate Spot runs tasks on spare compute capacity. When AWS needs the capacity back, your tasks will be interrupted with a two-minute warning - -Creating a New ECS Cluster That Uses Fargate Capacity Providers ---- - -When a new Amazon ECS cluster is created, you specify one or more capacity providers to associate with the cluster. The associated capacity providers determine the infrastructure to run your tasks on. Set the following global variables for the names of resources be created in this workshop - -Run the following command to create a new cluster and associate both the Fargate and Fargate Spot capacity providers with it. - -``` -aws ecs create-cluster \ ---cluster-name EcsSpotWorkshop \ ---capacity-providers FARGATE FARGATE_SPOT \ ---region $AWS_REGION \ ---default-capacity-provider-strategy capacityProvider=FARGATE,base=1,weight=1 -``` -If the above command fails with below error, run the command again. It should create the cluster now. - -``` -“An error occurred (InvalidParameterException) when calling the CreateCluster operation: Unable to assume the service linked role. Please verify that the ECS service linked role exists.“ -``` - -The ECS cluster will look like below in the AWS Console. Select ECS in **Services** and click on **Clusters** on left panel - -![ECS Cluster](/images/ecs-spot-capacity-providers/c1.png) - -Note that above ECS cluster create command also specifies a default capacity provider strategy. - -The strategy sets FARGATE as the default capacity provider. That means if there is no capacity provider strategy specified during the deployment of Tasks/Services, ECS by default chooses the FARGATE Capacity Provider to launch them. - -Click _***Update Cluster***_ on the top right corner to see default Capacity Provider Strategy. As shown base=1 is set for FARGATE Capacity Provider. - -![ECS Cluster](/images/ecs-spot-capacity-providers/c2.png) - diff --git a/content/ecs-spot-capacity-providers/modules.md b/content/ecs-spot-capacity-providers/modules.md index 6ce13e6d..74828964 100644 --- a/content/ecs-spot-capacity-providers/modules.md +++ b/content/ecs-spot-capacity-providers/modules.md @@ -11,5 +11,4 @@ These modules are designed to be completed in sequence. If you are reading this | Modules | Description | | --- | --- | | **Module-1** | Cost optimizing ECS using Spot Instances with Auto Scaling groups Capacity Providers | -| **Module-2** | Handling EC2 Spot Interruptions | -| **Module-3 (Optional)** | Cost optimizing ECS using AWS Fargate Spot | +| **Module-2 (Optional)** | Cost optimizing ECS using AWS Fargate Spot |