Skip to content

Commit

Permalink
feat(ecs): support Fargate and Fargate spot capacity providers
Browse files Browse the repository at this point in the history
  • Loading branch information
SoManyHs committed Feb 8, 2021
1 parent 6de792c commit 45c5693
Show file tree
Hide file tree
Showing 17 changed files with 1,039 additions and 22 deletions.
75 changes: 75 additions & 0 deletions design/aws-ecs/aws-ecs-fargate-capacity-providers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Fargate Spot Capacity Provider support in the CDK

## Objective

Since Capacity Providers are now supported in CloudFormation, incorporating support for Fargate Spot capacity has been one of the [top asks](https://github.com/aws/aws-cdk/issues?q=is%3Aissue+is%3Aopen+label%3A%40aws-cdk%2Faws-ecs+sort%3Areactions-%2B1-desc) for the ECS CDK module, with over 60 customer reactions. While there are still some outstanding issues regarding capacity provider support in general, specifically regarding cyclic workflows with named clusters (See: [CFN issue](http://%20https//github.com/aws/containers-roadmap/issues/631#issuecomment-702580141)), we should be able to move ahead with supporting `FARGATE` and `FARGATE_SPOT` capacity providers with our existing FargateService construct.

See: https://github.com/aws/aws-cdk/issues/5850

## CloudFormation Requirements

### Cluster

A list of capacity providers (specifically, `FARGATE` and `FARGATE_SPOT`) need to be specified on the cluster itself as part of the [CapacityProviders](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-cluster.html#cfn-ecs-cluster-capacityproviders) field.

Additionally, there is a [DefaultCapacityProviderStrategy](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-cluster.html#cfn-ecs-cluster-defaultcapacityproviderstrategy) on the cluster. While it is considered best practice to specify one if using capacity providers, this may not be necessary when only using Fargate capacity providers.

### Service

The [CapacityProviderStrategy](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-service.html#cfn-ecs-service-capacityproviderstrategy) field will need to be added to the Service construct. This would be a list of capacity provider strategies (aka [CapacityProviderStrategyItem](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-service-capacityproviderstrategyitem.html) in CFN) used for the service.

_Note_: It may be more readable to name the `CapacityProviderStrategy` field on the service to *CapacityProviderStrategies*, which would be a list of *CapacityProviderStrategy* objects that correspond to the CFN `CapacityProviderStrategyItem`.


## Proposed solution

### User Experience

The most straightforward solution would be to add the *capacityProviders* field on cluster, which the customer would have to set to the Fargate capacity providers (`FARGATE` and `FARGATE_SPOT`), and then specify the *capacityProviderStrategies* field on the FargateService with one or more strategies that use the Fargate capacity providers.

Example:

```ts
const stack = new cdk.Stack();
const vpc = new ec2.Vpc(stack, 'MyVpc', {});
const cluster = new ecs.Cluster(stack, 'EcsCluster', {
vpc,
*capacityProviders: ['FARGATE', 'FARGATE_SPOT'],*
});

const taskDefinition = new ecs.FargateTaskDefinition(stack, 'FargateTaskDef');

const container = taskDefinition.addContainer('web', {
image: ecs.ContainerImage.fromRegistry('amazon/amazon-ecs-sample'),
memoryLimitMiB: 512,
});
container.addPortMappings({ containerPort: 8000 });

new ecs.FargateService(stack, 'FargateService', {
cluster,
taskDefinition,
*capacityProviderStrategies**: [
{
capacityProvider: 'FARGATE_SPOT',
weight: 2,
},
{
capacityProvider: 'FARGATE',
weight: 1,
}
],*
});
```

The type for the *capacityProviders* field on a *Cluster* would be a list of string literals. An alternative that ensures type safety is to have `FARGATE` and `FARGATE_SPOT` as enum values; however, this would make it potentially more difficult to support Autoscaling Group capacity providers in the future, since [capacity providers](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cluster-capacity-providers.html) of that type would have be specified by their capacity provider name (as a string literal).

The type for the *capacityProviderStrategies* field on a *Service* would be a list of [*CapacityProviderStrategy*](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-service-capacityproviderstrategyitem.html) objects, taking the form:

{"[Base](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-service-capacityproviderstrategyitem.html#cfn-ecs-service-capacityproviderstrategyitem-base)" : Integer, "[CapacityProvider](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-service-capacityproviderstrategyitem.html#cfn-ecs-service-capacityproviderstrategyitem-capacityprovider)" : String, "[Weight](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-service-capacityproviderstrategyitem.html#cfn-ecs-service-capacityproviderstrategyitem-weight)" : Integer }

This new field would be added to the BaseService, not only for better extensibility when we add support for ASG capacity providers, but also to facilitate construction, since the FargateService extends the BaseService and would necessarily call super into the BaseService constructor.

### Alternatives
One alternative could be to provide a more magical experience by populating the capacityProviders field under the hood (for example, by modifying the cluster if capacityProviderStrategies is set on a FargateService). However, it’s unclear if this may lead to undesired behavior, especially if the customer is using an imported cluster. There is also the slight disadvantage of this being a less consistent behavior with how ASG capacity providers will be set in the future, and would break from the general pattern of setting resource fields at construction time.

Another option would be to create a new FargateCluster resource, that would have the two Fargate capacity providers set by default. The main advantage with this alternative would be that it would be consistent with the current Console experience, which sets the Fargate capacity providers for you if you choose the “Networking Only” cluster template via the cluster wizard. The downside is that it would be a more restrictive resource model that would go back on the decision to have a single generic ECS Cluster resource that could potentially contain both Fargate and EC2 services or tasks.
87 changes: 87 additions & 0 deletions packages/@aws-cdk/aws-ecs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -670,3 +670,90 @@ taskDefinition.addContainer('TheContainer', {
})
});
```

## Capacity Providers

Currently, only `FARGATE` and `FARGATE_SPOT` capacity providers are supported.

To enable capacity providers on your cluster, set the `capacityProviders` field
to [`FARGATE`, `FARGATE_SPOT`]. Then, specify capacity provider strategies on
the `capacityProviderStrategies` field for your Fargate Service.

```ts
import * as cdk from '@aws-cdk/core';
import * as ec2 from '@aws-cdk/aws-ec2';
import * as ecs from '../../lib';

const app = new cdk.App();
const stack = new cdk.Stack(app, 'aws-ecs-integ-capacity-provider');

const vpc = new ec2.Vpc(stack, 'Vpc', { maxAzs: 2 });

const cluster = new ecs.Cluster(stack, 'FargateCPCluster', {
vpc,
capacityProviders: ['FARGATE', 'FARGATE_SPOT'],
});

const taskDefinition = new ecs.FargateTaskDefinition(stack, 'TaskDef');

taskDefinition.addContainer('web', {
image: ecs.ContainerImage.fromRegistry('amazon/amazon-ecs-sample'),
});

new ecs.FargateService(stack, 'FargateService', {
cluster,
taskDefinition,
capacityProviderStrategies: [
{
capacityProvider: 'FARGATE_SPOT',
weight: 2,
},
{
capacityProvider: 'FARGATE',
weight: 1,
}
],
});

app.synth();
```

Alternatively, you can specify a defaultCapacityProviderStrategy on the cluster
itself using either of the Fargate capacity providers. Any service launched
inthe cluster that does not specify its own capacityProviderStrategies will use
the default.

```ts
import * as cdk from '@aws-cdk/core';
import * as ec2 from '@aws-cdk/aws-ec2';
import * as ecs from '../../lib';

const app = new cdk.App();
const stack = new cdk.Stack(app, 'aws-ecs-integ-capacity-provider');

const vpc = new ec2.Vpc(stack, 'Vpc', { maxAzs: 2 });

const defaultCapacityProviderStrategy = [{
capacityProvider: 'FARGATE_SPOT',
weight: 2,
}];

const cluster = new ecs.Cluster(stack, 'EcsCluster', {
vpc,
defaultCapacityProviderStrategy,
capacityProviders: ['FARGATE', 'FARGATE_SPOT'],
});

const taskDefinition = new ecs.FargateTaskDefinition(stack, 'TaskDef');

taskDefinition.addContainer('web', {
image: ecs.ContainerImage.fromRegistry('amazon/amazon-ecs-sample'),
});

new ecs.FargateService(stack, 'FargateService', {
cluster,
taskDefinition,
});

app.synth();
```
25 changes: 23 additions & 2 deletions packages/@aws-cdk/aws-ecs/lib/base/base-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import * as cloudmap from '@aws-cdk/aws-servicediscovery';
import { Annotations, Duration, IResolvable, IResource, Lazy, Resource, Stack } from '@aws-cdk/core';
import { Construct } from 'constructs';
import { LoadBalancerTargetOptions, NetworkMode, TaskDefinition } from '../base/task-definition';
import { ICluster } from '../cluster';
import { ICluster, CapacityProviderStrategy } from '../cluster';
import { Protocol } from '../container-definition';
import { CfnService } from '../ecs.generated';
import { ScalableTaskCount } from './scalable-task-count';
Expand Down Expand Up @@ -181,6 +181,15 @@ export interface BaseServiceOptions {
* @default - disabled
*/
readonly circuitBreaker?: DeploymentCircuitBreaker;

/**
* A list of Capacity Provider strategies used to place a service.
*
* @default - the defaultCapacityProviderStrategy of the cluster the service is launched in if specified, otherwise
* undefined
*
*/
readonly capacityProviderStrategies?: CapacityProviderStrategy[];
}

/**
Expand All @@ -191,6 +200,10 @@ export interface BaseServiceProps extends BaseServiceOptions {
/**
* The launch type on which to run your service.
*
* LaunchType will be omitted if capacity provider strategies are specified on the service.
*
* @see - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-service.html#cfn-ecs-service-capacityproviderstrategy
*
* Valid values are: LaunchType.ECS or LaunchType.FARGATE
*/
readonly launchType: LaunchType;
Expand Down Expand Up @@ -356,6 +369,13 @@ export abstract class BaseService extends Resource

this.taskDefinition = taskDefinition;

// launchType will set to undefined if using external DeploymentController or capacityProviderStrategies (whether
// specified on service itself or as default strategy on cluster)
const launchType = props.deploymentController?.type === DeploymentControllerType.EXTERNAL ||
props.capacityProviderStrategies !== undefined ||
props.cluster.defaultCapacityProviderStrategy !== undefined ?
undefined : props.launchType;

this.resource = new CfnService(this, 'Service', {
desiredCount: props.desiredCount,
serviceName: this.physicalName,
Expand All @@ -371,7 +391,8 @@ export abstract class BaseService extends Resource
propagateTags: props.propagateTags === PropagatedTagSource.NONE ? undefined : props.propagateTags,
enableEcsManagedTags: props.enableECSManagedTags ?? false,
deploymentController: props.deploymentController,
launchType: props.deploymentController?.type === DeploymentControllerType.EXTERNAL ? undefined : props.launchType,
launchType: launchType,
capacityProviderStrategy: props.capacityProviderStrategies,
healthCheckGracePeriodSeconds: this.evaluateHealthGracePeriod(props.healthCheckGracePeriod),
/* role: never specified, supplanted by Service Linked Role */
networkConfiguration: Lazy.any({ produce: () => this.networkConfiguration }, { omitEmptyArray: true }),
Expand Down
60 changes: 60 additions & 0 deletions packages/@aws-cdk/aws-ecs/lib/cluster.ts
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,20 @@ export interface ClusterProps {
*/
readonly capacity?: AddCapacityOptions;

/**
* The capacity providers to add to the cluster
*
* @default - None. Currently only FARGATE and FARGATE_SPOT are supported.
*/
readonly capacityProviders?: string[];

/**
* The default capacity provider strategy for the cluster.
*
* @default - no default strategy
*/
readonly defaultCapacityProviderStrategy?: CapacityProviderStrategy[];

/**
* If true CloudWatch Container Insights will be enabled for the cluster
*
Expand Down Expand Up @@ -101,6 +115,18 @@ export class Cluster extends Resource implements ICluster {
*/
public readonly clusterName: string;

/**
* The capacity providers associated with the cluster.
*/
public readonly capacityProviders?: string[];

/**
* The default capacity provider strategy for the cluster.
*
* @default - no default strategy
*/
public readonly defaultCapacityProviderStrategy?: CapacityProviderStrategy[];

/**
* The AWS Cloud Map namespace to associate with the cluster.
*/
Expand Down Expand Up @@ -137,6 +163,8 @@ export class Cluster extends Resource implements ICluster {
const cluster = new CfnCluster(this, 'Resource', {
clusterName: this.physicalName,
clusterSettings,
defaultCapacityProviderStrategy: props.defaultCapacityProviderStrategy,
capacityProviders: props.capacityProviders,
});

this.clusterArn = this.getResourceArnAttribute(cluster.attrArn, {
Expand All @@ -148,6 +176,8 @@ export class Cluster extends Resource implements ICluster {

this.vpc = props.vpc || new ec2.Vpc(this, 'Vpc', { maxAzs: 2 });

this.capacityProviders = props.capacityProviders;

this._defaultCloudMapNamespace = props.defaultCloudMapNamespace !== undefined
? this.addDefaultCloudMapNamespace(props.defaultCloudMapNamespace)
: undefined;
Expand Down Expand Up @@ -692,6 +722,11 @@ export interface ICluster extends IResource {
*/
readonly defaultCloudMapNamespace?: cloudmap.INamespace;

/**
* The default capacity provider strategy for the cluster.
*/
readonly defaultCapacityProviderStrategy?: CapacityProviderStrategy[];

/**
* The autoscaling group added to the cluster if capacity is associated to the cluster
*/
Expand Down Expand Up @@ -934,3 +969,28 @@ enum ContainerInsights {
*/
DISABLED = 'disabled',
}

/**
* A Capacity Provider strategy to use for the service.
*/
export interface CapacityProviderStrategy {
/**
* The name of the Capacity Provider. Currently only FARGATE and FARGATE_SPOT are supported.
*/
readonly capacityProvider: string;

/**
* The base value designates how many tasks, at a minimum, to run on the specified capacity provider. Only one
* capacity provider in a capacity provider strategy can have a base defined. If no value is specified, the default
* value of 0 is used.
*
* @default - none
*/
readonly base?: number;

/**
* The weight value designates the relative percentage of the total number of tasks launched that should use the specified
capacity provider. The weight value is taken into consideration after the base value, if defined, is satisfied.
*/
readonly weight: number;
}
2 changes: 2 additions & 0 deletions packages/@aws-cdk/aws-ecs/lib/fargate/fargate-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ export interface FargateServiceProps extends BaseServiceOptions {
* @default PropagatedTagSource.NONE
*/
readonly propagateTaskTagsFrom?: PropagatedTagSource;

}

/**
Expand Down Expand Up @@ -153,6 +154,7 @@ export class FargateService extends BaseService implements IFargateService {
...props,
desiredCount: props.desiredCount,
launchType: LaunchType.FARGATE,
capacityProviderStrategies: props.capacityProviderStrategies,
propagateTags: propagateTagsFromSource,
enableECSManagedTags: props.enableECSManagedTags,
}, {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -698,21 +698,21 @@
"Code": {
"ZipFile": "import boto3, json, os, time\n\necs = boto3.client('ecs')\nautoscaling = boto3.client('autoscaling')\n\n\ndef lambda_handler(event, context):\n print(json.dumps(event))\n cluster = os.environ['CLUSTER']\n snsTopicArn = event['Records'][0]['Sns']['TopicArn']\n lifecycle_event = json.loads(event['Records'][0]['Sns']['Message'])\n instance_id = lifecycle_event.get('EC2InstanceId')\n if not instance_id:\n print('Got event without EC2InstanceId: %s', json.dumps(event))\n return\n\n instance_arn = container_instance_arn(cluster, instance_id)\n print('Instance %s has container instance ARN %s' % (lifecycle_event['EC2InstanceId'], instance_arn))\n\n if not instance_arn:\n return\n\n while has_tasks(cluster, instance_arn):\n time.sleep(10)\n\n try:\n print('Terminating instance %s' % instance_id)\n autoscaling.complete_lifecycle_action(\n LifecycleActionResult='CONTINUE',\n **pick(lifecycle_event, 'LifecycleHookName', 'LifecycleActionToken', 'AutoScalingGroupName'))\n except Exception as e:\n # Lifecycle action may have already completed.\n print(str(e))\n\n\ndef container_instance_arn(cluster, instance_id):\n \"\"\"Turn an instance ID into a container instance ARN.\"\"\"\n arns = ecs.list_container_instances(cluster=cluster, filter='ec2InstanceId==' + instance_id)['containerInstanceArns']\n if not arns:\n return None\n return arns[0]\n\n\ndef has_tasks(cluster, instance_arn):\n \"\"\"Return True if the instance is running tasks for the given cluster.\"\"\"\n instances = ecs.describe_container_instances(cluster=cluster, containerInstances=[instance_arn])['containerInstances']\n if not instances:\n return False\n instance = instances[0]\n\n if instance['status'] == 'ACTIVE':\n # Start draining, then try again later\n set_container_instance_to_draining(cluster, instance_arn)\n return True\n\n tasks = instance['runningTasksCount'] + instance['pendingTasksCount']\n print('Instance %s has %s tasks' % (instance_arn, tasks))\n\n return tasks > 0\n\n\ndef set_container_instance_to_draining(cluster, instance_arn):\n ecs.update_container_instances_state(\n cluster=cluster,\n containerInstances=[instance_arn], status='DRAINING')\n\n\ndef pick(dct, *keys):\n \"\"\"Pick a subset of a dict.\"\"\"\n return {k: v for k, v in dct.items() if k in keys}\n"
},
"Handler": "index.lambda_handler",
"Role": {
"Fn::GetAtt": [
"EcsClusterDefaultAutoScalingGroupDrainECSHookFunctionServiceRole94543EDA",
"Arn"
]
},
"Runtime": "python3.6",
"Environment": {
"Variables": {
"CLUSTER": {
"Ref": "EcsCluster97242B84"
}
}
},
"Handler": "index.lambda_handler",
"Runtime": "python3.6",
"Tags": [
{
"Key": "Name",
Expand Down
Loading

0 comments on commit 45c5693

Please sign in to comment.