Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrated AWS Step Functions for workflow orchestration #179

Merged
merged 11 commits into from
Aug 3, 2022
4 changes: 2 additions & 2 deletions content/rendering-with-batch/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ pre: "<b>9. </b>"
---

{{% notice info %}}
The estimated completion time of this lab is **60 minutes**. Please note that rendering the animation presented below can incur in costs up to **$15**.
The estimated completion time of this lab is **90 minutes**. Please note that rendering the animation presented below can incur in costs up to **$15**.
{{% /notice %}}
## Overview

In this workshop you will learn to submit jobs with [AWS Batch](https://aws.amazon.com/batch/) following Spot best practices to [render](https://en.wikipedia.org/wiki/Rendering_(computer_graphics)) a [Blender](https://www.blender.org/) file in a distributed way. You will be creating a docker container and publishing it in Amazon Elastic Container Registry (ECR). Then you will use that container in AWS Batch using a mix of EC2 On-Demand and Spot instances. [Spot instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html) are EC2 spare capacity offered at steep discounts compared to On-Demand instances and are a cost-effective choice for applications that can be interrupted, what makes them well-suited for the batch processing that we will run. After going through all the sections, you will have the following pipeline created:
In this workshop you will learn to submit jobs with [AWS Batch](https://aws.amazon.com/batch/) following Spot best practices to [render](https://en.wikipedia.org/wiki/Rendering_(computer_graphics)) a [Blender](https://www.blender.org/) file in a distributed way. You will be creating a docker container and publishing it in Amazon Elastic Container Registry (ECR). Then you will use that container in AWS Batch using a mix of EC2 On-Demand and Spot instances. [Spot instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html) are EC2 spare capacity offered at steep discounts compared to On-Demand instances and are a cost-effective choice for applications that can be interrupted, what makes them well-suited for the batch processing that we will run. After going through all the sections, you will have the following pipeline created, orchestrated by AWS Step Functions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Spot Instances and On-Demand Instances instead of Spot instances and On-Demand instances.


1. A python script downloads the Blender file from S3 to extract the number of frames from the Blender project.
2. The script submits a batch job using an `array job` with as many tasks as number of frames. It also submits a single stitching job using [FFmpeg](https://ffmpeg.org/) to create a final video file.
Expand Down
257 changes: 0 additions & 257 deletions content/rendering-with-batch/batch/batch.files/job_submission.py

This file was deleted.

Binary file not shown.
4 changes: 3 additions & 1 deletion content/rendering-with-batch/batch/job_definition.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ cat <<EoF > job-definition-config.json
"containerProperties": {
"image": "${IMAGE}",
"vcpus": 1,
"memory": 8000
"memory": 8000,
"command": ["Ref::action", "-i", "Ref::inputUri", "-o", "Ref::outputUri", "-f", "Ref::framesPerJob"]
},
"retryStrategy": {
"attempts": 3
Expand All @@ -36,6 +37,7 @@ Let's explore the configuration parameters in the structure:
- **image**: the image used to start a container, this value is passed directly to the Docker daemon.
- **vcpus**: The number of vCPUs reserved for the job. Each vCPU is equivalent to 1,024 CPU shares.
- **memory**: hard limit (in MiB) for a container. If your container attempts to exceed the specified number, it's terminated.
- **command**: this is the command that will be executed in the container when the job is started. It has placeholders for some parameters that will be substituted when submitting the job using AWS Batch.
- **platformCapabilities**: the platform capabilities required by the job definition. Either `EC2` or `FARGATE`.

{{% notice info %}}
Expand Down
3 changes: 2 additions & 1 deletion content/rendering-with-batch/batch/job_queue.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ All the compute environments within a queue must be either (`SPOT` and/or `EC2`)
Execute this command to create the job queue. To learn more about this API, see [create-job-queue CLI command reference](https://docs.aws.amazon.com/cli/latest/reference/batch/create-job-queue.html).

```
aws batch create-job-queue --cli-input-json file://job-queue-config.json
export JOB_QUEUE_ARN=$(aws batch create-job-queue --cli-input-json file://job-queue-config.json | jq -r '.jobQueueArn')
echo "Job queue Arn: ${JOB_QUEUE_ARN}"
```

Next, you are going to create a **Job Definition** that will be used to submit jobs.
Loading