Deep Learning Base AMI (Amazon Linux)
Deep Learning AMI comes with CUDA 9 environment configured by default, but can be easily switched to CUDA 8: https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-base.html
We'll prebuild the nvidia docker 2 into the base AMI.
Reference the document, "Creating a GPU Workload AMI", install nvidia-docker, the containing driver for GPU on AMI and create AMI.
The nvidia-docker is a tool implemented as a docker wrapper, so that if you have NVIDIA drivers installed on the host OS, you can use the GPU from a container contain CUDA Toolkit.
More detail: https://www.nvidia.com/object/docker-container.html
Follow the document "Creating a Cluster", you can create a ECS Cluster and basically it's ok to create it with all default value.
ECS actually create the CloudFormation template behind the scenes. In order to change the type of instance started in the cluster and the underlying AMI you need to modify the CloudFormation template.
On the AWS CloudFormation console select EC2ContainerService - ${cluster name} (my cluster name is ECS-GPU-Cluster) and Update Stack.
In Specify stack details update the EcsAmiId and EcsInstanceType
Note: In this guide, my EcsAmiId is ami-020d1cd527153432a and EcsInstanceType use p2.xlarge please input yours.
Create the Task Definition and choose the EC2 launch type.
Note: Farget is not support GPU instance currently.
There is no more specific step need to remind just follow the document "Creating a Task Definition" as usual.
With this, the environment for GPU calculation from ECS had done.
Scale your instance and task at least 1, start your service and enjoy your ride.