Skip to content

Commit

Permalink
Update ECS Deep Learning Workshop (#4)
Browse files Browse the repository at this point in the history
* Updating ECS DL Workshop, previous content was making reference to python 2, moved to python 3
* Fix reference to image
  • Loading branch information
ivallhon authored and ruecarlo committed Jul 3, 2019
1 parent 7822fee commit 964bb4f
Show file tree
Hide file tree
Showing 10 changed files with 31 additions and 28 deletions.
2 changes: 1 addition & 1 deletion content/ecs-deep-learning-workshop/lab3.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ weight = 130
Now that you have an MXNet image ready to go, the next step is to create a task definition. A task defintion specifies parameters and requirements used by ECS to run your container, e.g. the Docker image, cpu/memory resource requirements, host:container port mappings. You'll notice that the parameters in the task definition closely match options passed to a Docker run command. Task definitions are very flexible and can be used to deploy multiple containers that are linked together- for example, an application server and database. In this workshop, we will focus on deploying a single container.

\
1. Open the EC2 Container Service dashboard, click on **Task Definitions** in the left menu, and click **Create new Task Definition**.
1. Open the EC2 Container Service dashboard, click on **Task Definitions** in the left menu, and click **Create new Task Definition**. Then select **EC2** as launch type compatibility.

*Note: You'll notice there's a task definition already there in the list. Ignore this until you reach lab 5.*

Expand Down
21 changes: 14 additions & 7 deletions content/ecs-deep-learning-workshop/lab4.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Now that you're in the container, you can feel free to navigate around. It shoul

$ cd /root/ecs-deep-learning-workshop/mxnet/example/image-classification/

$ python train_mnist.py --lr-factor 1
$ python3 train_mnist.py --lr-factor 1

You will start to see output right away. It will something look like:

Expand All @@ -45,24 +45,31 @@ As you should be able to tell, logging into a machine, then dropping into a shel

### Prediction
\
Since training a model can be resource intensive and a lengthy process, you will run through an example that uses a pre-trained model built from the full [ImageNet](http://image-net.org/) dataset, which is a collection of over 10 million images with thousands of classes for those images. This example is presented as a Juypter notebook, so you can interactively walk through the example.
Since training a model can be resource intensive and a lengthy process, you will run through an example that uses a pre-trained model built from the full [ImageNet](http://image-net.org/) dataset, which is a collection of over 10 million images with thousands of classes for those images. This example is available [here](https://github.com/apache/incubator-mxnet/blob/master/docs/tutorials/python/predict_image.md) and we will create a new Jupyter notebook to go through it.

If you're new to Jupyter, it is essentially a web application that allows you to interactively step through blocks of written code. The code can be edited by the user as needed or desired, and there is a play button that lets you step through the cells. Cells that do not code have no effect, so you can hit play to pass through the cell.

\
1. Open a web browser and visit this URL to access the Jupyter notebook for the demo:

http://__EC2_PUBLIC_DNS_NAME__/notebooks/mxnet-notebooks/python/tutorials/predict_imagenet.ipynb
http://__EC2_PUBLIC_DNS_NAME__/tree/mxnet/docs/tutorials/python

\
2. Play through the cells to run through this example, which loads and prepares the pre-trained model as well as provide methods to load images into the model to predict its classification. If you've never used Jupyter before, you're probably wonder how you know something is happening. Cells with code are denoted on the left with "In [n]" where n is simply a cell number. When you play a cell that requires processing time, the number will show an asterisk.
2. Click on the **New** drop-down button on the right side, and then Python 3 to create a new notebook.

**IMPORTANT:** In cell 2, the default context is to use gpu, but in the case of this workshop, we're using cpu resources so change the text "gpu" to "cpu". Being able to switch between using cpu and gpu is a great feature of this library. See the following screenshot which illustrates where to change from gpu to cpu; also highlighted in the screenshot is the play button which lets you run the cells. While deep learning performance is better on gpu, you can make use of cpu resources in dev/test environments to keep costs down.
![Jupyter Notebook - Create](/images/ecs-deep-learning-workshop/new-jupyter-notebook.png)

![](/images/ecs-deep-learning-workshop/jupyter-notebook-predict.png)
\
3. Then, on the notebook copy and paste the code blocks on the [example](https://github.com/apache/incubator-mxnet/blob/master/docs/tutorials/python/predict_image.md) and click Run to execute each block as you paste it into the cell. The code loads and prepares the pre-trained model as well as provide methods to load images into the model to predict its classification. If you've never used Jupyter before, you're probably wonder how you know something is happening. Cells with code are denoted on the left with "In [n]" where n is simply a cell number. When you play a cell that requires processing time, the number will show an asterisk.

See the following screenshot which illustrates the notebook and the play button which lets you run code on the cells as you paste it.

![Jupyter Notebook - Predict](/images/ecs-deep-learning-workshop/jupyter-notebook-predict.png)

**IMPORTANT**: In the second code block, you will see we are setting the context to cpu, as for this workshop we're using cpu resources. When using an instance type with gpu, it is possible to switch the context to GPU. Being able to switch between using cpu and gpu is a great feature of this library. While deep learning performance is better on gpu, you can make use of cpu resources in dev/test environments to keep costs down.

\
3. Once you've stepped through the two examples at the end of the notebook, try feeding arbitrary images to see how well the model performs. Remember that Jupyter notebooks let you input your own code in a cell and run it, so feel free to experiment.
4. Once you've stepped through the two examples at the end of the notebook, try feeding arbitrary images to see how well the model performs. Remember that Jupyter notebooks let you input your own code in a cell and run it, so feel free to experiment.



10 changes: 5 additions & 5 deletions content/ecs-deep-learning-workshop/lab5.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,21 +12,21 @@ At this point, you've run through training and prediction examples using the com

### Training task

1. Open the EC2 Container Service dashboard, click on **Task Definitions** in the left menu, and click **Create new Task Definition**.
1. Open the EC2 Container Service dashboard, click on **Task Definitions** in the left menu, and click **Create new Task Definition**. Select **EC2** as Launch compatibility and click Next step.

2. Name your task definition, e.g. "mxnet-train".

3. Click on **Add container** and complete the Standard fields in the Add container window. Provide a name for your container, e.g. "mxnet-train". The image field is the same container image that you deployed previously. As a reminder, the format is equivalent to the registry/repository:tag format used in lab 2, step 6, i.e. **AWS_ACCOUNT_ID**.dkr.ecr.**AWS_REGION**.amazonaws.com/**ECR_REPOSITORY**:latest.

Set the memory to "1024". Leave the port mapping blank because you will not be starting the Jupyter process, and instead running a command to perform the training.
Set the memory to a soft limit of "1024". Leave the port mapping blank because you will not be starting the Jupyter process, and instead running a command to perform the training.

Scroll down to the **Advanced Container** configuration section, and in the **Entry point** field, type:

/bin/bash, -c

In the Command field, type:

DATE=`date -Iseconds` && echo \\\"running train_mnist.py\\\" && cd /root/ecs-deep-learning-workshop/mxnet/example/image-classification/ && python train_mnist.py --lr-factor 1|& tee results && echo \\\"results being written to s3://$OUTPUTBUCKET/train_mnist.results.$HOSTNAME.$DATE.txt\\\" && aws s3 cp results s3://$OUTPUTBUCKET/train_mnist.results.$HOSTNAME.$DATE.txt && echo \\\"Task complete!\\\"
DATE=`date -Iseconds` && echo \\\"running train_mnist.py\\\" && cd /root/ecs-deep-learning-workshop/mxnet/example/image-classification/ && python3 train_mnist.py --lr-factor 1|& tee results && echo \\\"results being written to s3://$OUTPUTBUCKET/train_mnist.results.$HOSTNAME.$DATE.txt\\\" && aws s3 cp results s3://$OUTPUTBUCKET/train_mnist.results.$HOSTNAME.$DATE.txt && echo \\\"Task complete!\\\"

The command references an OUTPUTBUCKET environment variable, and you can set this in **Env variables**. Set the key to be "OUTPUTBUCKET" and the value to be the S3 output bucket created by CloudFormation. You can find the value of your S3 output bucket by going to the CloudFormation stack outputs tab, and used the value for **outputBucketName**. Set "AWS_DEFAULT_REGION" to be the value of **awsRegionName** from the CloudFormation stack outputs tab.

Expand All @@ -50,13 +50,13 @@ Click **Add** to save this configuration and add it to the task defintion. Click

### Prediction task

1. Return to the **Task Definitions** page, and click **Create new Task Definition**.
1. Return to the **Task Definitions** page, and click **Create new Task Definition**. Select **EC2** as Launch compatibility and click Next step.

2. Name your task definition, e.g. "mxnet-predict".

3. Click on **Add container** and complete the Standard fields in the Add container window. Provide a name for your container, e.g. "mxnet-predict". The image field is the same container image that you deployed previously. As a reminder, the format is equivalent to the registry/repository:tag format used in lab 2, step 6, i.e. **AWS_ACCOUNT_ID**.dkr.ecr.**AWS_REGION**.amazonaws.com/**ECR_REPOSITORY**:latest.

Set the memory to "1024". Leave the port mapping blank because you will not be starting the Jupyter process, and instead running a command to perform the training.
Set the memory to a soft limit of "1024". Leave the port mapping blank because you will not be starting the Jupyter process, and instead running a command to perform the training.

Scroll down to the **Advanced Container configuration** section, and in the **Entry point** field, type:

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 7 additions & 10 deletions workshops/ecs-deep-learning-workshop/lab-2-build/mxnet/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,16 @@ RUN apt-get -y update
RUN apt-get -y install git \
python-opencv \
build-essential \
python-dev \
python-tk
python3-dev \
python3-tk

RUN pip install dumb-init
RUN pip install awscli
RUN pip install matplotlib
RUN pip install opencv-python dumb-init awscli matplotlib

ENV WORKSHOPDIR /root/ecs-deep-learning-workshop
RUN mkdir ${WORKSHOPDIR}
RUN mkdir ${WORKSHOPDIR}

RUN cd ${WORKSHOPDIR} \
&& git clone --recursive https://github.com/apache/incubator-mxnet.git mxnet \
&& git clone https://github.com/dmlc/mxnet-notebooks.git
&& git clone --recursive https://github.com/apache/incubator-mxnet.git mxnet

COPY predict_imagenet.py /usr/local/bin/

Expand All @@ -30,9 +27,9 @@ RUN jupyter-notebook --generate-config --allow-root \

ARG PASSWORD

RUN python -c "from notebook.auth import passwd;print(passwd('${PASSWORD}') if '${PASSWORD}' != '' else 'sha1:c6bd96fb0824:6654e9eabfc54d0b3d0715ddf9561bed18e09b82')" > ${WORKSHOPDIR}/password_temp
RUN python3 -c "from notebook.auth import passwd;print(passwd('${PASSWORD}') if '${PASSWORD}' != '' else 'sha1:c6bd96fb0824:6654e9eabfc54d0b3d0715ddf9561bed18e09b82')" > ${WORKSHOPDIR}/password_temp

RUN sed -i "s/#c.NotebookApp.password = u''/c.NotebookApp.password = u'$(cat ${WORKSHOPDIR}/password_temp)'/g" /root/.jupyter/jupyter_notebook_config.py
RUN sed -i "s/#c.NotebookApp.password = ''/c.NotebookApp.password = '$(cat ${WORKSHOPDIR}/password_temp)'/g" /root/.jupyter/jupyter_notebook_config.py

RUN rm ${WORKSHOPDIR}/password_temp

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env ipython
#!/usr/bin/env python3

from __future__ import print_function
import os, sys, urllib
import os, sys, urllib.request

if len(sys.argv) < 2:
print("Usage:", sys.argv[0], "<url>")
Expand All @@ -14,7 +14,7 @@
def download(url,prefix=''):
filename = prefix+url.split("/")[-1]
if not os.path.exists(filename):
urllib.urlretrieve(url, filename)
urllib.request.urlretrieve(url, filename)

path='http://data.mxnet.io/models/imagenet-11k/'
download(path+'resnet-152/resnet-152-symbol.json', 'full-')
Expand All @@ -30,7 +30,6 @@ def download(url,prefix=''):
mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))])
mod.set_params(arg_params, aux_params)

get_ipython().magic(u'matplotlib auto')
import matplotlib
matplotlib.rc("savefig", dpi=100)
import cv2
Expand All @@ -40,7 +39,7 @@ def download(url,prefix=''):

def get_image(url, show=True):
filename = url.split("/")[-1]
urllib.urlretrieve(url, filename)
urllib.request.urlretrieve(url, filename)
img = cv2.imread(filename)
if img is None:
print('failed to download ' + url)
Expand Down

0 comments on commit 964bb4f

Please sign in to comment.