From acc61dc1d8476d57b7fcf598508348919bd410d9 Mon Sep 17 00:00:00 2001 From: Shreya Pandit Date: Fri, 29 Oct 2021 09:22:57 -0700 Subject: [PATCH] removes redundant notebook --- ...e-with-ml-model-from-aws-marketplace.ipynb | 507 ------------------ 1 file changed, 507 deletions(-) delete mode 100644 aws_marketplace/using_data/using_data_with_ml_model/using-dataset-product-from-aws-data-exchange-with-ml-model-from-aws-marketplace.ipynb diff --git a/aws_marketplace/using_data/using_data_with_ml_model/using-dataset-product-from-aws-data-exchange-with-ml-model-from-aws-marketplace.ipynb b/aws_marketplace/using_data/using_data_with_ml_model/using-dataset-product-from-aws-data-exchange-with-ml-model-from-aws-marketplace.ipynb deleted file mode 100644 index 419fbcb49f..0000000000 --- a/aws_marketplace/using_data/using_data_with_ml_model/using-dataset-product-from-aws-data-exchange-with-ml-model-from-aws-marketplace.ipynb +++ /dev/null @@ -1,507 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "33386bb3", - "metadata": {}, - "source": [ - "## Using dataset product from AWS Data Exchange with ML model from AWS Marketplace\n", - "\n", - "This sample notebook shows how to perform machine learning on third-party datasets from [AWS Data Exchange](https://aws.amazon.com/data-exchange/) using a pre-trained ML Model.\n", - "\n", - "In this notebook, you will subscribe to a dataset listed by shutterstock in AWS Data Exchange. You will then export the dataset to an S3 bucket, and then download it to your local environment. You will also subscribe to Resnet 18, an open ML model from AWS Marketplace and deploy it in form an Amazon SageMaker Endpoint. Finally, you will perform inference.\n", - "\n", - "\n", - "### Contents:\n", - "* [Pre-requisites](#Pre-requisites)\n", - "* [Introduction](#Introduction)\n", - "* [Explore dataset](#Explore-dataset)\n", - "* [Perform inference](#Perform-inference)\n", - "* [Cleanup](#Cleanup)\n", - "\n", - "\n", - "#### Usage instructions\n", - "You can run this notebook one cell at a time (By using Shift+Enter for running a cell)." - ] - }, - { - "cell_type": "markdown", - "id": "b230f601", - "metadata": {}, - "source": [ - "### Pre-requisites:\n", - "\n", - "#### Pre-requisite 1:\n", - "This sample notebook assumes a subscription to the [500 Image & Metadata Free Sample dataset](https://console.aws.amazon.com/dataexchange/home?region=us-east-1#/products/prodview-2h52yl4q6jrjw) has been created and data has been exported into an S3 bucket.\n", - "\n", - "If you have not done this already, please follow these steps: \n", - "\n", - "#### Subscribe to data from AWS Data Exchange:\n", - "1. Open the [500 Image & Metadata Free Sample dataset](https://console.aws.amazon.com/dataexchange/home?region=us-east-1#/products/prodview-2h52yl4q6jrjw) from AWS Data Exchange console.\n", - "2. Read the overview and other information such as pricing, usage, support. \n", - "3. Choose __Continue to Subscribe__.\n", - "4. If your organization agrees to subscription terms, pricing information, and Data subscription agreement, then review/update the renewal settings and choose __Subscribe__.\n", - "5. Once subscription has been successfully created (This step may take 5-10 minutes), you will find the dataset listed in the [__Subscriptions__](https://console.aws.amazon.com/dataexchange/home?region=us-east-1#/subscriptions) section of the console\n", - "6. From [subscription page](https://console.aws.amazon.com/dataexchange/home?region=us-east-1#/subscriptions), open **Shutterstock dataset**, and for this use-case, choose the __retail_trials-bathbodyworks__ dataset.\n", - "7. Select the revision and then choose **Export to Amazon S3**.\n", - "8. Select appropriate bucket and once the export job has completed, open the s3 bucket you chose in preceding step and then copy the S3 URL of the data folder by choosing **Copy S3 URI** and specify the same in following cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c042e2bb", - "metadata": {}, - "outputs": [], - "source": [ - "# Please specify S3 location in which dataset has been exported.\n", - "dataset_export_location = \"\"\n", - "# dataset_export_location='s3://bucket/adx_free_data_sample/'" - ] - }, - { - "cell_type": "markdown", - "id": "edd03dc7", - "metadata": {}, - "source": [ - "#### Pre-requisite 2:\n", - "\n", - "This sample notebook assumes a subscription to the [Resnet 18 ML Model](https://aws.amazon.com/marketplace/pp/prodview-rte234xioxzqu) has been created and an endpoint has been deployed.If you have not done this already, please follow these steps:\n", - "\n", - "\n", - "#### Subscribe and deploy ML Model from AWS Marketplace:\n", - "1. Open the [Resnet 18 ML Model from AWS Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-rte234xioxzqu) from AWS Marketplace.\n", - "2. Read the **Highlights** section and then **product overview** section of the listing.\n", - "3. View **usage information** and then **additional resources**.\n", - "4. Note the supported instance types.\n", - "5. Next, click on **Continue to subscribe**.\n", - "6. Review **End user license agreement**, **support terms**, as well as **pricing information**.\n", - "7. **\"Accept Offer\"** button needs to be clicked if your organization agrees with EULA, pricing information as well as support terms.\n", - "8. Choose **Continue to Configuration**.\n", - "9. Leave **AWS CloudFormation** as the selected option and if this is the first time you are using Amazon SageMaker, \n", - "under *Configure for AWS CloudFormation*, choose **Create and use a new service role** and ***Any S3 bucket**, and then select **Launch CloudFormation Template**.\n", - "10. In CloudFormation console, choose **Create Stack**\n", - "11. After you have launched AWS CloudFormation template, wait for the newly launched AWS CloudFormation stack's status to change to **Create Complete**. \n", - "12. Open Outputs tab of the CloudFormation stack and then copy the value corresponding to **EndpointName** and specify the same in following cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f75db16a", - "metadata": {}, - "outputs": [], - "source": [ - "endpoint_name = \"Endpoint-ResNet-18-1\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "41a4a77a", - "metadata": {}, - "outputs": [], - "source": [ - "# Import necessary libraries.\n", - "import math\n", - "import re\n", - "import os\n", - "import json\n", - "import time\n", - "\n", - "import glob\n", - "import pandas as pd\n", - "\n", - "import boto3\n", - "import sagemaker\n", - "from sagemaker import AlgorithmEstimator\n", - "from sagemaker import get_execution_role\n", - "from IPython.display import Image\n", - "\n", - "s3 = boto3.client(\"s3\")\n", - "sagemaker_session = sagemaker.Session()\n", - "region = sagemaker_session.boto_region_name\n", - "runtime = boto3.client(\"runtime.sagemaker\")\n", - "\n", - "content_type = \"application/x-image\"\n", - "predictions = []\n", - "\n", - "s3_bucket = f\"jumpstart-cache-prod-{region}\"\n", - "s3.download_file(s3_bucket, \"inference-notebook-assets/ImageNetLabels.txt\", \"ImageNetLabels.txt\")\n", - "with open(\"ImageNetLabels.txt\", \"r\") as file:\n", - " class_id_to_label = file.read().splitlines()[1::]" - ] - }, - { - "cell_type": "markdown", - "id": "f8b3497a", - "metadata": {}, - "source": [ - "### Introduction\n", - "\n", - "You work for a super cool startup, which lets you bring your pet to the office. The startup is expanding and culture is pretty friendly. Your office is on a large campus provided by a tech incubator. The campus itself is well-equipped with safety cameras.\n", - "\n", - "You bring your little shih-tzu dog, affectionately called Toffee, to work. Because of his friendly nature, he quickly becomes the most popular dog on entire campus. He loves visiting all his friends and you have to find Toffee every day before leaving work.\n", - "\n", - "Since the campus is large, it is hard to physically go everywhere and find your dog. You typically end up with security and have to go through hundreds of cameras to find Toffee before you can leave for the day.\n", - "\n", - "In this workshop, you will develop new skills which you can use to build a software that security team can use to help people find their dog. For this workshop, you don’t need to worry about finding a campus and setting up cameras. Shutterstock has provided a dataset that you will use for the analysis. As part of pre-requisites of this notebook, you should already have subscribed to the dataset and specified the s3 location in dataset_export_location variable." - ] - }, - { - "cell_type": "markdown", - "id": "f1a4996d", - "metadata": {}, - "source": [ - "### Explore dataset" - ] - }, - { - "cell_type": "markdown", - "id": "ba59a3c5", - "metadata": {}, - "source": [ - "Next, you will load the dataset from S3 into your local execution environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ef6f579f", - "metadata": {}, - "outputs": [], - "source": [ - "!aws s3 sync $dataset_export_location data" - ] - }, - { - "cell_type": "markdown", - "id": "38ffb9e2", - "metadata": {}, - "source": [ - "Load the camera footage into a dictionary so you can easily do a lookup." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "596ecf8b", - "metadata": {}, - "outputs": [], - "source": [ - "camera_footage = {}\n", - "counter = 1\n", - "for subdir, dirs, files in os.walk(\"data\"):\n", - " for file in files:\n", - " camera_footage[counter] = subdir + \"/\" + file\n", - " counter = counter + 1\n", - "\n", - "print(\"Total \", (counter - 1), \" cameras were found\")\n", - "\n", - "\n", - "def get_camera_id(value):\n", - " for (\n", - " key,\n", - " val,\n", - " ) in camera_footage.items(): # for name, age in dictionary.iteritems(): (for Python 2.x)\n", - " if value in val:\n", - " return key" - ] - }, - { - "cell_type": "markdown", - "id": "11fa6f4c", - "metadata": {}, - "source": [ - "See what footage from camera #1 looks like" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1562bd13", - "metadata": {}, - "outputs": [], - "source": [ - "def show_cam_footage(camera_id):\n", - " return Image(url=camera_footage[camera_id], width=400, height=800)\n", - "\n", - "\n", - "camera_id = get_camera_id(\"1634351818.jpg\")\n", - "show_cam_footage(camera_id)" - ] - }, - { - "cell_type": "markdown", - "id": "cd916d1a", - "metadata": {}, - "source": [ - "Looks like you are looking at camera located in the grocery store of the campus. Try footage from another camera." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f0e726ad", - "metadata": {}, - "outputs": [], - "source": [ - "camera_id = get_camera_id(\"1821728006.jpg\")\n", - "show_cam_footage(camera_id)" - ] - }, - { - "cell_type": "markdown", - "id": "4d39abf6", - "metadata": {}, - "source": [ - "That's Stacy from your team giving a treat to her golden retriever!" - ] - }, - { - "cell_type": "markdown", - "id": "197e5ccd", - "metadata": {}, - "source": [ - "Now you need to identify a way to catalog all the different dogs and cats so that you can look them up easily. For this purpose, you will use an ML model that can identify 1000 different image classes including many popular dog and cat breeds as shown in following table.\n", - "\n", - "\n", - "| Class | dog | |\n", - "|------------------|-----|---|\n", - "| redbone | dog | |\n", - "| shih-tzu | dog | |\n", - "| collie | dog | |\n", - "| basset | dog | |\n", - "| malamute | dog | |\n", - "| beagle | dog | |\n", - "| pug | dog | |\n", - "| golden retriever | dog | |\n", - "| tabby | cat | |\n", - "| siamese cat | cat | |" - ] - }, - { - "cell_type": "markdown", - "id": "b011bd61", - "metadata": {}, - "source": [ - "### Perform inference" - ] - }, - { - "cell_type": "markdown", - "id": "01aead45", - "metadata": {}, - "source": [ - "As part of pre-requisite#2, you have already deployed the ML model and configured the endpoint name in 'endpoint_name' variable. Now you are ready to perform inference. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "deb8a0a4", - "metadata": {}, - "outputs": [], - "source": [ - "# The following method sends picture corresponding to camera_id specified to the ML model\n", - "# and returns you the classes found.\n", - "\n", - "\n", - "def perform_inference(camera_id):\n", - "\n", - " with open(camera_footage[camera_id], \"rb\") as file:\n", - " body = file.read()\n", - "\n", - " # Perform inference by calling invoke_endpoint API\n", - " response = runtime.invoke_endpoint(\n", - " EndpointName=endpoint_name, ContentType=content_type, Body=body\n", - " )\n", - "\n", - " # Parse the inference response and load top 10 classes found into a dictionary.\n", - " prediction = json.loads(response[\"Body\"].read())\n", - " prediction_ids = sorted(\n", - " range(len(prediction)), key=lambda index: prediction[index], reverse=True\n", - " )[:10]\n", - " for id in prediction_ids:\n", - " predictions.append([camera_id, class_id_to_label[id].lower(), 100 * prediction[id]])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "13c592fe", - "metadata": {}, - "outputs": [], - "source": [ - "# Perform inference on all cameras\n", - "for id in camera_footage:\n", - " perform_inference(id)\n", - "\n", - "# Load the inference results into a pandas datafram so you can easily look it up.\n", - "df = pd.DataFrame(predictions, columns=[\"camera_id\", \"entity\", \"probability_measure\"])" - ] - }, - { - "cell_type": "markdown", - "id": "04dc9563", - "metadata": {}, - "source": [ - "Now that our catalog containing image classes for all cameras is ready, you can look-up the classes identified by the Resnet-18 machine learning model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d8ce325b", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"-------------------------------------------------\")\n", - "print(\"Image classes summary for cam-\", camera_id)\n", - "print(\"-------------------------------------------------\")\n", - "print(df[df[\"camera_id\"] == camera_id])\n", - "show_cam_footage(camera_id)" - ] - }, - { - "cell_type": "markdown", - "id": "8d3cf380", - "metadata": {}, - "source": [ - "You can see how the ML model was able to identify the golden retriever with high probability measure value." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ae444514", - "metadata": {}, - "outputs": [], - "source": [ - "# Following function accepts the pet catagory and returns results\n", - "# that meet the probability_measure threshold.\n", - "def find_my_pet(catagory, probability_measure):\n", - " images = []\n", - " entries = df[\n", - " (df[\"entity\"] == catagory) & (df[\"probability_measure\"] > probability_measure)\n", - " ].sort_values(\"probability_measure\", ascending=False)\n", - " for entry in entries.iterrows():\n", - " print(\n", - " \"Camera-id:\"\n", - " + str(entry[1][\"camera_id\"])\n", - " + \" -> \"\n", - " + str(entry[1][\"probability_measure\"])\n", - " )\n", - " display(Image(url=camera_footage[entry[1][\"camera_id\"]], width=400, height=800))" - ] - }, - { - "cell_type": "markdown", - "id": "a1fcdab0", - "metadata": {}, - "source": [ - "Now its time to find Toffee. Specify a **pet_category** and a **probability_measure** value to see all cameras that have the specified pet." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "afa5cba2", - "metadata": {}, - "outputs": [], - "source": [ - "pet_category = \"shih-tzu\"\n", - "probability_measure_threshold = 10\n", - "find_my_pet(pet_category, probability_measure_threshold)" - ] - }, - { - "cell_type": "markdown", - "id": "3c5cb197", - "metadata": {}, - "source": [ - "You can now try specifying different values for the **pet_category** and **probability_measure** variables to see how model behaves." - ] - }, - { - "cell_type": "markdown", - "id": "101be7f8", - "metadata": {}, - "source": [ - "Congratulations, you have learnt how pre-trained ML models can be used to extract insights from data." - ] - }, - { - "cell_type": "markdown", - "id": "f3fe07a9", - "metadata": {}, - "source": [ - "### Next Steps\n", - "As a next step, i recommend you to:\n", - "1. Explore [AWS Data Exchange](https://console.aws.amazon.com/dataexchange/home?region=us-east-1#/products) and identify the dataset that will help you solve your business problems. If you can't find a dataset you are looking for, you can also [request dataset products](https://console.aws.amazon.com/dataexchange/home?region=us-east-1#/products/product-request)\n", - "2. Explore [ML Models from AWS Marketplace](https://aws.amazon.com/marketplace/search/results?page=1&filters=fulfillment_options&fulfillment_options=SageMaker&ref_=header_nav_dm_sagemaker) and identify which ML model can help you build differentiating features. If you have any questions or need a custom ML model, you can contact AWS Marketplace team on aws-mp-bd-ml@amazon.com." - ] - }, - { - "cell_type": "markdown", - "id": "a05174b2", - "metadata": {}, - "source": [ - "### Cleanup" - ] - }, - { - "cell_type": "markdown", - "id": "4d1aa396", - "metadata": {}, - "source": [ - "To avoid charges to your AWS account when not running your invocation, you will need to delete your endpoint. You will not be charged for keeping your endpoint config or model.\n", - "\n", - "You can visit CloudFormation to delete the stack you created.\n" - ] - }, - { - "cell_type": "markdown", - "id": "09392f63", - "metadata": {}, - "source": [ - "Finally, if the AWS Marketplace subscription was created just for the experiment and you want to unsubscribe to the product, here are the steps that can be followed.\n", - "Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. \n", - "\n", - "**Steps to un-subscribe to product from AWS Marketplace**:\n", - "1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=lbr_tab_ml)\n", - "2. Locate the listing that you need to cancel subscription for, and then __Cancel Subscription__ can be clicked to cancel the subscription.\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "38ebd1f8", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "conda_python3", - "language": "python", - "name": "conda_python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -}