Merge pull request aws#1 from aws/master

update
seanpmorgan · Dec 9, 2020 · 2ec897c · 2ec897c
2 parents 25b0fad + 265536f
commit 2ec897c
Show file tree

Hide file tree

Showing 441 changed files with 52,983 additions and 12,710 deletions.
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
@@ -7,4 +7,8 @@
 #
 # @See https://help.github.com/articles/about-codeowners/
 
-/sagemaker-experiments/* @aws/sagemakerexperimentsadmin
+/sagemaker-experiments/* @aws/sagemakerexperimentsadmin
+/sagemaker-lineage/* @aws/sagemakerexperimentsadmin
+
+# Community contributed
+/contrib/ @aws/sagemaker-notebook-sas
diff --git a/.gitignore b/.gitignore
@@ -3,3 +3,5 @@
 **/__pycache__
 **/.aws-sam
 .DS_Store
+
+**/_build
diff --git a/.readthedocs.yml b/.readthedocs.yml
@@ -0,0 +1,15 @@
+# ReadTheDocs environment customization to allow us to use conda to install
+# libraries which have C dependencies for the doc build. See:
+# https://docs.readthedocs.io/en/latest/config-file/v2.html
+
+version: 2
+
+conda:
+  environment: environment.yml
+
+python:
+  version: 3.6
+
+sphinx:
+  configuration: conf.py
+  fail_on_warning: false
diff --git a/Makefile b/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/README.md b/README.md
@@ -77,7 +77,6 @@ The following provide examples demonstrating different capabilities of Amazon Sa
 - [Knapsack Problem](reinforcement_learning/rl_knapsack_coach_custom) demonstrates how to solve the knapsack problem using a custom environment.
 - [Mountain Car](reinforcement_learning/rl_mountain_car_coach_gymEnv) Mountain car is a classic RL problem. This notebook explains how to solve this using the OpenAI Gym environment.
 - [Distributed Neural Network Compression](reinforcement_learning/rl_network_compression_ray_custom) This notebook explains how to compress ResNets using RL, using a custom environment and the RLLib toolkit.
-- [Turtlebot Tracker](reinforcement_learning/rl_objecttracker_robomaker_coach_gazebo) This notebook demonstrates object tracking using AWS Robomaker and RL Coach in the Gazebo environment.
 - [Portfolio Management](reinforcement_learning/rl_portfolio_management_coach_customEnv) This notebook uses a custom Gym environment to manage multiple financial investments.
 - [Autoscaling](reinforcement_learning/rl_predictive_autoscaling_coach_customEnv) demonstrates how to adjust load depending on demand. This uses RL Coach and a custom environment.
 - [Roboschool](reinforcement_learning/rl_roboschool_ray) is an open source physics simulator that is commonly used to train RL policies for robotic systems. This notebook demonstrates training a few agents using it.
@@ -141,7 +140,7 @@ These examples provide you an introduction to how to use Neo to optimizes deep l
 - [Distributed TensorFlow](sagemaker_neo_compilation_jobs/tensorflow_distributed_mnist) Adapts form [tensorflow mnist](sagemaker-python-sdk/tensorflow_distributed_mnist) including Neo API and comparsion between the baseline
 - [Predicting Customer Churn](sagemaker_neo_compilation_jobs/xgboost_customer_churn) Adapts form [xgboost customer churn](introduction_to_applying_machine_learning/xgboost_customer_churn) including Neo API and comparsion between the baseline
 
-### Amazon SageMaker Procesing
+### Amazon SageMaker Processing
 
 These examples show you how to use SageMaker Processing jobs to run data processing workloads.
 
@@ -207,6 +206,7 @@ These examples show you how to use model-packages and algorithms from AWS Market
 	- [Using models for extracting vehicle metadata](aws_marketplace/using_model_packages/auto_insurance) provides a detailed walkthrough on how to use pre-trained models from AWS Marketplace for extracting metadata for a sample use-case of auto-insurance claim processing.
 	- [Using models for identifying non-compliance at a workplace](aws_marketplace/using_model_packages/improving_industrial_workplace_safety) provides a detailed walkthrough on how to use pre-trained models from AWS Marketplace for extracting metadata for a sample use-case of generating summary reports for identifying non-compliance at a construction/industrial workplace.
 	- [Extracting insights from your credit card statements](aws_marketplace/using_model_packages/financial_transaction_processing) provides a detailed walkthrough on how to use pre-trained models from AWS Marketplace for efficiently processing financial transaction logs.
+	- [Creative writing using GPT-2 Text Generation](aws_marketplace/using_model_packages/creative-writing-using-gpt-2-text-generation) will show you how to use AWS Marketplace GPT-2-XL pre-trained model on Amazon SageMaker to generate text based on your prompt to help you author prose and poetry.
 
 
 

diff --git a/_static/js/analytics.js b/_static/js/analytics.js
@@ -0,0 +1,2 @@
+console.log("Starting analytics...");
+var s_code=s.t();if(s_code)document.write(s_code)
diff --git a/_static/product-icon_Amazon_SageMaker_lockup_centered_squid_ink.png b/_static/product-icon_Amazon_SageMaker_lockup_centered_squid_ink.png
diff --git a/_static/sagemaker_gears.jpg b/_static/sagemaker_gears.jpg
diff --git a/advanced_functionality/autogluon-tabular/AutoGluon_Tabular_SageMaker.ipynb b/advanced_functionality/autogluon-tabular/AutoGluon_Tabular_SageMaker.ipynb
@@ -43,17 +43,18 @@
    },
    "outputs": [],
    "source": [
-    "# Imports\n",
     "import os\n",
     "import boto3\n",
     "import sagemaker\n",
     "from time import sleep\n",
     "from collections import Counter\n",
     "import numpy as np\n",
     "import pandas as pd\n",
-    "from sagemaker import get_execution_role, local, Model, utils, fw_utils, s3\n",
+    "from sagemaker import get_execution_role, local, Model, utils, s3\n",
     "from sagemaker.estimator import Estimator\n",
-    "from sagemaker.predictor import RealTimePredictor, csv_serializer, StringDeserializer\n",
+    "from sagemaker.predictor import Predictor\n",
+    "from sagemaker.serializers import CSVSerializer\n",
+    "from sagemaker.deserializers import StringDeserializer\n",
     "from sklearn.metrics import accuracy_score, classification_report\n",
     "from IPython.core.display import display, HTML\n",
     "from IPython.core.interactiveshell import InteractiveShell\n",
@@ -74,9 +75,10 @@
     "    \"sts\", region_name=region, endpoint_url=utils.sts_regional_endpoint(region)\n",
     "    )\n",
     "account = client.get_caller_identity()['Account']\n",
-    "ecr_uri_prefix = utils.get_ecr_image_uri_prefix(account, region)\n",
-    "registry_id = fw_utils._registry_id(region, 'mxnet', 'py3', account, '1.6.0')\n",
-    "registry_uri = utils.get_ecr_image_uri_prefix(registry_id, region)"
+    "\n",
+    "registry_uri_training = sagemaker.image_uris.retrieve('mxnet', region, version= '1.6.0', py_version='py3', instance_type='ml.m5.2xlarge', image_scope='training')\n",
+    "registry_uri_inference = sagemaker.image_uris.retrieve('mxnet', region, version= '1.6.0', py_version='py3', instance_type='ml.m5.2xlarge', image_scope='inference')\n",
+    "ecr_uri_prefix = account +'.'+'.'.join(registry_uri_training.split('/')[0].split('.')[1:])"
    ]
   },
   {
@@ -94,32 +96,7 @@
     "Collapsed": "false"
    },
    "source": [
-    "First, build autogluon package to copy into docker image."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "Collapsed": "false"
-   },
-   "outputs": [],
-   "source": [
-    "if not os.path.exists('package'):\n",
-    "    !pip install PrettyTable -t package\n",
-    "    !pip install --upgrade boto3 -t package\n",
-    "    !pip install bokeh -t package\n",
-    "    !pip install --upgrade matplotlib -t package\n",
-    "    !pip install autogluon -t package"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "Collapsed": "false"
-   },
-   "source": [
-    "Now build the training/inference image and push to ECR"
+    "Build the training/inference image and push to ECR"
    ]
   },
   {
@@ -142,8 +119,8 @@
    },
    "outputs": [],
    "source": [
-    "!./container-training/build_push_training.sh {account} {region} {training_algorithm_name} {ecr_uri_prefix} {registry_id} {registry_uri}\n",
-    "!./container-inference/build_push_inference.sh {account} {region} {inference_algorithm_name} {ecr_uri_prefix} {registry_id} {registry_uri}"
+    "!/bin/bash ./container-training/build_push_training.sh {account} {region} {training_algorithm_name} {ecr_uri_prefix} {registry_uri_training.split('/')[0].split('.')[0]} {registry_uri_training}\n",
+    "!/bin/bash ./container-inference/build_push_inference.sh {account} {region} {inference_algorithm_name} {ecr_uri_prefix} {registry_uri_training.split('/')[0].split('.')[0]} {registry_uri_inference}"
    ]
   },
   {
@@ -316,7 +293,12 @@
     "hyperparameters = {\n",
     "  'fit_args': fit_args,\n",
     "  'feature_importance': True\n",
-    "}"
+    "}\n",
+    "\n",
+    "tags = [{\n",
+    "    'Key' : 'AlgorithmName',\n",
+    "    'Value' : 'AutoGluon-Tabular'\n",
+    "}]"
    ]
   },
   {
@@ -348,19 +330,38 @@
     "\n",
     "ecr_image = f'{ecr_uri_prefix}/{training_algorithm_name}:latest'\n",
     "\n",
-    "estimator = Estimator(image_name=ecr_image,\n",
+    "estimator = Estimator(image_uri=ecr_image,\n",
     "                      role=role,\n",
-    "                      train_instance_count=1,\n",
-    "                      train_instance_type=instance_type,\n",
+    "                      instance_count=1,\n",
+    "                      instance_type=instance_type,\n",
     "                      hyperparameters=hyperparameters,\n",
-    "                      train_volume_size=100)\n",
+    "                      volume_size=100,\n",
+    "                      tags=tags)\n",
     "\n",
     "# Set inputs. Test data is optional, but requires a label column.\n",
     "inputs = {'training': train_s3_path, 'testing': test_s3_path}\n",
     "\n",
     "estimator.fit(inputs)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Review the performance of the trained model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from utils.ag_utils import launch_viewer\n",
+    "\n",
+    "launch_viewer(is_debug=False)"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {
@@ -379,10 +380,10 @@
    "outputs": [],
    "source": [
     "# Create predictor object\n",
-    "class AutoGluonTabularPredictor(RealTimePredictor):\n",
+    "class AutoGluonTabularPredictor(Predictor):\n",
     "    def __init__(self, *args, **kwargs):\n",
-    "        super().__init__(*args, content_type='text/csv', \n",
-    "                         serializer=csv_serializer, \n",
+    "        super().__init__(*args, \n",
+    "                         serializer=CSVSerializer(), \n",
     "                         deserializer=StringDeserializer(), **kwargs)"
    ]
   },
@@ -397,10 +398,10 @@
     "ecr_image = f'{ecr_uri_prefix}/{inference_algorithm_name}:latest'\n",
     "\n",
     "if instance_type == 'local':\n",
-    "    model = estimator.create_model(image=ecr_image, role=role)\n",
+    "    model = estimator.create_model(image_uri=ecr_image, role=role)\n",
     "else:\n",
     "    model_uri = os.path.join(estimator.output_path, estimator._current_job_name, \"output\", \"model.tar.gz\")\n",
-    "    model = Model(model_uri, ecr_image, role=role, sagemaker_session=session, predictor_cls=AutoGluonTabularPredictor)"
+    "    model = Model(ecr_image, model_data=model_uri, role=role, sagemaker_session=session, predictor_cls=AutoGluonTabularPredictor)"
    ]
   },
   {

diff --git a/advanced_functionality/autogluon-tabular/container-inference/Dockerfile.inference b/advanced_functionality/autogluon-tabular/container-inference/Dockerfile.inference
@@ -1,9 +1,8 @@
 ARG REGISTRY_URI
-FROM ${REGISTRY_URI}/mxnet-inference:1.6.0-cpu-py3
+FROM ${REGISTRY_URI}
 
-RUN pip install --upgrade pip
-
-COPY package/ /opt/ml/code/package/
+RUN pip install autogluon
+RUN pip install PrettyTable
 
 # Defines inference.py as script entrypoint
 ENV SAGEMAKER_PROGRAM inference.py
diff --git a/advanced_functionality/autogluon-tabular/container-training/Dockerfile.training b/advanced_functionality/autogluon-tabular/container-training/Dockerfile.training
@@ -1,14 +1,24 @@
 ARG REGISTRY_URI
-FROM ${REGISTRY_URI}/mxnet-training:1.6.0-cpu-py3
+FROM ${REGISTRY_URI}
 
-RUN pip install --upgrade pip
+RUN pip install autogluon
+RUN pip install PrettyTable
+RUN pip install bokeh
+
+RUN apt-get update \
+  && apt-get install -y --no-install-recommends graphviz libgraphviz-dev pkg-config \
+  && rm -rf /var/lib/apt/lists/* \
+  && pip install pygraphviz
+
 ENV PATH="/opt/ml/code:${PATH}"
 
 # Copies the training code inside the container
-COPY package/ /opt/ml/code/package/
 COPY container-training/train.py /opt/ml/code/train.py
 COPY container-training/inference.py /opt/ml/code/inference.py
 
+# Install seaborn for plot
+RUN pip install seaborn
+
 # this environment variable is used by the SageMaker PyTorch container to determine our user code directory.
 ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/code
 

diff --git a/advanced_functionality/autogluon-tabular/container-training/inference.py b/advanced_functionality/autogluon-tabular/container-training/inference.py
@@ -10,8 +10,6 @@
 
 warnings.filterwarnings('ignore', category=FutureWarning)
 
-sys.path.append(os.path.join(os.path.dirname(__file__), '/opt/ml/code/package'))
-
 import numpy as np
 import pandas as pd
 import pickle
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		console.log("Starting analytics...");
		var s_code=s.t();if(s_code)document.write(s_code)