diff --git a/sagemaker-pipelines/time_series_forecasting/amazon_forecast_pipeline/sm_pipeline_with_amazon_forecast.ipynb b/sagemaker-pipelines/time_series_forecasting/amazon_forecast_pipeline/sm_pipeline_with_amazon_forecast.ipynb index 4904fbbfe7..c49389ab19 100644 --- a/sagemaker-pipelines/time_series_forecasting/amazon_forecast_pipeline/sm_pipeline_with_amazon_forecast.ipynb +++ b/sagemaker-pipelines/time_series_forecasting/amazon_forecast_pipeline/sm_pipeline_with_amazon_forecast.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "markdown", - "id": "0c9ee48f", + "id": "91ee6b6d", "metadata": {}, "source": [ "# Creating an Amazon Forecast Predictor with SageMaker Pipelines\n", @@ -27,7 +27,7 @@ { "cell_type": "code", "execution_count": null, - "id": "932dc4d0", + "id": "86a4678e", "metadata": {}, "outputs": [], "source": [ @@ -52,7 +52,7 @@ }, { "cell_type": "markdown", - "id": "de12d5a8", + "id": "b0125efc", "metadata": {}, "source": [ "Finally, you will need the following trust policies." @@ -61,7 +61,7 @@ { "cell_type": "code", "execution_count": null, - "id": "957d4d1f", + "id": "8050bbca", "metadata": {}, "outputs": [], "source": [ @@ -81,7 +81,7 @@ }, { "cell_type": "markdown", - "id": "45298f90", + "id": "9ed30cce", "metadata": {}, "source": [ "## Prerequisites\n", @@ -95,7 +95,17 @@ { "cell_type": "code", "execution_count": null, - "id": "9ab8df52", + "id": "1d137518", + "metadata": {}, + "outputs": [], + "source": [ + "! pip install sagemaker==2.93.0" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b763c7b0", "metadata": {}, "outputs": [], "source": [ @@ -135,7 +145,7 @@ { "cell_type": "code", "execution_count": null, - "id": "51f2beea", + "id": "ad1e02a1", "metadata": {}, "outputs": [], "source": [ @@ -189,7 +199,7 @@ }, { "cell_type": "markdown", - "id": "fff1a8b5", + "id": "23cb4c19", "metadata": {}, "source": [ "## Dataset\n", @@ -200,7 +210,7 @@ { "cell_type": "code", "execution_count": null, - "id": "e4367c4a", + "id": "d71ab7f2", "metadata": {}, "outputs": [], "source": [ @@ -226,7 +236,7 @@ { "cell_type": "code", "execution_count": null, - "id": "5a4a1d4a", + "id": "8ce30a2a", "metadata": {}, "outputs": [], "source": [ @@ -243,7 +253,7 @@ }, { "cell_type": "markdown", - "id": "db24153a", + "id": "f58aea79", "metadata": {}, "source": [ "The dataset happens to span January 01, 2011, to January 01, 2015. We are only going to use about two and a half week's of hourly data to train Amazon Forecast. \n", @@ -253,7 +263,7 @@ { "cell_type": "code", "execution_count": null, - "id": "ac099d3c", + "id": "724eee5f", "metadata": {}, "outputs": [], "source": [ @@ -262,7 +272,7 @@ }, { "cell_type": "markdown", - "id": "d114bd69", + "id": "18b87844", "metadata": {}, "source": [ "Next, we define parameters that can be set for the execution of the pipeline. They serve as variables. We define the following:\n", @@ -286,7 +296,7 @@ { "cell_type": "code", "execution_count": null, - "id": "f1d86c83", + "id": "52ba0c45", "metadata": {}, "outputs": [], "source": [ @@ -294,7 +304,6 @@ "processing_instance_type = ParameterString(\n", " name=\"ProcessingInstanceType\", default_value=\"ml.m5.large\"\n", ")\n", - "training_instance_count = ParameterInteger(name=\"TrainingInstanceCount\", default_value=1)\n", "training_instance_type = ParameterString(name=\"TrainingInstanceType\", default_value=\"ml.m5.large\")\n", "\n", "input_train = ParameterString(\n", @@ -312,7 +321,7 @@ }, { "cell_type": "markdown", - "id": "eff2dad9", + "id": "3a2ee68c", "metadata": {}, "source": [ "We use an updated [SKLearnProcessor](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/sagemaker.sklearn.html#sagemaker.sklearn.processing.SKLearnProcessor) to run Python scripts to build a dataset group and train an Amazon Forecast predictor using `boto3`. In the next chunk, we instantiate an instance of `ScriptProcessor`, which is essentially an SKLearnProcessor with updated `boto3` and `botocore` (as built above) that we use in the next steps. " @@ -321,7 +330,7 @@ { "cell_type": "code", "execution_count": null, - "id": "11e82c55", + "id": "130c2059", "metadata": {}, "outputs": [], "source": [ @@ -336,7 +345,7 @@ { "cell_type": "code", "execution_count": null, - "id": "88fb293a", + "id": "2abf7b80", "metadata": {}, "outputs": [], "source": [ @@ -353,7 +362,7 @@ }, { "cell_type": "markdown", - "id": "5d40d2b1", + "id": "26bd50c0", "metadata": {}, "source": [ "First we preprocess the data using an Amazon SageMaker [ProcessingStep](https://sagemaker.readthedocs.io/en/stable/workflows/pipelines/sagemaker.workflow.pipelines.html?highlight=ProcessingStep#sagemaker.workflow.steps.ProcessingStep) that provides a containerized execution environment to run the `preprocess.py` script." @@ -362,7 +371,7 @@ { "cell_type": "code", "execution_count": null, - "id": "b5d84ca3", + "id": "aa0259f4", "metadata": {}, "outputs": [], "source": [ @@ -383,7 +392,7 @@ }, { "cell_type": "markdown", - "id": "6d4b1540", + "id": "6e05150d", "metadata": {}, "source": [ "The next step is to train and evaluate the forecasting model calling Amazon Forecast using `boto3`. We instantiate an instance of `SKLearn` estimator that we use in the next `TrainingStep` to run the script `train.py`. \n", @@ -394,7 +403,7 @@ { "cell_type": "code", "execution_count": null, - "id": "b80ada3f", + "id": "95177b2f", "metadata": {}, "outputs": [], "source": [ @@ -425,7 +434,7 @@ { "cell_type": "code", "execution_count": null, - "id": "b52d154a", + "id": "cf10e258", "metadata": {}, "outputs": [], "source": [ @@ -433,7 +442,6 @@ " entry_point=\"train.py\",\n", " role=role_arn,\n", " image_uri=container_image_uri,\n", - " instance_count=training_instance_count,\n", " instance_type=training_instance_type,\n", " sagemaker_session=sagemaker_session,\n", " base_job_name=\"forecast-train\",\n", @@ -446,7 +454,7 @@ { "cell_type": "code", "execution_count": null, - "id": "f5ddecce", + "id": "82f5536a", "metadata": {}, "outputs": [], "source": [ @@ -455,7 +463,7 @@ }, { "cell_type": "markdown", - "id": "29f0d4d4", + "id": "867f2daf", "metadata": {}, "source": [ "The third step is an Amazon SageMaker ProcessingStep that deletes or keeps the Amazon Forecast model running using the script `conditional_delete.py`. If the error reported after training is higher than a threshold you specify for the metric you specify, this step deletes all the resources created by Amazon Forecast that are related to the pipeline's execution.\n", @@ -465,7 +473,7 @@ { "cell_type": "code", "execution_count": null, - "id": "43c79816", + "id": "f6122249", "metadata": {}, "outputs": [], "source": [ @@ -492,7 +500,7 @@ }, { "cell_type": "markdown", - "id": "41ef4915", + "id": "991697b7", "metadata": {}, "source": [ "Finally, we combine all the steps and define our pipeline." @@ -501,7 +509,7 @@ { "cell_type": "code", "execution_count": null, - "id": "7cf7b196", + "id": "fdc925a3", "metadata": {}, "outputs": [], "source": [ @@ -513,7 +521,6 @@ " parameters=[\n", " processing_instance_type,\n", " processing_instance_count,\n", - " training_instance_count,\n", " training_instance_type,\n", " input_train,\n", " forecast_horizon,\n", @@ -532,7 +539,7 @@ }, { "cell_type": "markdown", - "id": "c838b490", + "id": "681b8721", "metadata": {}, "source": [ "Once the pipeline is successfully defined, we can start the execution." @@ -541,7 +548,7 @@ { "cell_type": "code", "execution_count": null, - "id": "1cbe62f1", + "id": "5b375f45", "metadata": {}, "outputs": [], "source": [ @@ -551,7 +558,7 @@ { "cell_type": "code", "execution_count": null, - "id": "35e9c22d", + "id": "b2fec897", "metadata": {}, "outputs": [], "source": [ @@ -561,7 +568,7 @@ { "cell_type": "code", "execution_count": null, - "id": "ccb70b34", + "id": "72464cc3", "metadata": {}, "outputs": [], "source": [ @@ -571,7 +578,7 @@ { "cell_type": "code", "execution_count": null, - "id": "a20d8f39", + "id": "e66f34a3", "metadata": {}, "outputs": [], "source": [ @@ -580,7 +587,7 @@ }, { "cell_type": "markdown", - "id": "1e285dfd", + "id": "c5c56dff", "metadata": {}, "source": [ "## Experiments Tracking\n", @@ -602,7 +609,7 @@ }, { "cell_type": "markdown", - "id": "067b7888", + "id": "a0030897", "metadata": {}, "source": [ "## Conclusion" @@ -610,7 +617,7 @@ }, { "cell_type": "markdown", - "id": "40a6ba7e", + "id": "132ad067", "metadata": {}, "source": [ "In this notebook we have seen how to create a SageMaker Pipeline to train an Amazon Forecast predictor on your own dataset with a target and related time series." @@ -618,7 +625,7 @@ }, { "cell_type": "markdown", - "id": "93c99720", + "id": "d10d8baf", "metadata": {}, "source": [ "## Clean up\n", @@ -629,7 +636,7 @@ { "cell_type": "code", "execution_count": null, - "id": "bc956081", + "id": "c8665320", "metadata": {}, "outputs": [], "source": [ @@ -654,7 +661,7 @@ { "cell_type": "code", "execution_count": null, - "id": "4c2e6e8a", + "id": "6a234cbf", "metadata": {}, "outputs": [], "source": [ @@ -670,7 +677,7 @@ { "cell_type": "code", "execution_count": null, - "id": "e50ca583", + "id": "b269f192", "metadata": {}, "outputs": [], "source": [ @@ -680,7 +687,7 @@ { "cell_type": "code", "execution_count": null, - "id": "82e5928a", + "id": "a8828ef6", "metadata": {}, "outputs": [], "source": [ @@ -690,7 +697,7 @@ { "cell_type": "code", "execution_count": null, - "id": "44212447", + "id": "abeda944", "metadata": {}, "outputs": [], "source": [ @@ -708,7 +715,7 @@ { "cell_type": "code", "execution_count": null, - "id": "41649cfd", + "id": "d64d4ae5", "metadata": {}, "outputs": [], "source": [ @@ -720,7 +727,7 @@ { "cell_type": "code", "execution_count": null, - "id": "a9fd48dc", + "id": "33d2861f", "metadata": {}, "outputs": [], "source": [ @@ -733,7 +740,7 @@ { "cell_type": "code", "execution_count": null, - "id": "cc00b557", + "id": "d0e4bd9d", "metadata": {}, "outputs": [], "source": [ @@ -744,7 +751,7 @@ { "cell_type": "code", "execution_count": null, - "id": "336300e0", + "id": "d9968b15", "metadata": {}, "outputs": [], "source": [ @@ -756,7 +763,7 @@ { "cell_type": "code", "execution_count": null, - "id": "8eed39f2", + "id": "1440c84e", "metadata": {}, "outputs": [], "source": [ @@ -767,9 +774,9 @@ "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "conda_python3", "language": "python", - "name": "python3" + "name": "conda_python3" }, "language_info": { "codemirror_mode": { @@ -781,7 +788,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.11" + "version": "3.8.12" } }, "nbformat": 4,