Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix pipeline notebook instance_type #3435

Merged
merged 1 commit into from
Jun 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,6 @@
"source": [
"Next, we define parameters that can be set for the execution of the pipeline. They serve as variables. We define the following:\n",
"\n",
"- `ProcessingInstanceType`: The number of processing instances to use for the execution of the pipeline\n",
"- `TrainData`: Location of the training data in S3\n",
"- `TestData`: Location of the test data in S3\n",
"- `RoleArn`: ARN (Amazon Resource Name) of the role used for pipeline execution\n",
Expand All @@ -216,9 +215,6 @@
"outputs": [],
"source": [
"processing_instance_count = ParameterInteger(name=\"ProcessingInstanceCount\", default_value=1)\n",
"processing_instance_type = ParameterString(\n",
" name=\"ProcessingInstanceType\", default_value=\"ml.m5.xlarge\"\n",
")\n",
"\n",
"input_train = ParameterString(\n",
" name=\"TrainData\",\n",
Expand Down Expand Up @@ -250,7 +246,7 @@
"source": [
"sklearn_processor = SKLearnProcessor(\n",
" framework_version=\"0.23-1\",\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.xlarge\",\n",
" instance_count=processing_instance_count,\n",
" base_job_name=\"comprehend-process\",\n",
" sagemaker_session=sagemaker_session,\n",
Expand Down Expand Up @@ -497,7 +493,6 @@
"pipeline = Pipeline(\n",
" name=pipeline_name,\n",
" parameters=[\n",
" processing_instance_type,\n",
" processing_instance_count,\n",
" input_train,\n",
" input_test,\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,6 @@
"\n",
"The parameters defined in this workflow include:\n",
"\n",
"* `processing_instance_type` - The `ml.*` instance type of the processing job.\n",
"* `processing_instance_count` - The instance count of the processing job.\n",
"* `instance_type` - The `ml.*` instance type of the training job.\n",
"* `model_approval_status` - The approval status to register with the trained model for CI/CD purposes (\"PendingManualApproval\" is the default).\n",
Expand All @@ -234,9 +233,6 @@
"\n",
"\n",
"processing_instance_count = ParameterInteger(name=\"ProcessingInstanceCount\", default_value=1)\n",
"processing_instance_type = ParameterString(\n",
" name=\"ProcessingInstanceType\", default_value=\"ml.m5.xlarge\"\n",
")\n",
"instance_type = ParameterString(name=\"TrainingInstanceType\", default_value=\"ml.m5.xlarge\")\n",
"model_approval_status = ParameterString(\n",
" name=\"ModelApprovalStatus\", default_value=\"PendingManualApproval\"\n",
Expand Down Expand Up @@ -392,7 +388,7 @@
"\n",
"You also specify the `framework_version` to use throughout this notebook.\n",
"\n",
"Note the `processing_instance_type` and `processing_instance_count` parameters used by the processor instance."
"Note the `processing_instance_count` parameter used by the processor instance."
]
},
{
Expand All @@ -408,7 +404,7 @@
"\n",
"sklearn_processor = SKLearnProcessor(\n",
" framework_version=framework_version,\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.xlarge\",\n",
" instance_count=processing_instance_count,\n",
" base_job_name=\"sklearn-abalone-process\",\n",
" role=role,\n",
Expand Down Expand Up @@ -626,9 +622,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, create an instance of a `ScriptProcessor` processor and use it in the `ProcessingStep`.\n",
"\n",
"Note the `processing_instance_type` parameter passed into the processor."
"Next, create an instance of a `ScriptProcessor` processor and use it in the `ProcessingStep`."
]
},
{
Expand All @@ -643,7 +637,7 @@
"script_eval = ScriptProcessor(\n",
" image_uri=image_uri,\n",
" command=[\"python3\"],\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.xlarge\",\n",
" instance_count=1,\n",
" base_job_name=\"script-abalone-eval\",\n",
" role=role,\n",
Expand Down Expand Up @@ -983,7 +977,6 @@
"pipeline = Pipeline(\n",
" name=pipeline_name,\n",
" parameters=[\n",
" processing_instance_type,\n",
" processing_instance_count,\n",
" instance_type,\n",
" model_approval_status,\n",
Expand Down Expand Up @@ -1179,7 +1172,6 @@
"source": [
"execution = pipeline.start(\n",
" parameters=dict(\n",
" ProcessingInstanceType=\"ml.c5.xlarge\",\n",
" ModelApprovalStatus=\"Approved\",\n",
" )\n",
")"
Expand Down Expand Up @@ -1272,9 +1264,9 @@
"metadata": {
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (Data Science)",
"display_name": "Python 3",
"language": "python",
"name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-2:429704687514:image/datascience-1.0"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -1286,9 +1278,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.6.14"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -941,8 +941,7 @@
"output_data = ParameterString(\n",
" name=\"OutputData\", default_value=f\"s3://{default_bucket}/{taxi_prefix}_output/\"\n",
")\n",
"training_instance_count = ParameterInteger(name=\"TrainingInstanceCount\", default_value=1)\n",
"training_instance_type = ParameterString(name=\"TrainingInstanceType\", default_value=\"ml.c5.xlarge\")"
"training_instance_count = ParameterInteger(name=\"TrainingInstanceCount\", default_value=1)"
]
},
{
Expand Down Expand Up @@ -1036,7 +1035,7 @@
" sagemaker.get_execution_role(),\n",
" output_path=\"s3://{}/{}/output\".format(default_bucket, model_prefix),\n",
" instance_count=training_instance_count,\n",
" instance_type=training_instance_type,\n",
" instance_type=\"ml.c5.xlarge\",\n",
" sagemaker_session=session,\n",
")\n",
"\n",
Expand Down Expand Up @@ -1229,7 +1228,6 @@
" name=pipeline_name,\n",
" parameters=[\n",
" input_data,\n",
" training_instance_type,\n",
" training_instance_count,\n",
" id_out,\n",
" ],\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,6 @@ def get_pipeline(

# Parameters for pipeline execution
processing_instance_count = ParameterInteger(name="ProcessingInstanceCount", default_value=1)
processing_instance_type = ParameterString(
name="ProcessingInstanceType", default_value="ml.m5.xlarge"
)
training_instance_type = ParameterString(
name="TrainingInstanceType", default_value="ml.m5.xlarge"
)
Expand All @@ -108,7 +105,7 @@ def get_pipeline(
# Processing step for feature engineering
sklearn_processor = SKLearnProcessor(
framework_version="0.23-1",
instance_type=processing_instance_type,
instance_type="ml.m5.xlarge",
instance_count=processing_instance_count,
base_job_name=f"{base_job_prefix}/sklearn-CustomerChurn-preprocess", # choose any name
sagemaker_session=sagemaker_session,
Expand All @@ -133,7 +130,7 @@ def get_pipeline(
region=region,
version="1.0-1",
py_version="py3",
instance_type=training_instance_type,
instance_type="ml.m5.xlarge",
)
xgb_train = Estimator(
image_uri=image_uri,
Expand Down Expand Up @@ -177,7 +174,7 @@ def get_pipeline(
script_eval = ScriptProcessor(
image_uri=image_uri,
command=["python3"],
instance_type=processing_instance_type,
instance_type="ml.m5.xlarge",
instance_count=1,
base_job_name=f"{base_job_prefix}/script-CustomerChurn-eval",
sagemaker_session=sagemaker_session,
Expand Down Expand Up @@ -254,7 +251,6 @@ def get_pipeline(
pipeline = Pipeline(
name=pipeline_name,
parameters=[
processing_instance_type,
processing_instance_count,
training_instance_type,
model_approval_status,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,7 @@
"dictionary whose names are the parameter names, and whose values are the primitive values to use as overrides of the defaults.\n",
"\n",
"Of particular note, based on the performance of the model, we may want to kick off another pipeline execution, but this \n",
"time on a compute-optimized instance type and set the model approval status automatically be \"Approved\". This means \n",
"time set the model approval status automatically be \"Approved\". This means\n",
"that the model package version generated by the `RegisterModel` step will automatically be ready for deployment through \n",
"CI/CD pipelines, such as with SageMaker Projects.\n",
"\n",
Expand All @@ -255,7 +255,6 @@
"\n",
"execution = pipeline.start(\n",
" parameters=dict(\n",
" ProcessingInstanceType=\"ml.c5.xlarge\",\n",
" ModelApprovalStatus=\"Approved\",\n",
" )\n",
")\n",
Expand Down Expand Up @@ -285,17 +284,8 @@
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
},
"pycharm": {
"stem_cell": {
"cell_type": "raw",
"metadata": {
"collapsed": false
},
"source": []
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -204,9 +204,6 @@
"s3_prefix = \"lambda-step-pipeline\"\n",
"\n",
"processing_instance_count = ParameterInteger(name=\"ProcessingInstanceCount\", default_value=1)\n",
"processing_instance_type = ParameterString(\n",
" name=\"ProcessingInstanceType\", default_value=\"ml.m5.xlarge\"\n",
")\n",
"training_instance_type = ParameterString(name=\"TrainingInstanceType\", default_value=\"ml.m5.xlarge\")\n",
"model_approval_status = ParameterString(\n",
" name=\"ModelApprovalStatus\", default_value=\"PendingManualApproval\"\n",
Expand Down Expand Up @@ -378,7 +375,7 @@
"\n",
"sklearn_processor = SKLearnProcessor(\n",
" framework_version=\"0.23-1\",\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.xlarge\",\n",
" instance_count=processing_instance_count,\n",
" base_job_name=f\"{base_job_prefix}/sklearn-abalone-preprocess\",\n",
" sagemaker_session=sagemaker_session,\n",
Expand Down Expand Up @@ -557,7 +554,7 @@
"script_eval = ScriptProcessor(\n",
" image_uri=image_uri,\n",
" command=[\"python3\"],\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.xlarge\",\n",
" instance_count=1,\n",
" base_job_name=f\"{prefix}/{base_job_prefix}/sklearn-abalone-preprocess\",\n",
" sagemaker_session=sagemaker_session,\n",
Expand Down Expand Up @@ -802,7 +799,6 @@
"pipeline = Pipeline(\n",
" name=pipeline_name,\n",
" parameters=[\n",
" processing_instance_type,\n",
" processing_instance_count,\n",
" training_instance_type,\n",
" input_data,\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -238,9 +238,6 @@
"outputs": [],
"source": [
"processing_instance_count = ParameterInteger(name=\"ProcessingInstanceCount\", default_value=1)\n",
"processing_instance_type = ParameterString(\n",
" name=\"ProcessingInstanceType\", default_value=\"ml.m5.xlarge\"\n",
")\n",
"training_instance_type = ParameterString(name=\"TrainingInstanceType\", default_value=\"ml.m5.xlarge\")\n",
"model_approval_status = ParameterString(\n",
" name=\"ModelApprovalStatus\", default_value=\"PendingManualApproval\"\n",
Expand Down Expand Up @@ -454,7 +451,7 @@
"source": [
"sklearn_processor = SKLearnProcessor(\n",
" framework_version=\"0.23-1\",\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.xlarge\",\n",
" instance_count=processing_instance_count,\n",
" base_job_name=f\"{base_job_prefix}/sklearn-abalone-preprocess\",\n",
" sagemaker_session=sagemaker_session,\n",
Expand Down Expand Up @@ -975,7 +972,7 @@
"script_eval = ScriptProcessor(\n",
" image_uri=image_uri,\n",
" command=[\"python3\"],\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.xlarge\",\n",
" instance_count=1,\n",
" base_job_name=f\"{base_job_prefix}/script-abalone-eval\",\n",
" sagemaker_session=sagemaker_session,\n",
Expand Down Expand Up @@ -1169,7 +1166,6 @@
"pipeline = Pipeline(\n",
" name=pipeline_name,\n",
" parameters=[\n",
" processing_instance_type,\n",
" processing_instance_count,\n",
" training_instance_type,\n",
" model_approval_status,\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -285,13 +285,7 @@
"# raw input data\n",
"input_data = ParameterString(name=\"InputData\", default_value=raw_s3)\n",
"\n",
"# processing step parameters\n",
"processing_instance_type = ParameterString(\n",
" name=\"ProcessingInstanceType\", default_value=\"ml.m5.large\"\n",
")\n",
"\n",
"# training step parameters\n",
"training_instance_type = ParameterString(name=\"TrainingInstanceType\", default_value=\"ml.m5.large\")\n",
"training_epochs = ParameterString(name=\"TrainingEpochs\", default_value=\"100\")\n",
"\n",
"# model performance step parameters\n",
Expand Down Expand Up @@ -376,7 +370,7 @@
"sklearn_processor = SKLearnProcessor(\n",
" framework_version=framework_version,\n",
" role=role,\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.large\",\n",
" instance_count=1,\n",
" base_job_name=\"tf2-california-housing-processing-job\",\n",
")\n",
Expand Down Expand Up @@ -428,7 +422,7 @@
"tf2_estimator = TensorFlow(\n",
" source_dir=\"code\",\n",
" entry_point=\"train.py\",\n",
" instance_type=training_instance_type,\n",
" instance_type=\"ml.m5.large\",\n",
" instance_count=1,\n",
" framework_version=tensorflow_version,\n",
" role=role,\n",
Expand Down Expand Up @@ -536,7 +530,7 @@
"# The object contains information about what container to use, what instance type etc.\n",
"evaluate_model_processor = SKLearnProcessor(\n",
" framework_version=framework_version,\n",
" instance_type=processing_instance_type,\n",
" instance_type=\"ml.m5.large\",\n",
" instance_count=1,\n",
" base_job_name=\"tf2-california-housing-evaluate\",\n",
" role=role,\n",
Expand Down Expand Up @@ -778,7 +772,7 @@
"step_create_model = CreateModelStep(\n",
" name=\"Create-California-Housing-Model\",\n",
" model=model,\n",
" inputs=sagemaker.inputs.CreateModelInput(instance_type=endpoint_instance_type),\n",
" inputs=sagemaker.inputs.CreateModelInput(instance_type=\"ml.m5.large\"),\n",
")"
]
},
Expand Down Expand Up @@ -1005,8 +999,6 @@
"pipeline = Pipeline(\n",
" name=pipeline_name,\n",
" parameters=[\n",
" processing_instance_type,\n",
" training_instance_type,\n",
" input_data,\n",
" training_epochs,\n",
" accuracy_mse_threshold,\n",
Expand Down Expand Up @@ -1278,9 +1270,9 @@
"metadata": {
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (Data Science)",
"display_name": "Python 3",
"language": "python",
"name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/datascience-1.0"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -1292,7 +1284,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.10"
"version": "3.6.14"
}
},
"nbformat": 4,
Expand Down
Loading