diff --git a/sagemaker_batch_transform/introduction_to_batch_transform/batch_transform_pca_dbscan_movie_clusters.ipynb b/sagemaker_batch_transform/introduction_to_batch_transform/batch_transform_pca_dbscan_movie_clusters.ipynb index 7393604be8..420935fe05 100644 --- a/sagemaker_batch_transform/introduction_to_batch_transform/batch_transform_pca_dbscan_movie_clusters.ipynb +++ b/sagemaker_batch_transform/introduction_to_batch_transform/batch_transform_pca_dbscan_movie_clusters.ipynb @@ -261,7 +261,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now, we'll setup to split our dataset into train and test. Dimensionality reduction and clustering don't always require a holdout set to test accuracy, but it will allow us to illustrate how batch prediction might be used when new data arrives. In this case, our test dataset will be a simple 1% sample of items." + "Now, we'll setup to split our dataset into train and test. Dimensionality reduction and clustering don't always require a holdout set to test accuracy, but it will allow us to illustrate how batch prediction might be used when new data arrives. In this case, our test dataset will be a simple 0.5% sample of items." ] }, { @@ -270,7 +270,7 @@ "metadata": {}, "outputs": [], "source": [ - "test_products = products.sample(frac=0.01)\n", + "test_products = products.sample(frac=0.005)\n", "train_products = products[~(products.index.isin(test_products.index))]" ] },