update parameters for albert-base-v2 notebook and reformat it

aws · Sep 14, 2022 · 5adc710 · 5adc710
1 parent 1b3e9ca
commit 5adc710
Show file tree

Hide file tree

Showing 3 changed files with 19 additions and 41 deletions.
diff --git a/...g-compiler/huggingface/pytorch_single_gpu_single_node/albert-base-v2/albert-base-v2.ipynb b/...g-compiler/huggingface/pytorch_single_gpu_single_node/albert-base-v2/albert-base-v2.ipynb
@@ -4,7 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Compile and Train a Hugging Face Transformers Trainer Model for Question and Answering with the SQuAD dataset"
+    "# Compile and Train a Hugging Face Transformers Trainer Model for Question and Answering with the `SQuAD` dataset"
    ]
   },
   {
@@ -15,7 +15,7 @@
     "2. [Introduction](#Introduction)  \n",
     "3. [SageMaker Environment and Permissions](#SageMaker-Environment-and-Permissions)\n",
     "    1. [Installation](#Installation)\n",
-    "4. [Loading the SQuAD dataset](#Loading-the-SQuAD-dataset)\n",
+    "4. [Loading the `SQuAD` dataset](#Loading-the-SQuAD-dataset)\n",
     "5. [Preprocessing](#Preprocessing)   \n",
     "6. [SageMaker Training Job](#SageMaker-Training-Job)  \n",
     "    1. [Training with Native PyTorch](#Training-with-Native-PyTorch)  \n",
@@ -38,7 +38,7 @@
     "\n",
     "## Introduction\n",
     "\n",
-    "This example notebook demonstrates how to compile and fine-tune a question and answering NLP task. We use Hugging Face's `transformers` and `datasets` libraries with Amazon Sagemaker Training Compiler to accelerate fine-tuning of a pre-trained transformer model on question and answering. In particular, the pre-trained model will be fine-tuned using the `SQuAD` dataset. To get started, we need to set up the environment with a few prerequisite steps to add permissions, configurations, and so on. \n",
+    "This example notebook demonstrates how to compile and fine-tune a question and answering NLP task. We use Hugging Face's `transformers` and `datasets` libraries with Amazon SageMaker Training Compiler to accelerate fine-tuning of a pre-trained transformer model on question and answering. In particular, the pre-trained model will be fine-tuned using the `SQuAD` dataset. To get started, we need to set up the environment with a few prerequisite steps to add permissions, configurations, and so on. \n",
     "\n",
     "**NOTE:** You can run this demo in SageMaker Studio, SageMaker notebook instances, or your local machine with AWS CLI set up. If using SageMaker Studio or SageMaker notebook instances, make sure you choose one of the PyTorch-based kernels, `Python 3 (PyTorch x.y Python 3.x CPU Optimized)` or `conda_pytorch_p36` respectively.\n",
     "\n",
@@ -67,7 +67,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!pip install \"sagemaker>=2.108.0\" botocore boto3 awscli s3fs typing-extensions --upgrade"
+    "!pip install \"sagemaker>=2.108.0\" botocore boto3 awscli s3fs typing-extensions \"torch==1.11.0\" --upgrade"
    ]
   },
   {
@@ -98,7 +98,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Copy and run the following code if you need to upgrade ipywidgets for `datasets` library and restart kernel. This is only needed when prerpocessing is done in the notebook.\n",
+    "Copy and run the following code if you need to upgrade `ipywidgets` for `datasets` library and restart kernel. This is only needed when prepocessing is done in the notebook.\n",
     "\n",
     "```python\n",
     "%%capture\n",
@@ -120,7 +120,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Note:** If you are going to use Sagemaker in a local environment. You need access to an IAM Role with the required permissions for SageMaker. To learn more, see [SageMaker Roles](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html)."
+    "**Note:** If you are going to use SageMaker in a local environment. You need access to an IAM Role with the required permissions for SageMaker. To learn more, see [SageMaker Roles](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html)."
    ]
   },
   {
@@ -153,7 +153,7 @@
     "id": "whPRbBNbIrIl"
    },
    "source": [
-    "## Loading the SQuAD dataset"
+    "## Loading the `SQuAD` dataset"
    ]
   },
   {
@@ -169,10 +169,10 @@
     "\n",
     "If you'd like to try other training datasets later, you can simply use this method.\n",
     "\n",
-    "For this example notebook, we prepared the SQuAD v1.1 dataset in the public SageMaker sample file S3 bucket. The following code cells show how you can directly load the dataset and convert to a HuggingFace DatasetDict.\n",
+    "For this example notebook, we prepared the `SQuAD v1.1 dataset` in the public SageMaker sample file S3 bucket. The following code cells show how you can directly load the dataset and convert to a `HuggingFace DatasetDict`.\n",
     "\n",
     "\n",
-    "**NOTE:** The [SQuAD dataset](https://rajpurkar.github.io/SQuAD-explorer/) is under the [CC BY-SA 4.0 license terms](https://creativecommons.org/licenses/by-sa/4.0/)."
+    "**NOTE:** The [`SQuAD` dataset](https://rajpurkar.github.io/SQuAD-explorer/) is under the [CC BY-SA 4.0 license terms](https://creativecommons.org/licenses/by-sa/4.0/)."
    ]
   },
   {
@@ -322,7 +322,7 @@
     "id": "Vl6IidfdIrJK"
    },
    "source": [
-    "The following assertion ensures that our tokenizer is a fast tokenizers (backed by Rust) from the 🤗 Tokenizers library. Those fast tokenizers are available for almost all models, and we will need some of the special features they have for our preprocessing."
+    "The following assertion ensures that our tokenizer is a fast tokenizer (backed by Rust) from the 🤗 Tokenizers library. Those fast tokenizers are available for almost all models, and we will need some of the special features they have for our preprocessing."
    ]
   },
   {
@@ -515,7 +515,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Before we kick off our SageMaker training job we need to transfer our dataset to S3 so the training job can download it from S3."
+    "Before we kick off our SageMaker training job we need to transfer our dataset to S3, so the training job can download it from S3."
    ]
   },
   {
@@ -583,11 +583,11 @@
    "source": [
     "Below, we run a native PyTorch training job with the `PyTorch` estimator on a `ml.p3.2xlarge` instance. \n",
     "\n",
-    "We run a batch size of 28 on our native training job and 52 on our Training Compiler training job to make an apples to apples comparision. These batch sizes along with the max_length variable get us close to 100% GPU memory utilization.\n",
+    "We run a batch size of 28 on our native training job and 52 on our Training Compiler training job to make an apple to apple comparison. These batch sizes along with the max_length variable get us close to 100% GPU memory utilization.\n",
     "\n",
     "We recommend using the tested batch size that's provided at [Tested Models](https://docs.aws.amazon.com/sagemaker/latest/dg/training-compiler-support.html#training-compiler-tested-models) in the *SageMaker Training Compiler Developer Guide*.\n",
     "\n",
-    "![gpu mem](images/gpumem.png)"
+    "![`GPU MEM`](images/gpumem.png)"
    ]
   },
   {
@@ -669,6 +669,7 @@
    "outputs": [],
    "source": [
     "from sagemaker.huggingface import HuggingFace, TrainingCompilerConfig\n",
+    "\n",
     "# an updated max batch size that can fit into GPU memory with compiler\n",
     "batch_size = 52\n",
     "\n",
@@ -920,13 +921,6 @@
     "plt.show()"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "![throughput](images/throughput.png)"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -943,9 +937,7 @@
    "outputs": [],
    "source": [
     "sm = boto3.client(\"sagemaker\")\n",
-    "native_job = sm.describe_training_job(\n",
-    "    TrainingJobName=native_estimator.latest_training_job.name\n",
-    ")\n",
+    "native_job = sm.describe_training_job(TrainingJobName=native_estimator.latest_training_job.name)\n",
     "\n",
     "compile_job = sm.describe_training_job(TrainingJobName=compile_estimator.latest_training_job.name)\n",
     "\n",
@@ -965,13 +957,6 @@
     "plt.show()"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "![training time](images/trainingtime.png)"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -1001,20 +986,13 @@
     "plt.show()"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "![loss](images/loss.png)"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "## Conclusion\n",
     "\n",
-    "In this example, we fine-tuned an [ALBERT model](https://huggingface.co/albert-base-v2) (`albert-base-v2`) with the SQuAD dataset and compared a native training job with a SageMaker Training Compiler training job. The Training Compiler job has `86% higher throughput` and `40% quicker training` time while training loss was equal with the native pytorch training job."
+    "In this example, we fine-tuned an [ALBERT model](https://huggingface.co/albert-base-v2) (`albert-base-v2`) with the `SQuAD` dataset and compared a native training job with a SageMaker Training Compiler training job. The Training Compiler job has `93% higher throughput` and `38% quicker training` time while training loss was equal with the native PyTorch training job."
    ]
   },
   {

diff --git a/...torch_single_gpu_single_node/bert-base-cased/bert-base-cased-single-node-single-gpu.ipynb b/...torch_single_gpu_single_node/bert-base-cased/bert-base-cased-single-node-single-gpu.ipynb
@@ -112,7 +112,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Copy and run the following code if you need to upgrade \"ipywidgets\" for `datasets` library and restart kernel. This is only needed when preprocessing is done in the notebook.\n",
+    "Copy and run the following code if you need to upgrade `ipywidgets` for `datasets` library and restart kernel. This is only needed when preprocessing is done in the notebook.\n",
     "\n",
     "```python\n",
     "%%capture\n",

diff --git a/...ining-compiler/huggingface/pytorch_single_gpu_single_node/roberta-base/roberta-base.ipynb b/...ining-compiler/huggingface/pytorch_single_gpu_single_node/roberta-base/roberta-base.ipynb
@@ -99,7 +99,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Copy and run the following code if you need to upgrade `ipywidgets` for `datasets` library and restart kernel. This is only needed when prepocessing is done in the notebook.\n",
+    "Copy and run the following code if you need to upgrade `ipywidgets` for `datasets` library and restart kernel. This is only needed when preprocessing is done in the notebook.\n",
     "\n",
     "```python\n",
     "%%capture\n",
@@ -302,7 +302,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Set up an option for fine-tuning or full training. `FINE_TUNING = 1` is for fine-tuning and it will use `fine_tune_with_huggingface.py`. `FINE_TUNING = 0` is for full training and it will use `full_train_roberta_with_huggingface.py`."
+    "Set up an option for fine-tuning or full training. `FINE_TUNING = 1` is for fine-tuning, and it will use `fine_tune_with_huggingface.py`. `FINE_TUNING = 0` is for full training, and it will use `full_train_roberta_with_huggingface.py`."
    ]
   },
   {