edit code format

aws · Sep 22, 2022 · 5c64ca5 · 5c64ca5
1 parent 618b49d
commit 5c64ca5
Show file tree

Hide file tree

Showing 3 changed files with 36 additions and 43 deletions.
diff --git a/...g-compiler/huggingface/pytorch_single_gpu_single_node/albert-base-v2/albert-base-v2.ipynb b/...g-compiler/huggingface/pytorch_single_gpu_single_node/albert-base-v2/albert-base-v2.ipynb
@@ -4,7 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Compile and Train a Hugging Face Transformers Trainer Model for Question and Answering with the `SQuAD` dataset"
+    "# Compile and Train a Hugging Face Transformers Trainer Model for Question and Answering with the SQuAD dataset"
    ]
   },
   {
@@ -15,7 +15,7 @@
     "2. [Introduction](#Introduction)  \n",
     "3. [SageMaker Environment and Permissions](#SageMaker-Environment-and-Permissions)\n",
     "    1. [Installation](#Installation)\n",
-    "4. [Loading the `SQuAD` dataset](#Loading-the-SQuAD-dataset)\n",
+    "4. [Loading the SQuAD dataset](#Loading-the-SQuAD-dataset)\n",
     "5. [Preprocessing](#Preprocessing)   \n",
     "6. [SageMaker Training Job](#SageMaker-Training-Job)  \n",
     "    1. [Training with Native PyTorch](#Training-with-Native-PyTorch)  \n",
@@ -38,11 +38,11 @@
     "\n",
     "## Introduction\n",
     "\n",
-    "This example notebook demonstrates how to compile and fine-tune a question and answering NLP task. We use Hugging Face's `transformers` and `datasets` libraries with Amazon SageMaker Training Compiler to accelerate fine-tuning of a pre-trained transformer model on question and answering. In particular, the pre-trained model will be fine-tuned using the `SQuAD` dataset. To get started, we need to set up the environment with a few prerequisite steps to add permissions, configurations, and so on. \n",
+    "This example notebook demonstrates how to compile and fine-tune a question and answering NLP task. We use HuggingFace's transformers and datasets libraries with Amazon SageMaker Training Compiler to accelerate fine-tuning of a pre-trained transformer model on question and answering. In particular, the pre-trained model will be fine-tuned using the SQuAD dataset. To get started, we need to set up the environment with a few prerequisite steps to add permissions, configurations, and so on. \n",
     "\n",
-    "**NOTE:** You can run this demo in SageMaker Studio, SageMaker notebook instances, or your local machine with AWS CLI set up. If using SageMaker Studio or SageMaker notebook instances, make sure you choose one of the PyTorch-based kernels, `Python 3 (PyTorch x.y Python 3.x CPU Optimized)` or `conda_pytorch_p36` respectively.\n",
+    "**NOTE:** You can run this demo in SageMaker Studio, SageMaker notebook instances, or your local machine with AWS CLI set up. If using SageMaker Studio or SageMaker notebook instances, make sure you choose one of the PyTorch-based kernels, Python 3 (PyTorch x.y Python 3.x CPU Optimized) or conda_pytorch_p36 respectively.\n",
     "\n",
-    "**NOTE:** This notebook uses two `ml.p3.2xlarge` instances that have single GPU. If you don't have enough quota, see [Request a service quota increase for SageMaker resources](https://docs.aws.amazon.com/sagemaker/latest/dg/regions-quotas.html#service-limit-increase-request-procedure). "
+    "**NOTE:** This notebook uses two ml.p3.2xlarge instances that have single GPU. If you don't have enough quota, see [Request a service quota increase for SageMaker resources](https://docs.aws.amazon.com/sagemaker/latest/dg/regions-quotas.html#service-limit-increase-request-procedure). "
    ]
   },
   {
@@ -153,7 +153,7 @@
     "id": "whPRbBNbIrIl"
    },
    "source": [
-    "## Loading the `SQuAD` dataset"
+    "## Loading the SQuAD dataset"
    ]
   },
   {
@@ -169,10 +169,10 @@
     "\n",
     "If you'd like to try other training datasets later, you can simply use this method.\n",
     "\n",
-    "For this example notebook, we prepared the `SQuAD v1.1 dataset` in the public SageMaker sample file S3 bucket. The following code cells show how you can directly load the dataset and convert to a `HuggingFace DatasetDict`.\n",
+    "For this example notebook, we prepared the SQuAD v1.1 dataset in the public SageMaker sample file S3 bucket. The following code cells show how you can directly load the dataset and convert to a HuggingFace DatasetDict.\n",
     "\n",
     "\n",
-    "**NOTE:** The [`SQuAD` dataset](https://rajpurkar.github.io/SQuAD-explorer/) is under the [CC BY-SA 4.0 license terms](https://creativecommons.org/licenses/by-sa/4.0/)."
+    "**NOTE:** The [SQuAD dataset](https://rajpurkar.github.io/SQuAD-explorer/) is under the [CC BY-SA 4.0 license terms](https://creativecommons.org/licenses/by-sa/4.0/)."
    ]
   },
   {
@@ -563,11 +563,11 @@
    "source": [
     "## SageMaker Training Job\n",
     "\n",
-    "To create a SageMaker training job, we use a `HuggingFace`/`PyTorch` estimator. Using the estimator, you can define which fine-tuning script should SageMaker use through `entry_point`, which `instance_type` to use for training, which `hyperparameters` to pass, and so on.\n",
+    "To create a SageMaker training job, we use a HuggingFace/PyTorch estimator. Using the estimator, you can define which fine-tuning script should SageMaker use through entry_point, which instance_type to use for training, which hyperparameters to pass, and so on.\n",
     "\n",
-    "When a SageMaker training job starts, SageMaker takes care of starting and managing all the required machine learning instances, picks up the `HuggingFace` Deep Learning Container, uploads your training script, and downloads the data from `sagemaker_session_bucket` into the container at `/opt/ml/input/data`.\n",
+    "When a SageMaker training job starts, SageMaker takes care of starting and managing all the required machine learning instances, picks up the HuggingFace Deep Learning Container, uploads your training script, and downloads the data from sagemaker_session_bucket into the container at /opt/ml/input/data.\n",
     "\n",
-    "In the following section, you learn how to set up two versions of the SageMaker `HuggingFace`/`PyTorch` estimator, a native one without the compiler and an optimized one with the compiler."
+    "In the following section, you learn how to set up two versions of the SageMaker HuggingFace/PyTorch estimator, a native one without the compiler and an optimized one with the compiler."
    ]
   },
   {
@@ -581,7 +581,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Below, we run a native PyTorch training job with the `PyTorch` estimator on a `ml.p3.2xlarge` instance. \n",
+    "Below, we run a native PyTorch training job with the PyTorch estimator on a ml.p3.2xlarge instance. \n",
     "\n",
     "We run a batch size of 28 on our native training job and 52 on our Training Compiler training job to make an apple to apple comparison. These batch sizes along with the max_length variable get us close to 100% GPU memory utilization.\n",
     "\n",
@@ -1037,7 +1037,7 @@
    "source": [
     "## Conclusion\n",
     "\n",
-    "In this example, we fine-tuned an [ALBERT model](https://huggingface.co/albert-base-v2) (`albert-base-v2`) with the `SQuAD` dataset and compared a native training job with a SageMaker Training Compiler training job. The Training Compiler job has `93% higher throughput` and `38% quicker training` time while training loss was equal with the native PyTorch training job."
+    "In this example, we fine-tuned an [ALBERT model](https://huggingface.co/albert-base-v2) (albert-base-v2) with the SQuAD dataset and compared a native training job with a SageMaker Training Compiler training job. The Training Compiler job has 93% higher throughput and 38% quicker training time while training loss was equal with the native PyTorch training job."
    ]
   },
   {

diff --git a/...torch_single_gpu_single_node/bert-base-cased/bert-base-cased-single-node-single-gpu.ipynb b/...torch_single_gpu_single_node/bert-base-cased/bert-base-cased-single-node-single-gpu.ipynb
@@ -4,7 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Compile and Train a Hugging Face Transformer ``BERT`` Model with the SST Dataset using SageMaker Training Compiler"
+    "# Compile and Train a Hugging Face Transformer BERT Model with the SST Dataset using SageMaker Training Compiler"
    ]
   },
   {
@@ -50,13 +50,13 @@
    "source": [
     "## Introduction\n",
     "\n",
-    "This notebook is an end-to-end binary text classification example. In this demo, we use the Hugging Face's `transformers` and `datasets` libraries with SageMaker Training Compiler to compile and fine-tune a pre-trained transformer for binary text classification. In particular, the pre-trained model will be fine-tuned using the `Stanford Sentiment Treebank (SST)` dataset. To get started, you need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on. \n",
+    "This notebook is an end-to-end binary text classification example. In this demo, we use the Hugging Face's transformers and datasets libraries with SageMaker Training Compiler to compile and fine-tune a pre-trained transformer for binary text classification. In particular, the pre-trained model will be fine-tuned using the Stanford Sentiment Treebank (SST) dataset. To get started, you need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on. \n",
     "\n",
     "![image.png](attachment:image.png)\n",
     "\n",
-    "**NOTE:** You can run this demo in SageMaker Studio, SageMaker notebook instances, or your local machine with AWS CLI set up. If using SageMaker Studio or SageMaker notebook instances, make sure you choose one of the PyTorch-based kernels, `Python 3 (PyTorch x.y Python 3.x CPU Optimized)` or `conda_pytorch_p36` respectively.\n",
+    "**NOTE:** You can run this demo in SageMaker Studio, SageMaker notebook instances, or your local machine with AWS CLI set up. If using SageMaker Studio or SageMaker notebook instances, make sure you choose one of the PyTorch-based kernels, Python 3 (PyTorch x.y Python 3.x CPU Optimized) or conda_pytorch_p36 respectively.\n",
     "\n",
-    "**NOTE:** This notebook uses two `ml.p3.2xlarge` instances that have single GPU. If you don't have enough quota, see [Request a service quota increase for SageMaker resources](https://docs.aws.amazon.com/sagemaker/latest/dg/regions-quotas.html#service-limit-increase-request-procedure). "
+    "**NOTE:** This notebook uses two ml.p3.2xlarge instances that have single GPU. If you don't have enough quota, see [Request a service quota increase for SageMaker resources](https://docs.aws.amazon.com/sagemaker/latest/dg/regions-quotas.html#service-limit-increase-request-procedure). "
    ]
   },
   {
@@ -112,7 +112,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Copy and run the following code if you need to upgrade `ipywidgets` for `datasets` library and restart kernel. This is only needed when preprocessing is done in the notebook.\n",
+    "Copy and run the following code if you need to upgrade ipywidgets for datasets library and restart kernel. This is only needed when preprocessing is done in the notebook.\n",
     "\n",
     "```python\n",
     "%%capture\n",
@@ -176,7 +176,7 @@
     "\n",
     "If you'd like to try other training datasets later, you can simply use this method.\n",
     "\n",
-    "For this example notebook, we prepared the [SST2 dataset](https://www.tensorflow.org/datasets/catalog/glue#gluesst2) in the public SageMaker sample S3 bucket. The following code cells show how you can directly load the dataset and convert to a `HuggingFace DatasetDict`."
+    "For this example notebook, we prepared the [SST2 dataset](https://www.tensorflow.org/datasets/catalog/glue#gluesst2) in the public SageMaker sample S3 bucket. The following code cells show how you can directly load the dataset and convert to a HuggingFace DatasetDict."
    ]
   },
   {
@@ -343,7 +343,7 @@
    "source": [
     "### Uploading data to `sagemaker_session_bucket`\n",
     "\n",
-    "After we processed the `datasets` we are going to use the new `FileSystem` [integration](https://huggingface.co/docs/datasets/filesystems.html) to upload our dataset to S3."
+    "After we processed the datasets we are going to use the new FileSystem [integration](https://huggingface.co/docs/datasets/filesystems.html) to upload our dataset to S3."
    ]
   },
   {
@@ -375,11 +375,11 @@
    "source": [
     "## SageMaker Training Job\n",
     "\n",
-    "To create a SageMaker training job, we use a `HuggingFace/PyTorch` estimator. Using the estimator, you can define which fine-tuning script should SageMaker use through `entry_point`, which `instance_type` to use for training, which `hyperparameters` to pass, and so on.\n",
+    "To create a SageMaker training job, we use a HuggingFace/PyTorch estimator. Using the estimator, you can define which fine-tuning script should SageMaker use through entry_point, which instance_type to use for training, which hyperparameters to pass, and so on.\n",
     "\n",
-    "When a SageMaker training job starts, SageMaker takes care of starting and managing all the required machine learning instances, picks up the `HuggingFace` Deep Learning Container, uploads your training script, and downloads the data from `sagemaker_session_bucket` into the container at `/opt/ml/input/data`.\n",
+    "When a SageMaker training job starts, SageMaker takes care of starting and managing all the required machine learning instances, picks up the HuggingFace Deep Learning Container, uploads your training script, and downloads the data from sagemaker_session_bucket into the container at /opt/ml/input/data.\n",
     "\n",
-    "In the following section, you learn how to set up two versions of the SageMaker `HuggingFace/PyTorch` estimator, a native one without the compiler and an optimized one with the compiler."
+    "In the following section, you learn how to set up two versions of the SageMaker HuggingFace/PyTorch estimator, a native one without the compiler and an optimized one with the compiler."
    ]
   },
   {