diff --git a/01-introduction.html b/01-introduction.html
index b50c8622..867839ba 100644
--- a/01-introduction.html
+++ b/01-introduction.html
@@ -167,7 +167,6 @@ <h2 class="accordion-header chapters" id="flush-headingEleven">
     <div id="flush-collapsecurrent" class="accordion-collapse collapse show" aria-labelledby="flush-headingcurrent" data-bs-parent="#accordionFlushcurrent">
       <div class="accordion-body">
         <ul><li><a href="#what-is-machine-learning">What is machine learning?</a></li>
-<li><a href="#training-data">Training Data</a></li>
 <li><a href="#deep-learning-machine-learning-and-artificial-intelligence">Deep Learning, Machine Learning and Artificial Intelligence</a></li>
 <li><a href="#what-is-image-classification">What is image classification?</a></li>
 <li><a href="#deep-learning-workflow">Deep Learning Workflow</a></li>
@@ -280,7 +279,7 @@ <h2 class="accordion-header" id="flush-headingTwelve">
       </div>
       <hr></nav><main id="main-content" class="main-content"><div class="container lesson-content">
         <h1>Introduction to Deep Learning</h1>
-        <p>Last updated on 2024-02-14 |
+        <p>Last updated on 2024-02-23 |
         
         <a href="https://https://github.com/erinmgraham/icwithcnn/edit/main/episodes/01-introduction.html" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
         
@@ -330,15 +329,9 @@ <h3 class="card-title">Objectives</h3>
 are many more.</p>
 <p>The techniques break down into two broad categories, predictors and
 classifiers. Predictors are used to predict a value (or set of values)
-given a set of inputs, for example trying to predict the cost of
-something given the economic conditions and the cost of raw materials or
-predicting a country’s GDP given its life expectancy. Classifiers try to
-classify data into different categories, or assign a label; for example,
-deciding what characters are visible in a picture of some writing or if
-an email or text message is spam or not.</p>
-</section><section id="training-data"><h2 class="section-heading">Training Data<a class="anchor" aria-label="anchor" href="#training-data"></a>
-</h2>
-<hr class="half-width"><p>Many, but not all, machine learning systems “learn” by taking a
+given a set of inputs whereas classifiers try to classify data into
+different categories, or assign a labelcond env.</p>
+<p>Many, but not all, machine learning systems “learn” by taking a
 series of input data and output data and using it to form a model. The
 maths behind the machine learning doesn’t care what the data is as long
 as it can represented numerically or categorised. Some examples might
@@ -377,18 +370,18 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 <div class="callout-content">
 <p>Concept: Differentiation between traditional Machine Learning models
 and Deep Learning models:</p>
-<p><strong>Traditional ML algorithms</strong> can only use one (possibly
-two layers) of data transformation to calculate an output (shallow
-models). With high dimensional data and growing feature space (possible
-set of values for any given feature), shallow models quickly run out of
-layers to calculate outputs.</p>
-<p><strong>Deep neural networks</strong> (constructed with multiple
-layers of neurons) are the extension of shallow models with three
-layers: input, hidden, and outputs layers. The hidden layer is where
-learning takes place. As a result, deep learning is best applied to
-large datasets for training and prediction. As observations and feature
-inputs decrease, shallow ML approaches begin to perform noticeably
-better.</p>
+<p><strong>Traditional ML algorithms</strong>, known as shallow models,
+are limited to just one or maybe two layers of data transformation to
+generate an output. When dealing with complex data featuring high
+dimensions and growing feature space (i.e. many attributes and an
+expanding set of potential values for each feature), these shallow
+models become limited in their ability to compute accurate outputs.</p>
+<p><strong>Deep neural networks</strong> are the extension of shallow
+models with three layers: input, hidden, and outputs layers. The hidden
+layer(s) is where learning takes place. As a result, deep learning is
+best applied to large datasets for training and prediction. As
+observations and feature inputs decrease, shallow ML approaches begin to
+perform noticeably better.</p>
 </div>
 </div>
 </div>
@@ -418,10 +411,11 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 <li>
 <strong>Security and Surveillance</strong>: Detecting anomalies or
 unauthorised objects in security footage.</li>
-</ul><p>Convolutional Neural Networks (CNNs) have become a cornerstone in
-image classification due to their ability to automatically learn
-hierarchical features from images and achieve remarkable performance on
-a wide range of tasks.</p>
+</ul><p>A Convolutional Neural Networks (CNN) is a Deep Learning algorithm
+that has become a cornerstone in image classification due to its ability
+to automatically learn features from images in a hierarchical fashion
+(i.e. each layer builds upon what was learned by the previous layer). It
+can achieve remarkable performance on a wide range of tasks.</p>
 </section><section id="deep-learning-workflow"><h2 class="section-heading">Deep Learning Workflow<a class="anchor" aria-label="anchor" href="#deep-learning-workflow"></a>
 </h2>
 <hr class="half-width"><p>To apply Deep Learning to a problem there are several steps to go
@@ -450,12 +444,12 @@ <h3 id="step-3--prepare-data">Step 3. Prepare data<a class="anchor" aria-label="
 the data structure will be explored in <a href="02-image-data">Episode
 02 Introduction to Image Data</a>.</p>
 <p>For this lesson, we will use an existing image dataset known as
-CIFAR-10. We will introduce this dataset and the different data
-preparation tasks in more detail in the next episode but for this
-introduction, we want to divide the data into <strong>training</strong>,
-<strong>validation</strong>, and <strong>test</strong> subsets;
-normalise the image pixel values to be between 0 and 1; and one-hot
-encode our image labels.</p>
+CIFAR-10 (Canadian Institute for Advanced Research). We will introduce
+this dataset and the different data preparation tasks in more detail in
+the next episode but for this introduction, we want to divide the data
+into <strong>training</strong>, <strong>validation</strong>, and
+<strong>test</strong> subsets; normalise the image pixel values to be
+between 0 and 1; and one-hot encode our image labels.</p>
 <div class="section level4">
 <h4 id="preparing-the-code">Preparing the code<a class="anchor" aria-label="anchor" href="#preparing-the-code"></a></h4>
 <p>It is the goal of this training workshop to produce a Deep Learning
@@ -467,28 +461,27 @@ <h4 id="preparing-the-code">Preparing the code<a class="anchor" aria-label="anch
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
 </h3>
 <pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># load the required packages</span></span>
-<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras <span class="co"># library for neural networks </span></span>
-<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split <span class="co"># library for splitting data into sets</span></span>
-<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt <span class="co"># library for plotting</span></span>
-<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> numpy <span class="im">as</span> np <span class="co"># library for working with images as arrays</span></span>
-<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a><span class="co"># load the CIFAR-10 dataset included with the keras library</span></span>
-<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>(train_images, train_labels), (test_images, test_labels) <span class="op">=</span> keras.datasets.cifar10.load_data()</span>
-<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a><span class="co"># normalise the RGB values to be between 0 and 1</span></span>
-<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>train_images <span class="op">=</span> train_images <span class="op">/</span> <span class="fl">255.0</span></span>
-<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>test_images <span class="op">=</span> test_images <span class="op">/</span> <span class="fl">255.0</span></span>
-<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a><span class="co"># create a list of class names</span></span>
-<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>class_names <span class="op">=</span> [<span class="st">'airplane'</span>, <span class="st">'automobile'</span>, <span class="st">'bird'</span>, <span class="st">'cat'</span>, <span class="st">'deer'</span>, <span class="st">'dog'</span>, <span class="st">'frog'</span>, <span class="st">'horse'</span>, <span class="st">'ship'</span>, <span class="st">'truck'</span>]</span>
-<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="co"># one-hot encode labels</span></span>
-<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>train_labels <span class="op">=</span> keras.utils.to_categorical(train_labels, <span class="bu">len</span>(class_names))</span>
-<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a>val_labels <span class="op">=</span> keras.utils.to_categorical(val_labels, <span class="bu">len</span>(class_names))</span>
-<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="co"># split the training data into training and validation sets</span></span>
-<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a><span class="co"># </span><span class="al">NOTE</span><span class="co"> the function is train_test_split() but we are using it to split train into train and validation</span></span>
-<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a>train_images, val_images, train_labels, val_labels <span class="op">=</span> train_test_split(train_images, train_labels, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">42</span>)</span></code></pre>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras <span class="co"># for neural networks </span></span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split <span class="co"># for splitting data into sets</span></span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt <span class="co"># for plotting</span></span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="co"># load the CIFAR-10 dataset included with keras</span></span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>(train_images, train_labels), (test_images, test_labels) <span class="op">=</span> keras.datasets.cifar10.load_data()</span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a><span class="co"># normalise the RGB values to be between 0 and 1</span></span>
+<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>train_images <span class="op">=</span> train_images <span class="op">/</span> <span class="fl">255.0</span></span>
+<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>test_images <span class="op">=</span> test_images <span class="op">/</span> <span class="fl">255.0</span></span>
+<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a><span class="co"># create a list of class names</span></span>
+<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>class_names <span class="op">=</span> [<span class="st">'airplane'</span>, <span class="st">'automobile'</span>, <span class="st">'bird'</span>, <span class="st">'cat'</span>, <span class="st">'deer'</span>, <span class="st">'dog'</span>, <span class="st">'frog'</span>, <span class="st">'horse'</span>, <span class="st">'ship'</span>, <span class="st">'truck'</span>]</span>
+<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a><span class="co"># one-hot encode labels</span></span>
+<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a>train_labels <span class="op">=</span> keras.utils.to_categorical(train_labels, <span class="bu">len</span>(class_names))</span>
+<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>val_labels <span class="op">=</span> keras.utils.to_categorical(val_labels, <span class="bu">len</span>(class_names))</span>
+<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a><span class="co"># split the training data into training and validation sets</span></span>
+<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="co"># </span><span class="al">NOTE</span><span class="co"> the function is train_test_split() but we are using it to split train into train and validation</span></span>
+<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a>train_images, val_images, train_labels, val_labels <span class="op">=</span> train_test_split(train_images, train_labels, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">42</span>)</span></code></pre>
 </div>
 <div id="challenge-examine-the-cifar-10-dataset" class="callout challenge">
 <div class="callout-square">
@@ -543,7 +536,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
 <span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="co"># plot a subset of the images </span></span>
 <span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">25</span>):</span>
 <span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>    plt.subplot(<span class="dv">5</span>,<span class="dv">5</span>,i<span class="op">+</span><span class="dv">1</span>)</span>
-<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    plt.imshow(train_images[i], cmap<span class="op">=</span>plt.cm.binary)</span>
+<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    plt.imshow(train_images[i])</span>
 <span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>    plt.axis(<span class="st">'off'</span>)</span>
 <span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>    plt.title(class_names[train_labels[i,].argmax()])</span>
 <span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code></pre>
@@ -646,35 +639,35 @@ <h4 id="what-does-this-output-mean">What does this output mean?<a class="anchor"
 <p>This output printed during the fit phase, i.e. training the model
 against known image labels, can be broken down as follows:</p>
 <ul><li><p><code>Epoch</code> describes the number of full passes over all
-<em>training data</em>. In the output above there are <strong>1250
-training observations</strong>. This number is calculated as the total
-number of images used as input divided by the batch size (40000/32). An
-epoch will conclude and move to the next epoch after a training pass
-over all observations.</p></li>
-<li><p><code>loss</code> and <code>val_loss</code> can be considered as
-related. Where <code>loss</code> is a value the model will attempt to
-minimise, and is the distance between the true label of an image and the
-models prediction. Minimising this distance is where <em>learning</em>
-occurs to adjust weights and bias which reduce <code>loss</code>. On the
-other hand <code>val_loss</code> is a value calculated against the
-validation data and is a measurement of the models performance against
-<strong>unseen data</strong>. Both values are a summation of errors made
-for each example when fitting to the training or validation
-sets.</p></li>
-<li><p><code>accuracy</code> and <code>val_accuracy</code> can also be
-considered as related. Unlike <code>loss</code> and
-<code>val_loss</code>, these values are a percentage and are only
-revelant to <strong>classification problems</strong>. The
-<code>val_accuracy</code> score can be used to communicate a percentage
-value of model effectiveness on unseen data.</p></li>
+<em>training data</em>.</p></li>
+<li><p>In the output above, there are <strong>1250</strong> batches
+(steps) to complete each epoch. This number is calculated as the total
+number of images used as input divided by the batch size (40000/32).
+After 1250 batches, all training images will have been seen once and the
+model moves on to the next epoch.</p></li>
+<li><p><code>loss</code> is a value the model will attempt to minimise
+and is a measure of the dissimilarity or error between the true label of
+an image and the model prediction. Minimising this distance is where
+<em>learning</em> occurs to adjust weights and bias which reduce
+<code>loss</code>.</p></li>
+<li><p><code>val_loss</code> is a value calculated against the
+validation data and is a measure of the model’s performance against
+unseen data.</p></li>
+<li><p>Both values are a summation of errors made during each
+epoch.</p></li>
+<li><p><code>accuracy</code> and <code>val_accuracy</code> values are a
+percentage and are only revelant to <strong>classification
+problems</strong>.</p></li>
+<li><p>The <code>val_accuracy</code> score can be used to communicate a
+model’s effectiveness on unseen data.</p></li>
 </ul></div>
 </div>
 <div class="section level3">
 <h3 id="step-7--perform-a-predictionclassification">Step 7. Perform a Prediction/Classification<a class="anchor" aria-label="anchor" href="#step-7--perform-a-predictionclassification"></a></h3>
 <p>After training the network we can use it to perform predictions. This
-is the mode you would use the network in after you have fully trained it
-to a satisfactory performance. Doing predictions on a special hold-out
-set is used in the next step to measure the performance of the
+is how you would use the network after you have fully trained it to a
+satisfactory performance. The predictions performed here on a special
+hold-out set is used in the next step to measure the performance of the
 network.</p>
 <div class="codewrapper sourceCode" id="cb9">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -686,7 +679,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
 <span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="st">'The class with the highest predicted probability is: '</span>, class_names[result_intro.argmax()])</span>
 <span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a></span>
 <span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a><span class="co"># plot the image with its true label</span></span>
-<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>plt.imshow(test_images[<span class="dv">0</span>], cmap<span class="op">=</span>plt.cm.binary)</span>
+<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>plt.imshow(test_images[<span class="dv">0</span>])</span>
 <span id="cb9-9"><a href="#cb9-9" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'True class:'</span> <span class="op">+</span> class_names[test_labels[<span class="dv">0</span>,].argmax()])</span>
 <span id="cb9-10"><a href="#cb9-10" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code></pre>
 </div>
@@ -696,7 +689,14 @@ <h3 class="code-label">OUTPUT<i aria-hidden="true" data-feather="chevron-left"><
 <pre class="output" tabindex="0"><code>The predicted probability of each class is:  [[0.0074 0.0006 0.0456 0.525  0.0036 0.1062 0.0162 0.0006 0.2908 0.004 ]]
 The class with the highest predicted probability is:  cat</code></pre>
 </div>
-<figure><img src="fig/01_test_image.png" alt="poor resolution image of a cat" class="figure mx-auto d-block"></figure><div id="callout2" class="callout callout">
+<figure><img src="fig/01_test_image.png" alt="poor resolution image of a cat" class="figure mx-auto d-block"></figure><p>Congratulations, you just created your first image classification
+model and used it to classify an image!</p>
+<p>Was the classification correct? Why might it be incorrect and what
+can we do about?</p>
+<p>There are many ways to try to improve the accuracy of our model, such
+as adding or removing layers to the model definition and fine-tuning the
+hyperparameters, which takes us to the next steps in our workflow.</p>
+<div id="callout2" class="callout callout">
 <div class="callout-square">
 <i class="callout-icon" data-feather="bell"></i>
 </div>
@@ -705,11 +705,11 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 </h3>
 <div class="callout-content">
 <p>My result is different!</p>
-<p>While the neural network itself is deterministic, various factors in
-the training process, system setup, and data variability can lead to
-small variations in the output. These variations are usually minor and
-should not significantly impact the overall performance or behavior of
-the model.</p>
+<p>While the neural network itself is deterministic (ie without
+randomness), various factors in the training process, system setup, and
+data variability can lead to small variations in the output. These
+variations are usually minor and should not significantly impact the
+overall performance or behavior of the model.</p>
 <p>If you are finding significant differences in the model predictions,
 this could be a sign the model is not fully converged. “Convergence”
 refers to the point where the model has reached an optimal or
@@ -717,13 +717,6 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 </div>
 </div>
 </div>
-<p>Congratulations, you just created your first image classification
-model and used it to classify an image!</p>
-<p>Was the classification correct? Why might it be incorrect and what
-can we do about?</p>
-<p>There are many ways to try to improve the accuracy of our model, such
-as adding or removing layers to the model definition and fine-tuning the
-hyperparameters, which takes us to the next steps in our workflow.</p>
 </div>
 <div class="section level3">
 <h3 id="step-8--measure-performance">Step 8. Measure Performance<a class="anchor" aria-label="anchor" href="#step-8--measure-performance"></a></h3>
@@ -741,7 +734,7 @@ <h3 id="step-9--tune-hyperparameters">Step 9. Tune Hyperparameters<a class="anch
 designing a neural network but also choosing the best values for various
 hyperparameters that govern the training process.</p>
 <p><strong>Hyperparameters</strong> are all the parameters set by the
-person configuring the machine learning instead of those learned by the
+person configuring the model as opposed to those learned by the
 algorithm itself. These hyperparameters can include the learning rate,
 the number of layers in the network, the number of neurons per layer,
 and many more. Hyperparameter tuning refers to the process of
@@ -753,12 +746,12 @@ <h3 id="step-9--tune-hyperparameters">Step 9. Tune Hyperparameters<a class="anch
 <div class="section level3">
 <h3 id="step-10--share-model">Step 10. Share Model<a class="anchor" aria-label="anchor" href="#step-10--share-model"></a></h3>
 <p>Now that we have a trained network that performs at a level we are
-happy with we can go and use it on real data to perform a prediction. At
-this point we might want to consider publishing a file with both the
-architecture of our network and the weights which it has learned
-(assuming we did not use a pre-trained network). This will allow others
-to use it as as pre-trained network for their own purposes and for them
-to (mostly) reproduce our result.</p>
+happy with we can go and use it on real live data to perform a
+prediction. At this point we might want to consider publishing a file
+with both the architecture of our network and the weights which it has
+learned (assuming we did not use a pre-trained network). This will allow
+others to use it as as pre-trained network for their own purposes and
+for them to (mostly) reproduce our result.</p>
 <p>To share the model we must save it first:</p>
 <div class="codewrapper sourceCode" id="cb11">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -853,7 +846,7 @@ <h3 class="callout-title">Key Points<a class="anchor" aria-label="anchor" href="
   "url": "https://.github.io/github.com/01-introduction.html",
   "identifier": "https://.github.io/github.com/01-introduction.html",
   "dateCreated": "2023-05-03",
-  "dateModified": "2024-02-14",
+  "dateModified": "2024-02-23",
   "datePublished": "2024-02-23"
 }
 
diff --git a/04-fit-cnn.html b/04-fit-cnn.html
index 2c8aa2fd..48a200b3 100644
--- a/04-fit-cnn.html
+++ b/04-fit-cnn.html
@@ -403,7 +403,7 @@ <h4 id="optimizer">Optimizer<a class="anchor" aria-label="anchor" href="#optimiz
   <h3 class="accordion-header" id="headingSpoiler1">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Learning Rate</h3>
 </button>
-<div id="collapseSpoiler1" class="accordion-collapse collapse" aria-labelledby="headingSpoiler1" data-bs-parent="#accordionSpoiler1">
+<div id="collapseSpoiler1" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler1" aria-labelledby="headingSpoiler1">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p><strong>Learning rate</strong> is a hyperparameter that determines
@@ -502,7 +502,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
   <h3 class="accordion-header" id="headingSpoiler2">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Batch size</h3>
 </button>
-<div id="collapseSpoiler2" class="accordion-collapse collapse" aria-labelledby="headingSpoiler2" data-bs-parent="#accordionSpoiler2">
+<div id="collapseSpoiler2" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler2" aria-labelledby="headingSpoiler2">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p>The choice of batch size can have various implications, and there are
@@ -588,7 +588,7 @@ <h3 class="callout-title">Inspect the Training Curve<a class="anchor" aria-label
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution1" aria-expanded="false" aria-controls="collapseSolution1">
   <h4 class="accordion-header" id="headingSolution1">Show me the solution</h4>
 </button>
-<div id="collapseSolution1" class="accordion-collapse collapse" aria-labelledby="headingSolution1" data-bs-parent="#accordionSolution1">
+<div id="collapseSolution1" class="accordion-collapse collapse" data-bs-parent="#accordionSolution1" aria-labelledby="headingSolution1">
 <div class="accordion-body">
 <ol style="list-style-type: decimal"><li>The loss curve should drop quite quickly in a smooth line with
 little jitter. The accuracy should increase quite quickly in a smooth
@@ -630,7 +630,7 @@ <h4 class="accordion-header" id="headingSolution1">Show me the solution</h4>
   <h3 class="accordion-header" id="headingSpoiler3">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: What is underfitting?</h3>
 </button>
-<div id="collapseSpoiler3" class="accordion-collapse collapse" aria-labelledby="headingSpoiler3" data-bs-parent="#accordionSpoiler3">
+<div id="collapseSpoiler3" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler3" aria-labelledby="headingSpoiler3">
 <div class="accordion-body">
 <p>Underfitting occurs when the model is too simple or lacks the
 capacity to capture the underlying patterns and relationships present in
@@ -804,7 +804,7 @@ <h3 class="callout-title">Does adding a Dropout Layer improve our
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution2" aria-expanded="false" aria-controls="collapseSolution2">
   <h4 class="accordion-header" id="headingSolution2">Show me the solution</h4>
 </button>
-<div id="collapseSolution2" class="accordion-collapse collapse" aria-labelledby="headingSolution2" data-bs-parent="#accordionSolution2">
+<div id="collapseSolution2" class="accordion-collapse collapse" data-bs-parent="#accordionSolution2" aria-labelledby="headingSolution2">
 <div class="accordion-body">
 <div class="codewrapper sourceCode" id="cb9">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -856,7 +856,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
   <h3 class="accordion-header" id="headingSpoiler4">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Regularization methods for Convolutional Neural Networks (CNNs)</h3>
 </button>
-<div id="collapseSpoiler4" class="accordion-collapse collapse" aria-labelledby="headingSpoiler4" data-bs-parent="#accordionSpoiler4">
+<div id="collapseSpoiler4" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler4" aria-labelledby="headingSpoiler4">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p><strong>Regularization</strong> methods introduce constraints or
diff --git a/aio.html b/aio.html
index f0d4ce38..21c73504 100644
--- a/aio.html
+++ b/aio.html
@@ -593,7 +593,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
  -->
 </section></section><section id="aio-01-introduction"><p>Content from <a href="01-introduction.html">Introduction to Deep Learning</a></p>
 <hr>
-<p>Last updated on 2024-02-14 |
+<p>Last updated on 2024-02-23 |
         
         <a href="https://https://github.com/erinmgraham/icwithcnn/edit/main/episodes/01-introduction.html" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
 <div class="text-end">
@@ -642,15 +642,8 @@ <h3 class="card-title">Objectives</h3>
 are many more.</p>
 <p>The techniques break down into two broad categories, predictors and
 classifiers. Predictors are used to predict a value (or set of values)
-given a set of inputs, for example trying to predict the cost of
-something given the economic conditions and the cost of raw materials or
-predicting a country’s GDP given its life expectancy. Classifiers try to
-classify data into different categories, or assign a label; for example,
-deciding what characters are visible in a picture of some writing or if
-an email or text message is spam or not.</p>
-</section><section id="training-data"><h2 class="section-heading">Training Data<a class="anchor" aria-label="anchor" href="#training-data"></a>
-</h2>
-<hr class="half-width">
+given a set of inputs whereas classifiers try to classify data into
+different categories, or assign a labelcond env.</p>
 <p>Many, but not all, machine learning systems “learn” by taking a
 series of input data and output data and using it to form a model. The
 maths behind the machine learning doesn’t care what the data is as long
@@ -693,18 +686,18 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 <div class="callout-content">
 <p>Concept: Differentiation between traditional Machine Learning models
 and Deep Learning models:</p>
-<p><strong>Traditional ML algorithms</strong> can only use one (possibly
-two layers) of data transformation to calculate an output (shallow
-models). With high dimensional data and growing feature space (possible
-set of values for any given feature), shallow models quickly run out of
-layers to calculate outputs.</p>
-<p><strong>Deep neural networks</strong> (constructed with multiple
-layers of neurons) are the extension of shallow models with three
-layers: input, hidden, and outputs layers. The hidden layer is where
-learning takes place. As a result, deep learning is best applied to
-large datasets for training and prediction. As observations and feature
-inputs decrease, shallow ML approaches begin to perform noticeably
-better.</p>
+<p><strong>Traditional ML algorithms</strong>, known as shallow models,
+are limited to just one or maybe two layers of data transformation to
+generate an output. When dealing with complex data featuring high
+dimensions and growing feature space (i.e. many attributes and an
+expanding set of potential values for each feature), these shallow
+models become limited in their ability to compute accurate outputs.</p>
+<p><strong>Deep neural networks</strong> are the extension of shallow
+models with three layers: input, hidden, and outputs layers. The hidden
+layer(s) is where learning takes place. As a result, deep learning is
+best applied to large datasets for training and prediction. As
+observations and feature inputs decrease, shallow ML approaches begin to
+perform noticeably better.</p>
 </div>
 </div>
 </div>
@@ -737,10 +730,11 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 <strong>Security and Surveillance</strong>: Detecting anomalies or
 unauthorised objects in security footage.</li>
 </ul>
-<p>Convolutional Neural Networks (CNNs) have become a cornerstone in
-image classification due to their ability to automatically learn
-hierarchical features from images and achieve remarkable performance on
-a wide range of tasks.</p>
+<p>A Convolutional Neural Networks (CNN) is a Deep Learning algorithm
+that has become a cornerstone in image classification due to its ability
+to automatically learn features from images in a hierarchical fashion
+(i.e. each layer builds upon what was learned by the previous layer). It
+can achieve remarkable performance on a wide range of tasks.</p>
 </section><section id="deep-learning-workflow"><h2 class="section-heading">Deep Learning Workflow<a class="anchor" aria-label="anchor" href="#deep-learning-workflow"></a>
 </h2>
 <hr class="half-width">
@@ -773,12 +767,12 @@ <h3 id="step-3--prepare-data">Step 3. Prepare data<a class="anchor" aria-label="
 the data structure will be explored in <a href="02-image-data">Episode
 02 Introduction to Image Data</a>.</p>
 <p>For this lesson, we will use an existing image dataset known as
-CIFAR-10. We will introduce this dataset and the different data
-preparation tasks in more detail in the next episode but for this
-introduction, we want to divide the data into <strong>training</strong>,
-<strong>validation</strong>, and <strong>test</strong> subsets;
-normalise the image pixel values to be between 0 and 1; and one-hot
-encode our image labels.</p>
+CIFAR-10 (Canadian Institute for Advanced Research). We will introduce
+this dataset and the different data preparation tasks in more detail in
+the next episode but for this introduction, we want to divide the data
+into <strong>training</strong>, <strong>validation</strong>, and
+<strong>test</strong> subsets; normalise the image pixel values to be
+between 0 and 1; and one-hot encode our image labels.</p>
 <div class="section level4">
 <h4 id="preparing-the-code">Preparing the code<a class="anchor" aria-label="anchor" href="#preparing-the-code"></a>
 </h4>
@@ -791,28 +785,27 @@ <h4 id="preparing-the-code">Preparing the code<a class="anchor" aria-label="anch
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
 </h3>
 <pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># load the required packages</span></span>
-<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras <span class="co"># library for neural networks </span></span>
-<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split <span class="co"># library for splitting data into sets</span></span>
-<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt <span class="co"># library for plotting</span></span>
-<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> numpy <span class="im">as</span> np <span class="co"># library for working with images as arrays</span></span>
-<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a><span class="co"># load the CIFAR-10 dataset included with the keras library</span></span>
-<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>(train_images, train_labels), (test_images, test_labels) <span class="op">=</span> keras.datasets.cifar10.load_data()</span>
-<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a><span class="co"># normalise the RGB values to be between 0 and 1</span></span>
-<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>train_images <span class="op">=</span> train_images <span class="op">/</span> <span class="fl">255.0</span></span>
-<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>test_images <span class="op">=</span> test_images <span class="op">/</span> <span class="fl">255.0</span></span>
-<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a><span class="co"># create a list of class names</span></span>
-<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>class_names <span class="op">=</span> [<span class="st">'airplane'</span>, <span class="st">'automobile'</span>, <span class="st">'bird'</span>, <span class="st">'cat'</span>, <span class="st">'deer'</span>, <span class="st">'dog'</span>, <span class="st">'frog'</span>, <span class="st">'horse'</span>, <span class="st">'ship'</span>, <span class="st">'truck'</span>]</span>
-<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="co"># one-hot encode labels</span></span>
-<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>train_labels <span class="op">=</span> keras.utils.to_categorical(train_labels, <span class="bu">len</span>(class_names))</span>
-<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a>val_labels <span class="op">=</span> keras.utils.to_categorical(val_labels, <span class="bu">len</span>(class_names))</span>
-<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="co"># split the training data into training and validation sets</span></span>
-<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a><span class="co"># </span><span class="al">NOTE</span><span class="co"> the function is train_test_split() but we are using it to split train into train and validation</span></span>
-<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a>train_images, val_images, train_labels, val_labels <span class="op">=</span> train_test_split(train_images, train_labels, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">42</span>)</span></code></pre>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras <span class="co"># for neural networks </span></span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split <span class="co"># for splitting data into sets</span></span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt <span class="co"># for plotting</span></span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="co"># load the CIFAR-10 dataset included with keras</span></span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>(train_images, train_labels), (test_images, test_labels) <span class="op">=</span> keras.datasets.cifar10.load_data()</span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a><span class="co"># normalise the RGB values to be between 0 and 1</span></span>
+<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>train_images <span class="op">=</span> train_images <span class="op">/</span> <span class="fl">255.0</span></span>
+<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>test_images <span class="op">=</span> test_images <span class="op">/</span> <span class="fl">255.0</span></span>
+<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a><span class="co"># create a list of class names</span></span>
+<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>class_names <span class="op">=</span> [<span class="st">'airplane'</span>, <span class="st">'automobile'</span>, <span class="st">'bird'</span>, <span class="st">'cat'</span>, <span class="st">'deer'</span>, <span class="st">'dog'</span>, <span class="st">'frog'</span>, <span class="st">'horse'</span>, <span class="st">'ship'</span>, <span class="st">'truck'</span>]</span>
+<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a><span class="co"># one-hot encode labels</span></span>
+<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a>train_labels <span class="op">=</span> keras.utils.to_categorical(train_labels, <span class="bu">len</span>(class_names))</span>
+<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>val_labels <span class="op">=</span> keras.utils.to_categorical(val_labels, <span class="bu">len</span>(class_names))</span>
+<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a><span class="co"># split the training data into training and validation sets</span></span>
+<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="co"># </span><span class="al">NOTE</span><span class="co"> the function is train_test_split() but we are using it to split train into train and validation</span></span>
+<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a>train_images, val_images, train_labels, val_labels <span class="op">=</span> train_test_split(train_images, train_labels, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">42</span>)</span></code></pre>
 </div>
 <div id="challenge-examine-the-cifar-10-dataset" class="callout challenge">
 <div class="callout-square">
@@ -868,7 +861,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
 <span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="co"># plot a subset of the images </span></span>
 <span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">25</span>):</span>
 <span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>    plt.subplot(<span class="dv">5</span>,<span class="dv">5</span>,i<span class="op">+</span><span class="dv">1</span>)</span>
-<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    plt.imshow(train_images[i], cmap<span class="op">=</span>plt.cm.binary)</span>
+<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    plt.imshow(train_images[i])</span>
 <span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>    plt.axis(<span class="st">'off'</span>)</span>
 <span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>    plt.title(class_names[train_labels[i,].argmax()])</span>
 <span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code></pre>
@@ -978,27 +971,27 @@ <h4 id="what-does-this-output-mean">What does this output mean?<a class="anchor"
 against known image labels, can be broken down as follows:</p>
 <ul>
 <li><p><code>Epoch</code> describes the number of full passes over all
-<em>training data</em>. In the output above there are <strong>1250
-training observations</strong>. This number is calculated as the total
-number of images used as input divided by the batch size (40000/32). An
-epoch will conclude and move to the next epoch after a training pass
-over all observations.</p></li>
-<li><p><code>loss</code> and <code>val_loss</code> can be considered as
-related. Where <code>loss</code> is a value the model will attempt to
-minimise, and is the distance between the true label of an image and the
-models prediction. Minimising this distance is where <em>learning</em>
-occurs to adjust weights and bias which reduce <code>loss</code>. On the
-other hand <code>val_loss</code> is a value calculated against the
-validation data and is a measurement of the models performance against
-<strong>unseen data</strong>. Both values are a summation of errors made
-for each example when fitting to the training or validation
-sets.</p></li>
-<li><p><code>accuracy</code> and <code>val_accuracy</code> can also be
-considered as related. Unlike <code>loss</code> and
-<code>val_loss</code>, these values are a percentage and are only
-revelant to <strong>classification problems</strong>. The
-<code>val_accuracy</code> score can be used to communicate a percentage
-value of model effectiveness on unseen data.</p></li>
+<em>training data</em>.</p></li>
+<li><p>In the output above, there are <strong>1250</strong> batches
+(steps) to complete each epoch. This number is calculated as the total
+number of images used as input divided by the batch size (40000/32).
+After 1250 batches, all training images will have been seen once and the
+model moves on to the next epoch.</p></li>
+<li><p><code>loss</code> is a value the model will attempt to minimise
+and is a measure of the dissimilarity or error between the true label of
+an image and the model prediction. Minimising this distance is where
+<em>learning</em> occurs to adjust weights and bias which reduce
+<code>loss</code>.</p></li>
+<li><p><code>val_loss</code> is a value calculated against the
+validation data and is a measure of the model’s performance against
+unseen data.</p></li>
+<li><p>Both values are a summation of errors made during each
+epoch.</p></li>
+<li><p><code>accuracy</code> and <code>val_accuracy</code> values are a
+percentage and are only revelant to <strong>classification
+problems</strong>.</p></li>
+<li><p>The <code>val_accuracy</code> score can be used to communicate a
+model’s effectiveness on unseen data.</p></li>
 </ul>
 </div>
 </div>
@@ -1006,9 +999,9 @@ <h4 id="what-does-this-output-mean">What does this output mean?<a class="anchor"
 <h3 id="step-7--perform-a-predictionclassification">Step 7. Perform a Prediction/Classification<a class="anchor" aria-label="anchor" href="#step-7--perform-a-predictionclassification"></a>
 </h3>
 <p>After training the network we can use it to perform predictions. This
-is the mode you would use the network in after you have fully trained it
-to a satisfactory performance. Doing predictions on a special hold-out
-set is used in the next step to measure the performance of the
+is how you would use the network after you have fully trained it to a
+satisfactory performance. The predictions performed here on a special
+hold-out set is used in the next step to measure the performance of the
 network.</p>
 <div class="codewrapper sourceCode" id="cb9">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -1020,7 +1013,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
 <span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="st">'The class with the highest predicted probability is: '</span>, class_names[result_intro.argmax()])</span>
 <span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a></span>
 <span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a><span class="co"># plot the image with its true label</span></span>
-<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>plt.imshow(test_images[<span class="dv">0</span>], cmap<span class="op">=</span>plt.cm.binary)</span>
+<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>plt.imshow(test_images[<span class="dv">0</span>])</span>
 <span id="cb9-9"><a href="#cb9-9" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'True class:'</span> <span class="op">+</span> class_names[test_labels[<span class="dv">0</span>,].argmax()])</span>
 <span id="cb9-10"><a href="#cb9-10" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code></pre>
 </div>
@@ -1030,7 +1023,14 @@ <h3 class="code-label">OUTPUT<i aria-hidden="true" data-feather="chevron-left"><
 <pre class="output" tabindex="0"><code>The predicted probability of each class is:  [[0.0074 0.0006 0.0456 0.525  0.0036 0.1062 0.0162 0.0006 0.2908 0.004 ]]
 The class with the highest predicted probability is:  cat</code></pre>
 </div>
-<figure><img src="fig/01_test_image.png" alt="poor resolution image of a cat" class="figure mx-auto d-block"></figure><div id="callout2" class="callout callout">
+<figure><img src="fig/01_test_image.png" alt="poor resolution image of a cat" class="figure mx-auto d-block"></figure><p>Congratulations, you just created your first image classification
+model and used it to classify an image!</p>
+<p>Was the classification correct? Why might it be incorrect and what
+can we do about?</p>
+<p>There are many ways to try to improve the accuracy of our model, such
+as adding or removing layers to the model definition and fine-tuning the
+hyperparameters, which takes us to the next steps in our workflow.</p>
+<div id="callout2" class="callout callout">
 <div class="callout-square">
 <i class="callout-icon" data-feather="bell"></i>
 </div>
@@ -1039,11 +1039,11 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 </h3>
 <div class="callout-content">
 <p>My result is different!</p>
-<p>While the neural network itself is deterministic, various factors in
-the training process, system setup, and data variability can lead to
-small variations in the output. These variations are usually minor and
-should not significantly impact the overall performance or behavior of
-the model.</p>
+<p>While the neural network itself is deterministic (ie without
+randomness), various factors in the training process, system setup, and
+data variability can lead to small variations in the output. These
+variations are usually minor and should not significantly impact the
+overall performance or behavior of the model.</p>
 <p>If you are finding significant differences in the model predictions,
 this could be a sign the model is not fully converged. “Convergence”
 refers to the point where the model has reached an optimal or
@@ -1051,13 +1051,6 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 </div>
 </div>
 </div>
-<p>Congratulations, you just created your first image classification
-model and used it to classify an image!</p>
-<p>Was the classification correct? Why might it be incorrect and what
-can we do about?</p>
-<p>There are many ways to try to improve the accuracy of our model, such
-as adding or removing layers to the model definition and fine-tuning the
-hyperparameters, which takes us to the next steps in our workflow.</p>
 </div>
 <div class="section level3">
 <h3 id="step-8--measure-performance">Step 8. Measure Performance<a class="anchor" aria-label="anchor" href="#step-8--measure-performance"></a>
@@ -1077,7 +1070,7 @@ <h3 id="step-9--tune-hyperparameters">Step 9. Tune Hyperparameters<a class="anch
 designing a neural network but also choosing the best values for various
 hyperparameters that govern the training process.</p>
 <p><strong>Hyperparameters</strong> are all the parameters set by the
-person configuring the machine learning instead of those learned by the
+person configuring the model as opposed to those learned by the
 algorithm itself. These hyperparameters can include the learning rate,
 the number of layers in the network, the number of neurons per layer,
 and many more. Hyperparameter tuning refers to the process of
@@ -1090,12 +1083,12 @@ <h3 id="step-9--tune-hyperparameters">Step 9. Tune Hyperparameters<a class="anch
 <h3 id="step-10--share-model">Step 10. Share Model<a class="anchor" aria-label="anchor" href="#step-10--share-model"></a>
 </h3>
 <p>Now that we have a trained network that performs at a level we are
-happy with we can go and use it on real data to perform a prediction. At
-this point we might want to consider publishing a file with both the
-architecture of our network and the weights which it has learned
-(assuming we did not use a pre-trained network). This will allow others
-to use it as as pre-trained network for their own purposes and for them
-to (mostly) reproduce our result.</p>
+happy with we can go and use it on real live data to perform a
+prediction. At this point we might want to consider publishing a file
+with both the architecture of our network and the weights which it has
+learned (assuming we did not use a pre-trained network). This will allow
+others to use it as as pre-trained network for their own purposes and
+for them to (mostly) reproduce our result.</p>
 <p>To share the model we must save it first:</p>
 <div class="codewrapper sourceCode" id="cb11">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -2712,7 +2705,7 @@ <h4 id="optimizer">Optimizer<a class="anchor" aria-label="anchor" href="#optimiz
   <h3 class="accordion-header" id="headingSpoiler1">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Learning Rate</h3>
 </button>
-<div id="collapseSpoiler1" class="accordion-collapse collapse" aria-labelledby="headingSpoiler1" data-bs-parent="#accordionSpoiler1">
+<div id="collapseSpoiler1" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler1" aria-labelledby="headingSpoiler1">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p><strong>Learning rate</strong> is a hyperparameter that determines
@@ -2813,7 +2806,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
   <h3 class="accordion-header" id="headingSpoiler2">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Batch size</h3>
 </button>
-<div id="collapseSpoiler2" class="accordion-collapse collapse" aria-labelledby="headingSpoiler2" data-bs-parent="#accordionSpoiler2">
+<div id="collapseSpoiler2" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler2" aria-labelledby="headingSpoiler2">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p>The choice of batch size can have various implications, and there are
@@ -2906,7 +2899,7 @@ <h3 class="callout-title">Inspect the Training Curve<a class="anchor" aria-label
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution1" aria-expanded="false" aria-controls="collapseSolution1">
   <h4 class="accordion-header" id="headingSolution1">Show me the solution</h4>
 </button>
-<div id="collapseSolution1" class="accordion-collapse collapse" aria-labelledby="headingSolution1" data-bs-parent="#accordionSolution1">
+<div id="collapseSolution1" class="accordion-collapse collapse" data-bs-parent="#accordionSolution1" aria-labelledby="headingSolution1">
 <div class="accordion-body">
 <ol style="list-style-type: decimal">
 <li>The loss curve should drop quite quickly in a smooth line with
@@ -2954,7 +2947,7 @@ <h4 class="accordion-header" id="headingSolution1">Show me the solution</h4>
   <h3 class="accordion-header" id="headingSpoiler3">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: What is underfitting?</h3>
 </button>
-<div id="collapseSpoiler3" class="accordion-collapse collapse" aria-labelledby="headingSpoiler3" data-bs-parent="#accordionSpoiler3">
+<div id="collapseSpoiler3" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler3" aria-labelledby="headingSpoiler3">
 <div class="accordion-body">
 <p>Underfitting occurs when the model is too simple or lacks the
 capacity to capture the underlying patterns and relationships present in
@@ -3134,7 +3127,7 @@ <h3 class="callout-title">Does adding a Dropout Layer improve our
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution2" aria-expanded="false" aria-controls="collapseSolution2">
   <h4 class="accordion-header" id="headingSolution2">Show me the solution</h4>
 </button>
-<div id="collapseSolution2" class="accordion-collapse collapse" aria-labelledby="headingSolution2" data-bs-parent="#accordionSolution2">
+<div id="collapseSolution2" class="accordion-collapse collapse" data-bs-parent="#accordionSolution2" aria-labelledby="headingSolution2">
 <div class="accordion-body">
 <div class="codewrapper sourceCode" id="cb9">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -3186,7 +3179,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
   <h3 class="accordion-header" id="headingSpoiler4">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Regularization methods for Convolutional Neural Networks (CNNs)</h3>
 </button>
-<div id="collapseSpoiler4" class="accordion-collapse collapse" aria-labelledby="headingSpoiler4" data-bs-parent="#accordionSpoiler4">
+<div id="collapseSpoiler4" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler4" aria-labelledby="headingSpoiler4">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p><strong>Regularization</strong> methods introduce constraints or
diff --git a/index.html b/index.html
index 4db3946f..62b70a7a 100644
--- a/index.html
+++ b/index.html
@@ -337,7 +337,7 @@ <h3 class="callout-title">Install Python Using Anaconda<a class="anchor" aria-la
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution1" aria-expanded="false" aria-controls="collapseSolution1">
   <h4 class="accordion-header" id="headingSolution1">Windows</h4>
 </button>
-<div id="collapseSolution1" class="accordion-collapse collapse" data-bs-parent="#accordionSolution1" aria-labelledby="headingSolution1">
+<div id="collapseSolution1" class="accordion-collapse collapse" aria-labelledby="headingSolution1" data-bs-parent="#accordionSolution1">
 <div class="accordion-body">
 <p>Check out the <a href="https://www.youtube.com/watch?v=xxQ0mzZ8UvA" class="external-link">Windows - Video
 tutorial</a> or:</p>
@@ -356,7 +356,7 @@ <h4 class="accordion-header" id="headingSolution1">Windows</h4>
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution2" aria-expanded="false" aria-controls="collapseSolution2">
   <h4 class="accordion-header" id="headingSolution2">MacOS</h4>
 </button>
-<div id="collapseSolution2" class="accordion-collapse collapse" data-bs-parent="#accordionSolution2" aria-labelledby="headingSolution2">
+<div id="collapseSolution2" class="accordion-collapse collapse" aria-labelledby="headingSolution2" data-bs-parent="#accordionSolution2">
 <div class="accordion-body">
 <p>Check out the <a href="https://www.youtube.com/watch?v=TcSAln46u9U" class="external-link">Mac OS X - Video
 tutorial</a> or:</p>
@@ -374,7 +374,7 @@ <h4 class="accordion-header" id="headingSolution2">MacOS</h4>
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution3" aria-expanded="false" aria-controls="collapseSolution3">
   <h4 class="accordion-header" id="headingSolution3">Linux</h4>
 </button>
-<div id="collapseSolution3" class="accordion-collapse collapse" data-bs-parent="#accordionSolution3" aria-labelledby="headingSolution3">
+<div id="collapseSolution3" class="accordion-collapse collapse" aria-labelledby="headingSolution3" data-bs-parent="#accordionSolution3">
 <div class="accordion-body">
 <p>Note the following installation steps require you to work from the
 shell. If you run into any difficulties, please request help before the
diff --git a/instructor/01-introduction.html b/instructor/01-introduction.html
index 80cd775c..9ce52329 100644
--- a/instructor/01-introduction.html
+++ b/instructor/01-introduction.html
@@ -167,7 +167,6 @@ <h2 class="accordion-header chapters" id="flush-headingEleven">
     <div id="flush-collapsecurrent" class="accordion-collapse collapse show" aria-labelledby="flush-headingcurrent" data-bs-parent="#accordionFlushcurrent">
       <div class="accordion-body">
         <ul><li><a href="#what-is-machine-learning">What is machine learning?</a></li>
-<li><a href="#training-data">Training Data</a></li>
 <li><a href="#deep-learning-machine-learning-and-artificial-intelligence">Deep Learning, Machine Learning and Artificial Intelligence</a></li>
 <li><a href="#what-is-image-classification">What is image classification?</a></li>
 <li><a href="#deep-learning-workflow">Deep Learning Workflow</a></li>
@@ -280,7 +279,7 @@ <h2 class="accordion-header" id="flush-headingTwelve">
       </div>
       <hr></nav><main id="main-content" class="main-content"><div class="container lesson-content">
         <h1>Introduction to Deep Learning</h1>
-        <p>Last updated on 2024-02-14 |
+        <p>Last updated on 2024-02-23 |
         
         <a href="https://https://github.com/erinmgraham/icwithcnn/edit/main/episodes/01-introduction.html" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
         
@@ -332,15 +331,9 @@ <h3 class="card-title">Objectives</h3>
 are many more.</p>
 <p>The techniques break down into two broad categories, predictors and
 classifiers. Predictors are used to predict a value (or set of values)
-given a set of inputs, for example trying to predict the cost of
-something given the economic conditions and the cost of raw materials or
-predicting a country’s GDP given its life expectancy. Classifiers try to
-classify data into different categories, or assign a label; for example,
-deciding what characters are visible in a picture of some writing or if
-an email or text message is spam or not.</p>
-</section><section id="training-data"><h2 class="section-heading">Training Data<a class="anchor" aria-label="anchor" href="#training-data"></a>
-</h2>
-<hr class="half-width"><p>Many, but not all, machine learning systems “learn” by taking a
+given a set of inputs whereas classifiers try to classify data into
+different categories, or assign a labelcond env.</p>
+<p>Many, but not all, machine learning systems “learn” by taking a
 series of input data and output data and using it to form a model. The
 maths behind the machine learning doesn’t care what the data is as long
 as it can represented numerically or categorised. Some examples might
@@ -379,18 +372,18 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 <div class="callout-content">
 <p>Concept: Differentiation between traditional Machine Learning models
 and Deep Learning models:</p>
-<p><strong>Traditional ML algorithms</strong> can only use one (possibly
-two layers) of data transformation to calculate an output (shallow
-models). With high dimensional data and growing feature space (possible
-set of values for any given feature), shallow models quickly run out of
-layers to calculate outputs.</p>
-<p><strong>Deep neural networks</strong> (constructed with multiple
-layers of neurons) are the extension of shallow models with three
-layers: input, hidden, and outputs layers. The hidden layer is where
-learning takes place. As a result, deep learning is best applied to
-large datasets for training and prediction. As observations and feature
-inputs decrease, shallow ML approaches begin to perform noticeably
-better.</p>
+<p><strong>Traditional ML algorithms</strong>, known as shallow models,
+are limited to just one or maybe two layers of data transformation to
+generate an output. When dealing with complex data featuring high
+dimensions and growing feature space (i.e. many attributes and an
+expanding set of potential values for each feature), these shallow
+models become limited in their ability to compute accurate outputs.</p>
+<p><strong>Deep neural networks</strong> are the extension of shallow
+models with three layers: input, hidden, and outputs layers. The hidden
+layer(s) is where learning takes place. As a result, deep learning is
+best applied to large datasets for training and prediction. As
+observations and feature inputs decrease, shallow ML approaches begin to
+perform noticeably better.</p>
 </div>
 </div>
 </div>
@@ -420,10 +413,11 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 <li>
 <strong>Security and Surveillance</strong>: Detecting anomalies or
 unauthorised objects in security footage.</li>
-</ul><p>Convolutional Neural Networks (CNNs) have become a cornerstone in
-image classification due to their ability to automatically learn
-hierarchical features from images and achieve remarkable performance on
-a wide range of tasks.</p>
+</ul><p>A Convolutional Neural Networks (CNN) is a Deep Learning algorithm
+that has become a cornerstone in image classification due to its ability
+to automatically learn features from images in a hierarchical fashion
+(i.e. each layer builds upon what was learned by the previous layer). It
+can achieve remarkable performance on a wide range of tasks.</p>
 </section><section id="deep-learning-workflow"><h2 class="section-heading">Deep Learning Workflow<a class="anchor" aria-label="anchor" href="#deep-learning-workflow"></a>
 </h2>
 <hr class="half-width"><p>To apply Deep Learning to a problem there are several steps to go
@@ -452,12 +446,12 @@ <h3 id="step-3--prepare-data">Step 3. Prepare data<a class="anchor" aria-label="
 the data structure will be explored in <a href="02-image-data">Episode
 02 Introduction to Image Data</a>.</p>
 <p>For this lesson, we will use an existing image dataset known as
-CIFAR-10. We will introduce this dataset and the different data
-preparation tasks in more detail in the next episode but for this
-introduction, we want to divide the data into <strong>training</strong>,
-<strong>validation</strong>, and <strong>test</strong> subsets;
-normalise the image pixel values to be between 0 and 1; and one-hot
-encode our image labels.</p>
+CIFAR-10 (Canadian Institute for Advanced Research). We will introduce
+this dataset and the different data preparation tasks in more detail in
+the next episode but for this introduction, we want to divide the data
+into <strong>training</strong>, <strong>validation</strong>, and
+<strong>test</strong> subsets; normalise the image pixel values to be
+between 0 and 1; and one-hot encode our image labels.</p>
 <div class="section level4">
 <h4 id="preparing-the-code">Preparing the code<a class="anchor" aria-label="anchor" href="#preparing-the-code"></a></h4>
 <p>It is the goal of this training workshop to produce a Deep Learning
@@ -469,28 +463,27 @@ <h4 id="preparing-the-code">Preparing the code<a class="anchor" aria-label="anch
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
 </h3>
 <pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># load the required packages</span></span>
-<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras <span class="co"># library for neural networks </span></span>
-<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split <span class="co"># library for splitting data into sets</span></span>
-<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt <span class="co"># library for plotting</span></span>
-<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> numpy <span class="im">as</span> np <span class="co"># library for working with images as arrays</span></span>
-<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a><span class="co"># load the CIFAR-10 dataset included with the keras library</span></span>
-<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>(train_images, train_labels), (test_images, test_labels) <span class="op">=</span> keras.datasets.cifar10.load_data()</span>
-<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a><span class="co"># normalise the RGB values to be between 0 and 1</span></span>
-<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>train_images <span class="op">=</span> train_images <span class="op">/</span> <span class="fl">255.0</span></span>
-<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>test_images <span class="op">=</span> test_images <span class="op">/</span> <span class="fl">255.0</span></span>
-<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a><span class="co"># create a list of class names</span></span>
-<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>class_names <span class="op">=</span> [<span class="st">'airplane'</span>, <span class="st">'automobile'</span>, <span class="st">'bird'</span>, <span class="st">'cat'</span>, <span class="st">'deer'</span>, <span class="st">'dog'</span>, <span class="st">'frog'</span>, <span class="st">'horse'</span>, <span class="st">'ship'</span>, <span class="st">'truck'</span>]</span>
-<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="co"># one-hot encode labels</span></span>
-<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>train_labels <span class="op">=</span> keras.utils.to_categorical(train_labels, <span class="bu">len</span>(class_names))</span>
-<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a>val_labels <span class="op">=</span> keras.utils.to_categorical(val_labels, <span class="bu">len</span>(class_names))</span>
-<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="co"># split the training data into training and validation sets</span></span>
-<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a><span class="co"># </span><span class="al">NOTE</span><span class="co"> the function is train_test_split() but we are using it to split train into train and validation</span></span>
-<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a>train_images, val_images, train_labels, val_labels <span class="op">=</span> train_test_split(train_images, train_labels, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">42</span>)</span></code></pre>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras <span class="co"># for neural networks </span></span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split <span class="co"># for splitting data into sets</span></span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt <span class="co"># for plotting</span></span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="co"># load the CIFAR-10 dataset included with keras</span></span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>(train_images, train_labels), (test_images, test_labels) <span class="op">=</span> keras.datasets.cifar10.load_data()</span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a><span class="co"># normalise the RGB values to be between 0 and 1</span></span>
+<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>train_images <span class="op">=</span> train_images <span class="op">/</span> <span class="fl">255.0</span></span>
+<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>test_images <span class="op">=</span> test_images <span class="op">/</span> <span class="fl">255.0</span></span>
+<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a><span class="co"># create a list of class names</span></span>
+<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>class_names <span class="op">=</span> [<span class="st">'airplane'</span>, <span class="st">'automobile'</span>, <span class="st">'bird'</span>, <span class="st">'cat'</span>, <span class="st">'deer'</span>, <span class="st">'dog'</span>, <span class="st">'frog'</span>, <span class="st">'horse'</span>, <span class="st">'ship'</span>, <span class="st">'truck'</span>]</span>
+<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a><span class="co"># one-hot encode labels</span></span>
+<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a>train_labels <span class="op">=</span> keras.utils.to_categorical(train_labels, <span class="bu">len</span>(class_names))</span>
+<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>val_labels <span class="op">=</span> keras.utils.to_categorical(val_labels, <span class="bu">len</span>(class_names))</span>
+<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a><span class="co"># split the training data into training and validation sets</span></span>
+<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="co"># </span><span class="al">NOTE</span><span class="co"> the function is train_test_split() but we are using it to split train into train and validation</span></span>
+<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a>train_images, val_images, train_labels, val_labels <span class="op">=</span> train_test_split(train_images, train_labels, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">42</span>)</span></code></pre>
 </div>
 <div id="challenge-examine-the-cifar-10-dataset" class="callout challenge">
 <div class="callout-square">
@@ -545,7 +538,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
 <span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="co"># plot a subset of the images </span></span>
 <span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">25</span>):</span>
 <span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>    plt.subplot(<span class="dv">5</span>,<span class="dv">5</span>,i<span class="op">+</span><span class="dv">1</span>)</span>
-<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    plt.imshow(train_images[i], cmap<span class="op">=</span>plt.cm.binary)</span>
+<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    plt.imshow(train_images[i])</span>
 <span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>    plt.axis(<span class="st">'off'</span>)</span>
 <span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>    plt.title(class_names[train_labels[i,].argmax()])</span>
 <span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code></pre>
@@ -648,35 +641,35 @@ <h4 id="what-does-this-output-mean">What does this output mean?<a class="anchor"
 <p>This output printed during the fit phase, i.e. training the model
 against known image labels, can be broken down as follows:</p>
 <ul><li><p><code>Epoch</code> describes the number of full passes over all
-<em>training data</em>. In the output above there are <strong>1250
-training observations</strong>. This number is calculated as the total
-number of images used as input divided by the batch size (40000/32). An
-epoch will conclude and move to the next epoch after a training pass
-over all observations.</p></li>
-<li><p><code>loss</code> and <code>val_loss</code> can be considered as
-related. Where <code>loss</code> is a value the model will attempt to
-minimise, and is the distance between the true label of an image and the
-models prediction. Minimising this distance is where <em>learning</em>
-occurs to adjust weights and bias which reduce <code>loss</code>. On the
-other hand <code>val_loss</code> is a value calculated against the
-validation data and is a measurement of the models performance against
-<strong>unseen data</strong>. Both values are a summation of errors made
-for each example when fitting to the training or validation
-sets.</p></li>
-<li><p><code>accuracy</code> and <code>val_accuracy</code> can also be
-considered as related. Unlike <code>loss</code> and
-<code>val_loss</code>, these values are a percentage and are only
-revelant to <strong>classification problems</strong>. The
-<code>val_accuracy</code> score can be used to communicate a percentage
-value of model effectiveness on unseen data.</p></li>
+<em>training data</em>.</p></li>
+<li><p>In the output above, there are <strong>1250</strong> batches
+(steps) to complete each epoch. This number is calculated as the total
+number of images used as input divided by the batch size (40000/32).
+After 1250 batches, all training images will have been seen once and the
+model moves on to the next epoch.</p></li>
+<li><p><code>loss</code> is a value the model will attempt to minimise
+and is a measure of the dissimilarity or error between the true label of
+an image and the model prediction. Minimising this distance is where
+<em>learning</em> occurs to adjust weights and bias which reduce
+<code>loss</code>.</p></li>
+<li><p><code>val_loss</code> is a value calculated against the
+validation data and is a measure of the model’s performance against
+unseen data.</p></li>
+<li><p>Both values are a summation of errors made during each
+epoch.</p></li>
+<li><p><code>accuracy</code> and <code>val_accuracy</code> values are a
+percentage and are only revelant to <strong>classification
+problems</strong>.</p></li>
+<li><p>The <code>val_accuracy</code> score can be used to communicate a
+model’s effectiveness on unseen data.</p></li>
 </ul></div>
 </div>
 <div class="section level3">
 <h3 id="step-7--perform-a-predictionclassification">Step 7. Perform a Prediction/Classification<a class="anchor" aria-label="anchor" href="#step-7--perform-a-predictionclassification"></a></h3>
 <p>After training the network we can use it to perform predictions. This
-is the mode you would use the network in after you have fully trained it
-to a satisfactory performance. Doing predictions on a special hold-out
-set is used in the next step to measure the performance of the
+is how you would use the network after you have fully trained it to a
+satisfactory performance. The predictions performed here on a special
+hold-out set is used in the next step to measure the performance of the
 network.</p>
 <div class="codewrapper sourceCode" id="cb9">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -688,7 +681,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
 <span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="st">'The class with the highest predicted probability is: '</span>, class_names[result_intro.argmax()])</span>
 <span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a></span>
 <span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a><span class="co"># plot the image with its true label</span></span>
-<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>plt.imshow(test_images[<span class="dv">0</span>], cmap<span class="op">=</span>plt.cm.binary)</span>
+<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>plt.imshow(test_images[<span class="dv">0</span>])</span>
 <span id="cb9-9"><a href="#cb9-9" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'True class:'</span> <span class="op">+</span> class_names[test_labels[<span class="dv">0</span>,].argmax()])</span>
 <span id="cb9-10"><a href="#cb9-10" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code></pre>
 </div>
@@ -698,7 +691,14 @@ <h3 class="code-label">OUTPUT<i aria-hidden="true" data-feather="chevron-left"><
 <pre class="output" tabindex="0"><code>The predicted probability of each class is:  [[0.0074 0.0006 0.0456 0.525  0.0036 0.1062 0.0162 0.0006 0.2908 0.004 ]]
 The class with the highest predicted probability is:  cat</code></pre>
 </div>
-<figure><img src="../fig/01_test_image.png" alt="poor resolution image of a cat" class="figure mx-auto d-block"></figure><div id="callout2" class="callout callout">
+<figure><img src="../fig/01_test_image.png" alt="poor resolution image of a cat" class="figure mx-auto d-block"></figure><p>Congratulations, you just created your first image classification
+model and used it to classify an image!</p>
+<p>Was the classification correct? Why might it be incorrect and what
+can we do about?</p>
+<p>There are many ways to try to improve the accuracy of our model, such
+as adding or removing layers to the model definition and fine-tuning the
+hyperparameters, which takes us to the next steps in our workflow.</p>
+<div id="callout2" class="callout callout">
 <div class="callout-square">
 <i class="callout-icon" data-feather="bell"></i>
 </div>
@@ -707,11 +707,11 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 </h3>
 <div class="callout-content">
 <p>My result is different!</p>
-<p>While the neural network itself is deterministic, various factors in
-the training process, system setup, and data variability can lead to
-small variations in the output. These variations are usually minor and
-should not significantly impact the overall performance or behavior of
-the model.</p>
+<p>While the neural network itself is deterministic (ie without
+randomness), various factors in the training process, system setup, and
+data variability can lead to small variations in the output. These
+variations are usually minor and should not significantly impact the
+overall performance or behavior of the model.</p>
 <p>If you are finding significant differences in the model predictions,
 this could be a sign the model is not fully converged. “Convergence”
 refers to the point where the model has reached an optimal or
@@ -719,13 +719,6 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 </div>
 </div>
 </div>
-<p>Congratulations, you just created your first image classification
-model and used it to classify an image!</p>
-<p>Was the classification correct? Why might it be incorrect and what
-can we do about?</p>
-<p>There are many ways to try to improve the accuracy of our model, such
-as adding or removing layers to the model definition and fine-tuning the
-hyperparameters, which takes us to the next steps in our workflow.</p>
 </div>
 <div class="section level3">
 <h3 id="step-8--measure-performance">Step 8. Measure Performance<a class="anchor" aria-label="anchor" href="#step-8--measure-performance"></a></h3>
@@ -743,7 +736,7 @@ <h3 id="step-9--tune-hyperparameters">Step 9. Tune Hyperparameters<a class="anch
 designing a neural network but also choosing the best values for various
 hyperparameters that govern the training process.</p>
 <p><strong>Hyperparameters</strong> are all the parameters set by the
-person configuring the machine learning instead of those learned by the
+person configuring the model as opposed to those learned by the
 algorithm itself. These hyperparameters can include the learning rate,
 the number of layers in the network, the number of neurons per layer,
 and many more. Hyperparameter tuning refers to the process of
@@ -755,12 +748,12 @@ <h3 id="step-9--tune-hyperparameters">Step 9. Tune Hyperparameters<a class="anch
 <div class="section level3">
 <h3 id="step-10--share-model">Step 10. Share Model<a class="anchor" aria-label="anchor" href="#step-10--share-model"></a></h3>
 <p>Now that we have a trained network that performs at a level we are
-happy with we can go and use it on real data to perform a prediction. At
-this point we might want to consider publishing a file with both the
-architecture of our network and the weights which it has learned
-(assuming we did not use a pre-trained network). This will allow others
-to use it as as pre-trained network for their own purposes and for them
-to (mostly) reproduce our result.</p>
+happy with we can go and use it on real live data to perform a
+prediction. At this point we might want to consider publishing a file
+with both the architecture of our network and the weights which it has
+learned (assuming we did not use a pre-trained network). This will allow
+others to use it as as pre-trained network for their own purposes and
+for them to (mostly) reproduce our result.</p>
 <p>To share the model we must save it first:</p>
 <div class="codewrapper sourceCode" id="cb11">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -855,7 +848,7 @@ <h3 class="callout-title">Key Points<a class="anchor" aria-label="anchor" href="
   "url": "https://.github.io/github.com/instructor/01-introduction.html",
   "identifier": "https://.github.io/github.com/instructor/01-introduction.html",
   "dateCreated": "2023-05-03",
-  "dateModified": "2024-02-14",
+  "dateModified": "2024-02-23",
   "datePublished": "2024-02-23"
 }
 
diff --git a/instructor/04-fit-cnn.html b/instructor/04-fit-cnn.html
index 269840a6..d28efb8e 100644
--- a/instructor/04-fit-cnn.html
+++ b/instructor/04-fit-cnn.html
@@ -405,7 +405,7 @@ <h4 id="optimizer">Optimizer<a class="anchor" aria-label="anchor" href="#optimiz
   <h3 class="accordion-header" id="headingSpoiler1">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Learning Rate</h3>
 </button>
-<div id="collapseSpoiler1" class="accordion-collapse collapse" aria-labelledby="headingSpoiler1" data-bs-parent="#accordionSpoiler1">
+<div id="collapseSpoiler1" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler1" aria-labelledby="headingSpoiler1">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p><strong>Learning rate</strong> is a hyperparameter that determines
@@ -504,7 +504,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
   <h3 class="accordion-header" id="headingSpoiler2">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Batch size</h3>
 </button>
-<div id="collapseSpoiler2" class="accordion-collapse collapse" aria-labelledby="headingSpoiler2" data-bs-parent="#accordionSpoiler2">
+<div id="collapseSpoiler2" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler2" aria-labelledby="headingSpoiler2">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p>The choice of batch size can have various implications, and there are
@@ -590,7 +590,7 @@ <h3 class="callout-title">Inspect the Training Curve<a class="anchor" aria-label
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution1" aria-expanded="false" aria-controls="collapseSolution1">
   <h4 class="accordion-header" id="headingSolution1">Show me the solution</h4>
 </button>
-<div id="collapseSolution1" class="accordion-collapse collapse" aria-labelledby="headingSolution1" data-bs-parent="#accordionSolution1">
+<div id="collapseSolution1" class="accordion-collapse collapse" data-bs-parent="#accordionSolution1" aria-labelledby="headingSolution1">
 <div class="accordion-body">
 <ol style="list-style-type: decimal"><li>The loss curve should drop quite quickly in a smooth line with
 little jitter. The accuracy should increase quite quickly in a smooth
@@ -632,7 +632,7 @@ <h4 class="accordion-header" id="headingSolution1">Show me the solution</h4>
   <h3 class="accordion-header" id="headingSpoiler3">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: What is underfitting?</h3>
 </button>
-<div id="collapseSpoiler3" class="accordion-collapse collapse" aria-labelledby="headingSpoiler3" data-bs-parent="#accordionSpoiler3">
+<div id="collapseSpoiler3" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler3" aria-labelledby="headingSpoiler3">
 <div class="accordion-body">
 <p>Underfitting occurs when the model is too simple or lacks the
 capacity to capture the underlying patterns and relationships present in
@@ -806,7 +806,7 @@ <h3 class="callout-title">Does adding a Dropout Layer improve our
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution2" aria-expanded="false" aria-controls="collapseSolution2">
   <h4 class="accordion-header" id="headingSolution2">Show me the solution</h4>
 </button>
-<div id="collapseSolution2" class="accordion-collapse collapse" aria-labelledby="headingSolution2" data-bs-parent="#accordionSolution2">
+<div id="collapseSolution2" class="accordion-collapse collapse" data-bs-parent="#accordionSolution2" aria-labelledby="headingSolution2">
 <div class="accordion-body">
 <div class="codewrapper sourceCode" id="cb9">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -858,7 +858,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
   <h3 class="accordion-header" id="headingSpoiler4">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Regularization methods for Convolutional Neural Networks (CNNs)</h3>
 </button>
-<div id="collapseSpoiler4" class="accordion-collapse collapse" aria-labelledby="headingSpoiler4" data-bs-parent="#accordionSpoiler4">
+<div id="collapseSpoiler4" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler4" aria-labelledby="headingSpoiler4">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p><strong>Regularization</strong> methods introduce constraints or
diff --git a/instructor/aio.html b/instructor/aio.html
index f2133bde..ebf187fa 100644
--- a/instructor/aio.html
+++ b/instructor/aio.html
@@ -595,7 +595,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
  -->
 </section></section><section id="aio-01-introduction"><p>Content from <a href="01-introduction.html">Introduction to Deep Learning</a></p>
 <hr>
-<p>Last updated on 2024-02-14 |
+<p>Last updated on 2024-02-23 |
         
         <a href="https://https://github.com/erinmgraham/icwithcnn/edit/main/episodes/01-introduction.html" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
 <p>Estimated time: <i aria-hidden="true" data-feather="clock"></i> 10 minutes</p>
@@ -645,15 +645,8 @@ <h3 class="card-title">Objectives</h3>
 are many more.</p>
 <p>The techniques break down into two broad categories, predictors and
 classifiers. Predictors are used to predict a value (or set of values)
-given a set of inputs, for example trying to predict the cost of
-something given the economic conditions and the cost of raw materials or
-predicting a country’s GDP given its life expectancy. Classifiers try to
-classify data into different categories, or assign a label; for example,
-deciding what characters are visible in a picture of some writing or if
-an email or text message is spam or not.</p>
-</section><section id="training-data"><h2 class="section-heading">Training Data<a class="anchor" aria-label="anchor" href="#training-data"></a>
-</h2>
-<hr class="half-width">
+given a set of inputs whereas classifiers try to classify data into
+different categories, or assign a labelcond env.</p>
 <p>Many, but not all, machine learning systems “learn” by taking a
 series of input data and output data and using it to form a model. The
 maths behind the machine learning doesn’t care what the data is as long
@@ -696,18 +689,18 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 <div class="callout-content">
 <p>Concept: Differentiation between traditional Machine Learning models
 and Deep Learning models:</p>
-<p><strong>Traditional ML algorithms</strong> can only use one (possibly
-two layers) of data transformation to calculate an output (shallow
-models). With high dimensional data and growing feature space (possible
-set of values for any given feature), shallow models quickly run out of
-layers to calculate outputs.</p>
-<p><strong>Deep neural networks</strong> (constructed with multiple
-layers of neurons) are the extension of shallow models with three
-layers: input, hidden, and outputs layers. The hidden layer is where
-learning takes place. As a result, deep learning is best applied to
-large datasets for training and prediction. As observations and feature
-inputs decrease, shallow ML approaches begin to perform noticeably
-better.</p>
+<p><strong>Traditional ML algorithms</strong>, known as shallow models,
+are limited to just one or maybe two layers of data transformation to
+generate an output. When dealing with complex data featuring high
+dimensions and growing feature space (i.e. many attributes and an
+expanding set of potential values for each feature), these shallow
+models become limited in their ability to compute accurate outputs.</p>
+<p><strong>Deep neural networks</strong> are the extension of shallow
+models with three layers: input, hidden, and outputs layers. The hidden
+layer(s) is where learning takes place. As a result, deep learning is
+best applied to large datasets for training and prediction. As
+observations and feature inputs decrease, shallow ML approaches begin to
+perform noticeably better.</p>
 </div>
 </div>
 </div>
@@ -740,10 +733,11 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 <strong>Security and Surveillance</strong>: Detecting anomalies or
 unauthorised objects in security footage.</li>
 </ul>
-<p>Convolutional Neural Networks (CNNs) have become a cornerstone in
-image classification due to their ability to automatically learn
-hierarchical features from images and achieve remarkable performance on
-a wide range of tasks.</p>
+<p>A Convolutional Neural Networks (CNN) is a Deep Learning algorithm
+that has become a cornerstone in image classification due to its ability
+to automatically learn features from images in a hierarchical fashion
+(i.e. each layer builds upon what was learned by the previous layer). It
+can achieve remarkable performance on a wide range of tasks.</p>
 </section><section id="deep-learning-workflow"><h2 class="section-heading">Deep Learning Workflow<a class="anchor" aria-label="anchor" href="#deep-learning-workflow"></a>
 </h2>
 <hr class="half-width">
@@ -776,12 +770,12 @@ <h3 id="step-3--prepare-data">Step 3. Prepare data<a class="anchor" aria-label="
 the data structure will be explored in <a href="02-image-data">Episode
 02 Introduction to Image Data</a>.</p>
 <p>For this lesson, we will use an existing image dataset known as
-CIFAR-10. We will introduce this dataset and the different data
-preparation tasks in more detail in the next episode but for this
-introduction, we want to divide the data into <strong>training</strong>,
-<strong>validation</strong>, and <strong>test</strong> subsets;
-normalise the image pixel values to be between 0 and 1; and one-hot
-encode our image labels.</p>
+CIFAR-10 (Canadian Institute for Advanced Research). We will introduce
+this dataset and the different data preparation tasks in more detail in
+the next episode but for this introduction, we want to divide the data
+into <strong>training</strong>, <strong>validation</strong>, and
+<strong>test</strong> subsets; normalise the image pixel values to be
+between 0 and 1; and one-hot encode our image labels.</p>
 <div class="section level4">
 <h4 id="preparing-the-code">Preparing the code<a class="anchor" aria-label="anchor" href="#preparing-the-code"></a>
 </h4>
@@ -794,28 +788,27 @@ <h4 id="preparing-the-code">Preparing the code<a class="anchor" aria-label="anch
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
 </h3>
 <pre class="sourceCode python" tabindex="0"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># load the required packages</span></span>
-<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras <span class="co"># library for neural networks </span></span>
-<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split <span class="co"># library for splitting data into sets</span></span>
-<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt <span class="co"># library for plotting</span></span>
-<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> numpy <span class="im">as</span> np <span class="co"># library for working with images as arrays</span></span>
-<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a><span class="co"># load the CIFAR-10 dataset included with the keras library</span></span>
-<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>(train_images, train_labels), (test_images, test_labels) <span class="op">=</span> keras.datasets.cifar10.load_data()</span>
-<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a><span class="co"># normalise the RGB values to be between 0 and 1</span></span>
-<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>train_images <span class="op">=</span> train_images <span class="op">/</span> <span class="fl">255.0</span></span>
-<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>test_images <span class="op">=</span> test_images <span class="op">/</span> <span class="fl">255.0</span></span>
-<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a><span class="co"># create a list of class names</span></span>
-<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>class_names <span class="op">=</span> [<span class="st">'airplane'</span>, <span class="st">'automobile'</span>, <span class="st">'bird'</span>, <span class="st">'cat'</span>, <span class="st">'deer'</span>, <span class="st">'dog'</span>, <span class="st">'frog'</span>, <span class="st">'horse'</span>, <span class="st">'ship'</span>, <span class="st">'truck'</span>]</span>
-<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="co"># one-hot encode labels</span></span>
-<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>train_labels <span class="op">=</span> keras.utils.to_categorical(train_labels, <span class="bu">len</span>(class_names))</span>
-<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a>val_labels <span class="op">=</span> keras.utils.to_categorical(val_labels, <span class="bu">len</span>(class_names))</span>
-<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a></span>
-<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="co"># split the training data into training and validation sets</span></span>
-<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a><span class="co"># </span><span class="al">NOTE</span><span class="co"> the function is train_test_split() but we are using it to split train into train and validation</span></span>
-<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a>train_images, val_images, train_labels, val_labels <span class="op">=</span> train_test_split(train_images, train_labels, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">42</span>)</span></code></pre>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> tensorflow <span class="im">import</span> keras <span class="co"># for neural networks </span></span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.model_selection <span class="im">import</span> train_test_split <span class="co"># for splitting data into sets</span></span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt <span class="co"># for plotting</span></span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="co"># load the CIFAR-10 dataset included with keras</span></span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>(train_images, train_labels), (test_images, test_labels) <span class="op">=</span> keras.datasets.cifar10.load_data()</span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a><span class="co"># normalise the RGB values to be between 0 and 1</span></span>
+<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>train_images <span class="op">=</span> train_images <span class="op">/</span> <span class="fl">255.0</span></span>
+<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>test_images <span class="op">=</span> test_images <span class="op">/</span> <span class="fl">255.0</span></span>
+<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a><span class="co"># create a list of class names</span></span>
+<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>class_names <span class="op">=</span> [<span class="st">'airplane'</span>, <span class="st">'automobile'</span>, <span class="st">'bird'</span>, <span class="st">'cat'</span>, <span class="st">'deer'</span>, <span class="st">'dog'</span>, <span class="st">'frog'</span>, <span class="st">'horse'</span>, <span class="st">'ship'</span>, <span class="st">'truck'</span>]</span>
+<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a><span class="co"># one-hot encode labels</span></span>
+<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a>train_labels <span class="op">=</span> keras.utils.to_categorical(train_labels, <span class="bu">len</span>(class_names))</span>
+<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a>val_labels <span class="op">=</span> keras.utils.to_categorical(val_labels, <span class="bu">len</span>(class_names))</span>
+<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a><span class="co"># split the training data into training and validation sets</span></span>
+<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="co"># </span><span class="al">NOTE</span><span class="co"> the function is train_test_split() but we are using it to split train into train and validation</span></span>
+<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a>train_images, val_images, train_labels, val_labels <span class="op">=</span> train_test_split(train_images, train_labels, test_size<span class="op">=</span><span class="fl">0.2</span>, random_state<span class="op">=</span><span class="dv">42</span>)</span></code></pre>
 </div>
 <div id="challenge-examine-the-cifar-10-dataset" class="callout challenge">
 <div class="callout-square">
@@ -871,7 +864,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
 <span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="co"># plot a subset of the images </span></span>
 <span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="bu">range</span>(<span class="dv">25</span>):</span>
 <span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>    plt.subplot(<span class="dv">5</span>,<span class="dv">5</span>,i<span class="op">+</span><span class="dv">1</span>)</span>
-<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    plt.imshow(train_images[i], cmap<span class="op">=</span>plt.cm.binary)</span>
+<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    plt.imshow(train_images[i])</span>
 <span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>    plt.axis(<span class="st">'off'</span>)</span>
 <span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>    plt.title(class_names[train_labels[i,].argmax()])</span>
 <span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code></pre>
@@ -981,27 +974,27 @@ <h4 id="what-does-this-output-mean">What does this output mean?<a class="anchor"
 against known image labels, can be broken down as follows:</p>
 <ul>
 <li><p><code>Epoch</code> describes the number of full passes over all
-<em>training data</em>. In the output above there are <strong>1250
-training observations</strong>. This number is calculated as the total
-number of images used as input divided by the batch size (40000/32). An
-epoch will conclude and move to the next epoch after a training pass
-over all observations.</p></li>
-<li><p><code>loss</code> and <code>val_loss</code> can be considered as
-related. Where <code>loss</code> is a value the model will attempt to
-minimise, and is the distance between the true label of an image and the
-models prediction. Minimising this distance is where <em>learning</em>
-occurs to adjust weights and bias which reduce <code>loss</code>. On the
-other hand <code>val_loss</code> is a value calculated against the
-validation data and is a measurement of the models performance against
-<strong>unseen data</strong>. Both values are a summation of errors made
-for each example when fitting to the training or validation
-sets.</p></li>
-<li><p><code>accuracy</code> and <code>val_accuracy</code> can also be
-considered as related. Unlike <code>loss</code> and
-<code>val_loss</code>, these values are a percentage and are only
-revelant to <strong>classification problems</strong>. The
-<code>val_accuracy</code> score can be used to communicate a percentage
-value of model effectiveness on unseen data.</p></li>
+<em>training data</em>.</p></li>
+<li><p>In the output above, there are <strong>1250</strong> batches
+(steps) to complete each epoch. This number is calculated as the total
+number of images used as input divided by the batch size (40000/32).
+After 1250 batches, all training images will have been seen once and the
+model moves on to the next epoch.</p></li>
+<li><p><code>loss</code> is a value the model will attempt to minimise
+and is a measure of the dissimilarity or error between the true label of
+an image and the model prediction. Minimising this distance is where
+<em>learning</em> occurs to adjust weights and bias which reduce
+<code>loss</code>.</p></li>
+<li><p><code>val_loss</code> is a value calculated against the
+validation data and is a measure of the model’s performance against
+unseen data.</p></li>
+<li><p>Both values are a summation of errors made during each
+epoch.</p></li>
+<li><p><code>accuracy</code> and <code>val_accuracy</code> values are a
+percentage and are only revelant to <strong>classification
+problems</strong>.</p></li>
+<li><p>The <code>val_accuracy</code> score can be used to communicate a
+model’s effectiveness on unseen data.</p></li>
 </ul>
 </div>
 </div>
@@ -1009,9 +1002,9 @@ <h4 id="what-does-this-output-mean">What does this output mean?<a class="anchor"
 <h3 id="step-7--perform-a-predictionclassification">Step 7. Perform a Prediction/Classification<a class="anchor" aria-label="anchor" href="#step-7--perform-a-predictionclassification"></a>
 </h3>
 <p>After training the network we can use it to perform predictions. This
-is the mode you would use the network in after you have fully trained it
-to a satisfactory performance. Doing predictions on a special hold-out
-set is used in the next step to measure the performance of the
+is how you would use the network after you have fully trained it to a
+satisfactory performance. The predictions performed here on a special
+hold-out set is used in the next step to measure the performance of the
 network.</p>
 <div class="codewrapper sourceCode" id="cb9">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -1023,7 +1016,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
 <span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a><span class="bu">print</span>(<span class="st">'The class with the highest predicted probability is: '</span>, class_names[result_intro.argmax()])</span>
 <span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a></span>
 <span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a><span class="co"># plot the image with its true label</span></span>
-<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>plt.imshow(test_images[<span class="dv">0</span>], cmap<span class="op">=</span>plt.cm.binary)</span>
+<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>plt.imshow(test_images[<span class="dv">0</span>])</span>
 <span id="cb9-9"><a href="#cb9-9" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'True class:'</span> <span class="op">+</span> class_names[test_labels[<span class="dv">0</span>,].argmax()])</span>
 <span id="cb9-10"><a href="#cb9-10" aria-hidden="true" tabindex="-1"></a>plt.show()</span></code></pre>
 </div>
@@ -1033,7 +1026,14 @@ <h3 class="code-label">OUTPUT<i aria-hidden="true" data-feather="chevron-left"><
 <pre class="output" tabindex="0"><code>The predicted probability of each class is:  [[0.0074 0.0006 0.0456 0.525  0.0036 0.1062 0.0162 0.0006 0.2908 0.004 ]]
 The class with the highest predicted probability is:  cat</code></pre>
 </div>
-<figure><img src="../fig/01_test_image.png" alt="poor resolution image of a cat" class="figure mx-auto d-block"></figure><div id="callout2" class="callout callout">
+<figure><img src="../fig/01_test_image.png" alt="poor resolution image of a cat" class="figure mx-auto d-block"></figure><p>Congratulations, you just created your first image classification
+model and used it to classify an image!</p>
+<p>Was the classification correct? Why might it be incorrect and what
+can we do about?</p>
+<p>There are many ways to try to improve the accuracy of our model, such
+as adding or removing layers to the model definition and fine-tuning the
+hyperparameters, which takes us to the next steps in our workflow.</p>
+<div id="callout2" class="callout callout">
 <div class="callout-square">
 <i class="callout-icon" data-feather="bell"></i>
 </div>
@@ -1042,11 +1042,11 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 </h3>
 <div class="callout-content">
 <p>My result is different!</p>
-<p>While the neural network itself is deterministic, various factors in
-the training process, system setup, and data variability can lead to
-small variations in the output. These variations are usually minor and
-should not significantly impact the overall performance or behavior of
-the model.</p>
+<p>While the neural network itself is deterministic (ie without
+randomness), various factors in the training process, system setup, and
+data variability can lead to small variations in the output. These
+variations are usually minor and should not significantly impact the
+overall performance or behavior of the model.</p>
 <p>If you are finding significant differences in the model predictions,
 this could be a sign the model is not fully converged. “Convergence”
 refers to the point where the model has reached an optimal or
@@ -1054,13 +1054,6 @@ <h3 class="callout-title">Callout<a class="anchor" aria-label="anchor" href="#ca
 </div>
 </div>
 </div>
-<p>Congratulations, you just created your first image classification
-model and used it to classify an image!</p>
-<p>Was the classification correct? Why might it be incorrect and what
-can we do about?</p>
-<p>There are many ways to try to improve the accuracy of our model, such
-as adding or removing layers to the model definition and fine-tuning the
-hyperparameters, which takes us to the next steps in our workflow.</p>
 </div>
 <div class="section level3">
 <h3 id="step-8--measure-performance">Step 8. Measure Performance<a class="anchor" aria-label="anchor" href="#step-8--measure-performance"></a>
@@ -1080,7 +1073,7 @@ <h3 id="step-9--tune-hyperparameters">Step 9. Tune Hyperparameters<a class="anch
 designing a neural network but also choosing the best values for various
 hyperparameters that govern the training process.</p>
 <p><strong>Hyperparameters</strong> are all the parameters set by the
-person configuring the machine learning instead of those learned by the
+person configuring the model as opposed to those learned by the
 algorithm itself. These hyperparameters can include the learning rate,
 the number of layers in the network, the number of neurons per layer,
 and many more. Hyperparameter tuning refers to the process of
@@ -1093,12 +1086,12 @@ <h3 id="step-9--tune-hyperparameters">Step 9. Tune Hyperparameters<a class="anch
 <h3 id="step-10--share-model">Step 10. Share Model<a class="anchor" aria-label="anchor" href="#step-10--share-model"></a>
 </h3>
 <p>Now that we have a trained network that performs at a level we are
-happy with we can go and use it on real data to perform a prediction. At
-this point we might want to consider publishing a file with both the
-architecture of our network and the weights which it has learned
-(assuming we did not use a pre-trained network). This will allow others
-to use it as as pre-trained network for their own purposes and for them
-to (mostly) reproduce our result.</p>
+happy with we can go and use it on real live data to perform a
+prediction. At this point we might want to consider publishing a file
+with both the architecture of our network and the weights which it has
+learned (assuming we did not use a pre-trained network). This will allow
+others to use it as as pre-trained network for their own purposes and
+for them to (mostly) reproduce our result.</p>
 <p>To share the model we must save it first:</p>
 <div class="codewrapper sourceCode" id="cb11">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -2718,7 +2711,7 @@ <h4 id="optimizer">Optimizer<a class="anchor" aria-label="anchor" href="#optimiz
   <h3 class="accordion-header" id="headingSpoiler1">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Learning Rate</h3>
 </button>
-<div id="collapseSpoiler1" class="accordion-collapse collapse" aria-labelledby="headingSpoiler1" data-bs-parent="#accordionSpoiler1">
+<div id="collapseSpoiler1" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler1" aria-labelledby="headingSpoiler1">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p><strong>Learning rate</strong> is a hyperparameter that determines
@@ -2819,7 +2812,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
   <h3 class="accordion-header" id="headingSpoiler2">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Batch size</h3>
 </button>
-<div id="collapseSpoiler2" class="accordion-collapse collapse" aria-labelledby="headingSpoiler2" data-bs-parent="#accordionSpoiler2">
+<div id="collapseSpoiler2" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler2" aria-labelledby="headingSpoiler2">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p>The choice of batch size can have various implications, and there are
@@ -2912,7 +2905,7 @@ <h3 class="callout-title">Inspect the Training Curve<a class="anchor" aria-label
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution1" aria-expanded="false" aria-controls="collapseSolution1">
   <h4 class="accordion-header" id="headingSolution1">Show me the solution</h4>
 </button>
-<div id="collapseSolution1" class="accordion-collapse collapse" aria-labelledby="headingSolution1" data-bs-parent="#accordionSolution1">
+<div id="collapseSolution1" class="accordion-collapse collapse" data-bs-parent="#accordionSolution1" aria-labelledby="headingSolution1">
 <div class="accordion-body">
 <ol style="list-style-type: decimal">
 <li>The loss curve should drop quite quickly in a smooth line with
@@ -2960,7 +2953,7 @@ <h4 class="accordion-header" id="headingSolution1">Show me the solution</h4>
   <h3 class="accordion-header" id="headingSpoiler3">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: What is underfitting?</h3>
 </button>
-<div id="collapseSpoiler3" class="accordion-collapse collapse" aria-labelledby="headingSpoiler3" data-bs-parent="#accordionSpoiler3">
+<div id="collapseSpoiler3" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler3" aria-labelledby="headingSpoiler3">
 <div class="accordion-body">
 <p>Underfitting occurs when the model is too simple or lacks the
 capacity to capture the underlying patterns and relationships present in
@@ -3140,7 +3133,7 @@ <h3 class="callout-title">Does adding a Dropout Layer improve our
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution2" aria-expanded="false" aria-controls="collapseSolution2">
   <h4 class="accordion-header" id="headingSolution2">Show me the solution</h4>
 </button>
-<div id="collapseSolution2" class="accordion-collapse collapse" aria-labelledby="headingSolution2" data-bs-parent="#accordionSolution2">
+<div id="collapseSolution2" class="accordion-collapse collapse" data-bs-parent="#accordionSolution2" aria-labelledby="headingSolution2">
 <div class="accordion-body">
 <div class="codewrapper sourceCode" id="cb9">
 <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"></i><i aria-hidden="true" data-feather="chevron-right"></i>
@@ -3192,7 +3185,7 @@ <h3 class="code-label">PYTHON<i aria-hidden="true" data-feather="chevron-left"><
   <h3 class="accordion-header" id="headingSpoiler4">
 <div class="note-square"><i aria-hidden="true" class="callout-icon" data-feather="eye"></i></div>WANT TO KNOW MORE: Regularization methods for Convolutional Neural Networks (CNNs)</h3>
 </button>
-<div id="collapseSpoiler4" class="accordion-collapse collapse" aria-labelledby="headingSpoiler4" data-bs-parent="#accordionSpoiler4">
+<div id="collapseSpoiler4" class="accordion-collapse collapse" data-bs-parent="#accordionSpoiler4" aria-labelledby="headingSpoiler4">
 <div class="accordion-body">
 <p>ChatGPT</p>
 <p><strong>Regularization</strong> methods introduce constraints or
diff --git a/instructor/index.html b/instructor/index.html
index 2e4d8f51..fe8ef877 100644
--- a/instructor/index.html
+++ b/instructor/index.html
@@ -415,7 +415,7 @@ <h3 class="callout-title">Install Python Using Anaconda<a class="anchor" aria-la
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution1" aria-expanded="false" aria-controls="collapseSolution1">
   <h4 class="accordion-header" id="headingSolution1">Windows</h4>
 </button>
-<div id="collapseSolution1" class="accordion-collapse collapse" data-bs-parent="#accordionSolution1" aria-labelledby="headingSolution1">
+<div id="collapseSolution1" class="accordion-collapse collapse" aria-labelledby="headingSolution1" data-bs-parent="#accordionSolution1">
 <div class="accordion-body">
 <p>Check out the <a href="https://www.youtube.com/watch?v=xxQ0mzZ8UvA" class="external-link">Windows - Video
 tutorial</a> or:</p>
@@ -434,7 +434,7 @@ <h4 class="accordion-header" id="headingSolution1">Windows</h4>
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution2" aria-expanded="false" aria-controls="collapseSolution2">
   <h4 class="accordion-header" id="headingSolution2">MacOS</h4>
 </button>
-<div id="collapseSolution2" class="accordion-collapse collapse" data-bs-parent="#accordionSolution2" aria-labelledby="headingSolution2">
+<div id="collapseSolution2" class="accordion-collapse collapse" aria-labelledby="headingSolution2" data-bs-parent="#accordionSolution2">
 <div class="accordion-body">
 <p>Check out the <a href="https://www.youtube.com/watch?v=TcSAln46u9U" class="external-link">Mac OS X - Video
 tutorial</a> or:</p>
@@ -452,7 +452,7 @@ <h4 class="accordion-header" id="headingSolution2">MacOS</h4>
 <button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution3" aria-expanded="false" aria-controls="collapseSolution3">
   <h4 class="accordion-header" id="headingSolution3">Linux</h4>
 </button>
-<div id="collapseSolution3" class="accordion-collapse collapse" data-bs-parent="#accordionSolution3" aria-labelledby="headingSolution3">
+<div id="collapseSolution3" class="accordion-collapse collapse" aria-labelledby="headingSolution3" data-bs-parent="#accordionSolution3">
 <div class="accordion-body">
 <p>Note the following installation steps require you to work from the
 shell. If you run into any difficulties, please request help before the
diff --git a/md5sum.txt b/md5sum.txt
index b6994aba..4c6cff98 100644
--- a/md5sum.txt
+++ b/md5sum.txt
@@ -5,7 +5,7 @@
 "index.md" "a02c9c785ed98ddd84fe3d34ddb12fcd" "site/built/index.md" "2023-08-22"
 "links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2023-08-22"
 "episodes/setup-gpu.md" "fad3fa7635f7b6b718dd14b33e40c6cf" "site/built/setup-gpu.md" "2024-02-23"
-"episodes/01-introduction.md" "84b5ec3f43b005b9d30416e976888d09" "site/built/01-introduction.md" "2024-02-14"
+"episodes/01-introduction.md" "cbad161595e756a7d132b3c7c56ccf00" "site/built/01-introduction.md" "2024-02-23"
 "episodes/02-image-data.md" "a5d98377866103717f42b2a73e375eb0" "site/built/02-image-data.md" "2024-02-09"
 "episodes/03-build-cnn.md" "0e2a686fd64ddf99d1af787ca5d1679f" "site/built/03-build-cnn.md" "2024-02-09"
 "episodes/04-fit-cnn.md" "7244a8847b43022d3f85da74f5f5946c" "site/built/04-fit-cnn.md" "2024-02-08"
diff --git a/pkgdown.yml b/pkgdown.yml
index c4d44dc7..44d169a5 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -2,5 +2,5 @@ pandoc: 2.19.2
 pkgdown: 2.0.7
 pkgdown_sha: ~
 articles: {}
-last_built: 2024-02-23T05:47Z
+last_built: 2024-02-23T08:08Z