diff --git a/episodes/1-introduction.Rmd b/episodes/1-introduction.Rmd index f4eb7648..72342af3 100644 --- a/episodes/1-introduction.Rmd +++ b/episodes/1-introduction.Rmd @@ -40,7 +40,7 @@ Deep learning (DL) is just one of many techniques collectively known as machine The image below shows some differences between artificial intelligence, machine learning and deep learning. -![](fig/01_AI_ML_DL_differences.jpg){ +![](fig/01_AI_ML_DL_differences.png){ alt='An infographic showing the relation of artificial intelligence, machine learning, and deep learning. Deep learning is a specific subset of machine learning algorithms. Machine learning is one of the approaches to artificial intelligence.' width='60%' } @@ -204,7 +204,7 @@ b. This solves the XOR logical problem, the output is 1 if only one of the two i ::: ##### What makes deep learning deep learning? -Neural networks aren't a new technique, they have been around since the late 1940s. But until around 2010 neural networks tended to be quite small, consisting of only 10s or perhaps 100s of neurons. This limited them to only solving quite basic problems. Around 2010, improvements in computing power and the algorithms for training the networks made much larger and more powerful networks practical. These are known as deep neural networks or deep learning. +Neural networks are not a new technique, they have been around since the late 1940s. But until around 2010 neural networks tended to be quite small, consisting of only 10s or perhaps 100s of neurons. This limited them to only solving quite basic problems. Around 2010, improvements in computing power and the algorithms for training the networks made much larger and more powerful networks practical. These are known as deep neural networks or deep learning. Deep learning requires extensive training using example data which shows the network what output it should produce for a given input. One common application of deep learning is [classifying](https://glosario.carpentries.org/en/#classification) images. Here the network will be trained by being "shown" a series of images and told what they contain. Once the network is trained it should be able to take another image and correctly classify its contents. diff --git a/episodes/2-keras.Rmd b/episodes/2-keras.Rmd index 884eac62..aeca9cb4 100644 --- a/episodes/2-keras.Rmd +++ b/episodes/2-keras.Rmd @@ -269,6 +269,7 @@ To split the cleaned dataset into a training and test set we will use a very con function from sklearn called `train_test_split`. This function takes a number of parameters which are extensively explained in [the scikit-learn documentation](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) : + - The first two parameters are the dataset (in our case `features`) and the corresponding targets (i.e. defined as target). - Next is the named parameter `test_size` this is the fraction of the dataset that is used for testing, in this case `0.2` means 20% of the data will be used for testing. @@ -332,7 +333,7 @@ So, to get truly replicable deep learning pipelines you need to run the notebook ### Build a neural network from scratch -Now we will build a neural network from scratch, which is surprisingly straightforward using Keras. +We will now build a simple neural network from scratch using Keras. With Keras you compose a neural network by creating layers and linking them together. For now we will only use one type of layer called a fully connected @@ -555,6 +556,14 @@ sns.lineplot(x=history.epoch, y=history.history['loss']) ``` ![][training_curve] +::: callout +## I get a different plot +It could be that you get a different plot than the one shown here. +This could be because of a different random initialization of the model or a different split of the data. +This difference can be avoided by setting `random_state` and random seed in the same way like we discussed +in [When to use random seeds?](#when-to-use-random-seeds). +::: + This plot can be used to identify whether the training is well configured or whether there are problems that need to be addressed. @@ -758,12 +767,12 @@ This can be done by using the `save` method of the model. It takes a string as a parameter which is the path of a directory where the model is stored. ```python -model.save('my_first_model') +model.save('my_first_model.keras') ``` This saved model can be loaded again by using the `load_model` method as follows: ```python -pretrained_model = keras.models.load_model('my_first_model') +pretrained_model = keras.models.load_model('my_first_model.keras') ``` This loaded model can be used as before to predict. diff --git a/episodes/3-monitor-the-model.Rmd b/episodes/3-monitor-the-model.Rmd index 30a15cfa..84caf32b 100644 --- a/episodes/3-monitor-the-model.Rmd +++ b/episodes/3-monitor-the-model.Rmd @@ -61,6 +61,12 @@ If you have not downloaded the data yet, you can also load it directly from Zeno ```python data = pd.read_csv("https://zenodo.org/record/5071376/files/weather_prediction_dataset_light.csv?download=1") ``` + +#### SSL certificate error + +If you get the following error message: `certificate verify failed: unable to get local issuer certificate`, +you can download [the data from here manually](https://zenodo.org/record/5071376/files/weather_prediction_dataset_light.csv?download=1) +into a local folder and load the data using the code below. ::: ```python @@ -504,8 +510,8 @@ plot_predictions(y_baseline_prediction, y_test, title='Baseline predictions on t It is difficult to interpret from this plot whether our model is doing better than the baseline. We can also have a look at the RMSE: ```python -from sklearn.metrics import mean_squared_error -rmse_baseline = mean_squared_error(y_test, y_baseline_prediction, squared=False) +from sklearn.metrics import root_mean_squared_error +rmse_baseline = root_mean_squared_error(y_test, y_baseline_prediction) print('Baseline:', rmse_baseline) print('Neural network: ', test_metrics['root_mean_squared_error']) ``` @@ -957,7 +963,7 @@ Which will show an interface that looks something like this: Now that we have a somewhat acceptable model, let us not forget to save it for future users to benefit from our explorative efforts! ```python -model.save('my_tuned_weather_model') +model.save('my_tuned_weather_model.keras') ``` ## Outlook diff --git a/episodes/4-advanced-layer-types.Rmd b/episodes/4-advanced-layer-types.Rmd index a60eca2b..7954ea81 100644 --- a/episodes/4-advanced-layer-types.Rmd +++ b/episodes/4-advanced-layer-types.Rmd @@ -134,7 +134,7 @@ train_labels.shape ``` ```output -(878, 1) +(878,) ``` So we have, for each image, a single value denoting the label. To find out what the possible values of these labels are: @@ -1062,7 +1062,7 @@ Next to grid search and random search there are many different hyperparameter tu Let's save our model ```python -model.save('cnn_model') +model.save('cnn_model.keras') ``` ## Conclusion and next steps diff --git a/episodes/5-transfer-learning.Rmd b/episodes/5-transfer-learning.Rmd index e82547b8..65844524 100644 --- a/episodes/5-transfer-learning.Rmd +++ b/episodes/5-transfer-learning.Rmd @@ -49,6 +49,7 @@ val_images = val_images / 255.0 Let's define our model input layer using the shape of our training images: ```python # input tensor +from tensorflow import keras inputs = keras.Input(train_images.shape[1:]) ``` @@ -57,6 +58,7 @@ trained on images of 160 x 160 pixels. To deal with this, we add an upscale laye that resizes the images to 160 x 160 pixels during training and prediction. ```python # upscale layer +import tensorflow as tf method = tf.image.ResizeMethod.BILINEAR upscale = keras.layers.Lambda( lambda x: tf.image.resize_with_pad(x, 160, 160, method=method))(inputs) @@ -78,6 +80,23 @@ base_model = keras.applications.DenseNet121(include_top=False, input_shape=(160,160,3), ) ``` + +::: callout +## SSL: certificate verify failed error +If you get the following error message: `certificate verify failed: unable to get local issuer certificate`, +you can download [the weights of the model manually](https://storage.googleapis.com/tensorflow/keras-applications/densenet/densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5) +and then load in the weights from the downloaded file: + +```python +base_model = keras.applications.DenseNet121( + include_top=False, + pooling='max', + weights='densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5', # this should refer to the weights file you downloaded + input_tensor=upscale, + input_shape=(160,160,3), +) +``` +::: By setting `include_top` to `False` we exclude the fully connected layer at the top of the network. This layer was used to predict the Imagenet classes, but will be of no use for our Dollar Street dataset. diff --git a/episodes/fig/01_AI_ML_DL_differences.jpg b/episodes/fig/01_AI_ML_DL_differences.jpg deleted file mode 100644 index 8d517d5d..00000000 Binary files a/episodes/fig/01_AI_ML_DL_differences.jpg and /dev/null differ diff --git a/episodes/fig/01_AI_ML_DL_differences.png b/episodes/fig/01_AI_ML_DL_differences.png new file mode 100644 index 00000000..7e27ee71 Binary files /dev/null and b/episodes/fig/01_AI_ML_DL_differences.png differ diff --git a/learners/setup.md b/learners/setup.md index dc3af282..d985bd1b 100644 --- a/learners/setup.md +++ b/learners/setup.md @@ -116,17 +116,12 @@ Jupyter Lab is compatible with Firefox, Chrome, Safari and Chromium-based browse Note that Internet Explorer and Edge are *not* supported. See the [Jupyter Lab documentation](https://jupyterlab.readthedocs.io/en/latest/getting_started/accessibility.html#compatibility-with-browsers-and-assistive-technology) for an up-to-date list of supported browsers. -To start Jupyter Lab, open a terminal (Mac/Linux) or Command Prompt (Windows) and type the command: - -```shell -jupyter lab -``` - -To start the Python interpreter without Jupyter Lab, open a terminal (Mac/Linux) or Command Prompt (Windows) +To start Jupyter Lab, open a terminal (Mac/Linux) or Command Prompt (Windows), +make sure that you activated the virtual environment you created for this course, and type the command: ```shell -python +jupyter lab ``` ## Check your setup