diff --git a/jupyter-book/_toc.yml b/jupyter-book/_toc.yml
index 5aea17ca2..80bb88aa3 100644
--- a/jupyter-book/_toc.yml
+++ b/jupyter-book/_toc.yml
@@ -99,11 +99,9 @@ parts:
     - file: linear_models/linear_models_quiz_m4_02
   - file: linear_models/linear_models_non_linear_index
     sections:
+    - file: python_scripts/linear_regression_non_linear_link
     - file: python_scripts/linear_models_ex_02
     - file: python_scripts/linear_models_sol_02
-    - file: python_scripts/linear_regression_non_linear_link
-    - file: python_scripts/linear_models_ex_03
-    - file: python_scripts/linear_models_sol_03
     - file: python_scripts/logistic_regression_non_linear
     - file: linear_models/linear_models_quiz_m4_03
   - file: linear_models/linear_models_regularization_index
@@ -111,8 +109,8 @@ parts:
     - file: linear_models/regularized_linear_models_slides
     - file: python_scripts/linear_models_regularization
     - file: linear_models/linear_models_quiz_m4_04
-    - file: python_scripts/linear_models_ex_04
-    - file: python_scripts/linear_models_sol_04
+    - file: python_scripts/linear_models_ex_03
+    - file: python_scripts/linear_models_sol_03
     - file: linear_models/linear_models_quiz_m4_05
   - file: linear_models/linear_models_wrap_up_quiz
   - file: linear_models/linear_models_module_take_away
diff --git a/notebooks/linear_models_ex_02.ipynb b/notebooks/linear_models_ex_02.ipynb
index c9c0aad96..4cf750e81 100644
--- a/notebooks/linear_models_ex_02.ipynb
+++ b/notebooks/linear_models_ex_02.ipynb
@@ -4,39 +4,19 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# \ud83d\udcdd Exercise M4.02\n",
+    "# \ud83d\udcdd Exercise M4.03\n",
     "\n",
-    "The goal of this exercise is to build an intuition on what will be the\n",
-    "parameters' values of a linear model when the link between the data and the\n",
-    "target is non-linear.\n",
+    "In all previous notebooks, we only used a single feature in `data`. But we\n",
+    "have already shown that we could add new features to make the model more\n",
+    "expressive by deriving new features, based on the original feature.\n",
     "\n",
-    "First, we will generate such non-linear data.\n",
+    "The aim of this notebook is to train a linear regression algorithm on a\n",
+    "dataset with more than a single feature.\n",
     "\n",
-    "<div class=\"admonition tip alert alert-warning\">\n",
-    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Tip</p>\n",
-    "<p class=\"last\"><tt class=\"docutils literal\">np.random.RandomState</tt> allows to create a random number generator which can\n",
-    "be later used to get deterministic results.</p>\n",
-    "</div>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy as np\n",
-    "\n",
-    "# Set the seed for reproduction\n",
-    "rng = np.random.RandomState(0)\n",
-    "\n",
-    "# Generate data\n",
-    "n_sample = 100\n",
-    "data_max, data_min = 1.4, -1.4\n",
-    "len_data = data_max - data_min\n",
-    "data = rng.rand(n_sample) * len_data - len_data / 2\n",
-    "noise = rng.randn(n_sample) * 0.3\n",
-    "target = data**3 - 0.5 * data**2 + noise"
+    "We will load a dataset about house prices in California. The dataset consists\n",
+    "of 8 features regarding the demography and geography of districts in\n",
+    "California and the aim is to predict the median house price of each district.\n",
+    "We will use all 8 features to predict the target, the median house price."
    ]
   },
   {
@@ -45,8 +25,8 @@
    "source": [
     "<div class=\"admonition note alert alert-info\">\n",
     "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Note</p>\n",
-    "<p class=\"last\">To ease the plotting, we will create a Pandas dataframe containing the data\n",
-    "and target</p>\n",
+    "<p class=\"last\">If you want a deeper overview regarding this dataset, you can refer to the\n",
+    "Appendix - Datasets description section at the end of this MOOC.</p>\n",
     "</div>"
    ]
   },
@@ -56,65 +36,19 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "import pandas as pd\n",
+    "from sklearn.datasets import fetch_california_housing\n",
     "\n",
-    "full_data = pd.DataFrame({\"data\": data, \"target\": target})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import seaborn as sns\n",
-    "\n",
-    "_ = sns.scatterplot(\n",
-    "    data=full_data, x=\"data\", y=\"target\", color=\"black\", alpha=0.5\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "lines_to_next_cell": 2
-   },
-   "source": [
-    "We observe that the link between the data `data` and vector `target` is\n",
-    "non-linear. For instance, `data` could represent the years of experience\n",
-    "(normalized) and `target` the salary (normalized). Therefore, the problem here\n",
-    "would be to infer the salary given the years of experience.\n",
-    "\n",
-    "Using the function `f` defined below, find both the `weight` and the\n",
-    "`intercept` that you think will lead to a good linear model. Plot both the\n",
-    "data and the predictions of this model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def f(data, weight=0, intercept=0):\n",
-    "    target_predict = weight * data + intercept\n",
-    "    return target_predict"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Write your code here."
+    "data, target = fetch_california_housing(as_frame=True, return_X_y=True)\n",
+    "target *= 100  # rescale the target in k$\n",
+    "data.head()"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Compute the mean squared error for this model"
+    "Now it is your turn to train a linear regression model on this dataset. First,\n",
+    "create a linear regression model."
    ]
   },
   {
@@ -130,16 +64,8 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Train a linear regression model on this dataset.\n",
-    "\n",
-    "<div class=\"admonition warning alert alert-danger\">\n",
-    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Warning</p>\n",
-    "<p class=\"last\">In scikit-learn, by convention <tt class=\"docutils literal\">data</tt> (also called <tt class=\"docutils literal\">X</tt> in the scikit-learn\n",
-    "documentation) should be a 2D matrix of shape <tt class=\"docutils literal\">(n_samples, n_features)</tt>.\n",
-    "If <tt class=\"docutils literal\">data</tt> is a 1D vector, you need to reshape it into a matrix with a\n",
-    "single column if the vector represents a feature or a single row if the\n",
-    "vector represents a sample.</p>\n",
-    "</div>"
+    "Execute a cross-validation with 10 folds and use the mean absolute error (MAE)\n",
+    "as metric. Be sure to *return* the fitted *estimators*."
    ]
   },
   {
@@ -148,8 +74,6 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "from sklearn.linear_model import LinearRegression\n",
-    "\n",
     "# Write your code here."
    ]
   },
@@ -157,8 +81,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Compute predictions from the linear regression model and plot both the data\n",
-    "and the predictions."
+    "Compute the mean and std of the MAE in thousands of dollars (k$)."
    ]
   },
   {
@@ -172,9 +95,15 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "lines_to_next_cell": 2
+   },
    "source": [
-    "Compute the mean squared error"
+    "Inspect the fitted model using a box plot to show the distribution of values\n",
+    "for the coefficients returned from the cross-validation. Hint: use the\n",
+    "function\n",
+    "[`df.plot.box()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.box.html)\n",
+    "to create a box plot."
    ]
   },
   {
diff --git a/notebooks/linear_models_ex_03.ipynb b/notebooks/linear_models_ex_03.ipynb
deleted file mode 100644
index 4cf750e81..000000000
--- a/notebooks/linear_models_ex_03.ipynb
+++ /dev/null
@@ -1,130 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# \ud83d\udcdd Exercise M4.03\n",
-    "\n",
-    "In all previous notebooks, we only used a single feature in `data`. But we\n",
-    "have already shown that we could add new features to make the model more\n",
-    "expressive by deriving new features, based on the original feature.\n",
-    "\n",
-    "The aim of this notebook is to train a linear regression algorithm on a\n",
-    "dataset with more than a single feature.\n",
-    "\n",
-    "We will load a dataset about house prices in California. The dataset consists\n",
-    "of 8 features regarding the demography and geography of districts in\n",
-    "California and the aim is to predict the median house price of each district.\n",
-    "We will use all 8 features to predict the target, the median house price."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<div class=\"admonition note alert alert-info\">\n",
-    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Note</p>\n",
-    "<p class=\"last\">If you want a deeper overview regarding this dataset, you can refer to the\n",
-    "Appendix - Datasets description section at the end of this MOOC.</p>\n",
-    "</div>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from sklearn.datasets import fetch_california_housing\n",
-    "\n",
-    "data, target = fetch_california_housing(as_frame=True, return_X_y=True)\n",
-    "target *= 100  # rescale the target in k$\n",
-    "data.head()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Now it is your turn to train a linear regression model on this dataset. First,\n",
-    "create a linear regression model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Write your code here."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Execute a cross-validation with 10 folds and use the mean absolute error (MAE)\n",
-    "as metric. Be sure to *return* the fitted *estimators*."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Write your code here."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Compute the mean and std of the MAE in thousands of dollars (k$)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Write your code here."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "lines_to_next_cell": 2
-   },
-   "source": [
-    "Inspect the fitted model using a box plot to show the distribution of values\n",
-    "for the coefficients returned from the cross-validation. Hint: use the\n",
-    "function\n",
-    "[`df.plot.box()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.box.html)\n",
-    "to create a box plot."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Write your code here."
-   ]
-  }
- ],
- "metadata": {
-  "jupytext": {
-   "main_language": "python"
-  },
-  "kernelspec": {
-   "display_name": "Python 3",
-   "name": "python3"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
\ No newline at end of file
diff --git a/notebooks/linear_models_sol_02.ipynb b/notebooks/linear_models_sol_02.ipynb
index d56864c4e..634c43171 100644
--- a/notebooks/linear_models_sol_02.ipynb
+++ b/notebooks/linear_models_sol_02.ipynb
@@ -4,39 +4,19 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# \ud83d\udcc3 Solution for Exercise M4.02\n",
+    "# \ud83d\udcc3 Solution for Exercise M4.03\n",
     "\n",
-    "The goal of this exercise is to build an intuition on what will be the\n",
-    "parameters' values of a linear model when the link between the data and the\n",
-    "target is non-linear.\n",
+    "In all previous notebooks, we only used a single feature in `data`. But we\n",
+    "have already shown that we could add new features to make the model more\n",
+    "expressive by deriving new features, based on the original feature.\n",
     "\n",
-    "First, we will generate such non-linear data.\n",
+    "The aim of this notebook is to train a linear regression algorithm on a\n",
+    "dataset with more than a single feature.\n",
     "\n",
-    "<div class=\"admonition tip alert alert-warning\">\n",
-    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Tip</p>\n",
-    "<p class=\"last\"><tt class=\"docutils literal\">np.random.RandomState</tt> allows to create a random number generator which can\n",
-    "be later used to get deterministic results.</p>\n",
-    "</div>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy as np\n",
-    "\n",
-    "# Set the seed for reproduction\n",
-    "rng = np.random.RandomState(0)\n",
-    "\n",
-    "# Generate data\n",
-    "n_sample = 100\n",
-    "data_max, data_min = 1.4, -1.4\n",
-    "len_data = data_max - data_min\n",
-    "data = rng.rand(n_sample) * len_data - len_data / 2\n",
-    "noise = rng.randn(n_sample) * 0.3\n",
-    "target = data**3 - 0.5 * data**2 + noise"
+    "We will load a dataset about house prices in California. The dataset consists\n",
+    "of 8 features regarding the demography and geography of districts in\n",
+    "California and the aim is to predict the median house price of each district.\n",
+    "We will use all 8 features to predict the target, the median house price."
    ]
   },
   {
@@ -45,8 +25,8 @@
    "source": [
     "<div class=\"admonition note alert alert-info\">\n",
     "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Note</p>\n",
-    "<p class=\"last\">To ease the plotting, we will create a Pandas dataframe containing the data\n",
-    "and target</p>\n",
+    "<p class=\"last\">If you want a deeper overview regarding this dataset, you can refer to the\n",
+    "Appendix - Datasets description section at the end of this MOOC.</p>\n",
     "</div>"
    ]
   },
@@ -56,49 +36,19 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "import pandas as pd\n",
-    "\n",
-    "full_data = pd.DataFrame({\"data\": data, \"target\": target})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import seaborn as sns\n",
+    "from sklearn.datasets import fetch_california_housing\n",
     "\n",
-    "_ = sns.scatterplot(\n",
-    "    data=full_data, x=\"data\", y=\"target\", color=\"black\", alpha=0.5\n",
-    ")"
+    "data, target = fetch_california_housing(as_frame=True, return_X_y=True)\n",
+    "target *= 100  # rescale the target in k$\n",
+    "data.head()"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {
-    "lines_to_next_cell": 2
-   },
-   "source": [
-    "We observe that the link between the data `data` and vector `target` is\n",
-    "non-linear. For instance, `data` could represent the years of experience\n",
-    "(normalized) and `target` the salary (normalized). Therefore, the problem here\n",
-    "would be to infer the salary given the years of experience.\n",
-    "\n",
-    "Using the function `f` defined below, find both the `weight` and the\n",
-    "`intercept` that you think will lead to a good linear model. Plot both the\n",
-    "data and the predictions of this model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
    "metadata": {},
-   "outputs": [],
    "source": [
-    "def f(data, weight=0, intercept=0):\n",
-    "    target_predict = weight * data + intercept\n",
-    "    return target_predict"
+    "Now it is your turn to train a linear regression model on this dataset. First,\n",
+    "create a linear regression model."
    ]
   },
   {
@@ -108,30 +58,17 @@
    "outputs": [],
    "source": [
     "# solution\n",
-    "predictions = f(data, weight=1.2, intercept=-0.2)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "tags": [
-     "solution"
-    ]
-   },
-   "outputs": [],
-   "source": [
-    "ax = sns.scatterplot(\n",
-    "    data=full_data, x=\"data\", y=\"target\", color=\"black\", alpha=0.5\n",
-    ")\n",
-    "_ = ax.plot(data, predictions)"
+    "from sklearn.linear_model import LinearRegression\n",
+    "\n",
+    "linear_regression = LinearRegression()"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Compute the mean squared error for this model"
+    "Execute a cross-validation with 10 folds and use the mean absolute error (MAE)\n",
+    "as metric. Be sure to *return* the fitted *estimators*."
    ]
   },
   {
@@ -141,26 +78,24 @@
    "outputs": [],
    "source": [
     "# solution\n",
-    "from sklearn.metrics import mean_squared_error\n",
+    "from sklearn.model_selection import cross_validate\n",
     "\n",
-    "error = mean_squared_error(target, f(data, weight=1.2, intercept=-0.2))\n",
-    "print(f\"The MSE is {error}\")"
+    "cv_results = cross_validate(\n",
+    "    linear_regression,\n",
+    "    data,\n",
+    "    target,\n",
+    "    scoring=\"neg_mean_absolute_error\",\n",
+    "    return_estimator=True,\n",
+    "    cv=10,\n",
+    "    n_jobs=2,\n",
+    ")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Train a linear regression model on this dataset.\n",
-    "\n",
-    "<div class=\"admonition warning alert alert-danger\">\n",
-    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Warning</p>\n",
-    "<p class=\"last\">In scikit-learn, by convention <tt class=\"docutils literal\">data</tt> (also called <tt class=\"docutils literal\">X</tt> in the scikit-learn\n",
-    "documentation) should be a 2D matrix of shape <tt class=\"docutils literal\">(n_samples, n_features)</tt>.\n",
-    "If <tt class=\"docutils literal\">data</tt> is a 1D vector, you need to reshape it into a matrix with a\n",
-    "single column if the vector represents a feature or a single row if the\n",
-    "vector represents a sample.</p>\n",
-    "</div>"
+    "Compute the mean and std of the MAE in thousands of dollars (k$)."
    ]
   },
   {
@@ -169,20 +104,25 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "from sklearn.linear_model import LinearRegression\n",
-    "\n",
     "# solution\n",
-    "linear_regression = LinearRegression()\n",
-    "data_2d = data.reshape(-1, 1)\n",
-    "linear_regression.fit(data_2d, target)"
+    "print(\n",
+    "    \"Mean absolute error on testing set: \"\n",
+    "    f\"{-cv_results['test_score'].mean():.3f} k$ \u00b1 \"\n",
+    "    f\"{cv_results['test_score'].std():.3f}\"\n",
+    ")"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "lines_to_next_cell": 2
+   },
    "source": [
-    "Compute predictions from the linear regression model and plot both the data\n",
-    "and the predictions."
+    "Inspect the fitted model using a box plot to show the distribution of values\n",
+    "for the coefficients returned from the cross-validation. Hint: use the\n",
+    "function\n",
+    "[`df.plot.box()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.box.html)\n",
+    "to create a box plot."
    ]
   },
   {
@@ -192,7 +132,11 @@
    "outputs": [],
    "source": [
     "# solution\n",
-    "predictions = linear_regression.predict(data_2d)"
+    "import pandas as pd\n",
+    "\n",
+    "weights = pd.DataFrame(\n",
+    "    [est.coef_ for est in cv_results[\"estimator\"]], columns=data.columns\n",
+    ")"
    ]
   },
   {
@@ -205,28 +149,11 @@
    },
    "outputs": [],
    "source": [
-    "ax = sns.scatterplot(\n",
-    "    data=full_data, x=\"data\", y=\"target\", color=\"black\", alpha=0.5\n",
-    ")\n",
-    "_ = ax.plot(data, predictions)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Compute the mean squared error"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# solution\n",
-    "error = mean_squared_error(target, predictions)\n",
-    "print(f\"The MSE is {error}\")"
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "color = {\"whiskers\": \"black\", \"medians\": \"black\", \"caps\": \"black\"}\n",
+    "weights.plot.box(color=color, vert=False)\n",
+    "_ = plt.title(\"Value of linear regression coefficients\")"
    ]
   }
  ],
diff --git a/notebooks/linear_models_sol_03.ipynb b/notebooks/linear_models_sol_03.ipynb
deleted file mode 100644
index 634c43171..000000000
--- a/notebooks/linear_models_sol_03.ipynb
+++ /dev/null
@@ -1,171 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# \ud83d\udcc3 Solution for Exercise M4.03\n",
-    "\n",
-    "In all previous notebooks, we only used a single feature in `data`. But we\n",
-    "have already shown that we could add new features to make the model more\n",
-    "expressive by deriving new features, based on the original feature.\n",
-    "\n",
-    "The aim of this notebook is to train a linear regression algorithm on a\n",
-    "dataset with more than a single feature.\n",
-    "\n",
-    "We will load a dataset about house prices in California. The dataset consists\n",
-    "of 8 features regarding the demography and geography of districts in\n",
-    "California and the aim is to predict the median house price of each district.\n",
-    "We will use all 8 features to predict the target, the median house price."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<div class=\"admonition note alert alert-info\">\n",
-    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Note</p>\n",
-    "<p class=\"last\">If you want a deeper overview regarding this dataset, you can refer to the\n",
-    "Appendix - Datasets description section at the end of this MOOC.</p>\n",
-    "</div>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from sklearn.datasets import fetch_california_housing\n",
-    "\n",
-    "data, target = fetch_california_housing(as_frame=True, return_X_y=True)\n",
-    "target *= 100  # rescale the target in k$\n",
-    "data.head()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Now it is your turn to train a linear regression model on this dataset. First,\n",
-    "create a linear regression model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# solution\n",
-    "from sklearn.linear_model import LinearRegression\n",
-    "\n",
-    "linear_regression = LinearRegression()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Execute a cross-validation with 10 folds and use the mean absolute error (MAE)\n",
-    "as metric. Be sure to *return* the fitted *estimators*."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# solution\n",
-    "from sklearn.model_selection import cross_validate\n",
-    "\n",
-    "cv_results = cross_validate(\n",
-    "    linear_regression,\n",
-    "    data,\n",
-    "    target,\n",
-    "    scoring=\"neg_mean_absolute_error\",\n",
-    "    return_estimator=True,\n",
-    "    cv=10,\n",
-    "    n_jobs=2,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Compute the mean and std of the MAE in thousands of dollars (k$)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# solution\n",
-    "print(\n",
-    "    \"Mean absolute error on testing set: \"\n",
-    "    f\"{-cv_results['test_score'].mean():.3f} k$ \u00b1 \"\n",
-    "    f\"{cv_results['test_score'].std():.3f}\"\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "lines_to_next_cell": 2
-   },
-   "source": [
-    "Inspect the fitted model using a box plot to show the distribution of values\n",
-    "for the coefficients returned from the cross-validation. Hint: use the\n",
-    "function\n",
-    "[`df.plot.box()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.box.html)\n",
-    "to create a box plot."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# solution\n",
-    "import pandas as pd\n",
-    "\n",
-    "weights = pd.DataFrame(\n",
-    "    [est.coef_ for est in cv_results[\"estimator\"]], columns=data.columns\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "tags": [
-     "solution"
-    ]
-   },
-   "outputs": [],
-   "source": [
-    "import matplotlib.pyplot as plt\n",
-    "\n",
-    "color = {\"whiskers\": \"black\", \"medians\": \"black\", \"caps\": \"black\"}\n",
-    "weights.plot.box(color=color, vert=False)\n",
-    "_ = plt.title(\"Value of linear regression coefficients\")"
-   ]
-  }
- ],
- "metadata": {
-  "jupytext": {
-   "main_language": "python"
-  },
-  "kernelspec": {
-   "display_name": "Python 3",
-   "name": "python3"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
\ No newline at end of file
diff --git a/python_scripts/linear_models_ex_02.py b/python_scripts/linear_models_ex_02.py
index 640c44046..f58a1f0fe 100644
--- a/python_scripts/linear_models_ex_02.py
+++ b/python_scripts/linear_models_ex_02.py
@@ -14,100 +14,80 @@
 # %% [markdown]
 # # 📝 Exercise M4.02
 #
-# The goal of this exercise is to build an intuition on what will be the
-# parameters' values of a linear model when the link between the data and the
-# target is non-linear.
+# In the previous notebook, we showed that we can add new features based on the
+# original feature to make the model more expressive, for instance `x ** 2` or `x ** 3`.
+# In that case we only used a single feature in `data`.
 #
-# First, we will generate such non-linear data.
+# The aim of this notebook is to train a linear regression algorithm on a
+# dataset with more than a single feature. In such a "multi-dimensional" feature
+# space we can derive new features of the form `x1 * x2`, `x2 * x3`,
+# etc. Products of features are usually called "non-linear or
+# multiplicative interactions" between features.
 #
-# ```{tip}
-# `np.random.RandomState` allows to create a random number generator which can
-# be later used to get deterministic results.
-# ```
-
-# %%
-import numpy as np
-
-# Set the seed for reproduction
-rng = np.random.RandomState(0)
-
-# Generate data
-n_sample = 100
-data_max, data_min = 1.4, -1.4
-len_data = data_max - data_min
-data = rng.rand(n_sample) * len_data - len_data / 2
-noise = rng.randn(n_sample) * 0.3
-target = data**3 - 0.5 * data**2 + noise
+# Feature engineering can be an important step of a model pipeline as long as
+# the new features are expected to be predictive. For instance, think of a
+# classification model to decide if a patient has risk of developing a heart
+# disease. This would depend on the patient's Body Mass Index which is defined
+# as `weight / height ** 2`.
+#
+# We load the dataset penguins dataset. We first use a set of 3 numerical
+# features to predict the target, i.e. the body mass of the penguin.
 
 # %% [markdown]
 # ```{note}
-# To ease the plotting, we will create a Pandas dataframe containing the data
-# and target
+# If you want a deeper overview regarding this dataset, you can refer to the
+# Appendix - Datasets description section at the end of this MOOC.
 # ```
 
 # %%
 import pandas as pd
 
-full_data = pd.DataFrame({"data": data, "target": target})
+penguins = pd.read_csv("../datasets/penguins.csv")
 
-# %%
-import seaborn as sns
+columns = ["Flipper Length (mm)", "Culmen Length (mm)", "Culmen Depth (mm)"]
+target_name = "Body Mass (g)"
 
-_ = sns.scatterplot(
-    data=full_data, x="data", y="target", color="black", alpha=0.5
-)
+# Remove lines with missing values for the columns of interest
+penguins_non_missing = penguins[columns + [target_name]].dropna()
 
-# %% [markdown]
-# We observe that the link between the data `data` and vector `target` is
-# non-linear. For instance, `data` could represent the years of experience
-# (normalized) and `target` the salary (normalized). Therefore, the problem here
-# would be to infer the salary given the years of experience.
-#
-# Using the function `f` defined below, find both the `weight` and the
-# `intercept` that you think will lead to a good linear model. Plot both the
-# data and the predictions of this model.
-
-
-# %%
-def f(data, weight=0, intercept=0):
-    target_predict = weight * data + intercept
-    return target_predict
+data = penguins_non_missing[columns]
+target = penguins_non_missing[target_name]
+data.head()
 
+# %% [markdown]
+# Now it is your turn to train a linear regression model on this dataset. First,
+# create a linear regression model.
 
 # %%
 # Write your code here.
 
 # %% [markdown]
-# Compute the mean squared error for this model
+# Execute a cross-validation with 10 folds and use the mean absolute error (MAE)
+# as metric.
 
 # %%
 # Write your code here.
 
 # %% [markdown]
-# Train a linear regression model on this dataset.
-#
-# ```{warning}
-# In scikit-learn, by convention `data` (also called `X` in the scikit-learn
-# documentation) should be a 2D matrix of shape `(n_samples, n_features)`.
-# If `data` is a 1D vector, you need to reshape it into a matrix with a
-# single column if the vector represents a feature or a single row if the
-# vector represents a sample.
-# ```
+# Compute the mean and std of the MAE in grams (g).
 
 # %%
-from sklearn.linear_model import LinearRegression
-
 # Write your code here.
 
 # %% [markdown]
-# Compute predictions from the linear regression model and plot both the data
-# and the predictions.
+# Now create a pipeline using `make_pipeline` consisting of a
+# `PolynomialFeatures` and a linear regression. Set `degree=2` and
+# `interaction_only=True` to the feature engineering step. Remember not to
+# include the bias to avoid redundancies with the linear's regression intercept.
+#
+# Use the same strategy as before to cross-validate such a pipeline.
 
 # %%
 # Write your code here.
 
 # %% [markdown]
-# Compute the mean squared error
+# Compute the mean and std of the MAE in grams (g) and compare with the results
+# without feature engineering.
 
 # %%
 # Write your code here.
diff --git a/python_scripts/linear_models_ex_03.py b/python_scripts/linear_models_ex_03.py
index 3ab6949a3..9c311e817 100644
--- a/python_scripts/linear_models_ex_03.py
+++ b/python_scripts/linear_models_ex_03.py
@@ -14,24 +14,14 @@
 # %% [markdown]
 # # 📝 Exercise M4.03
 #
-# In the previous notebook, we showed that we can add new features based on the
-# original feature to make the model more expressive, for instance `x ** 2` or `x ** 3`.
-# In that case we only used a single feature in `data`.
+# The parameter `penalty` can control the **type** of regularization to use,
+# whereas the regularization **strength** is set using the parameter `C`.
+# Setting`penalty="none"` is equivalent to an infinitely large value of `C`. In
+# this exercise, we ask you to train a logistic regression classifier using the
+# `penalty="l2"` regularization (which happens to be the default in
+# scikit-learn) to find by yourself the effect of the parameter `C`.
 #
-# The aim of this notebook is to train a linear regression algorithm on a
-# dataset with more than a single feature. In such a "multi-dimensional" feature
-# space we can derive new features of the form `x1 * x2`, `x2 * x3`,
-# etc. Products of features are usually called "non-linear or
-# multiplicative interactions" between features.
-#
-# Feature engineering can be an important step of a model pipeline as long as
-# the new features are expected to be predictive. For instance, think of a
-# classification model to decide if a patient has risk of developing a heart
-# disease. This would depend on the patient's Body Mass Index which is defined
-# as `weight / height ** 2`.
-#
-# We load the dataset penguins dataset. We first use a set of 3 numerical
-# features to predict the target, i.e. the body mass of the penguin.
+# We start by loading the dataset.
 
 # %% [markdown]
 # ```{note}
@@ -42,52 +32,51 @@
 # %%
 import pandas as pd
 
-penguins = pd.read_csv("../datasets/penguins.csv")
+penguins = pd.read_csv("../datasets/penguins_classification.csv")
+# only keep the Adelie and Chinstrap classes
+penguins = (
+    penguins.set_index("Species").loc[["Adelie", "Chinstrap"]].reset_index()
+)
 
-columns = ["Flipper Length (mm)", "Culmen Length (mm)", "Culmen Depth (mm)"]
-target_name = "Body Mass (g)"
+culmen_columns = ["Culmen Length (mm)", "Culmen Depth (mm)"]
+target_column = "Species"
 
-# Remove lines with missing values for the columns of interest
-penguins_non_missing = penguins[columns + [target_name]].dropna()
+# %%
+from sklearn.model_selection import train_test_split
 
-data = penguins_non_missing[columns]
-target = penguins_non_missing[target_name]
-data.head()
+penguins_train, penguins_test = train_test_split(penguins, random_state=0)
 
-# %% [markdown]
-# Now it is your turn to train a linear regression model on this dataset. First,
-# create a linear regression model.
+data_train = penguins_train[culmen_columns]
+data_test = penguins_test[culmen_columns]
 
-# %%
-# Write your code here.
+target_train = penguins_train[target_column]
+target_test = penguins_test[target_column]
 
 # %% [markdown]
-# Execute a cross-validation with 10 folds and use the mean absolute error (MAE)
-# as metric.
+# First, let's create our predictive model.
 
 # %%
-# Write your code here.
+from sklearn.pipeline import make_pipeline
+from sklearn.preprocessing import StandardScaler
+from sklearn.linear_model import LogisticRegression
 
-# %% [markdown]
-# Compute the mean and std of the MAE in grams (g).
-
-# %%
-# Write your code here.
+logistic_regression = make_pipeline(
+    StandardScaler(), LogisticRegression(penalty="l2")
+)
 
 # %% [markdown]
-# Now create a pipeline using `make_pipeline` consisting of a
-# `PolynomialFeatures` and a linear regression. Set `degree=2` and
-# `interaction_only=True` to the feature engineering step. Remember not to
-# include the bias to avoid redundancies with the linear's regression intercept.
-#
-# Use the same strategy as before to cross-validate such a pipeline.
+# Given the following candidates for the `C` parameter, find out the impact of
+# `C` on the classifier decision boundary. You can use
+# `sklearn.inspection.DecisionBoundaryDisplay.from_estimator` to plot the
+# decision function boundary.
 
 # %%
+Cs = [0.01, 0.1, 1, 10]
+
 # Write your code here.
 
 # %% [markdown]
-# Compute the mean and std of the MAE in grams (g) and compare with the results
-# without feature engineering.
+# Look at the impact of the `C` hyperparameter on the magnitude of the weights.
 
 # %%
 # Write your code here.
diff --git a/python_scripts/linear_models_ex_04.py b/python_scripts/linear_models_ex_04.py
deleted file mode 100644
index ef365713a..000000000
--- a/python_scripts/linear_models_ex_04.py
+++ /dev/null
@@ -1,82 +0,0 @@
-# ---
-# jupyter:
-#   jupytext:
-#     text_representation:
-#       extension: .py
-#       format_name: percent
-#       format_version: '1.3'
-#       jupytext_version: 1.14.5
-#   kernelspec:
-#     display_name: Python 3
-#     name: python3
-# ---
-
-# %% [markdown]
-# # 📝 Exercise M4.04
-#
-# The parameter `penalty` can control the **type** of regularization to use,
-# whereas the regularization **strength** is set using the parameter `C`.
-# Setting`penalty="none"` is equivalent to an infinitely large value of `C`. In
-# this exercise, we ask you to train a logistic regression classifier using the
-# `penalty="l2"` regularization (which happens to be the default in
-# scikit-learn) to find by yourself the effect of the parameter `C`.
-#
-# We will start by loading the dataset.
-
-# %% [markdown]
-# ```{note}
-# If you want a deeper overview regarding this dataset, you can refer to the
-# Appendix - Datasets description section at the end of this MOOC.
-# ```
-
-# %%
-import pandas as pd
-
-penguins = pd.read_csv("../datasets/penguins_classification.csv")
-# only keep the Adelie and Chinstrap classes
-penguins = (
-    penguins.set_index("Species").loc[["Adelie", "Chinstrap"]].reset_index()
-)
-
-culmen_columns = ["Culmen Length (mm)", "Culmen Depth (mm)"]
-target_column = "Species"
-
-# %%
-from sklearn.model_selection import train_test_split
-
-penguins_train, penguins_test = train_test_split(penguins, random_state=0)
-
-data_train = penguins_train[culmen_columns]
-data_test = penguins_test[culmen_columns]
-
-target_train = penguins_train[target_column]
-target_test = penguins_test[target_column]
-
-# %% [markdown]
-# First, let's create our predictive model.
-
-# %%
-from sklearn.pipeline import make_pipeline
-from sklearn.preprocessing import StandardScaler
-from sklearn.linear_model import LogisticRegression
-
-logistic_regression = make_pipeline(
-    StandardScaler(), LogisticRegression(penalty="l2")
-)
-
-# %% [markdown]
-# Given the following candidates for the `C` parameter, find out the impact of
-# `C` on the classifier decision boundary. You can use
-# `sklearn.inspection.DecisionBoundaryDisplay.from_estimator` to plot the
-# decision function boundary.
-
-# %%
-Cs = [0.01, 0.1, 1, 10]
-
-# Write your code here.
-
-# %% [markdown]
-# Look at the impact of the `C` hyperparameter on the magnitude of the weights.
-
-# %%
-# Write your code here.
diff --git a/python_scripts/linear_models_sol_02.py b/python_scripts/linear_models_sol_02.py
index d62a4b983..3abc476da 100644
--- a/python_scripts/linear_models_sol_02.py
+++ b/python_scripts/linear_models_sol_02.py
@@ -8,123 +8,127 @@
 # %% [markdown]
 # # 📃 Solution for Exercise M4.02
 #
-# The goal of this exercise is to build an intuition on what will be the
-# parameters' values of a linear model when the link between the data and the
-# target is non-linear.
+# In the previous notebook, we showed that we can add new features based on the
+# original feature to make the model more expressive, for instance `x ** 2` or `x ** 3`.
+# In that case we only used a single feature in `data`.
 #
-# First, we will generate such non-linear data.
+# The aim of this notebook is to train a linear regression algorithm on a
+# dataset with more than a single feature. In such a "multi-dimensional" feature
+# space we can derive new features of the form `x1 * x2`, `x2 * x3`,
+# etc. Products of features are usually called "non-linear or
+# multiplicative interactions" between features.
 #
-# ```{tip}
-# `np.random.RandomState` allows to create a random number generator which can
-# be later used to get deterministic results.
-# ```
-
-# %%
-import numpy as np
-
-# Set the seed for reproduction
-rng = np.random.RandomState(0)
-
-# Generate data
-n_sample = 100
-data_max, data_min = 1.4, -1.4
-len_data = data_max - data_min
-data = rng.rand(n_sample) * len_data - len_data / 2
-noise = rng.randn(n_sample) * 0.3
-target = data**3 - 0.5 * data**2 + noise
+# Feature engineering can be an important step of a model pipeline as long as
+# the new features are expected to be predictive. For instance, think of a
+# classification model to decide if a patient has risk of developing a heart
+# disease. This would depend on the patient's Body Mass Index which is defined
+# as `weight / height ** 2`.
+#
+# We load the dataset penguins dataset. We first use a set of 3 numerical
+# features to predict the target, i.e. the body mass of the penguin.
 
 # %% [markdown]
 # ```{note}
-# To ease the plotting, we will create a Pandas dataframe containing the data
-# and target
+# If you want a deeper overview regarding this dataset, you can refer to the
+# Appendix - Datasets description section at the end of this MOOC.
 # ```
 
 # %%
 import pandas as pd
 
-full_data = pd.DataFrame({"data": data, "target": target})
-
-# %%
-import seaborn as sns
-
-_ = sns.scatterplot(
-    data=full_data, x="data", y="target", color="black", alpha=0.5
-)
+penguins = pd.read_csv("../datasets/penguins.csv")
 
-# %% [markdown]
-# We observe that the link between the data `data` and vector `target` is
-# non-linear. For instance, `data` could represent the years of experience
-# (normalized) and `target` the salary (normalized). Therefore, the problem here
-# would be to infer the salary given the years of experience.
-#
-# Using the function `f` defined below, find both the `weight` and the
-# `intercept` that you think will lead to a good linear model. Plot both the
-# data and the predictions of this model.
+columns = ["Flipper Length (mm)", "Culmen Length (mm)", "Culmen Depth (mm)"]
+target_name = "Body Mass (g)"
 
+# Remove lines with missing values for the columns of interest
+penguins_non_missing = penguins[columns + [target_name]].dropna()
 
-# %%
-def f(data, weight=0, intercept=0):
-    target_predict = weight * data + intercept
-    return target_predict
+data = penguins_non_missing[columns]
+target = penguins_non_missing[target_name]
+data.head()
 
+# %% [markdown]
+# Now it is your turn to train a linear regression model on this dataset. First,
+# create a linear regression model.
 
 # %%
 # solution
-predictions = f(data, weight=1.2, intercept=-0.2)
+from sklearn.linear_model import LinearRegression
 
-# %% tags=["solution"]
-ax = sns.scatterplot(
-    data=full_data, x="data", y="target", color="black", alpha=0.5
-)
-_ = ax.plot(data, predictions)
+linear_regression = LinearRegression()
 
 # %% [markdown]
-# Compute the mean squared error for this model
+# Execute a cross-validation with 10 folds and use the mean absolute error (MAE)
+# as metric.
 
 # %%
 # solution
-from sklearn.metrics import mean_squared_error
-
-error = mean_squared_error(target, f(data, weight=1.2, intercept=-0.2))
-print(f"The MSE is {error}")
+from sklearn.model_selection import cross_validate
+
+cv_results = cross_validate(
+    linear_regression,
+    data,
+    target,
+    cv=10,
+    scoring="neg_mean_absolute_error",
+    n_jobs=2,
+)
 
 # %% [markdown]
-# Train a linear regression model on this dataset.
-#
-# ```{warning}
-# In scikit-learn, by convention `data` (also called `X` in the scikit-learn
-# documentation) should be a 2D matrix of shape `(n_samples, n_features)`.
-# If `data` is a 1D vector, you need to reshape it into a matrix with a
-# single column if the vector represents a feature or a single row if the
-# vector represents a sample.
-# ```
+# Compute the mean and std of the MAE in grams (g).
 
 # %%
-from sklearn.linear_model import LinearRegression
-
 # solution
-linear_regression = LinearRegression()
-data_2d = data.reshape(-1, 1)
-linear_regression.fit(data_2d, target)
+print(
+    "Mean absolute error on testing set with original features: "
+    f"{-cv_results['test_score'].mean():.3f} ± "
+    f"{cv_results['test_score'].std():.3f} g"
+)
 
 # %% [markdown]
-# Compute predictions from the linear regression model and plot both the data
-# and the predictions.
+# Now create a pipeline using `make_pipeline` consisting of a
+# `PolynomialFeatures` and a linear regression. Set `degree=2` and
+# `interaction_only=True` to the feature engineering step. Remember not to
+# include the bias to avoid redundancies with the linear's regression intercept.
+#
+# Use the same strategy as before to cross-validate such a pipeline.
 
 # %%
 # solution
-predictions = linear_regression.predict(data_2d)
+from sklearn.preprocessing import PolynomialFeatures
+from sklearn.pipeline import make_pipeline
 
-# %% tags=["solution"]
-ax = sns.scatterplot(
-    data=full_data, x="data", y="target", color="black", alpha=0.5
+poly_features = PolynomialFeatures(
+    degree=2, include_bias=False, interaction_only=True
+)
+linear_regression_interactions = make_pipeline(
+    poly_features, linear_regression
+)
+
+cv_results = cross_validate(
+    linear_regression_interactions,
+    data,
+    target,
+    cv=10,
+    scoring="neg_mean_absolute_error",
+    n_jobs=2,
 )
-_ = ax.plot(data, predictions)
 
 # %% [markdown]
-# Compute the mean squared error
+# Compute the mean and std of the MAE in grams (g) and compare with the results
+# without feature engineering.
 
 # %%
 # solution
-error = mean_squared_error(target, predictions)
-print(f"The MSE is {error}")
+print(
+    "Mean absolute error on testing set with interactions: "
+    f"{-cv_results['test_score'].mean():.3f} ± "
+    f"{cv_results['test_score'].std():.3f} g"
+)
+
+# %% [markdown] tags=["solution"]
+# We observe that the mean absolute error is lower and less spread with the
+# enriched features. In this case the "interactions" are indeed predictive. In
+# the following notebook we will see what happens when the enriched features are
+# non-predictive and how to deal with this case.
diff --git a/python_scripts/linear_models_sol_03.py b/python_scripts/linear_models_sol_03.py
index 0cacfcf0d..d789c8522 100644
--- a/python_scripts/linear_models_sol_03.py
+++ b/python_scripts/linear_models_sol_03.py
@@ -8,24 +8,14 @@
 # %% [markdown]
 # # 📃 Solution for Exercise M4.03
 #
-# In the previous notebook, we showed that we can add new features based on the
-# original feature to make the model more expressive, for instance `x ** 2` or `x ** 3`.
-# In that case we only used a single feature in `data`.
+# The parameter `penalty` can control the **type** of regularization to use,
+# whereas the regularization **strength** is set using the parameter `C`.
+# Setting`penalty="none"` is equivalent to an infinitely large value of `C`. In
+# this exercise, we ask you to train a logistic regression classifier using the
+# `penalty="l2"` regularization (which happens to be the default in
+# scikit-learn) to find by yourself the effect of the parameter `C`.
 #
-# The aim of this notebook is to train a linear regression algorithm on a
-# dataset with more than a single feature. In such a "multi-dimensional" feature
-# space we can derive new features of the form `x1 * x2`, `x2 * x3`,
-# etc. Products of features are usually called "non-linear or
-# multiplicative interactions" between features.
-#
-# Feature engineering can be an important step of a model pipeline as long as
-# the new features are expected to be predictive. For instance, think of a
-# classification model to decide if a patient has risk of developing a heart
-# disease. This would depend on the patient's Body Mass Index which is defined
-# as `weight / height ** 2`.
-#
-# We load the dataset penguins dataset. We first use a set of 3 numerical
-# features to predict the target, i.e. the body mass of the penguin.
+# We start by loading the dataset.
 
 # %% [markdown]
 # ```{note}
@@ -36,99 +26,97 @@
 # %%
 import pandas as pd
 
-penguins = pd.read_csv("../datasets/penguins.csv")
-
-columns = ["Flipper Length (mm)", "Culmen Length (mm)", "Culmen Depth (mm)"]
-target_name = "Body Mass (g)"
-
-# Remove lines with missing values for the columns of interest
-penguins_non_missing = penguins[columns + [target_name]].dropna()
-
-data = penguins_non_missing[columns]
-target = penguins_non_missing[target_name]
-data.head()
+penguins = pd.read_csv("../datasets/penguins_classification.csv")
+# only keep the Adelie and Chinstrap classes
+penguins = (
+    penguins.set_index("Species").loc[["Adelie", "Chinstrap"]].reset_index()
+)
 
-# %% [markdown]
-# Now it is your turn to train a linear regression model on this dataset. First,
-# create a linear regression model.
+culmen_columns = ["Culmen Length (mm)", "Culmen Depth (mm)"]
+target_column = "Species"
 
 # %%
-# solution
-from sklearn.linear_model import LinearRegression
+from sklearn.model_selection import train_test_split
 
-linear_regression = LinearRegression()
+penguins_train, penguins_test = train_test_split(penguins, random_state=0)
 
-# %% [markdown]
-# Execute a cross-validation with 10 folds and use the mean absolute error (MAE)
-# as metric.
+data_train = penguins_train[culmen_columns]
+data_test = penguins_test[culmen_columns]
 
-# %%
-# solution
-from sklearn.model_selection import cross_validate
-
-cv_results = cross_validate(
-    linear_regression,
-    data,
-    target,
-    cv=10,
-    scoring="neg_mean_absolute_error",
-    n_jobs=2,
-)
+target_train = penguins_train[target_column]
+target_test = penguins_test[target_column]
 
 # %% [markdown]
-# Compute the mean and std of the MAE in grams (g).
+# First, let's create our predictive model.
 
 # %%
-# solution
-print(
-    "Mean absolute error on testing set with original features: "
-    f"{-cv_results['test_score'].mean():.3f} ± "
-    f"{cv_results['test_score'].std():.3f} g"
+from sklearn.pipeline import make_pipeline
+from sklearn.preprocessing import StandardScaler
+from sklearn.linear_model import LogisticRegression
+
+logistic_regression = make_pipeline(
+    StandardScaler(), LogisticRegression(penalty="l2")
 )
 
 # %% [markdown]
-# Now create a pipeline using `make_pipeline` consisting of a
-# `PolynomialFeatures` and a linear regression. Set `degree=2` and
-# `interaction_only=True` to the feature engineering step. Remember not to
-# include the bias to avoid redundancies with the linear's regression intercept.
-#
-# Use the same strategy as before to cross-validate such a pipeline.
+# Given the following candidates for the `C` parameter, find out the impact of
+# `C` on the classifier decision boundary. You can use
+# `sklearn.inspection.DecisionBoundaryDisplay.from_estimator` to plot the
+# decision function boundary.
 
 # %%
-# solution
-from sklearn.preprocessing import PolynomialFeatures
-from sklearn.pipeline import make_pipeline
-
-poly_features = PolynomialFeatures(
-    degree=2, include_bias=False, interaction_only=True
-)
-linear_regression_interactions = make_pipeline(
-    poly_features, linear_regression
-)
+Cs = [0.01, 0.1, 1, 10]
 
-cv_results = cross_validate(
-    linear_regression_interactions,
-    data,
-    target,
-    cv=10,
-    scoring="neg_mean_absolute_error",
-    n_jobs=2,
-)
+# solution
+import matplotlib.pyplot as plt
+import seaborn as sns
+from sklearn.inspection import DecisionBoundaryDisplay
+
+for C in Cs:
+    logistic_regression.set_params(logisticregression__C=C)
+    logistic_regression.fit(data_train, target_train)
+    accuracy = logistic_regression.score(data_test, target_test)
+
+    DecisionBoundaryDisplay.from_estimator(
+        logistic_regression,
+        data_test,
+        response_method="predict",
+        cmap="RdBu_r",
+        alpha=0.5,
+    )
+    sns.scatterplot(
+        data=penguins_test,
+        x=culmen_columns[0],
+        y=culmen_columns[1],
+        hue=target_column,
+        palette=["tab:red", "tab:blue"],
+    )
+    plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left")
+    plt.title(f"C: {C} \n Accuracy on the test set: {accuracy:.2f}")
 
 # %% [markdown]
-# Compute the mean and std of the MAE in grams (g) and compare with the results
-# without feature engineering.
+# Look at the impact of the `C` hyperparameter on the magnitude of the weights.
 
 # %%
 # solution
-print(
-    "Mean absolute error on testing set with interactions: "
-    f"{-cv_results['test_score'].mean():.3f} ± "
-    f"{cv_results['test_score'].std():.3f} g"
-)
+weights_ridge = []
+for C in Cs:
+    logistic_regression.set_params(logisticregression__C=C)
+    logistic_regression.fit(data_train, target_train)
+    coefs = logistic_regression[-1].coef_[0]
+    weights_ridge.append(pd.Series(coefs, index=culmen_columns))
+
+# %% tags=["solution"]
+weights_ridge = pd.concat(weights_ridge, axis=1, keys=[f"C: {C}" for C in Cs])
+weights_ridge.plot.barh()
+_ = plt.title("LogisticRegression weights depending of C")
 
 # %% [markdown] tags=["solution"]
-# We observe that the mean absolute error is lower and less spread with the
-# enriched features. In this case the "interactions" are indeed predictive. In
-# the following notebook we will see what happens when the enriched features are
-# non-predictive and how to deal with this case.
+# We see that a small `C` will shrink the weights values toward zero. It means
+# that a small `C` provides a more regularized model. Thus, `C` is the inverse
+# of the `alpha` coefficient in the `Ridge` model.
+#
+# Besides, with a strong penalty (i.e. small `C` value), the weight of the
+# feature "Culmen Depth (mm)" is almost zero. It explains why the decision
+# separation in the plot is almost perpendicular to the "Culmen Length (mm)"
+# feature.
diff --git a/python_scripts/linear_models_sol_04.py b/python_scripts/linear_models_sol_04.py
deleted file mode 100644
index 358abce52..000000000
--- a/python_scripts/linear_models_sol_04.py
+++ /dev/null
@@ -1,122 +0,0 @@
-# ---
-# jupyter:
-#   kernelspec:
-#     display_name: Python 3
-#     name: python3
-# ---
-
-# %% [markdown]
-# # 📃 Solution for Exercise M4.04
-#
-# The parameter `penalty` can control the **type** of regularization to use,
-# whereas the regularization **strength** is set using the parameter `C`.
-# Setting`penalty="none"` is equivalent to an infinitely large value of `C`. In
-# this exercise, we ask you to train a logistic regression classifier using the
-# `penalty="l2"` regularization (which happens to be the default in
-# scikit-learn) to find by yourself the effect of the parameter `C`.
-#
-# We start by loading the dataset.
-
-# %% [markdown]
-# ```{note}
-# If you want a deeper overview regarding this dataset, you can refer to the
-# Appendix - Datasets description section at the end of this MOOC.
-# ```
-
-# %%
-import pandas as pd
-
-penguins = pd.read_csv("../datasets/penguins_classification.csv")
-# only keep the Adelie and Chinstrap classes
-penguins = (
-    penguins.set_index("Species").loc[["Adelie", "Chinstrap"]].reset_index()
-)
-
-culmen_columns = ["Culmen Length (mm)", "Culmen Depth (mm)"]
-target_column = "Species"
-
-# %%
-from sklearn.model_selection import train_test_split
-
-penguins_train, penguins_test = train_test_split(penguins, random_state=0)
-
-data_train = penguins_train[culmen_columns]
-data_test = penguins_test[culmen_columns]
-
-target_train = penguins_train[target_column]
-target_test = penguins_test[target_column]
-
-# %% [markdown]
-# First, let's create our predictive model.
-
-# %%
-from sklearn.pipeline import make_pipeline
-from sklearn.preprocessing import StandardScaler
-from sklearn.linear_model import LogisticRegression
-
-logistic_regression = make_pipeline(
-    StandardScaler(), LogisticRegression(penalty="l2")
-)
-
-# %% [markdown]
-# Given the following candidates for the `C` parameter, find out the impact of
-# `C` on the classifier decision boundary. You can use
-# `sklearn.inspection.DecisionBoundaryDisplay.from_estimator` to plot the
-# decision function boundary.
-
-# %%
-Cs = [0.01, 0.1, 1, 10]
-
-# solution
-import matplotlib.pyplot as plt
-import seaborn as sns
-from sklearn.inspection import DecisionBoundaryDisplay
-
-for C in Cs:
-    logistic_regression.set_params(logisticregression__C=C)
-    logistic_regression.fit(data_train, target_train)
-    accuracy = logistic_regression.score(data_test, target_test)
-
-    DecisionBoundaryDisplay.from_estimator(
-        logistic_regression,
-        data_test,
-        response_method="predict",
-        cmap="RdBu_r",
-        alpha=0.5,
-    )
-    sns.scatterplot(
-        data=penguins_test,
-        x=culmen_columns[0],
-        y=culmen_columns[1],
-        hue=target_column,
-        palette=["tab:red", "tab:blue"],
-    )
-    plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left")
-    plt.title(f"C: {C} \n Accuracy on the test set: {accuracy:.2f}")
-
-# %% [markdown]
-# Look at the impact of the `C` hyperparameter on the magnitude of the weights.
-
-# %%
-# solution
-weights_ridge = []
-for C in Cs:
-    logistic_regression.set_params(logisticregression__C=C)
-    logistic_regression.fit(data_train, target_train)
-    coefs = logistic_regression[-1].coef_[0]
-    weights_ridge.append(pd.Series(coefs, index=culmen_columns))
-
-# %% tags=["solution"]
-weights_ridge = pd.concat(weights_ridge, axis=1, keys=[f"C: {C}" for C in Cs])
-weights_ridge.plot.barh()
-_ = plt.title("LogisticRegression weights depending of C")
-
-# %% [markdown] tags=["solution"]
-# We see that a small `C` will shrink the weights values toward zero. It means
-# that a small `C` provides a more regularized model. Thus, `C` is the inverse
-# of the `alpha` coefficient in the `Ridge` model.
-#
-# Besides, with a strong penalty (i.e. small `C` value), the weight of the
-# feature "Culmen Depth (mm)" is almost zero. It explains why the decision
-# separation in the plot is almost perpendicular to the "Culmen Length (mm)"
-# feature.