✨ Merge pull request #80 from ENSTA-U2IS-AI/dev

✨ Add grouping loss, Monte-Carlo Batch Normalization, OpenImage-O, MUAD & Improve code quality
ENSTA-U2IS-AI · Feb 14, 2024 · fbf6c41 · fbf6c41
2 parents 8bf1e69 + 5de3568
commit fbf6c41
Show file tree

Hide file tree

Showing 110 changed files with 2,766 additions and 1,068 deletions.
diff --git a/.github/workflows/run-tests.yml b/.github/workflows/run-tests.yml
@@ -39,12 +39,13 @@ jobs:
 
     - name: Get changed files
       id: changed-files-specific
-      uses: tj-actions/changed-files@v41
+      uses: tj-actions/changed-files@v42
       with:
         files: |
           auto_tutorials_source/**
           data/**
           experiments/**
+          docs/**
           *.md
           *.yaml
           *.yml
@@ -53,7 +54,7 @@ jobs:
 
     - name: Cache folder for Torch Uncertainty
       if: steps.changed-files-specific.outputs.only_changed != 'true'
-      uses: actions/cache@v3
+      uses: actions/cache@v4
       id: cache-folder
       with:
         path: |
@@ -79,7 +80,7 @@ jobs:
 
     - name: Upload coverage to Codecov
       if: steps.changed-files-specific.outputs.only_changed != 'true' && (github.event_name != 'pull_request' || github.base_ref == 'dev')
-      uses: codecov/codecov-action@v3
+      uses: codecov/codecov-action@v4
       continue-on-error: true
       with:
         token: ${{ secrets.CODECOV_TOKEN }}

diff --git a/.gitignore b/.gitignore
@@ -6,6 +6,7 @@ lightning_logs/
 docs/*/generated/
 docs/*/auto_tutorials/
 *.pth
+*.ckpt
 *.out
 
 # Byte-compiled / optimized / DLL files

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -106,7 +106,7 @@ We intend to include datamodules for the most popular datasets only.
 For now, we intend to follow scikit-learn style API for post-processing
 methods (except that we use a validation dataset for now). You can get
 inspiration from the already implemented
-[temperature-scaling](https://github.com/ENSTA-U2IS/torch-uncertainty/blob/dev/torch_uncertainty/post_processing/calibration/temperature_scaler.py).
+[temperature-scaling](https://github.com/ENSTA-U2IS-AI/torch-uncertainty/blob/dev/torch_uncertainty/post_processing/calibration/temperature_scaler.py).
 
 ## License
 

diff --git a/README.md b/README.md
@@ -1,12 +1,13 @@
 <div align="center">
 
-![Torch Uncertainty Logo](https://github.com/ENSTA-U2IS/torch-uncertainty/blob/main/docs/source/_static/images/torch_uncertainty.png)
+![Torch Uncertainty Logo](https://github.com/ENSTA-U2IS-AI/torch-uncertainty/blob/main/docs/source/_static/images/torch_uncertainty.png)
 
 [![pypi](https://img.shields.io/pypi/v/torch_uncertainty.svg)](https://pypi.python.org/pypi/torch_uncertainty)
-[![tests](https://github.com/ENSTA-U2IS/torch-uncertainty/actions/workflows/run-tests.yml/badge.svg?branch=main&event=push)](https://github.com/ENSTA-U2IS/torch-uncertainty/actions/workflows/run-tests.yml)
-[![Docs](https://github.com/ENSTA-U2IS/torch-uncertainty/actions/workflows/build-docs.yml/badge.svg)](https://torch-uncertainty.github.io/)
+[![tests](https://github.com/ENSTA-U2IS-AI/torch-uncertainty/actions/workflows/run-tests.yml/badge.svg?branch=main&event=push)](https://github.com/ENSTA-U2IS-AI/torch-uncertainty/actions/workflows/run-tests.yml)
+[![Docs](https://github.com/ENSTA-U2IS-AI/torch-uncertainty/actions/workflows/build-docs.yml/badge.svg)](https://torch-uncertainty.github.io/)
+[![PRWelcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/ENSTA-U2IS-AI/torch-uncertainty/pulls)
 [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
-[![Code Coverage](https://codecov.io/github/ENSTA-U2IS/torch-uncertainty/coverage.svg?branch=master)](https://codecov.io/gh/ENSTA-U2IS/torch-uncertainty)
+[![Code Coverage](https://codecov.io/github/ENSTA-U2IS-AI/torch-uncertainty/coverage.svg?branch=master)](https://codecov.io/gh/ENSTA-U2IS-AI/torch-uncertainty)
 [![Discord Badge](https://dcbadge.vercel.app/api/server/HMCawt5MJu?compact=true&style=flat)](https://discord.gg/HMCawt5MJu)
 </div>
 
@@ -70,12 +71,13 @@ The following data augmentation methods have been implemented:
 To date, the following post-processing methods have been implemented:
 
 - Temperature, Vector, & Matrix scaling - [Tutorial](https://torch-uncertainty.github.io/auto_tutorials/tutorial_scaler.html)
+- Monte Carlo Batch Normalization - [Tutorial](https://torch-uncertainty.github.io/auto_tutorials/tutorial_mc_batch_norm.html)
 
 ## Tutorials
 
 We provide the following tutorials in our documentation:
 
-- [From a Vanilla Classifier to a Packed-Ensemble](https://torch-uncertainty.github.io/auto_tutorials/tutorial_pe_cifar10.html)
+- [From a Standard Classifier to a Packed-Ensemble](https://torch-uncertainty.github.io/auto_tutorials/tutorial_pe_cifar10.html)
 - [Training a Bayesian Neural Network in 3 minutes](https://torch-uncertainty.github.io/auto_tutorials/tutorial_bayesian.html)
 - [Improve Top-label Calibration with Temperature Scaling](https://torch-uncertainty.github.io/auto_tutorials/tutorial_scaler.html)
 - [Deep Evidential Regression on a Toy Example](https://torch-uncertainty.github.io/auto_tutorials/tutorial_der_cubic.html)
@@ -84,7 +86,7 @@ We provide the following tutorials in our documentation:
 
 ## Awesome Uncertainty repositories
 
-You may find a lot of papers about modern uncertainty estimation techniques on the [Awesome Uncertainty in Deep Learning](https://github.com/ENSTA-U2IS/awesome-uncertainty-deeplearning).
+You may find a lot of papers about modern uncertainty estimation techniques on the [Awesome Uncertainty in Deep Learning](https://github.com/ENSTA-U2IS-AI/awesome-uncertainty-deeplearning).
 
 ## Other References
 

diff --git a/auto_tutorials_source/tutorial_corruptions.py b/auto_tutorials_source/tutorial_corruptions.py
@@ -17,7 +17,7 @@
 
 from torchvision.utils import make_grid
 import matplotlib.pyplot as plt
-plt.axis('off')
+
 ds = CIFAR10("./data", train=False, download=True)
 
 def get_images(main_transform, severity):

diff --git a/auto_tutorials_source/tutorial_mc_batch_norm.py b/auto_tutorials_source/tutorial_mc_batch_norm.py
@@ -0,0 +1,172 @@
+"""
+Training a LeNet with Monte Carlo Batch Normalization
+=====================================================
+
+In this tutorial, we will train a LeNet classifier on the MNIST dataset using Monte-Carlo Batch Normalization (MCBN), a post-hoc Bayesian approximation method. 
+
+Training a LeNet with MCBN using TorchUncertainty models and PyTorch Lightning
+------------------------------------------------------------------------------
+In this part, we train a LeNet with batch normalization layers, based on the model and routines already implemented in TU.
+
+1. Loading the utilities
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+First, we have to load the following utilities from TorchUncertainty:
+
+- the cli handler: cli_main and argument parser: init_args
+- the datamodule that handles dataloaders: MNISTDataModule, which lies in the torch_uncertainty.datamodule
+- the model: LeNet, which lies in torch_uncertainty.models
+- the mc-batch-norm wrapper: mc_dropout, which lies in torch_uncertainty.models
+- a resnet baseline to get the command line arguments: ResNet, which lies in torch_uncertainty.baselines
+- the classification training routine in the torch_uncertainty.training.classification module
+- the optimizer wrapper in the torch_uncertainty.optimization_procedures module.
+"""
+# %%
+from torch_uncertainty import cli_main, init_args
+from torch_uncertainty.datamodules import MNISTDataModule
+from torch_uncertainty.models.lenet import lenet
+from torch_uncertainty.post_processing.mc_batch_norm import MCBatchNorm
+from torch_uncertainty.baselines.classification import ResNet
+from torch_uncertainty.routines.classification import ClassificationSingle
+from torch_uncertainty.optimization_procedures import optim_cifar10_resnet18
+
+# %%
+# We will also need import the neural network utils withing `torch.nn`.
+#
+# We also import ArgvContext to avoid using the jupyter arguments as cli
+# arguments, and therefore avoid errors.
+
+import os
+from pathlib import Path
+
+from torch import nn
+from cli_test_helpers import ArgvContext
+
+# %%
+# 2. Creating the necessary variables
+# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+#
+# In the following, we will need to define the root of the datasets and the
+# logs, and to fake-parse the arguments needed for using the PyTorch Lightning
+# Trainer. We also create the datamodule that handles the MNIST dataset,
+# dataloaders and transforms. We create the model using the
+# blueprint from torch_uncertainty.models and we wrap it into mc-dropout.
+#
+# It is important to specify the arguments ``version`` as ``mc-dropout``,
+# ``num_estimators`` and the ``dropout_rate`` to use Monte Carlo dropout.
+
+root = Path(os.path.abspath(""))
+
+# We mock the arguments for the trainer
+with ArgvContext(
+    "file.py",
+    "--max_epochs",
+    "1",
+    "--enable_progress_bar",
+    "False",
+    "--num_estimators",
+    "8",
+    "--max_epochs",
+    "2"
+):
+    args = init_args(network=ResNet, datamodule=MNISTDataModule)
+
+net_name = "logs/lenet-mnist"
+
+# datamodule
+args.root = str(root / "data")
+dm = MNISTDataModule(**vars(args))
+
+
+model = lenet(
+    in_channels=dm.num_channels,
+    num_classes=dm.num_classes,
+    norm = nn.BatchNorm2d,
+)
+
+# %%
+# 3. The Loss and the Training Routine
+# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+# This is a classification problem, and we use CrossEntropyLoss as the likelihood.
+# We define the training routine using the classification training routine from
+# torch_uncertainty.training.classification. We provide the number of classes
+# and channels, the optimizer wrapper, the dropout rate, and the number of
+# forward passes to perform through the network, as well as all the default
+# arguments.
+
+baseline = ClassificationSingle(
+    num_classes=dm.num_classes,
+    model=model,
+    loss=nn.CrossEntropyLoss,
+    optimization_procedure=optim_cifar10_resnet18,
+    **vars(args),
+)
+
+# %%
+# 5. Gathering Everything and Training the Model
+# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+results = cli_main(baseline, dm, root, net_name, args)
+
+
+# %%
+# 6. Wrapping the Model in a MCBatchNorm
+# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+# We can now wrap the model in a MCBatchNorm to add stochasticity to the
+# predictions. We specify that the BatchNorm layers are to be converted to
+# MCBatchNorm layers, and that we want to use 8 stochastic estimators.
+# The amount of stochasticity is controlled by the ``mc_batch_size`` argument.
+# The larger the ``mc_batch_size``, the more stochastic the predictions will be.
+# The authors suggest 32 as a good value for ``mc_batch_size`` but we use 4 here
+# to highlight the effect of stochasticity on the predictions.
+
+baseline.model = MCBatchNorm(baseline.model, num_estimators=8, convert=True, mc_batch_size=32)
+baseline.model.fit(dm.train)
+baseline.eval()
+
+# %%
+# 7. Testing the Model
+# ~~~~~~~~~~~~~~~~~~~~
+# Now that the model is trained, let's test it on MNIST. Don't forget to call
+# .eval() to enable Monte Carlo batch normalization at inference.
+# In this tutorial, we plot the most uncertain images, i.e. the images for which
+# the variance of the predictions is the highest.
+
+import matplotlib.pyplot as plt
+import torch
+import torchvision
+
+import numpy as np
+
+
+def imshow(img):
+    npimg = img.numpy()
+    plt.imshow(np.transpose(npimg, (1, 2, 0)))
+    plt.show()
+
+
+dataiter = iter(dm.val_dataloader())
+images, labels = next(dataiter)
+
+# print images
+imshow(torchvision.utils.make_grid(images[:4, ...]))
+print("Ground truth: ", " ".join(f"{labels[j]}" for j in range(4)))
+
+baseline.eval()
+logits = baseline(images).reshape(8, 128, 10)
+
+probs = torch.nn.functional.softmax(logits, dim=-1)
+
+
+for j in sorted(probs.var(0).sum(-1).topk(4).indices):
+    values, predicted = torch.max(probs[:, j], 1)
+    print(
+        f"Predicted digits for the image {j}: ",
+        " ".join([str(image_id.item()) for image_id in predicted]),
+    )
+
+# %%
+# The predictions are mostly erroneous, which is expected since we selected
+# the most uncertain images. We also see that there stochasticity in the
+# predictions, as the predictions for the same image differ depending on the
+# stochastic estimator used.
diff --git a/auto_tutorials_source/tutorial_mc_dropout.py b/auto_tutorials_source/tutorial_mc_dropout.py
@@ -1,8 +1,8 @@
 """
 Training a LeNet with Monte-Carlo Dropout
-==========================================
+=========================================
 
-In this tutorial, we'll train a LeNet classifier on the MNIST dataset using Monte-Carlo Dropout (MC Dropout), a computationally efficient Bayesian approximation method. To estimate the predictive mean and uncertainty (variance), we perform multiple forward passes through the network with dropout layers enabled in ``train`` mode.
+In this tutorial, we will train a LeNet classifier on the MNIST dataset using Monte-Carlo Dropout (MC Dropout), a computationally efficient Bayesian approximation method. To estimate the predictive mean and uncertainty (variance), we perform multiple forward passes through the network with dropout layers enabled in ``train`` mode.
 
 For more information on Monte-Carlo Dropout, we refer the reader to the following resources:
 
@@ -90,7 +90,6 @@
     in_channels=dm.num_channels,
     num_classes=dm.num_classes,
     dropout_rate=args.dropout_rate,
-    num_estimators=args.num_estimators,
 )
 
 mc_model = mc_dropout(model, num_estimators=args.num_estimators, last_layer=0.0)

diff --git a/auto_tutorials_source/tutorial_pe_cifar10.py b/auto_tutorials_source/tutorial_pe_cifar10.py
@@ -1,11 +1,11 @@
 """
-From a Vanilla Classifier to a Packed-Ensemble
-==============================================
+From a Standard Classifier to a Packed-Ensemble
+===============================================
 
 This tutorial is heavily inspired by PyTorch's `Training a Classifier <https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#test-the-network-on-the-test-data>`_
 tutorial.
 
-Let's dive step by step into the process to modify a vanilla classifier into a
+Let's dive step by step into the process to modify a standard classifier into a
 packed-ensemble classifier.
 
 Dataset
@@ -30,7 +30,7 @@
 
 1. Load and normalizing the CIFAR10 training and test datasets using
    ``torchvision``
-2. Define a Packed-Ensemble from a vanilla classifier
+2. Define a Packed-Ensemble from a standard classifier
 3. Define a loss function
 4. Train the Packed-Ensemble on the training data
 5. Test the Packed-Ensemble on the test data and evaluate its performance
@@ -118,9 +118,9 @@ def imshow(img):
 
 
 # %%
-# 2. Define a Packed-Ensemble from a vanilla classifier
+# 2. Define a Packed-Ensemble from a standard classifier
 # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-# First we define a vanilla classifier for CIFAR10 for reference. We will use a
+# First we define a standard classifier for CIFAR10 for reference. We will use a
 # convolutional neural network.
 
 import torch.nn.functional as F
@@ -150,7 +150,7 @@ def forward(self, x):
 net = Net()
 
 # %%
-# Let's modify the vanilla classifier into a Packed-Ensemble classifier of
+# Let's modify the standard classifier into a Packed-Ensemble classifier of
 # parameters :math:`M=4,\ \alpha=2\text{ and }\gamma=1`.
 
 from einops import rearrange