deel-ai · M-Mouhcine · Oct 14, 2024 · Sep 4, 2024 · Oct 14, 2024 · Oct 14, 2024
diff --git a/.pylintrc b/.pylintrc
@@ -14,9 +14,11 @@ disable=
 
     R0801, # allow similar lines in 2 files
     R0915, # allow too many statements
+    R0917, # allow too many positional arguments
 
     W0105, # allow no effect string statement
     W0102, # allow dangerous default value []
+    W0212, # allow access to protected member 
     W0511, # allow todos
     W0632, # allow unbalanced-tuple-unpacking
     W0221, # allow arguments override

diff --git a/README.md b/README.md
@@ -25,7 +25,7 @@
 </div>
 <br>
 
-***Puncc*** (short for **P**redictive **un**certainty **c**alibration and **c**onformalization) is an open-source Python library. It seamlessly integrates a collection of state-of-the-art conformal prediction algorithms and associated techniques for diverse machine learning tasks, including regression, classification and anomaly detection.
+***Puncc*** (short for **P**redictive **un**certainty **c**alibration and **c**onformalization) is an open-source Python library. It seamlessly integrates a collection of state-of-the-art conformal prediction algorithms and associated techniques for diverse machine learning tasks, including regression, classification, object detection and anomaly detection.
 ***Puncc*** can be used with any predictive model to provide rigorous uncertainty estimations.
 Under data exchangeability (or *i.i.d*), the generated prediction sets are guaranteed to cover the true outputs within a user-defined error $\alpha$.
 
@@ -81,10 +81,11 @@ We highly recommend following the introductory tutorials to get familiar with th
 
 | Tutorial | Description | Link |
 |----------|-------------|------|
-| **Introduction Tutorial** | Get started with the basics of *puncc*. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/puncc_intro.ipynb)  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1TC_BM7JaEYtBIq6yuYB5U4cJjeg71Tch) |
-| **API Tutorial** | Learn about the API and how to use it effectively. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/api_intro.ipynb)  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1d06qQweM1X1eSrCnixA_MLEZil1vXewj) |
-| **Tutorial on CP with PyTorch** | Understand how to apply Conformal Prediction with PyTorch. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/puncc_pytorch.ipynb)  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tNO6u5rt8Bklfq7n4gv_Qyi1BvV827JA?usp=sharing) |
-| **Architecture Overview** | Detailed overview of *puncc*'s architecture. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/puncc_architecture.ipynb) |
+| **Introduction Tutorial** | Get started with the basics of *puncc*. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/puncc_intro.ipynb)  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/deel-ai/puncc/blob/main/docs/puncc_intro.ipynb) |
+| **API Tutorial** | Learn about *puncc*'s API. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/api_intro.ipynb)  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/deel-ai/puncc/blob/main/docs/api_intro.ipynb) |
+| **Tutorial on CP with PyTorch** | Learn how to use *puncc* with PyTorch. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/puncc_pytorch.ipynb)  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/deel-ai/puncc/blob/main/docs/puncc_pytorch.ipynb) |
+| **Conformal Object Detection** | Learn to conformalize an object detector. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/puncc_cod.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/deel-ai/puncc/blob/main/docs/puncc_cod.ipynb) |
+| **Architecture Overview** | Detailed overview of *puncc*'s architecture. | [![Open In Github](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](docs/puncc_architecture.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/deel-ai/puncc/blob/main/docs/puncc_architecture.ipynb) |
 
 ## 🚀 Quickstart
 

diff --git a/deel/puncc/api/conformalization.py b/deel/puncc/api/conformalization.py
@@ -335,13 +335,29 @@ def predict(
 
         return self._cv_cp_agg.predict(X, alpha, correction_func)
 
-    def save(self, path):
-        """Serialize current conformal predictor and write it into file.
-
-        :param str path: file path.
+    def save(self, path, save_data=True):
+        """Serialize current conformal predictor and write it to a file.
+
+        :param str path: File path.
+
+        :param bool save_data: If True, save the custom data used to 
+            fit/calibrate the model.
+
         """
+        # Remove cached data if needed (case of IdSplitter)
+        is_cached = False
+        if save_data and hasattr(self.splitter, "_split"):
+            cached = self.splitter._split
+            is_cached = True
+            self.splitter._split = None
+            print("\033[33m\033[1mWarning:\033[0m Custom train/calibration data removed from the"
+                " conformal predictor. If you want to keep them,"
+                " please set flag `save_data` to True.")
+
         with open(path, "wb") as output_file:
             pickle.dump(self.__dict__, output_file)
+        if is_cached:
+            self.splitter._split = cached
 
     @staticmethod
     def load(path):

diff --git a/deel/puncc/api/prediction_sets.py b/deel/puncc/api/prediction_sets.py
@@ -58,15 +58,19 @@ def raps_set(
     :param Iterable Y_pred:
         :math:`Y_{\\text{pred}} = (P_{\\text{C}_1}, ..., P_{\\text{C}_n})`
         where :math:`P_{\\text{C}_i}` is logit associated to class i.
+
     :param ndarray scores_quantile: quantile of nonconformity scores computed
         on a calibration set for a given :math:`\\alpha`
+
     :param float lambd: positive weight associated to the regularization term
         that encourages small set sizes. If :math:`\\lambda = 0`, there is no
         regularization and the implementation identifies with **APS**.
+
     :param float k_reg: class rank (ordered by descending probability) starting
         from which the regularization is applied. For example, if
         :math:`k_{reg} = 3`, then the fourth most likely estimated class has an
         extra penalty of size :math:`\\lambda`.
+
     : param bool rand: turn on or off the randomization term that smoothes the
         discrete probability mass jump when including a new class.
 
@@ -152,10 +156,12 @@ def raps_set_builder(
     :param float lambd: positive weight associated to the regularization term
         that encourages small set sizes. If :math:`\\lambda = 0`, there is no
         regularization and the implementation identifies with **APS**.
+
     :param float k_reg: class rank (ordered by descending probability) starting
         from which the regularization is applied. For example, if
         :math:`k_{reg} = 3`, then the fourth most likely estimated class has an
         extra penalty of size :math:`\\lambda`.
+
     : param bool rand: turn on or off the randomization term that smoothes the
         discrete probability mass jump when including a new class.
 
@@ -165,6 +171,7 @@ def raps_set_builder(
 
     :raises ValueError: incorrect value of lambd or k_reg.
     :raises TypeError: unsupported data types.
+
     """
     if lambd < 0:
         raise ValueError(
@@ -199,6 +206,7 @@ def constant_interval(
         \gamma_{\\alpha}]
 
     :param Iterable y_pred: predictions.
+
     :param ndarray scores_quantile: quantile of nonconformity scores computed
         on a calibration set for a given :math:`\\alpha`.
 

diff --git a/deel/puncc/api/utils.py b/deel/puncc/api/utils.py
@@ -31,6 +31,9 @@
 from typing import Optional
 from typing import Tuple
 from typing import Union
+from scipy.optimize import linear_sum_assignment
+
+from deel.puncc.metrics import iou
 
 import numpy as np
 
@@ -479,3 +482,73 @@ def quantile_weighted(
     return np.squeeze(
         np.transpose(quantile_res, (*range(1, quantile_res.ndim), 0))
     )
+
+
+def hungarian_assignment(predicted_bboxes: np.ndarray, true_bboxes: np.ndarray, min_iou:float=0.5):
+    """
+    Assign predicted bounding boxes to labeled ones based on maximizing IOU.
+
+    This function relies on the Hungarian algorithm (also known as the 
+    Kuhn-Munkres algorithm) to perform the assignment.
+
+
+    :param np.ndarray predicted_bboxes: Array of predicted bounding boxes with 
+        shape (N, 4), where N is the number of predictions.
+    :param np.ndarray true_bboxes: Array of true bounding boxes with shape 
+        (M, 4), where M is the number of true classes.
+    :param float min_iou: Minimum IoU threshold to consider a prediction as 
+        valid, by default 0.5.
+
+    :return: Tuple containing:
+        - Array of aligned predicted bounding boxes that have IoU greater than 
+            the minimum threshold.
+        - Array of true bounding boxes that correspond to the valid predicted 
+            bounding boxes.
+    :rtype: tuple(np.ndarray, np.ndarray)
+
+    .. note::
+        This function pads the predicted bounding boxes to match the number of 
+        true bounding boxes if necessary. It then calculates the IoU matrix 
+        between true and predicted bounding boxes and performs linear sum 
+        assignment to maximize the total IoU. Finally, it filters out the 
+        bounding boxes that do not meet the minimum IoU threshold.
+
+    .. code-block:: python
+
+        Examples
+        --------
+        >>> predicted_bboxes = np.array([[10, 10, 50, 50], [20, 20, 60, 60]])
+        >>> true_bboxes = np.array([[12, 12, 48, 48], [22, 22, 58, 58], [30, 30, 70, 70]])
+        >>> hungarian_assignment(predicted_bboxes, true_bboxes, min_iou=0.5)
+        (array([[10, 10, 50, 50], [20, 20, 60, 60]]), array([[12, 12, 48, 48], [22, 22, 58, 58]]))
+
+    """
+    # Pad predicted bounding boxes to match the number of labeled ones
+    def pad_predictions(predictions, labels):
+        num_preds = predictions.shape[0]
+        num_labels = labels.shape[0]
+
+        if num_preds < num_labels:
+            padded_predictions = np.zeros_like(labels)
+            padded_predictions[:num_preds] = predictions
+        else:
+            padded_predictions = predictions.copy()
+
+        return padded_predictions
+
+    # Pad predicted bounding boxes to match the number of true bounding boxes
+    padded_predictions = pad_predictions(predicted_bboxes, true_bboxes)
+
+    # Calculate IoUs between true and predicted bounding boxes
+    iou_matrix = np.round(iou(true_bboxes, padded_predictions), 2)
+
+    # Perform linear sum assignment to maximize the total IoU
+    _, best_pred_indices = linear_sum_assignment(iou_matrix, maximize=True)
+
+    # Align predicted bounding boxes with true ones based on the best assignment
+    aligned_predictions = padded_predictions[best_pred_indices]
+
+    # Keep only those bounding boxes that have IoU greater than the minimum threshold
+    valid_indices = iou(true_bboxes, aligned_predictions).diagonal() > min_iou
+
+    return aligned_predictions[valid_indices], true_bboxes[valid_indices], valid_indices
diff --git a/deel/puncc/metrics.py b/deel/puncc/metrics.py
@@ -173,3 +173,43 @@ def object_detection_mean_area(y_pred: np.ndarray):
     """
     x_min, y_min, x_max, y_max = np.hsplit(y_pred, 4)
     return np.mean((x_max - x_min) * (y_max - y_min))
+
+
+# Calculate Intersection over Union (IOU) between two bounding boxes
+def iou(bboxes1: np.ndarray, bboxes2: np.ndarray) -> np.ndarray:
+    """
+    Calculates the Intersection over Union (IoU) between two sets of 
+    bounding boxes. The IoU is calculated as the ratio between the area of 
+    intersection and the area of union between two bounding boxes.
+
+    :param np.ndarray bboxes1: array of shape (N, 4) representing the 
+        coordinates of N bounding boxes in the format 
+        [x_min, y_min, x_max, y_max].
+    :type y_pred: np.ndarray
+    :param np.ndarray bboxes2: array of shape (N, 4) representing the 
+        coordinates of N bounding boxes in the format 
+        [x_min, y_min, x_max, y_max].
+
+    :return: iou (numpy.ndarray): Array of shape (N, ) representing the IoU 
+        between each pair of bounding boxes.
+    :rtype: np.ndarray
+
+    """
+
+    x1_min, y1_min, x1_max, y1_max = np.split(bboxes1, 4, axis=1)
+    x2_min, y2_min, x2_max, y2_max = np.split(bboxes2, 4, axis=1)
+
+    inter_x_min = np.maximum(x1_min, np.transpose(x2_min))
+    inter_y_min = np.maximum(y1_min, np.transpose(y2_min))
+    inter_x_max = np.minimum(x1_max, np.transpose(x2_max))
+    inter_y_max = np.minimum(y1_max, np.transpose(y2_max))
+
+    inter_width = np.maximum(inter_x_max - inter_x_min + 1, 0)
+    inter_height = np.maximum(inter_y_max - inter_y_min + 1, 0)
+    inter_area = inter_width * inter_height
+
+    box1_area = (x1_max - x1_min + 1) * (y1_max - y1_min + 1)
+    box2_area = (x2_max - x2_min + 1) * (y2_max - y2_min + 1)
+
+    result = inter_area / (box1_area + np.transpose(box2_area) - inter_area)
+    return result
diff --git a/docs/api_intro.ipynb b/docs/api_intro.ipynb
@@ -8,7 +8,7 @@
    "source": [
     "# 💻 Welcome to *puncc* API tutorial\n",
     "\n",
-    "In this tutorial, we will see an alternative way to define conformal predictors using *puncc*'s API. We will apply such approach on the diabetes regression problem explored in the [**introduction tutorial**](puncc_intro.ipynb)</font> <sub> [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1TC_BM7JaEYtBIq6yuYB5U4cJjeg71Tch) </sub>. \n",
+    "In this tutorial, we will see an alternative way to define conformal predictors using *puncc*'s API. We will apply such approach on the diabetes regression problem explored in the [**introduction tutorial**](puncc_intro.ipynb)</font> <sub> [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/deel-ai/puncc/blob/main/docs/api_intro.ipynb) </sub>. \n",
     "\n",
     "By the end of this notebook, you will have an overview of *puncc*'s API and can start building your own conformal predictors !\n",
     "\n",
@@ -27,6 +27,16 @@
     "- [📘 Documentation](https://deel-ai.github.io/puncc/index.html)"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b33274ec",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install puncc"
+   ]
+  },
   {
    "attachments": {},
    "cell_type": "markdown",

diff --git a/docs/assets/instances_val2017.json b/docs/assets/instances_val2017.json
diff --git a/docs/assets/object_detection.png b/docs/assets/object_detection.png
diff --git a/docs/assets/puncc_architecture.png b/docs/assets/puncc_architecture.png
diff --git a/docs/assets/uc_object_detection.png b/docs/assets/uc_object_detection.png
diff --git a/docs/puncc_architecture.ipynb b/docs/puncc_architecture.ipynb
@@ -8,19 +8,21 @@
    "source": [
     "# 💻 Welcome to the presentation of *puncc*'s architecture\n",
     "\n",
-    "*Puncc* enables a turnkey solution and a fully customized approach to conformal prediction. It is as simple as calling the conformal prediction procedures in `deel.puncc.regression` or `deel.puncc.classification`.\n",
-    "\n",
-    "The currently implemented conformal regression procedures are the following:\n",
-    "* `deel.puncc.regression.SplitCP`: Split Conformal Prediction\n",
-    "* `deel.puncc.regression.LocallyAdaptiveCP`: Locally Adaptive Conformal Prediction\n",
-    "* `deel.puncc.regression.CQR`: Conformalized Quantile Regression\n",
-    "* `deel.puncc.regression.CvPlus`: CV + (cross-validation)\n",
-    "* `deel.puncc.regression.EnbPI`: Ensemble Batch Prediction Intervals method\n",
-    "* `deel.puncc.regression.aEnbPI`: locally adaptive Ensemble Batch Prediction Intervals method\n",
-    "\n",
-    "The currently implemented conformal classification procedures are the following:\n",
-    "* `deel.puncc.classification.APS`: Adaptive Prediction Sets. \n",
-    "* `deel.puncc.classification.RAPS`: Regularized Adaptive Prediction Sets. APS is a special case where regularization term is nulled ($\\lambda = 0$).\n",
+    "*Puncc* enables a turnkey solution and a fully customized approach to conformal prediction. It is as simple as calling the conformal prediction procedures from the associated module:\n",
+    "\n",
+    "\n",
+    "| Procedure Type                          | Procedure Name                                        | Description (more details in [Theory overview](https://deel-ai.github.io/puncc/theory_overview.html))                |\n",
+    "|-----------------------------------------|------------------------------------------------------|-------------------------------------------------------|\n",
+    "| Conformal Regression                    | `deel.puncc.regression.SplitCP`                      | Split Conformal Prediction                            |\n",
+    "| Conformal Regression                    | `deel.puncc.regression.LocallyAdaptiveCP`            | Locally Adaptive Conformal Prediction                 |\n",
+    "| Conformal Regression                    | `deel.puncc.regression.CQR`                          | Conformalized Quantile Regression                     |\n",
+    "| Conformal Regression                    | `deel.puncc.regression.CvPlus`                       | CV + (cross-validation)                               |\n",
+    "| Conformal Regression                    | `deel.puncc.regression.EnbPI`                        | Ensemble Batch Prediction Intervals method            |\n",
+    "| Conformal Regression                    | `deel.puncc.regression.aEnbPI`                       | Locally adaptive Ensemble Batch Prediction Intervals method |\n",
+    "| Conformal Classification                | `deel.puncc.classification.APS`                      | Adaptive Prediction Sets                              |\n",
+    "| Conformal Classification                | `deel.puncc.classification.RAPS`                     | Regularized Adaptive Prediction Sets (APS is a special case where $\\lambda = 0$) |\n",
+    "| Conformal Anomaly Detection             | `deel.puncc.anomaly_detection.SplitCAD`              | Split Conformal Anomaly detection (used to control the maximum false positive rate) |\n",
+    "| Conformal Object Detection              | `deel.puncc.object_detection.SplitBoxWise`           | Box-wise split conformal object detection             |\n",
     "\n",
     "Each of these procedures conformalize point-based or interval-based models that are wrapped in a predictor and passed as argument to the constructor. Wrapping the models in a predictor (`deel.puncc.api.prediction`) enables to work with several ML/DL libraries and data structures.\n",
     "\n",
@@ -46,6 +48,16 @@
     "- [📘 Documentation](https://deel-ai.github.io/puncc/index.html)"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "db1e1824",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install puncc"
+   ]
+  },
   {
    "attachments": {},
    "cell_type": "markdown",
@@ -60,7 +72,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "id": "2af0185f",
    "metadata": {},
    "outputs": [],
@@ -249,7 +261,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "id": "b9cc46fb",
    "metadata": {},
    "outputs": [],
@@ -296,7 +308,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "id": "72983b23",
    "metadata": {},
    "outputs": [],
@@ -321,7 +333,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "id": "b9866d85",
    "metadata": {},
    "outputs": [],
@@ -376,7 +388,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "punc-user-env",
+   "display_name": "puncc-dev-env",
    "language": "python",
    "name": "python3"
   },