📝 [Notebooks] - Install anomalib via pip in the Jupyter Notebooks (#1091

) * Fix metadata path * Refactor btech notebook to remove clone stuff * Refactor datamodule notebooks * Update fastflow model api * Add dataset symlink to notebooks * Fix pre-commit
openvinotoolkit · May 17, 2023 · 19693ea · 19693ea
1 parent 854044d
commit 19693ea
Show file tree

Hide file tree

Showing 6 changed files with 823 additions and 332 deletions.
diff --git a/.github/workflows/pre_merge.yml b/.github/workflows/pre_merge.yml
@@ -50,7 +50,9 @@ jobs:
       - name: Install Tox
         run: pip install tox
       - name: Link the dataset path to the dataset directory in the repository root.
-        run: ln -s $ANOMALIB_DATASET_PATH ./datasets
+        run: |
+          ln -s $ANOMALIB_DATASET_PATH ./datasets
+          ln -s $ANOMALIB_DATASET_PATH ./notebooks/datasets
       - name: Coverage
         run: tox -e pre-merge-${{ matrix.tox-env }}
       - name: Upload coverage report

diff --git a/notebooks/100_datamodules/101_btech.ipynb b/notebooks/100_datamodules/101_btech.ipynb
diff --git a/notebooks/100_datamodules/102_mvtec.ipynb b/notebooks/100_datamodules/102_mvtec.ipynb
diff --git a/notebooks/100_datamodules/103_folder.ipynb b/notebooks/100_datamodules/103_folder.ipynb
@@ -5,8 +5,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Setting up the Working Directory\n",
-    "This cell is to ensure we change the directory to anomalib source code to have access to the datasets and config files. We assume that you already went through `001_getting_started.ipynb` and install the required packages."
+    "# Use `Folder` for Customs Datasets\n",
+    "\n",
+    "# Installing Anomalib\n",
+    "\n",
+    "The easiest way to install anomalib is to use pip. You can install it from the command line using the following command:\n"
    ]
   },
   {
@@ -15,29 +18,35 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "import os\n",
-    "from pathlib import Path\n",
-    "\n",
-    "from git.repo import Repo\n",
+    "%pip install anomalib"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setting up the Dataset Directory\n",
     "\n",
-    "current_directory = Path.cwd()\n",
-    "if current_directory.name == \"100_datamodules\":\n",
-    "    # On the assumption that, the notebook is located in\n",
-    "    #   ~/anomalib/notebooks/100_datamodules/\n",
-    "    root_directory = current_directory.parent.parent\n",
-    "elif current_directory.name == \"anomalib\":\n",
-    "    # This means that the notebook is run from the main anomalib directory.\n",
-    "    root_directory = current_directory\n",
-    "else:\n",
-    "    # Otherwise, we'll need to clone the anomalib repo to the `current_directory`\n",
-    "    repo = Repo.clone_from(url=\"https://github.com/openvinotoolkit/anomalib.git\", to_path=current_directory)\n",
-    "    root_directory = current_directory / \"anomalib\"\n",
+    "This cell is to ensure we change the directory to have access to the datasets.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pathlib import Path\n",
     "\n",
-    "os.chdir(root_directory)\n",
-    "folder_dataset_root = root_directory / \"datasets\" / \"hazelnut_toy\""
+    "# NOTE: Provide the path to the dataset root directory.\n",
+    "#   If the datasets is not downloaded, it will be downloaded\n",
+    "#   to this directory.\n",
+    "dataset_root = Path.cwd().parent / \"datasets\" / \"hazelnut_toy\""
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -49,7 +58,7 @@
     "- A dataset with good and bad images as well as mask ground-truths for pixel-wise evaluation.\n",
     "- A dataset with good and bad images that is already split into training and testing sets.\n",
     "\n",
-    "To experiment this setting we provide a toy dataset that could be downloaded from the following [https://github.com/openvinotoolkit/anomalib/blob/main/docs/source/data/hazelnut_toy.zip](link). For the rest of the tutorial, we assume that the dataset is downloaded and extracted to `../../datasets`, located in the `anomalib` directory."
+    "To experiment this setting we provide a toy dataset that could be downloaded from the following [https://github.com/openvinotoolkit/anomalib/blob/main/docs/source/data/hazelnut_toy.zip](link). For the rest of the tutorial, we assume that the dataset is downloaded and extracted to `../datasets`, located in the `anomalib` directory.\n"
    ]
   },
   {
@@ -64,6 +73,7 @@
     "from PIL import Image\n",
     "from torchvision.transforms import ToPILImage\n",
     "\n",
+    "from anomalib.data import TaskType\n",
     "from anomalib.data.folder import Folder, FolderDataset\n",
     "from anomalib.data.utils import InputNormalizationMethod, get_transforms"
    ]
@@ -87,11 +97,11 @@
    "outputs": [],
    "source": [
     "folder_datamodule = Folder(\n",
-    "    root=folder_dataset_root,\n",
+    "    root=dataset_root,\n",
     "    normal_dir=\"good\",\n",
     "    abnormal_dir=\"crack\",\n",
-    "    task=\"segmentation\",\n",
-    "    mask_dir=folder_dataset_root / \"mask\" / \"crack\",\n",
+    "    task=TaskType.SEGMENTATION,\n",
+    "    mask_dir=dataset_root / \"mask\" / \"crack\",\n",
     "    image_size=256,\n",
     "    normalization=InputNormalizationMethod.NONE,  # don't apply normalization, as we want to visualize the images\n",
     ")\n",
@@ -121,10 +131,11 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As can be seen above, creating the dataloaders are pretty straghtforward, which could be directly used for training/testing/inference. We could visualize samples from the dataloaders as well."
+    "As can be seen above, creating the dataloaders are pretty straghtforward, which could be directly used for training/testing/inference. We could visualize samples from the dataloaders as well.\n"
    ]
   },
   {
@@ -140,10 +151,11 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "`Folder` data module offers much more flexibility cater all different sorts of needs. Please refer to the documentation for more details."
+    "`Folder` data module offers much more flexibility cater all different sorts of needs. Please refer to the documentation for more details.\n"
    ]
   },
   {
@@ -153,7 +165,7 @@
    "source": [
     "### Torch Dataset\n",
     "\n",
-    "As in earlier examples, we can also create a standalone PyTorch dataset instance."
+    "As in earlier examples, we can also create a standalone PyTorch dataset instance.\n"
    ]
   },
   {
@@ -170,7 +182,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To create `FolderDataset` we need to create the albumentations object that applies transforms to the input image."
+    "To create `FolderDataset` we need to create the albumentations object that applies transforms to the input image.\n"
    ]
   },
   {
@@ -193,10 +205,11 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Classification Task"
+    "#### Classification Task\n"
    ]
   },
   {
@@ -206,21 +219,22 @@
    "outputs": [],
    "source": [
     "folder_dataset_classification_train = FolderDataset(\n",
-    "    normal_dir=folder_dataset_root / \"good\",\n",
-    "    abnormal_dir=folder_dataset_root / \"crack\",\n",
+    "    normal_dir=dataset_root / \"good\",\n",
+    "    abnormal_dir=dataset_root / \"crack\",\n",
     "    split=\"train\",\n",
     "    transform=transform,\n",
-    "    task=\"classification\",\n",
+    "    task=TaskType.CLASSIFICATION,\n",
     ")\n",
     "folder_dataset_classification_train.setup()\n",
     "folder_dataset_classification_train.samples.head()"
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Let's look at the first sample in the dataset."
+    "Let's look at the first sample in the dataset.\n"
    ]
   },
   {
@@ -234,10 +248,11 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As can be seen above, when we choose `classification` task and `train` split, the dataset only returns `image`. This is mainly because training only requires normal images and no labels. Now let's try `test` split for the `classification` task"
+    "As can be seen above, when we choose `classification` task and `train` split, the dataset only returns `image`. This is mainly because training only requires normal images and no labels. Now let's try `test` split for the `classification` task\n"
    ]
   },
   {
@@ -248,11 +263,11 @@
    "source": [
     "# Folder Classification Test Set\n",
     "folder_dataset_classification_test = FolderDataset(\n",
-    "    normal_dir=folder_dataset_root / \"good\",\n",
-    "    abnormal_dir=folder_dataset_root / \"crack\",\n",
+    "    normal_dir=dataset_root / \"good\",\n",
+    "    abnormal_dir=dataset_root / \"crack\",\n",
     "    split=\"test\",\n",
     "    transform=transform,\n",
-    "    task=\"classification\",\n",
+    "    task=TaskType.CLASSIFICATION,\n",
     ")\n",
     "folder_dataset_classification_test.setup()\n",
     "folder_dataset_classification_test.samples.head()"
@@ -269,12 +284,13 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "#### Segmentation Task\n",
     "\n",
-    "It is also possible to configure the Folder dataset for the segmentation task, where the dataset object returns image and ground-truth mask."
+    "It is also possible to configure the Folder dataset for the segmentation task, where the dataset object returns image and ground-truth mask.\n"
    ]
   },
   {
@@ -285,12 +301,12 @@
    "source": [
     "# Folder Segmentation Train Set\n",
     "folder_dataset_segmentation_train = FolderDataset(\n",
-    "    normal_dir=folder_dataset_root / \"good\",\n",
-    "    abnormal_dir=folder_dataset_root / \"crack\",\n",
+    "    normal_dir=dataset_root / \"good\",\n",
+    "    abnormal_dir=dataset_root / \"crack\",\n",
     "    split=\"train\",\n",
     "    transform=transform,\n",
-    "    mask_dir=folder_dataset_root / \"mask\" / \"crack\",\n",
-    "    task=\"segmentation\",\n",
+    "    mask_dir=dataset_root / \"mask\" / \"crack\",\n",
+    "    task=TaskType.SEGMENTATION,\n",
     ")\n",
     "folder_dataset_segmentation_train.setup()  # like the datamodule, the dataset needs to be set up before use\n",
     "folder_dataset_segmentation_train.samples.head()"
@@ -304,12 +320,12 @@
    "source": [
     "# Folder Segmentation Test Set\n",
     "folder_dataset_segmentation_test = FolderDataset(\n",
-    "    normal_dir=folder_dataset_root / \"good\",\n",
-    "    abnormal_dir=folder_dataset_root / \"crack\",\n",
+    "    normal_dir=dataset_root / \"good\",\n",
+    "    abnormal_dir=dataset_root / \"crack\",\n",
     "    split=\"test\",\n",
     "    transform=transform,\n",
-    "    mask_dir=folder_dataset_root / \"mask\" / \"crack\",\n",
-    "    task=\"segmentation\",\n",
+    "    mask_dir=dataset_root / \"mask\" / \"crack\",\n",
+    "    task=TaskType.SEGMENTATION,\n",
     ")\n",
     "folder_dataset_segmentation_test.setup()  # like the datamodule, the dataset needs to be set up before use\n",
     "folder_dataset_segmentation_test.samples.head(10)"
@@ -326,10 +342,11 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Let's visualize the image and the mask..."
+    "Let's visualize the image and the mask...\n"
    ]
   },
   {
@@ -361,7 +378,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.13 (default, Nov  6 2022, 23:15:27) \n[GCC 9.3.0]"
+   "version": "3.10.11"
   },
   "orig_nbformat": 4,
   "vscode": {