Skip to content

Commit

Permalink
📝 [Notebooks] - Install anomalib via pip in the Jupyter Notebooks (#1091
Browse files Browse the repository at this point in the history
)

* Fix metadata path

* Refactor btech notebook to remove clone stuff

* Refactor datamodule notebooks

* Update fastflow model api

* Add dataset symlink to notebooks

* Fix pre-commit
  • Loading branch information
samet-akcay authored May 17, 2023
1 parent 854044d commit 19693ea
Show file tree
Hide file tree
Showing 6 changed files with 823 additions and 332 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/pre_merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,9 @@ jobs:
- name: Install Tox
run: pip install tox
- name: Link the dataset path to the dataset directory in the repository root.
run: ln -s $ANOMALIB_DATASET_PATH ./datasets
run: |
ln -s $ANOMALIB_DATASET_PATH ./datasets
ln -s $ANOMALIB_DATASET_PATH ./notebooks/datasets
- name: Coverage
run: tox -e pre-merge-${{ matrix.tox-env }}
- name: Upload coverage report
Expand Down
213 changes: 150 additions & 63 deletions notebooks/100_datamodules/101_btech.ipynb

Large diffs are not rendered by default.

183 changes: 127 additions & 56 deletions notebooks/100_datamodules/102_mvtec.ipynb

Large diffs are not rendered by default.

113 changes: 65 additions & 48 deletions notebooks/100_datamodules/103_folder.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up the Working Directory\n",
"This cell is to ensure we change the directory to anomalib source code to have access to the datasets and config files. We assume that you already went through `001_getting_started.ipynb` and install the required packages."
"# Use `Folder` for Customs Datasets\n",
"\n",
"# Installing Anomalib\n",
"\n",
"The easiest way to install anomalib is to use pip. You can install it from the command line using the following command:\n"
]
},
{
Expand All @@ -15,29 +18,35 @@
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from pathlib import Path\n",
"\n",
"from git.repo import Repo\n",
"%pip install anomalib"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up the Dataset Directory\n",
"\n",
"current_directory = Path.cwd()\n",
"if current_directory.name == \"100_datamodules\":\n",
" # On the assumption that, the notebook is located in\n",
" # ~/anomalib/notebooks/100_datamodules/\n",
" root_directory = current_directory.parent.parent\n",
"elif current_directory.name == \"anomalib\":\n",
" # This means that the notebook is run from the main anomalib directory.\n",
" root_directory = current_directory\n",
"else:\n",
" # Otherwise, we'll need to clone the anomalib repo to the `current_directory`\n",
" repo = Repo.clone_from(url=\"https://github.com/openvinotoolkit/anomalib.git\", to_path=current_directory)\n",
" root_directory = current_directory / \"anomalib\"\n",
"This cell is to ensure we change the directory to have access to the datasets.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"\n",
"os.chdir(root_directory)\n",
"folder_dataset_root = root_directory / \"datasets\" / \"hazelnut_toy\""
"# NOTE: Provide the path to the dataset root directory.\n",
"# If the datasets is not downloaded, it will be downloaded\n",
"# to this directory.\n",
"dataset_root = Path.cwd().parent / \"datasets\" / \"hazelnut_toy\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -49,7 +58,7 @@
"- A dataset with good and bad images as well as mask ground-truths for pixel-wise evaluation.\n",
"- A dataset with good and bad images that is already split into training and testing sets.\n",
"\n",
"To experiment this setting we provide a toy dataset that could be downloaded from the following [https://github.com/openvinotoolkit/anomalib/blob/main/docs/source/data/hazelnut_toy.zip](link). For the rest of the tutorial, we assume that the dataset is downloaded and extracted to `../../datasets`, located in the `anomalib` directory."
"To experiment this setting we provide a toy dataset that could be downloaded from the following [https://github.com/openvinotoolkit/anomalib/blob/main/docs/source/data/hazelnut_toy.zip](link). For the rest of the tutorial, we assume that the dataset is downloaded and extracted to `../datasets`, located in the `anomalib` directory.\n"
]
},
{
Expand All @@ -64,6 +73,7 @@
"from PIL import Image\n",
"from torchvision.transforms import ToPILImage\n",
"\n",
"from anomalib.data import TaskType\n",
"from anomalib.data.folder import Folder, FolderDataset\n",
"from anomalib.data.utils import InputNormalizationMethod, get_transforms"
]
Expand All @@ -87,11 +97,11 @@
"outputs": [],
"source": [
"folder_datamodule = Folder(\n",
" root=folder_dataset_root,\n",
" root=dataset_root,\n",
" normal_dir=\"good\",\n",
" abnormal_dir=\"crack\",\n",
" task=\"segmentation\",\n",
" mask_dir=folder_dataset_root / \"mask\" / \"crack\",\n",
" task=TaskType.SEGMENTATION,\n",
" mask_dir=dataset_root / \"mask\" / \"crack\",\n",
" image_size=256,\n",
" normalization=InputNormalizationMethod.NONE, # don't apply normalization, as we want to visualize the images\n",
")\n",
Expand Down Expand Up @@ -121,10 +131,11 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"As can be seen above, creating the dataloaders are pretty straghtforward, which could be directly used for training/testing/inference. We could visualize samples from the dataloaders as well."
"As can be seen above, creating the dataloaders are pretty straghtforward, which could be directly used for training/testing/inference. We could visualize samples from the dataloaders as well.\n"
]
},
{
Expand All @@ -140,10 +151,11 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"`Folder` data module offers much more flexibility cater all different sorts of needs. Please refer to the documentation for more details."
"`Folder` data module offers much more flexibility cater all different sorts of needs. Please refer to the documentation for more details.\n"
]
},
{
Expand All @@ -153,7 +165,7 @@
"source": [
"### Torch Dataset\n",
"\n",
"As in earlier examples, we can also create a standalone PyTorch dataset instance."
"As in earlier examples, we can also create a standalone PyTorch dataset instance.\n"
]
},
{
Expand All @@ -170,7 +182,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To create `FolderDataset` we need to create the albumentations object that applies transforms to the input image."
"To create `FolderDataset` we need to create the albumentations object that applies transforms to the input image.\n"
]
},
{
Expand All @@ -193,10 +205,11 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Classification Task"
"#### Classification Task\n"
]
},
{
Expand All @@ -206,21 +219,22 @@
"outputs": [],
"source": [
"folder_dataset_classification_train = FolderDataset(\n",
" normal_dir=folder_dataset_root / \"good\",\n",
" abnormal_dir=folder_dataset_root / \"crack\",\n",
" normal_dir=dataset_root / \"good\",\n",
" abnormal_dir=dataset_root / \"crack\",\n",
" split=\"train\",\n",
" transform=transform,\n",
" task=\"classification\",\n",
" task=TaskType.CLASSIFICATION,\n",
")\n",
"folder_dataset_classification_train.setup()\n",
"folder_dataset_classification_train.samples.head()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's look at the first sample in the dataset."
"Let's look at the first sample in the dataset.\n"
]
},
{
Expand All @@ -234,10 +248,11 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"As can be seen above, when we choose `classification` task and `train` split, the dataset only returns `image`. This is mainly because training only requires normal images and no labels. Now let's try `test` split for the `classification` task"
"As can be seen above, when we choose `classification` task and `train` split, the dataset only returns `image`. This is mainly because training only requires normal images and no labels. Now let's try `test` split for the `classification` task\n"
]
},
{
Expand All @@ -248,11 +263,11 @@
"source": [
"# Folder Classification Test Set\n",
"folder_dataset_classification_test = FolderDataset(\n",
" normal_dir=folder_dataset_root / \"good\",\n",
" abnormal_dir=folder_dataset_root / \"crack\",\n",
" normal_dir=dataset_root / \"good\",\n",
" abnormal_dir=dataset_root / \"crack\",\n",
" split=\"test\",\n",
" transform=transform,\n",
" task=\"classification\",\n",
" task=TaskType.CLASSIFICATION,\n",
")\n",
"folder_dataset_classification_test.setup()\n",
"folder_dataset_classification_test.samples.head()"
Expand All @@ -269,12 +284,13 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Segmentation Task\n",
"\n",
"It is also possible to configure the Folder dataset for the segmentation task, where the dataset object returns image and ground-truth mask."
"It is also possible to configure the Folder dataset for the segmentation task, where the dataset object returns image and ground-truth mask.\n"
]
},
{
Expand All @@ -285,12 +301,12 @@
"source": [
"# Folder Segmentation Train Set\n",
"folder_dataset_segmentation_train = FolderDataset(\n",
" normal_dir=folder_dataset_root / \"good\",\n",
" abnormal_dir=folder_dataset_root / \"crack\",\n",
" normal_dir=dataset_root / \"good\",\n",
" abnormal_dir=dataset_root / \"crack\",\n",
" split=\"train\",\n",
" transform=transform,\n",
" mask_dir=folder_dataset_root / \"mask\" / \"crack\",\n",
" task=\"segmentation\",\n",
" mask_dir=dataset_root / \"mask\" / \"crack\",\n",
" task=TaskType.SEGMENTATION,\n",
")\n",
"folder_dataset_segmentation_train.setup() # like the datamodule, the dataset needs to be set up before use\n",
"folder_dataset_segmentation_train.samples.head()"
Expand All @@ -304,12 +320,12 @@
"source": [
"# Folder Segmentation Test Set\n",
"folder_dataset_segmentation_test = FolderDataset(\n",
" normal_dir=folder_dataset_root / \"good\",\n",
" abnormal_dir=folder_dataset_root / \"crack\",\n",
" normal_dir=dataset_root / \"good\",\n",
" abnormal_dir=dataset_root / \"crack\",\n",
" split=\"test\",\n",
" transform=transform,\n",
" mask_dir=folder_dataset_root / \"mask\" / \"crack\",\n",
" task=\"segmentation\",\n",
" mask_dir=dataset_root / \"mask\" / \"crack\",\n",
" task=TaskType.SEGMENTATION,\n",
")\n",
"folder_dataset_segmentation_test.setup() # like the datamodule, the dataset needs to be set up before use\n",
"folder_dataset_segmentation_test.samples.head(10)"
Expand All @@ -326,10 +342,11 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's visualize the image and the mask..."
"Let's visualize the image and the mask...\n"
]
},
{
Expand Down Expand Up @@ -361,7 +378,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13 (default, Nov 6 2022, 23:15:27) \n[GCC 9.3.0]"
"version": "3.10.11"
},
"orig_nbformat": 4,
"vscode": {
Expand Down
Loading

0 comments on commit 19693ea

Please sign in to comment.