diff --git a/Makefile b/Makefile index b912b19ad3..dee7067210 100644 --- a/Makefile +++ b/Makefile @@ -37,6 +37,9 @@ NOTEBOOKS_TO_RUN += notebooks/what_are_recipes_and_how_to_use.ipynb NOTEBOOKS_TO_RUN += notebooks/transfer_learning_classification.ipynb NOTEBOOKS_TO_RUN += notebooks/how_to_use_knowledge_distillation_for_classification.ipynb NOTEBOOKS_TO_RUN += notebooks/PTQ_and_QAT_for_classification.ipynb +NOTEBOOKS_TO_RUN += notebooks/quickstart_segmentation.ipynb +NOTEBOOKS_TO_RUN += notebooks/segmentation_connect_custom_dataset.ipynb +NOTEBOOKS_TO_RUN += notebooks/transfer_learning_semantic_segmentation.ipynb # If there are additional notebooks that must not be executed, but still should be checked for version match, add them here NOTEBOOKS_TO_CHECK := $(NOTEBOOKS_TO_RUN) diff --git a/README.md b/README.md index 252ee4dc98..7deab15cbe 100644 --- a/README.md +++ b/README.md @@ -211,9 +211,10 @@ model = models.get("model-name", pretrained_weights="pretrained-model-name") ### Semantic Segmentation -* [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://bit.ly/3qKx9m8) [Segmentation Quick Start](https://bit.ly/3qKx9m8) -* [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://bit.ly/3qKwMbe) [Segmentation Transfer Learning](https://bit.ly/3qKwMbe) -* [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://bit.ly/3QQBVJp) [How to Connect Custom Dataset](https://bit.ly/3QQBVJp) +* [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Deci-AI/super-gradients/blob/master/notebooks/quickstart_segmentation.ipynb) [Segmentation Quick Start](https://colab.research.google.com/github/Deci-AI/super-gradients/blob/master/notebooks/quickstart_segmentation.ipynb)) +* [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Deci-AI/super-gradients/blob/master/notebooks/transfer_learning_semantic_segmentation.ipynb) [Segmentation Transfer Learning](https://colab.research.google.com/github/Deci-AI/super-gradients/blob/master/notebooks/transfer_learning_semantic_segmentation.ipynb) +* [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Deci-AI/super-gradients/blob/master/notebooks/segmentation_connect_custom_dataset.ipynb) [How to Connect Custom Dataset](https://colab.research.google.com/github/Deci-AI/super-gradients/blob/master/notebooks/segmentation_connect_custom_dataset.ipynb) + ### Pose Estimation diff --git a/notebooks/quickstart_segmentation.ipynb b/notebooks/quickstart_segmentation.ipynb new file mode 100644 index 0000000000..04386d76ab --- /dev/null +++ b/notebooks/quickstart_segmentation.ipynb @@ -0,0 +1,1342 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "HY_HuQbxn7X0" + }, + "source": [ + "![SG - Horizontal.png]()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oA_p5zIsoAJQ" + }, + "source": [ + "# SuperGradients quick start Semantic Segmentation\n", + "\n", + "In this tutorial we will train PPLiteSeg model on Supervisely semantic segmentation dataset\n", + "\n", + "The notebook is divided into 7 sections:\n", + "1. Experiment setup\n", + "2. Dataset definition\n", + "3. Architecture definition\n", + "4. Training setup\n", + "5. Training and Evaluation\n", + "6. Predict\n", + "7. Convert to ONNX\\TensorRT" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GqH4VGMroWec" + }, + "source": [ + "#Install SG" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Q8uA6AWEhHN6" + }, + "source": [ + "The cell below will install **super_gradients** which will automatically get all its dependencies." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "id": "-mm-E4xRoNEm" + }, + "outputs": [], + "source": [ + "! pip install -qq super-gradients==3.4.1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "892xArqDsGsQ" + }, + "source": [ + "# 1. Experiment setup\n", + "We will initialize our **trainer** which will be in charge of everything, like training, evaluation, saving checkpoints, plotting etc.\n", + "\n", + "The **experiment name** argument is important as every checkpoints, logs and tensorboards to be saved in a directory with the same name. This directory will be created as a sub-directory of **ckpt_root_dir** as follow:\n", + "\n", + "```\n", + "ckpt_root_dir\n", + "|─── experiment_name_1\n", + "│ ckpt_best.pth # Model checkpoint on best epoch\n", + "│ ckpt_latest.pth # Model checkpoint on last epoch\n", + "│ average_model.pth # Model checkpoint averaged over epochs\n", + "│ events.out.tfevents.1659878383... # Tensorflow artifacts of a specific run\n", + "│ log_Aug07_11_52_48.txt # Trainer logs of a specific run\n", + "└─── experiment_name_2\n", + " ...\n", + "```\n", + "In this notebook multi-gpu training is set as `OFF`, for Distributed training multi_gpu can be set as\n", + " `MultiGPUMode.DISTRIBUTED_DATA_PARALLEL` or `MultiGPUMode.DATA_PARALLEL`." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pl0WPz1HisFz" + }, + "source": [ + "Let's define **ckpt_root_dir** inside the Colab, later we can use it to start TensorBoard and monitor the run." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "HAff--HysJmP", + "outputId": "63e96426-a29b-4cdc-9a72-60da27d6aaa7" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "The console stream is logged into /root/sg_logs/console.log\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 11:11:11] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it\n", + "[2023-11-13 11:11:11] WARNING - __init__.py - Failed to import pytorch_quantization\n", + "[2023-11-13 11:11:11] INFO - utils.py - NumExpr defaulting to 2 threads.\n", + "[2023-11-13 11:11:23] WARNING - calibrator.py - Failed to import pytorch_quantization\n", + "[2023-11-13 11:11:23] WARNING - export.py - Failed to import pytorch_quantization\n", + "[2023-11-13 11:11:23] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization\n" + ] + } + ], + "source": [ + "from super_gradients import Trainer\n", + "\n", + "CHECKPOINT_DIR = './notebook_ckpts/'\n", + "trainer = Trainer(experiment_name=\"segmentation_quick_start\", ckpt_root_dir=CHECKPOINT_DIR)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dwVMY4gMjQSL" + }, + "source": [ + "# 2. Dataset definition\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fpIWhnR9j2rm" + }, + "source": [ + "\n", + "For the sake of this presentation, we'll use **Supervisely** semantic segmentation dataset." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZACgRb-qjzDJ" + }, + "source": [ + "SG trainer is fully compatible with PyTorch data loaders, so you can definitely use your own data for the experiment below if you prefer." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6ulV6Hpao3IN" + }, + "source": [ + "## 2.1 Download data\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mVwslNv-j-2C" + }, + "source": [ + "Feel free to change the download path by editing SUPERVISELY_DATASET_DOWNLOAD_PATH" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "dfR18Rmbo00y" + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "SUPERVISELY_DATASET_DOWNLOAD_PATH=os.path.join(os.getcwd(),\"data\")\n", + "\n", + "supervisely_dataset_dir_path = os.path.join(SUPERVISELY_DATASET_DOWNLOAD_PATH, 'supervisely-persons')\n", + "\n", + "if os.path.isdir(supervisely_dataset_dir_path):\n", + " print('supervisely dataset already downloaded...')\n", + "else:\n", + " print('Downloading and extracting supervisely dataset to: ' + SUPERVISELY_DATASET_DOWNLOAD_PATH)\n", + " ! mkdir $SUPERVISELY_DATASET_DOWNLOAD_PATH\n", + " %cd $SUPERVISELY_DATASET_DOWNLOAD_PATH\n", + " ! wget https://deci-pretrained-models.s3.amazonaws.com/supervisely-persons.zip\n", + " ! unzip --qq supervisely-persons.zip" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "id": "V9ZcklupX8Qx" + }, + "source": [ + "## 2.2 Create data loaders\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3Mk_YixjlEhj" + }, + "source": [ + "The dataloaders are initiated with the default parameters defined in the [yaml](https://github.com/Deci-AI/super-gradients/blob/master/src/super_gradients/recipes/dataset_params/supervisely_persons_dataset_params.yaml)\n", + "file. Parameters as batch_size, transforms, root_dir and others can be overridden by passing as `dataset_params` and\n", + "`dataloader_params`, as implemented bellow." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "S3BzMRhSX8Qx", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "87b5092d-fe93-4c0a-8b2e-febe215b52bd" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "supervisely dataset already downloaded...\n" + ] + } + ], + "source": [ + "from super_gradients.training import dataloaders\n", + "root_dir = supervisely_dataset_dir_path\n", + "batch_size = 8\n", + "\n", + "train_loader = dataloaders.supervisely_persons_train(dataset_params={\"root_dir\": root_dir}, dataloader_params={\"batch_size\": batch_size})\n", + "valid_loader = dataloaders.supervisely_persons_val(dataset_params={\"root_dir\": root_dir}, dataloader_params={\"batch_size\": batch_size})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6dHIwvs46-dk" + }, + "source": [ + "As you can see, we didn't have to pass many parameters into the dataloaders construction. That's because defaults are pre-defined for your convenience, and you might be curious to know what they are. Let's print them and see which resolution and transformations are defined." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "id": "76tzhKxi6aS-", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "3b5c8f34-673c-4f4c-d243-80e82c347f3d" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Dataloader parameters:\n", + "{'batch_size': 8, 'shuffle': True, 'drop_last': True}\n", + "Dataset parameters\n", + "{'root_dir': '/content/data/supervisely-persons', 'list_file': 'train.csv', 'cache_labels': False, 'cache_images': False, 'transforms': [{'SegRandomRescale': {'scales': [0.25, 1.0]}}, {'SegColorJitter': {'brightness': 0.5, 'contrast': 0.5, 'saturation': 0.5}}, {'SegRandomFlip': {'prob': 0.5}}, {'SegPadShortToCropSize': {'crop_size': [320, 480], 'fill_mask': 0}}, {'SegCropImageAndMask': {'crop_size': [320, 480], 'mode': 'random'}}]}\n" + ] + } + ], + "source": [ + "print('Dataloader parameters:')\n", + "print(train_loader.dataloader_params)\n", + "print('Dataset parameters')\n", + "print(train_loader.dataset.dataset_params)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "l5GcDAg_pUGJ" + }, + "source": [ + "# 3. Architecture definition\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "id": "xXPMJQCJzmb4" + }, + "outputs": [], + "source": [ + "from super_gradients.training import models\n", + "from super_gradients.common.object_names import Models\n", + "\n", + "model = models.get(model_name=Models.PP_LITE_T_SEG,\n", + " arch_params={\"use_aux_heads\": False},\n", + " num_classes=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fU8orO7wlwIK" + }, + "source": [ + "SG includes implementations of many different architectures for semantic segmentation tasks that can be found [here](https://github.com/Deci-AI/super-gradients#implemented-model-architectures)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-oGSU3V8lqcm" + }, + "source": [ + "Create a PPLiteSeg nn.Module, with 1 class segmentation head classifier. For simplicity `use_aux_head` is set as `False`\n", + "and extra Auxiliary heads aren't used for training.\n", + "\n", + "Other segmentation modules can be used for this task such as, DDRNet, STDC and RegSeg.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "X-_dBewgr1dG" + }, + "source": [ + "# 4. Training setup\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "H1Rll8Orl-Dy" + }, + "source": [ + "\n", + "Here we define the training recipe. The full parameters can be found here [training parameters supported](https://deci-ai.github.io/super-gradients/user_guide.html#training-parameters).\n", + "\n", + "We will be using an average of BCE and Dice loss for segmentation, with different learning rates for the replaced segmentation head layer, and the rest of the network- this is controlled by the `multiply_head_lr` parameter which is the multiplication factor of the learning rate for the newly replaced layer.\n", + "\n", + "As our `metric_to_watch`, we will be monitoring the `target_IOU` which is one of the components of `BinaryIOU` torchmetrics object (the other components are `mean_IOU` which is the mean of the background and target IOUs, and `background_IOU`)." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "id": "NShu3zLgr5qD" + }, + "outputs": [], + "source": [ + "from super_gradients.training.metrics.segmentation_metrics import BinaryIOU\n", + "\n", + "train_params = {\"max_epochs\": 15,\n", + " \"lr_mode\": \"cosine\",\n", + " \"initial_lr\": 0.01,\n", + " \"lr_warmup_epochs\": 5,\n", + " \"multiply_head_lr\": 10,\n", + " \"optimizer\": \"SGD\",\n", + " \"loss\": \"BCEDiceLoss\",\n", + " \"ema\": True,\n", + " \"ema_params\":\n", + " {\n", + " \"decay\": 0.9999,\n", + " \"decay_type\": \"exp\",\n", + " \"beta\": 15,\n", + " },\n", + "\n", + " \"zero_weight_decay_on_bias_and_bn\": True,\n", + " \"average_best_models\": True,\n", + " \"metric_to_watch\": \"target_IOU\",\n", + " \"greater_metric_to_watch_is_better\": True,\n", + " \"train_metrics_list\": [BinaryIOU()],\n", + " \"valid_metrics_list\": [BinaryIOU()],\n", + " \"loss_logging_items_names\": [\"loss\"]\n", + " }" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qTECVyhcs506" + }, + "source": [ + "# 5. Training and evaluation\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "S1K5MU2kmmDb" + }, + "source": [ + "The logs and the checkpoint for the latest epoch will be kept in your experiment folder.\n", + "\n", + "To start training we'll call train(...) and provide it with the objects we construted above: the model, the training parameters and the data loaders.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "id": "u6roEj9ktFTi", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "4a295f63-f0c4-43a7-c6e8-2f7ffd1b5ce2" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 11:15:07] INFO - sg_trainer.py - Starting a new run with `run_id=RUN_20231113_111507_197271`\n", + "[2023-11-13 11:15:07] INFO - sg_trainer.py - Checkpoints directory: ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271\n", + "[2023-11-13 11:15:07] INFO - sg_trainer.py - Using EMA with params {'decay': 0.9999, 'decay_type': 'exp', 'beta': 15}\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "The console stream is now moved to ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/console_Nov13_11_15_07.txt\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 11:15:08] INFO - sg_trainer_utils.py - TRAINING PARAMETERS:\n", + " - Mode: Single GPU\n", + " - Number of GPUs: 1 (1 available on the machine)\n", + " - Full dataset size: 2477 (len(train_set))\n", + " - Batch size per GPU: 8 (batch_size)\n", + " - Batch Accumulate: 1 (batch_accumulate)\n", + " - Total batch size: 8 (num_gpus * batch_size)\n", + " - Effective Batch size: 8 (num_gpus * batch_size * batch_accumulate)\n", + " - Iterations per epoch: 309 (len(train_loader))\n", + " - Gradient updates per epoch: 309 (len(train_loader) / batch_accumulate)\n", + "\n", + "[2023-11-13 11:15:08] INFO - sg_trainer.py - Started training for 15 epochs (0/14)\n", + "\n", + "Train epoch 0: 100%|██████████| 309/309 [02:12<00:00, 2.33it/s, BCEDiceLoss=0.4, background_IOU=0.545, gpu_mem=1.14, mean_IOU=0.609, target_IOU=0.674]\n", + "Validating: 100%|██████████| 65/65 [00:17<00:00, 3.69it/s]\n", + "[2023-11-13 11:17:39] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:17:39] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.6779429912567139\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 0\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.4001\n", + "│ ├── Target_iou = 0.6736\n", + "│ ├── Background_iou = 0.5448\n", + "│ └── Mean_iou = 0.6092\n", + "└── Validation\n", + " ├── Bcediceloss = 0.4166\n", + " ├── Target_iou = 0.6779\n", + " ├── Background_iou = 0.4039\n", + " └── Mean_iou = 0.5409\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 1: 100%|██████████| 309/309 [02:05<00:00, 2.46it/s, BCEDiceLoss=0.338, background_IOU=0.604, gpu_mem=1.14, mean_IOU=0.661, target_IOU=0.719]\n", + "Validating epoch 1: 100%|██████████| 65/65 [00:17<00:00, 3.69it/s]\n", + "[2023-11-13 11:20:05] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:20:05] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7205255031585693\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 1\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.3381\n", + "│ │ ├── Epoch N-1 = 0.4001 (\u001B[32m↘ -0.062\u001B[0m)\n", + "│ │ └── Best until now = 0.4001 (\u001B[32m↘ -0.062\u001B[0m)\n", + "│ ├── Target_iou = 0.7193\n", + "│ │ ├── Epoch N-1 = 0.6736 (\u001B[32m↗ 0.0457\u001B[0m)\n", + "│ │ └── Best until now = 0.6736 (\u001B[32m↗ 0.0457\u001B[0m)\n", + "│ ├── Background_iou = 0.6036\n", + "│ │ ├── Epoch N-1 = 0.5448 (\u001B[32m↗ 0.0587\u001B[0m)\n", + "│ │ └── Best until now = 0.5448 (\u001B[32m↗ 0.0587\u001B[0m)\n", + "│ └── Mean_iou = 0.6614\n", + "│ ├── Epoch N-1 = 0.6092 (\u001B[32m↗ 0.0522\u001B[0m)\n", + "│ └── Best until now = 0.6092 (\u001B[32m↗ 0.0522\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3578\n", + " │ ├── Epoch N-1 = 0.4166 (\u001B[32m↘ -0.0588\u001B[0m)\n", + " │ └── Best until now = 0.4166 (\u001B[32m↘ -0.0588\u001B[0m)\n", + " ├── Target_iou = 0.7205\n", + " │ ├── Epoch N-1 = 0.6779 (\u001B[32m↗ 0.0426\u001B[0m)\n", + " │ └── Best until now = 0.6779 (\u001B[32m↗ 0.0426\u001B[0m)\n", + " ├── Background_iou = 0.4497\n", + " │ ├── Epoch N-1 = 0.4039 (\u001B[32m↗ 0.0458\u001B[0m)\n", + " │ └── Best until now = 0.4039 (\u001B[32m↗ 0.0458\u001B[0m)\n", + " └── Mean_iou = 0.5851\n", + " ├── Epoch N-1 = 0.5409 (\u001B[32m↗ 0.0442\u001B[0m)\n", + " └── Best until now = 0.5409 (\u001B[32m↗ 0.0442\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 2: 100%|██████████| 309/309 [02:00<00:00, 2.55it/s, BCEDiceLoss=0.32, background_IOU=0.634, gpu_mem=1.14, mean_IOU=0.684, target_IOU=0.734]\n", + "Validating epoch 2: 100%|██████████| 65/65 [00:16<00:00, 3.84it/s]\n", + "[2023-11-13 11:22:24] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:22:24] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7300039529800415\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 2\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.3199\n", + "│ │ ├── Epoch N-1 = 0.3381 (\u001B[32m↘ -0.0182\u001B[0m)\n", + "│ │ └── Best until now = 0.3381 (\u001B[32m↘ -0.0182\u001B[0m)\n", + "│ ├── Target_iou = 0.734\n", + "│ │ ├── Epoch N-1 = 0.7193 (\u001B[32m↗ 0.0147\u001B[0m)\n", + "│ │ └── Best until now = 0.7193 (\u001B[32m↗ 0.0147\u001B[0m)\n", + "│ ├── Background_iou = 0.6344\n", + "│ │ ├── Epoch N-1 = 0.6036 (\u001B[32m↗ 0.0308\u001B[0m)\n", + "│ │ └── Best until now = 0.6036 (\u001B[32m↗ 0.0308\u001B[0m)\n", + "│ └── Mean_iou = 0.6842\n", + "│ ├── Epoch N-1 = 0.6614 (\u001B[32m↗ 0.0227\u001B[0m)\n", + "│ └── Best until now = 0.6614 (\u001B[32m↗ 0.0227\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.357\n", + " │ ├── Epoch N-1 = 0.3578 (\u001B[32m↘ -0.0008\u001B[0m)\n", + " │ └── Best until now = 0.3578 (\u001B[32m↘ -0.0008\u001B[0m)\n", + " ├── Target_iou = 0.73\n", + " │ ├── Epoch N-1 = 0.7205 (\u001B[32m↗ 0.0095\u001B[0m)\n", + " │ └── Best until now = 0.7205 (\u001B[32m↗ 0.0095\u001B[0m)\n", + " ├── Background_iou = 0.4503\n", + " │ ├── Epoch N-1 = 0.4497 (\u001B[32m↗ 0.0006\u001B[0m)\n", + " │ └── Best until now = 0.4497 (\u001B[32m↗ 0.0006\u001B[0m)\n", + " └── Mean_iou = 0.5902\n", + " ├── Epoch N-1 = 0.5851 (\u001B[32m↗ 0.0051\u001B[0m)\n", + " └── Best until now = 0.5851 (\u001B[32m↗ 0.0051\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 3: 100%|██████████| 309/309 [01:59<00:00, 2.58it/s, BCEDiceLoss=0.302, background_IOU=0.645, gpu_mem=1.14, mean_IOU=0.697, target_IOU=0.75]\n", + "Validating epoch 3: 100%|██████████| 65/65 [00:16<00:00, 3.84it/s]\n", + "[2023-11-13 11:24:43] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:24:43] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7432040572166443\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 3\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.3022\n", + "│ │ ├── Epoch N-1 = 0.3199 (\u001B[32m↘ -0.0177\u001B[0m)\n", + "│ │ └── Best until now = 0.3199 (\u001B[32m↘ -0.0177\u001B[0m)\n", + "│ ├── Target_iou = 0.7501\n", + "│ │ ├── Epoch N-1 = 0.734 (\u001B[32m↗ 0.0161\u001B[0m)\n", + "│ │ └── Best until now = 0.734 (\u001B[32m↗ 0.0161\u001B[0m)\n", + "│ ├── Background_iou = 0.6447\n", + "│ │ ├── Epoch N-1 = 0.6344 (\u001B[32m↗ 0.0103\u001B[0m)\n", + "│ │ └── Best until now = 0.6344 (\u001B[32m↗ 0.0103\u001B[0m)\n", + "│ └── Mean_iou = 0.6974\n", + "│ ├── Epoch N-1 = 0.6842 (\u001B[32m↗ 0.0132\u001B[0m)\n", + "│ └── Best until now = 0.6842 (\u001B[32m↗ 0.0132\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3307\n", + " │ ├── Epoch N-1 = 0.357 (\u001B[32m↘ -0.0263\u001B[0m)\n", + " │ └── Best until now = 0.357 (\u001B[32m↘ -0.0263\u001B[0m)\n", + " ├── Target_iou = 0.7432\n", + " │ ├── Epoch N-1 = 0.73 (\u001B[32m↗ 0.0132\u001B[0m)\n", + " │ └── Best until now = 0.73 (\u001B[32m↗ 0.0132\u001B[0m)\n", + " ├── Background_iou = 0.4794\n", + " │ ├── Epoch N-1 = 0.4503 (\u001B[32m↗ 0.0291\u001B[0m)\n", + " │ └── Best until now = 0.4503 (\u001B[32m↗ 0.0291\u001B[0m)\n", + " └── Mean_iou = 0.6113\n", + " ├── Epoch N-1 = 0.5902 (\u001B[32m↗ 0.0212\u001B[0m)\n", + " └── Best until now = 0.5902 (\u001B[32m↗ 0.0212\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 4: 100%|██████████| 309/309 [02:00<00:00, 2.56it/s, BCEDiceLoss=0.287, background_IOU=0.67, gpu_mem=1.14, mean_IOU=0.715, target_IOU=0.76]\n", + "Validating epoch 4: 100%|██████████| 65/65 [00:17<00:00, 3.79it/s]\n", + "[2023-11-13 11:27:02] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:27:02] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7445915341377258\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 4\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2867\n", + "│ │ ├── Epoch N-1 = 0.3022 (\u001B[32m↘ -0.0155\u001B[0m)\n", + "│ │ └── Best until now = 0.3022 (\u001B[32m↘ -0.0155\u001B[0m)\n", + "│ ├── Target_iou = 0.7604\n", + "│ │ ├── Epoch N-1 = 0.7501 (\u001B[32m↗ 0.0103\u001B[0m)\n", + "│ │ └── Best until now = 0.7501 (\u001B[32m↗ 0.0103\u001B[0m)\n", + "│ ├── Background_iou = 0.6697\n", + "│ │ ├── Epoch N-1 = 0.6447 (\u001B[32m↗ 0.0251\u001B[0m)\n", + "│ │ └── Best until now = 0.6447 (\u001B[32m↗ 0.0251\u001B[0m)\n", + "│ └── Mean_iou = 0.715\n", + "│ ├── Epoch N-1 = 0.6974 (\u001B[32m↗ 0.0177\u001B[0m)\n", + "│ └── Best until now = 0.6974 (\u001B[32m↗ 0.0177\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3281\n", + " │ ├── Epoch N-1 = 0.3307 (\u001B[32m↘ -0.0026\u001B[0m)\n", + " │ └── Best until now = 0.3307 (\u001B[32m↘ -0.0026\u001B[0m)\n", + " ├── Target_iou = 0.7446\n", + " │ ├── Epoch N-1 = 0.7432 (\u001B[32m↗ 0.0014\u001B[0m)\n", + " │ └── Best until now = 0.7432 (\u001B[32m↗ 0.0014\u001B[0m)\n", + " ├── Background_iou = 0.4869\n", + " │ ├── Epoch N-1 = 0.4794 (\u001B[32m↗ 0.0074\u001B[0m)\n", + " │ └── Best until now = 0.4794 (\u001B[32m↗ 0.0074\u001B[0m)\n", + " └── Mean_iou = 0.6157\n", + " ├── Epoch N-1 = 0.6113 (\u001B[32m↗ 0.0044\u001B[0m)\n", + " └── Best until now = 0.6113 (\u001B[32m↗ 0.0044\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 5: 100%|██████████| 309/309 [02:02<00:00, 2.53it/s, BCEDiceLoss=0.287, background_IOU=0.664, gpu_mem=1.14, mean_IOU=0.712, target_IOU=0.761]\n", + "Validating epoch 5: 100%|██████████| 65/65 [00:17<00:00, 3.75it/s]\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 5\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2869\n", + "│ │ ├── Epoch N-1 = 0.2867 (\u001B[31m↗ 1e-04\u001B[0m)\n", + "│ │ └── Best until now = 0.2867 (\u001B[31m↗ 1e-04\u001B[0m)\n", + "│ ├── Target_iou = 0.7606\n", + "│ │ ├── Epoch N-1 = 0.7604 (\u001B[32m↗ 0.0002\u001B[0m)\n", + "│ │ └── Best until now = 0.7604 (\u001B[32m↗ 0.0002\u001B[0m)\n", + "│ ├── Background_iou = 0.6637\n", + "│ │ ├── Epoch N-1 = 0.6697 (\u001B[31m↘ -0.0061\u001B[0m)\n", + "│ │ └── Best until now = 0.6697 (\u001B[31m↘ -0.0061\u001B[0m)\n", + "│ └── Mean_iou = 0.7121\n", + "│ ├── Epoch N-1 = 0.715 (\u001B[31m↘ -0.0029\u001B[0m)\n", + "│ └── Best until now = 0.715 (\u001B[31m↘ -0.0029\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3339\n", + " │ ├── Epoch N-1 = 0.3281 (\u001B[31m↗ 0.0059\u001B[0m)\n", + " │ └── Best until now = 0.3281 (\u001B[31m↗ 0.0059\u001B[0m)\n", + " ├── Target_iou = 0.7402\n", + " │ ├── Epoch N-1 = 0.7446 (\u001B[31m↘ -0.0044\u001B[0m)\n", + " │ └── Best until now = 0.7446 (\u001B[31m↘ -0.0044\u001B[0m)\n", + " ├── Background_iou = 0.4593\n", + " │ ├── Epoch N-1 = 0.4869 (\u001B[31m↘ -0.0276\u001B[0m)\n", + " │ └── Best until now = 0.4869 (\u001B[31m↘ -0.0276\u001B[0m)\n", + " └── Mean_iou = 0.5997\n", + " ├── Epoch N-1 = 0.6157 (\u001B[31m↘ -0.016\u001B[0m)\n", + " └── Best until now = 0.6157 (\u001B[31m↘ -0.016\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 6: 100%|██████████| 309/309 [02:03<00:00, 2.50it/s, BCEDiceLoss=0.269, background_IOU=0.689, gpu_mem=1.14, mean_IOU=0.731, target_IOU=0.772]\n", + "Validating epoch 6: 100%|██████████| 65/65 [00:17<00:00, 3.77it/s]\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 6\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2686\n", + "│ │ ├── Epoch N-1 = 0.2869 (\u001B[32m↘ -0.0183\u001B[0m)\n", + "│ │ └── Best until now = 0.2867 (\u001B[32m↘ -0.0181\u001B[0m)\n", + "│ ├── Target_iou = 0.7721\n", + "│ │ ├── Epoch N-1 = 0.7606 (\u001B[32m↗ 0.0115\u001B[0m)\n", + "│ │ └── Best until now = 0.7606 (\u001B[32m↗ 0.0115\u001B[0m)\n", + "│ ├── Background_iou = 0.6892\n", + "│ │ ├── Epoch N-1 = 0.6637 (\u001B[32m↗ 0.0255\u001B[0m)\n", + "│ │ └── Best until now = 0.6697 (\u001B[32m↗ 0.0194\u001B[0m)\n", + "│ └── Mean_iou = 0.7306\n", + "│ ├── Epoch N-1 = 0.7121 (\u001B[32m↗ 0.0185\u001B[0m)\n", + "│ └── Best until now = 0.715 (\u001B[32m↗ 0.0156\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3278\n", + " │ ├── Epoch N-1 = 0.3339 (\u001B[32m↘ -0.0061\u001B[0m)\n", + " │ └── Best until now = 0.3281 (\u001B[32m↘ -0.0003\u001B[0m)\n", + " ├── Target_iou = 0.7431\n", + " │ ├── Epoch N-1 = 0.7402 (\u001B[32m↗ 0.003\u001B[0m)\n", + " │ └── Best until now = 0.7446 (\u001B[31m↘ -0.0015\u001B[0m)\n", + " ├── Background_iou = 0.4733\n", + " │ ├── Epoch N-1 = 0.4593 (\u001B[32m↗ 0.0139\u001B[0m)\n", + " │ └── Best until now = 0.4869 (\u001B[31m↘ -0.0136\u001B[0m)\n", + " └── Mean_iou = 0.6082\n", + " ├── Epoch N-1 = 0.5997 (\u001B[32m↗ 0.0085\u001B[0m)\n", + " └── Best until now = 0.6157 (\u001B[31m↘ -0.0075\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 7: 100%|██████████| 309/309 [02:01<00:00, 2.54it/s, BCEDiceLoss=0.259, background_IOU=0.701, gpu_mem=1.14, mean_IOU=0.741, target_IOU=0.781]\n", + "Validating epoch 7: 100%|██████████| 65/65 [00:17<00:00, 3.77it/s]\n", + "[2023-11-13 11:34:05] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:34:05] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7548585534095764\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 7\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.259\n", + "│ │ ├── Epoch N-1 = 0.2686 (\u001B[32m↘ -0.0096\u001B[0m)\n", + "│ │ └── Best until now = 0.2686 (\u001B[32m↘ -0.0096\u001B[0m)\n", + "│ ├── Target_iou = 0.7808\n", + "│ │ ├── Epoch N-1 = 0.7721 (\u001B[32m↗ 0.0087\u001B[0m)\n", + "│ │ └── Best until now = 0.7721 (\u001B[32m↗ 0.0087\u001B[0m)\n", + "│ ├── Background_iou = 0.7009\n", + "│ │ ├── Epoch N-1 = 0.6892 (\u001B[32m↗ 0.0117\u001B[0m)\n", + "│ │ └── Best until now = 0.6892 (\u001B[32m↗ 0.0117\u001B[0m)\n", + "│ └── Mean_iou = 0.7409\n", + "│ ├── Epoch N-1 = 0.7306 (\u001B[32m↗ 0.0102\u001B[0m)\n", + "│ └── Best until now = 0.7306 (\u001B[32m↗ 0.0102\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3129\n", + " │ ├── Epoch N-1 = 0.3278 (\u001B[32m↘ -0.0149\u001B[0m)\n", + " │ └── Best until now = 0.3278 (\u001B[32m↘ -0.0149\u001B[0m)\n", + " ├── Target_iou = 0.7549\n", + " │ ├── Epoch N-1 = 0.7431 (\u001B[32m↗ 0.0117\u001B[0m)\n", + " │ └── Best until now = 0.7446 (\u001B[32m↗ 0.0103\u001B[0m)\n", + " ├── Background_iou = 0.5241\n", + " │ ├── Epoch N-1 = 0.4733 (\u001B[32m↗ 0.0508\u001B[0m)\n", + " │ └── Best until now = 0.4869 (\u001B[32m↗ 0.0372\u001B[0m)\n", + " └── Mean_iou = 0.6395\n", + " ├── Epoch N-1 = 0.6082 (\u001B[32m↗ 0.0313\u001B[0m)\n", + " └── Best until now = 0.6157 (\u001B[32m↗ 0.0238\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 8: 100%|██████████| 309/309 [02:05<00:00, 2.47it/s, BCEDiceLoss=0.251, background_IOU=0.713, gpu_mem=1.14, mean_IOU=0.749, target_IOU=0.786]\n", + "Validating epoch 8: 100%|██████████| 65/65 [00:17<00:00, 3.77it/s]\n", + "[2023-11-13 11:36:30] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:36:30] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7585687637329102\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 8\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.251\n", + "│ │ ├── Epoch N-1 = 0.259 (\u001B[32m↘ -0.008\u001B[0m)\n", + "│ │ └── Best until now = 0.259 (\u001B[32m↘ -0.008\u001B[0m)\n", + "│ ├── Target_iou = 0.786\n", + "│ │ ├── Epoch N-1 = 0.7808 (\u001B[32m↗ 0.0052\u001B[0m)\n", + "│ │ └── Best until now = 0.7808 (\u001B[32m↗ 0.0052\u001B[0m)\n", + "│ ├── Background_iou = 0.7125\n", + "│ │ ├── Epoch N-1 = 0.7009 (\u001B[32m↗ 0.0116\u001B[0m)\n", + "│ │ └── Best until now = 0.7009 (\u001B[32m↗ 0.0116\u001B[0m)\n", + "│ └── Mean_iou = 0.7493\n", + "│ ├── Epoch N-1 = 0.7409 (\u001B[32m↗ 0.0084\u001B[0m)\n", + "│ └── Best until now = 0.7409 (\u001B[32m↗ 0.0084\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3091\n", + " │ ├── Epoch N-1 = 0.3129 (\u001B[32m↘ -0.0039\u001B[0m)\n", + " │ └── Best until now = 0.3129 (\u001B[32m↘ -0.0039\u001B[0m)\n", + " ├── Target_iou = 0.7586\n", + " │ ├── Epoch N-1 = 0.7549 (\u001B[32m↗ 0.0037\u001B[0m)\n", + " │ └── Best until now = 0.7549 (\u001B[32m↗ 0.0037\u001B[0m)\n", + " ├── Background_iou = 0.5411\n", + " │ ├── Epoch N-1 = 0.5241 (\u001B[32m↗ 0.017\u001B[0m)\n", + " │ └── Best until now = 0.5241 (\u001B[32m↗ 0.017\u001B[0m)\n", + " └── Mean_iou = 0.6498\n", + " ├── Epoch N-1 = 0.6395 (\u001B[32m↗ 0.0103\u001B[0m)\n", + " └── Best until now = 0.6395 (\u001B[32m↗ 0.0103\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 9: 100%|██████████| 309/309 [02:02<00:00, 2.53it/s, BCEDiceLoss=0.246, background_IOU=0.713, gpu_mem=1.14, mean_IOU=0.752, target_IOU=0.791]\n", + "Validating epoch 9: 100%|██████████| 65/65 [00:17<00:00, 3.72it/s]\n", + "[2023-11-13 11:38:53] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:38:53] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.759834885597229\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 9\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2465\n", + "│ │ ├── Epoch N-1 = 0.251 (\u001B[32m↘ -0.0045\u001B[0m)\n", + "│ │ └── Best until now = 0.251 (\u001B[32m↘ -0.0045\u001B[0m)\n", + "│ ├── Target_iou = 0.7905\n", + "│ │ ├── Epoch N-1 = 0.786 (\u001B[32m↗ 0.0045\u001B[0m)\n", + "│ │ └── Best until now = 0.786 (\u001B[32m↗ 0.0045\u001B[0m)\n", + "│ ├── Background_iou = 0.7133\n", + "│ │ ├── Epoch N-1 = 0.7125 (\u001B[32m↗ 0.0008\u001B[0m)\n", + "│ │ └── Best until now = 0.7125 (\u001B[32m↗ 0.0008\u001B[0m)\n", + "│ └── Mean_iou = 0.7519\n", + "│ ├── Epoch N-1 = 0.7493 (\u001B[32m↗ 0.0026\u001B[0m)\n", + "│ └── Best until now = 0.7493 (\u001B[32m↗ 0.0026\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3072\n", + " │ ├── Epoch N-1 = 0.3091 (\u001B[32m↘ -0.0018\u001B[0m)\n", + " │ └── Best until now = 0.3091 (\u001B[32m↘ -0.0018\u001B[0m)\n", + " ├── Target_iou = 0.7598\n", + " │ ├── Epoch N-1 = 0.7586 (\u001B[32m↗ 0.0013\u001B[0m)\n", + " │ └── Best until now = 0.7586 (\u001B[32m↗ 0.0013\u001B[0m)\n", + " ├── Background_iou = 0.5481\n", + " │ ├── Epoch N-1 = 0.5411 (\u001B[32m↗ 0.007\u001B[0m)\n", + " │ └── Best until now = 0.5411 (\u001B[32m↗ 0.007\u001B[0m)\n", + " └── Mean_iou = 0.6539\n", + " ├── Epoch N-1 = 0.6498 (\u001B[32m↗ 0.0041\u001B[0m)\n", + " └── Best until now = 0.6498 (\u001B[32m↗ 0.0041\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 10: 100%|██████████| 309/309 [02:03<00:00, 2.50it/s, BCEDiceLoss=0.24, background_IOU=0.723, gpu_mem=1.14, mean_IOU=0.759, target_IOU=0.796]\n", + "Validating epoch 10: 100%|██████████| 65/65 [00:17<00:00, 3.77it/s]\n", + "[2023-11-13 11:41:16] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:41:16] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7605207562446594\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 10\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2399\n", + "│ │ ├── Epoch N-1 = 0.2465 (\u001B[32m↘ -0.0066\u001B[0m)\n", + "│ │ └── Best until now = 0.2465 (\u001B[32m↘ -0.0066\u001B[0m)\n", + "│ ├── Target_iou = 0.7956\n", + "│ │ ├── Epoch N-1 = 0.7905 (\u001B[32m↗ 0.0051\u001B[0m)\n", + "│ │ └── Best until now = 0.7905 (\u001B[32m↗ 0.0051\u001B[0m)\n", + "│ ├── Background_iou = 0.7229\n", + "│ │ ├── Epoch N-1 = 0.7133 (\u001B[32m↗ 0.0096\u001B[0m)\n", + "│ │ └── Best until now = 0.7133 (\u001B[32m↗ 0.0096\u001B[0m)\n", + "│ └── Mean_iou = 0.7593\n", + "│ ├── Epoch N-1 = 0.7519 (\u001B[32m↗ 0.0074\u001B[0m)\n", + "│ └── Best until now = 0.7519 (\u001B[32m↗ 0.0074\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3059\n", + " │ ├── Epoch N-1 = 0.3072 (\u001B[32m↘ -0.0014\u001B[0m)\n", + " │ └── Best until now = 0.3072 (\u001B[32m↘ -0.0014\u001B[0m)\n", + " ├── Target_iou = 0.7605\n", + " │ ├── Epoch N-1 = 0.7598 (\u001B[32m↗ 0.0007\u001B[0m)\n", + " │ └── Best until now = 0.7598 (\u001B[32m↗ 0.0007\u001B[0m)\n", + " ├── Background_iou = 0.5517\n", + " │ ├── Epoch N-1 = 0.5481 (\u001B[32m↗ 0.0037\u001B[0m)\n", + " │ └── Best until now = 0.5481 (\u001B[32m↗ 0.0037\u001B[0m)\n", + " └── Mean_iou = 0.6561\n", + " ├── Epoch N-1 = 0.6539 (\u001B[32m↗ 0.0022\u001B[0m)\n", + " └── Best until now = 0.6539 (\u001B[32m↗ 0.0022\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 11: 100%|██████████| 309/309 [02:01<00:00, 2.54it/s, BCEDiceLoss=0.231, background_IOU=0.733, gpu_mem=1.14, mean_IOU=0.767, target_IOU=0.801]\n", + "Validating epoch 11: 100%|██████████| 65/65 [00:17<00:00, 3.76it/s]\n", + "[2023-11-13 11:43:37] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:43:37] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7611058950424194\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 11\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2309\n", + "│ │ ├── Epoch N-1 = 0.2399 (\u001B[32m↘ -0.009\u001B[0m)\n", + "│ │ └── Best until now = 0.2399 (\u001B[32m↘ -0.009\u001B[0m)\n", + "│ ├── Target_iou = 0.8015\n", + "│ │ ├── Epoch N-1 = 0.7956 (\u001B[32m↗ 0.0059\u001B[0m)\n", + "│ │ └── Best until now = 0.7956 (\u001B[32m↗ 0.0059\u001B[0m)\n", + "│ ├── Background_iou = 0.7333\n", + "│ │ ├── Epoch N-1 = 0.7229 (\u001B[32m↗ 0.0104\u001B[0m)\n", + "│ │ └── Best until now = 0.7229 (\u001B[32m↗ 0.0104\u001B[0m)\n", + "│ └── Mean_iou = 0.7674\n", + "│ ├── Epoch N-1 = 0.7593 (\u001B[32m↗ 0.0081\u001B[0m)\n", + "│ └── Best until now = 0.7593 (\u001B[32m↗ 0.0081\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3046\n", + " │ ├── Epoch N-1 = 0.3059 (\u001B[32m↘ -0.0012\u001B[0m)\n", + " │ └── Best until now = 0.3059 (\u001B[32m↘ -0.0012\u001B[0m)\n", + " ├── Target_iou = 0.7611\n", + " │ ├── Epoch N-1 = 0.7605 (\u001B[32m↗ 0.0006\u001B[0m)\n", + " │ └── Best until now = 0.7605 (\u001B[32m↗ 0.0006\u001B[0m)\n", + " ├── Background_iou = 0.5546\n", + " │ ├── Epoch N-1 = 0.5517 (\u001B[32m↗ 0.0029\u001B[0m)\n", + " │ └── Best until now = 0.5517 (\u001B[32m↗ 0.0029\u001B[0m)\n", + " └── Mean_iou = 0.6579\n", + " ├── Epoch N-1 = 0.6561 (\u001B[32m↗ 0.0017\u001B[0m)\n", + " └── Best until now = 0.6561 (\u001B[32m↗ 0.0017\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 12: 100%|██████████| 309/309 [02:03<00:00, 2.51it/s, BCEDiceLoss=0.224, background_IOU=0.736, gpu_mem=1.14, mean_IOU=0.771, target_IOU=0.807]\n", + "Validating epoch 12: 100%|██████████| 65/65 [00:17<00:00, 3.77it/s]\n", + "[2023-11-13 11:46:00] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:46:00] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7616798877716064\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 12\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2243\n", + "│ │ ├── Epoch N-1 = 0.2309 (\u001B[32m↘ -0.0066\u001B[0m)\n", + "│ │ └── Best until now = 0.2309 (\u001B[32m↘ -0.0066\u001B[0m)\n", + "│ ├── Target_iou = 0.8068\n", + "│ │ ├── Epoch N-1 = 0.8015 (\u001B[32m↗ 0.0053\u001B[0m)\n", + "│ │ └── Best until now = 0.8015 (\u001B[32m↗ 0.0053\u001B[0m)\n", + "│ ├── Background_iou = 0.736\n", + "│ │ ├── Epoch N-1 = 0.7333 (\u001B[32m↗ 0.0027\u001B[0m)\n", + "│ │ └── Best until now = 0.7333 (\u001B[32m↗ 0.0027\u001B[0m)\n", + "│ └── Mean_iou = 0.7714\n", + "│ ├── Epoch N-1 = 0.7674 (\u001B[32m↗ 0.004\u001B[0m)\n", + "│ └── Best until now = 0.7674 (\u001B[32m↗ 0.004\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3035\n", + " │ ├── Epoch N-1 = 0.3046 (\u001B[32m↘ -0.0012\u001B[0m)\n", + " │ └── Best until now = 0.3046 (\u001B[32m↘ -0.0012\u001B[0m)\n", + " ├── Target_iou = 0.7617\n", + " │ ├── Epoch N-1 = 0.7611 (\u001B[32m↗ 0.0006\u001B[0m)\n", + " │ └── Best until now = 0.7611 (\u001B[32m↗ 0.0006\u001B[0m)\n", + " ├── Background_iou = 0.5569\n", + " │ ├── Epoch N-1 = 0.5546 (\u001B[32m↗ 0.0023\u001B[0m)\n", + " │ └── Best until now = 0.5546 (\u001B[32m↗ 0.0023\u001B[0m)\n", + " └── Mean_iou = 0.6593\n", + " ├── Epoch N-1 = 0.6579 (\u001B[32m↗ 0.0014\u001B[0m)\n", + " └── Best until now = 0.6579 (\u001B[32m↗ 0.0014\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 13: 100%|██████████| 309/309 [02:01<00:00, 2.55it/s, BCEDiceLoss=0.219, background_IOU=0.745, gpu_mem=1.14, mean_IOU=0.777, target_IOU=0.81]\n", + "Validating epoch 13: 100%|██████████| 65/65 [00:17<00:00, 3.81it/s]\n", + "[2023-11-13 11:48:23] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:48:23] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.7624021172523499\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 13\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2194\n", + "│ │ ├── Epoch N-1 = 0.2243 (\u001B[32m↘ -0.0049\u001B[0m)\n", + "│ │ └── Best until now = 0.2243 (\u001B[32m↘ -0.0049\u001B[0m)\n", + "│ ├── Target_iou = 0.8097\n", + "│ │ ├── Epoch N-1 = 0.8068 (\u001B[32m↗ 0.0029\u001B[0m)\n", + "│ │ └── Best until now = 0.8068 (\u001B[32m↗ 0.0029\u001B[0m)\n", + "│ ├── Background_iou = 0.7447\n", + "│ │ ├── Epoch N-1 = 0.736 (\u001B[32m↗ 0.0086\u001B[0m)\n", + "│ │ └── Best until now = 0.736 (\u001B[32m↗ 0.0086\u001B[0m)\n", + "│ └── Mean_iou = 0.7772\n", + "│ ├── Epoch N-1 = 0.7714 (\u001B[32m↗ 0.0058\u001B[0m)\n", + "│ └── Best until now = 0.7714 (\u001B[32m↗ 0.0058\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3024\n", + " │ ├── Epoch N-1 = 0.3035 (\u001B[32m↘ -0.0011\u001B[0m)\n", + " │ └── Best until now = 0.3035 (\u001B[32m↘ -0.0011\u001B[0m)\n", + " ├── Target_iou = 0.7624\n", + " │ ├── Epoch N-1 = 0.7617 (\u001B[32m↗ 0.0007\u001B[0m)\n", + " │ └── Best until now = 0.7617 (\u001B[32m↗ 0.0007\u001B[0m)\n", + " ├── Background_iou = 0.5596\n", + " │ ├── Epoch N-1 = 0.5569 (\u001B[32m↗ 0.0027\u001B[0m)\n", + " │ └── Best until now = 0.5569 (\u001B[32m↗ 0.0027\u001B[0m)\n", + " └── Mean_iou = 0.661\n", + " ├── Epoch N-1 = 0.6593 (\u001B[32m↗ 0.0017\u001B[0m)\n", + " └── Best until now = 0.6593 (\u001B[32m↗ 0.0017\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 14: 100%|██████████| 309/309 [02:01<00:00, 2.54it/s, BCEDiceLoss=0.215, background_IOU=0.748, gpu_mem=1.14, mean_IOU=0.781, target_IOU=0.813]\n", + "Validating epoch 14: 100%|██████████| 65/65 [00:17<00:00, 3.78it/s]\n", + "[2023-11-13 11:50:45] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_quick_start/RUN_20231113_111507_197271/ckpt_best.pth\n", + "[2023-11-13 11:50:45] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.763008713722229\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 14\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2155\n", + "│ │ ├── Epoch N-1 = 0.2194 (\u001B[32m↘ -0.0039\u001B[0m)\n", + "│ │ └── Best until now = 0.2194 (\u001B[32m↘ -0.0039\u001B[0m)\n", + "│ ├── Target_iou = 0.8134\n", + "│ │ ├── Epoch N-1 = 0.8097 (\u001B[32m↗ 0.0037\u001B[0m)\n", + "│ │ └── Best until now = 0.8097 (\u001B[32m↗ 0.0037\u001B[0m)\n", + "│ ├── Background_iou = 0.7484\n", + "│ │ ├── Epoch N-1 = 0.7447 (\u001B[32m↗ 0.0038\u001B[0m)\n", + "│ │ └── Best until now = 0.7447 (\u001B[32m↗ 0.0038\u001B[0m)\n", + "│ └── Mean_iou = 0.7809\n", + "│ ├── Epoch N-1 = 0.7772 (\u001B[32m↗ 0.0037\u001B[0m)\n", + "│ └── Best until now = 0.7772 (\u001B[32m↗ 0.0037\u001B[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.3015\n", + " │ ├── Epoch N-1 = 0.3024 (\u001B[32m↘ -0.0009\u001B[0m)\n", + " │ └── Best until now = 0.3024 (\u001B[32m↘ -0.0009\u001B[0m)\n", + " ├── Target_iou = 0.763\n", + " │ ├── Epoch N-1 = 0.7624 (\u001B[32m↗ 0.0006\u001B[0m)\n", + " │ └── Best until now = 0.7624 (\u001B[32m↗ 0.0006\u001B[0m)\n", + " ├── Background_iou = 0.5621\n", + " │ ├── Epoch N-1 = 0.5596 (\u001B[32m↗ 0.0025\u001B[0m)\n", + " │ └── Best until now = 0.5596 (\u001B[32m↗ 0.0025\u001B[0m)\n", + " └── Mean_iou = 0.6625\n", + " ├── Epoch N-1 = 0.661 (\u001B[32m↗ 0.0016\u001B[0m)\n", + " └── Best until now = 0.661 (\u001B[32m↗ 0.0016\u001B[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 11:50:47] INFO - sg_trainer.py - RUNNING ADDITIONAL TEST ON THE AVERAGED MODEL...\n", + "Validating epoch 15: 98%|█████████▊| 64/65 [00:16<00:00, 3.27it/s]" + ] + } + ], + "source": [ + "trainer.train(model=model, training_params=train_params, train_loader=train_loader, valid_loader=valid_loader)" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "id": "X8BJq1crcbjl", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "661796b8-431a-4c23-ac57-9bdc579a685d" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Best Checkpoint mIoU is: 0.763008713722229\n" + ] + } + ], + "source": [ + "print(\"Best Checkpoint mIoU is: \"+ str(trainer.best_metric))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3Nybj15cchxd" + }, + "source": [ + "Now you can download your trained weights from this directory" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "id": "_iHsFgPSciQh" + }, + "outputs": [], + "source": [ + "print(trainer.checkpoints_dir_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yuhYeXLA18q5" + }, + "source": [ + "# 6. Predict\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "VjRA1tu1mvXQ" + }, + "source": [ + "When the training is complete you can use the trained model to get predictions on the validation set, your data or some other image. Let's load some image and\n", + "run a model inference to create a binary segmentation mask." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "id": "Ads7RyGN2JwQ", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 977 + }, + "outputId": "c99ede2d-7fdd-428a-95fe-cac9afbf508b" + }, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "image/png": "\n" + }, + "metadata": {} + }, + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAUAAAAHgCAAAAADx5+uYAAAG2UlEQVR4nO3dzXLTQBAA4QnF+79yOCRAnMiSVj37Y033hSoOeOfzrGxSVBFhZmZmZmZmZmZmZmYv0NvsA+z1/vHL0mdc9XDvG7+35FlXPNQW3r9WO/Bq59nX+2ylQ690lnN6Hy1z7mUOEi18EcucfJFjRCtfxCJnX+IQEVf8YonTL3CEiIt8EQucf/oBIgL4xfQJFgAkehExeYbpgJgvImaOMRkwhy9i3iBTAfP4ImLSLBMBk/lizjDzAPP9YsY4v4a/4mdd/Dr9qXtN2sB+g44eaA5g10UZO9IMwO73bORQ4wFHPKYGTjUacNBTftxYYwHHfUgOm2vo15iBXzKGvdRIwKFf0ka92LgrPPw77pjRhm3g+L8jjHnF30NeZQLfqMZs4By/Ia864kExb/0GTDdgAyde3wEv3R9w6uOv/4t3B5z88dH95af9QHVUvQV7A973+8tnnQEX8Ot8hL6AC/j1PkRXwCX8OtcTcBW/rue4/adwRF/BjoCrLGDf+gGu5NfxLCWucE/BboArLWB0PE4vwMX8+h2oyBXuVyfA5Raw25EKbWAfwUKAfeoDuOANjk6nKrWBPQS7AK65gH0qtYE93tkegJUWsNgGdnhvOwAuvYDphyu2gfnlAy69gPmV28Ds9zcdsNgC1tvA7He4HmBy2YDVbnDFDcx9jwsC5pYM+BI3OPWQbiBMQFhJwMw7XBIws1zAl/gMyc0NhNUETLwpNQETSwUs+Ah0A2lFAfPuSiZgxRtcdQPzEhCWCFjyBpfdwLR3Ow+w5gKW3cC0BIQJCEsDLPoIdANpZQGzbkxZwKyyAKs+At1AmoAwAWECwpIAX/AzJOnIbiBMQJiAMAFhAsIEhBUGzPkeUxgwJwFhOYAv+BeRrNxAmIAwAWGVAVOe3JUBUxIQJiBMQJiAMAFhAsJyACf9T+0r5AbCBIQJCBMQJiBMQJiAsMqAKd9eKwOmJCBMQJiAMAFh/ssEmBsIExAmIExAmIAwAWECwlIAC38NdANpAsIyACvfYDeQJiAsAfBlb7D/RnqFOODLLmBOpTcw473HgMUXsPYGZrz7FLD6AhbfwIT3HwKWX8DqG8hjgK+/gHgCNxBWHpCuIAJ8/RvMI4D6hVcYrwEAdAEjCOBt/NggXmEoeBnwNgsYbJargHfyQ9N4hWEXAe+1gGSea4B38wMTeYVhlwDvt4DXZ7oCeEe/y1NdALyn39W5fAbC2gHvuoAXJ2sGvK/ftdm8wl+7INgKeOcFjCvjuYGPNQs2At58AS/UBljAr3VEr/D3GgWbAAssYLRO2QJYw69xzgbAKn5tk/oM3KpB8DxgnQWMlmFPA5bya8grvN3pfTkLWG4Bzw58ErCc3+mRvcKwc4AFF/Ds0G7g804JngIsuYBxbu4zgFX9Tk3uFd7t/ZDwBGDdBYw4nt4NhB0D1l7Aw/kPAav7HQl4hY/bFTwCdAFjH8ENPNOO4AGgC/jRcwc38FxPv1HvA7qAh+0C6velJxhe4dNtC+4BuoCPbXrsAOr3vS0Rr3BLG5/FzwFdwK1+qDz9v9X0e9YjmVe4ucfVegboAj7vwebJFdZvv/9sXuFL/f843gZ0AQ/7S7R5hfU71VvE9gbqd673iM0N1K+lnxuoX1M/APVr6zugfo19A9SvtUdA/Zp7ANSvvS9fY+S70j9A+a71CSjf1d6kY/njLJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSE/QHXqdvBmYEGJwAAAABJRU5ErkJggg==\n" + }, + "metadata": {} + } + ], + "source": [ + "from torchvision.transforms import Compose, ToTensor, Resize, Normalize, ToPILImage\n", + "from PIL import Image\n", + "import torch\n", + "\n", + "pre_proccess = Compose([\n", + " ToTensor(),\n", + " Normalize([.485, .456, .406], [.229, .224, .225])\n", + "])\n", + "\n", + "demo_img_path = os.path.join(root_dir, \"images\", \"ache-adult-depression-expression-41253.png\")\n", + "img = Image.open(demo_img_path)\n", + "# Resize the image and display\n", + "img = Resize(size=(480, 320))(img)\n", + "display(img)\n", + "\n", + "# Run pre-proccess - transforms to tensor and apply normalizations.\n", + "img = pre_proccess(img).unsqueeze(0).cuda()\n", + "\n", + "# Run inference\n", + "model = trainer.net\n", + "model = model.eval()\n", + "mask = model(img)\n", + "\n", + "# Run post-proccess - apply sigmoid to output probabilities, then apply hard\n", + "# threshold of 0.5 for binary mask prediction.\n", + "mask = torch.sigmoid(mask).gt(0.5).squeeze()\n", + "mask = ToPILImage()(mask.float())\n", + "display(mask)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-k6ZLKHL1hIM" + }, + "source": [ + "# 7. Convert to ONNX/TensorRT" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "br7n55Szm4Nq" + }, + "source": [ + "Let's compile our model to ONNX." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "id": "q0AGQvEf11PT", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "76b54859-3375-4fc4-c7a7-5b86ed3d80fb" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "ONNX successfully created at: /content/model.onnx\n" + ] + } + ], + "source": [ + "from onnxsim import simplify\n", + "import onnx\n", + "\n", + "onnx_path = os.path.join(os.getcwd(), \"model.onnx\")\n", + "\n", + "input_size = [1, 3, 480, 320]\n", + "model.prep_model_for_conversion(input_size=input_size)\n", + "\n", + "torch.onnx.export(model,\n", + " torch.randn(*input_size).cuda(),\n", + " onnx_path)\n", + "\n", + "# onnx simplifier\n", + "model_sim, check = simplify(onnx_path)\n", + "assert check, \"Simplified ONNX model could not be validated\"\n", + "onnx.save_model(model_sim, onnx_path)\n", + "\n", + "print(\"ONNX successfully created at: \", onnx_path)\n", + "\n" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/notebooks/segmentation_connect_custom_dataset.ipynb b/notebooks/segmentation_connect_custom_dataset.ipynb new file mode 100644 index 0000000000..8cce51322c --- /dev/null +++ b/notebooks/segmentation_connect_custom_dataset.ipynb @@ -0,0 +1,1010 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "sh6t_y7KzqBH" + }, + "source": [ + "![SG - Horizontal.png]()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5aISf1B-AGDQ" + }, + "source": [ + "# SuperGradients Semantic Segmentation How to Connect Custom Dataset\n", + "\n", + "In this tutorial we will explore how you can connect your custom Semantic Segmentation dataset to SG.\n", + "\n", + "Since SG trainer is fully compatible with PyTorch data loaders, we will demonstrate how to build one and use it.\n", + "\n", + "The notebook is divided into 5 sections:\n", + "1. Experiment setup\n", + "2. Dataset definition: create a proxy dataset and create a dataloader\n", + "3. Architecture definition: pre-trained PPLiteSeg on Cityscapes \n", + "4. Training setup\n", + "5. Training and Evaluation\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-1nPOPmc1lGp" + }, + "source": [ + "#Install SG" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "VAssbjJw7Yt1" + }, + "source": [ + "The cell below will install **super_gradients** which will automatically get all its dependencies. Let's import all the installed libraries to make sure they installed succesfully." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "JKce1SM6voVH", + "outputId": "e27e79a3-5b89-4869-bf1b-ea54ef60331f" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m12.0/12.0 MB\u001B[0m \u001B[31m36.2 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m135.8/135.8 kB\u001B[0m \u001B[31m21.6 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m684.5/684.5 kB\u001B[0m \u001B[31m39.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[?25h Preparing metadata (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m2.9/2.9 MB\u001B[0m \u001B[31m61.9 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m2.8/2.8 MB\u001B[0m \u001B[31m73.0 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m408.6/408.6 kB\u001B[0m \u001B[31m24.0 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m154.5/154.5 kB\u001B[0m \u001B[31m22.2 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m79.5/79.5 kB\u001B[0m \u001B[31m11.7 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m4.5/4.5 MB\u001B[0m \u001B[31m89.5 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m13.5/13.5 MB\u001B[0m \u001B[31m96.6 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m68.0/68.0 kB\u001B[0m \u001B[31m10.8 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[?25h Installing build dependencies ... \u001B[?25l\u001B[?25hdone\n", + " Getting requirements to build wheel ... \u001B[?25l\u001B[?25hdone\n", + " Preparing metadata (pyproject.toml) ... \u001B[?25l\u001B[?25hdone\n", + " Preparing metadata (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + " Preparing metadata (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + " Preparing metadata (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m17.0/17.0 MB\u001B[0m \u001B[31m88.3 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m3.3/3.3 MB\u001B[0m \u001B[31m82.7 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m2.2/2.2 MB\u001B[0m \u001B[31m77.3 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m458.9/458.9 kB\u001B[0m \u001B[31m46.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m46.0/46.0 kB\u001B[0m \u001B[31m6.6 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m11.3/11.3 MB\u001B[0m \u001B[31m74.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m79.8/79.8 kB\u001B[0m \u001B[31m11.2 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m108.3/108.3 kB\u001B[0m \u001B[31m13.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[?25h Preparing metadata (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m176.0/176.0 kB\u001B[0m \u001B[31m26.9 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m407.7/407.7 kB\u001B[0m \u001B[31m45.2 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m107.7/107.7 kB\u001B[0m \u001B[31m13.3 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m277.4/277.4 kB\u001B[0m \u001B[31m36.5 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m2.8/2.8 MB\u001B[0m \u001B[31m111.7 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m913.9/913.9 kB\u001B[0m \u001B[31m78.4 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[?25h Preparing metadata (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m117.0/117.0 kB\u001B[0m \u001B[31m18.0 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[?25h Preparing metadata (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m575.5/575.5 kB\u001B[0m \u001B[31m50.6 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m121.1/121.1 kB\u001B[0m \u001B[31m16.9 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m86.8/86.8 kB\u001B[0m \u001B[31m13.0 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m120.0/120.0 kB\u001B[0m \u001B[31m16.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m120.0/120.0 kB\u001B[0m \u001B[31m17.8 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m120.6/120.6 kB\u001B[0m \u001B[31m16.6 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m83.5/83.5 kB\u001B[0m \u001B[31m13.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m83.5/83.5 kB\u001B[0m \u001B[31m13.8 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m84.7/84.7 kB\u001B[0m \u001B[31m13.9 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m99.2/99.2 kB\u001B[0m \u001B[31m14.4 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m99.2/99.2 kB\u001B[0m \u001B[31m15.4 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m99.8/99.8 kB\u001B[0m \u001B[31m15.9 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m89.4/89.4 kB\u001B[0m \u001B[31m14.0 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m89.4/89.4 kB\u001B[0m \u001B[31m14.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m90.6/90.6 kB\u001B[0m \u001B[31m14.0 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m92.6/92.6 kB\u001B[0m \u001B[31m14.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m92.6/92.6 kB\u001B[0m \u001B[31m14.7 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m92.6/92.6 kB\u001B[0m \u001B[31m15.1 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m94.0/94.0 kB\u001B[0m \u001B[31m15.2 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m105.0/105.0 kB\u001B[0m \u001B[31m15.5 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m46.2/46.2 kB\u001B[0m \u001B[31m6.8 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m106.8/106.8 kB\u001B[0m \u001B[31m17.7 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m194.6/194.6 kB\u001B[0m \u001B[31m29.4 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m58.1/58.1 kB\u001B[0m \u001B[31m9.8 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\n", + "\u001B[?25h Building wheel for pycocotools (pyproject.toml) ... \u001B[?25l\u001B[?25hdone\n", + " Building wheel for termcolor (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + " Building wheel for treelib (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + " Building wheel for coverage (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + " Building wheel for xhtml2pdf (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + " Building wheel for antlr4-python3-runtime (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + " Building wheel for stringcase (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + " Building wheel for svglib (setup.py) ... \u001B[?25l\u001B[?25hdone\n", + "\u001B[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n", + "lida 0.0.10 requires fastapi, which is not installed.\n", + "lida 0.0.10 requires kaleido, which is not installed.\n", + "lida 0.0.10 requires python-multipart, which is not installed.\n", + "lida 0.0.10 requires uvicorn, which is not installed.\n", + "tensorflow 2.14.0 requires numpy>=1.23.5, but you have numpy 1.23.0 which is incompatible.\u001B[0m\u001B[31m\n", + "\u001B[0m" + ] + } + ], + "source": [ + "! pip install -qq super-gradients==3.4.1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "njthhNJR1pJm" + }, + "source": [ + "# 1. Experiment setup" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YPym4wvpOcOJ" + }, + "source": [ + "We will first initialize our **trainer** which will be in charge of everything, like training, evaluation, saving checkpoints, plotting etc.\n", + "\n", + "The **experiment name** argument is important as every checkpoints, logs and tensorboards to be saved in a directory with the same name. This directory will be created as a sub-directory of **ckpt_root_dir** as follow:\n", + "\n", + "```\n", + "ckpt_root_dir\n", + "|─── experiment_name_1\n", + "│ ckpt_best.pth # Model checkpoint on best epoch\n", + "│ ckpt_latest.pth # Model checkpoint on last epoch\n", + "│ average_model.pth # Model checkpoint averaged over epochs\n", + "│ events.out.tfevents.1659878383... # Tensorflow artifacts of a specific run\n", + "│ log_Aug07_11_52_48.txt # Trainer logs of a specific run\n", + "└─── experiment_name_2\n", + " ...\n", + "```\n", + "In this notebook multi-gpu training is set as `OFF`, for Distributed training multi_gpu can be set as\n", + " `MultiGPUMode.DISTRIBUTED_DATA_PARALLEL` or `MultiGPUMode.DATA_PARALLEL`.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "A2PlnTWpimnH" + }, + "source": [ + "Let's define **ckpt_root_dir** inside the Colab, later we can use it to start TensorBoard and monitor the run." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "_v1N3kXs3wo1", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "bf8d1fba-3fa0-41af-97a6-9bc23626e78e" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "The console stream is logged into /root/sg_logs/console.log\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 12:01:12] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it\n", + "[2023-11-13 12:01:13] WARNING - __init__.py - Failed to import pytorch_quantization\n", + "[2023-11-13 12:01:13] INFO - utils.py - NumExpr defaulting to 2 threads.\n", + "[2023-11-13 12:01:24] WARNING - calibrator.py - Failed to import pytorch_quantization\n", + "[2023-11-13 12:01:24] WARNING - export.py - Failed to import pytorch_quantization\n", + "[2023-11-13 12:01:24] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: boto3 required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: deprecated required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: coverage required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: sphinx-rtd-theme required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: torchmetrics required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: hydra-core required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: omegaconf required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: onnxruntime required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: onnx required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: einops required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: treelib required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: stringcase required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: rapidfuzz required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: json-tricks required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: onnx-simplifier required but not found\u001B[0m\n", + "[2023-11-13 12:01:24] WARNING - env_sanity_check.py - \u001B[31mFailed to verify installed packages: data-gradients required but not found\u001B[0m\n" + ] + } + ], + "source": [ + "from super_gradients.training import Trainer, MultiGPUMode\n", + "\n", + "\n", + "CHECKPOINT_DIR = './notebook_ckpts/'\n", + "trainer = Trainer(experiment_name='transfer_learning_semantic_segementation_ppLite', ckpt_root_dir=CHECKPOINT_DIR)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "J9ZaMulSvwhr" + }, + "source": [ + "# 2. Dataset definition\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_1TXuJKkKzFJ" + }, + "source": [ + "## 2.1 Generate Proxy Dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Y7us7VHRig7M" + }, + "source": [ + "\n", + "A proxy dataset generation is available merely to demonstrate an end-to-end training pipeline in this notebook.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "wbdVYnIyjgv-" + }, + "outputs": [], + "source": [ + "from PIL import Image\n", + "import os\n", + "import numpy as np\n", + "\n", + "\n", + "# creation of proxy dataset to demonstrate usage\n", + "def generate_proxy_dataset(write_path: str, num_samples: int, num_classes: int, img_size: int = 256):\n", + " # Create training files and text\n", + " os.makedirs(os.path.join(write_path, 'images', 'train'), exist_ok=True)\n", + " os.makedirs(os.path.join(write_path, 'images', 'val'), exist_ok=True)\n", + " os.makedirs(os.path.join(write_path, 'labels', 'train'), exist_ok=True)\n", + " os.makedirs(os.path.join(write_path, 'labels', 'val'), exist_ok=True)\n", + "\n", + " train_fp = open(os.path.join(write_path, 'train.txt'), 'w')\n", + " val_fp = open(os.path.join(write_path, 'val.txt'), 'w')\n", + "\n", + " # Create random samples\n", + " for n in range(num_samples):\n", + " img = np.random.rand(img_size, img_size, 3) * 255\n", + " img = Image.fromarray(img.astype('uint8')).convert('RGB')\n", + "\n", + " lbl = np.random.randint(0, num_classes, size=(img_size, img_size))\n", + " lbl = Image.fromarray(lbl.astype('uint8')).convert('L')\n", + "\n", + " im_string = '%000d.jpg' % n\n", + " lbl_string = '%000d.png' % n\n", + "\n", + " img_train_fn = os.path.join(write_path, 'images', 'train', im_string)\n", + " img_val_fn = img_train_fn.replace(\"train\", \"val\")\n", + " img.save(img_train_fn)\n", + " img.save(img_val_fn)\n", + "\n", + " lbl_train_fn = os.path.join(write_path, 'labels', 'train', lbl_string)\n", + " lbl_val_fn = lbl_train_fn.replace(\"train\", \"val\")\n", + " lbl.save(lbl_train_fn)\n", + " lbl.save(lbl_val_fn)\n", + "\n", + " train_fp.write(f\"{img_train_fn} {lbl_train_fn}\\n\")\n", + " val_fp.write(f\"{img_val_fn} {lbl_val_fn}\\n\")\n", + "\n", + " train_fp.close()\n", + " val_fp.close()" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "id": "DXu4yfuZoiv0" + }, + "outputs": [], + "source": [ + "num_classes = 10\n", + "data_dir = os.path.join(os.getcwd(), 'example_data')\n", + "generate_proxy_dataset(data_dir, num_samples=10, num_classes=num_classes)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MDksFYrIqClt" + }, + "source": [ + "## 2.2 Create Torch Dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "id": "AGziBKSIqaUu" + }, + "outputs": [], + "source": [ + "import torch\n", + "from torch.utils.data import Dataset\n", + "from torchvision import transforms, utils\n", + "\n", + "\n", + "class CustomDataset(Dataset):\n", + " \"\"\"\n", + " A PyTorch Dataset class to be used in a PyTorch DataLoader to create batches.\n", + " \"\"\"\n", + "\n", + " def __init__(self, data_folder, split):\n", + " \"\"\"\n", + " :param data_folder: folder where data files are stored\n", + " :param split: split, one of 'TRAIN' or 'TEST'\n", + " \"\"\"\n", + " self.data_folder = data_folder\n", + " self.split = split.lower()\n", + " assert self.split in {'train', 'val'}\n", + "\n", + " # Read data files\n", + " with open(os.path.join(data_folder, self.split + '.txt'), 'r') as f:\n", + " data_lines = f.readlines()\n", + " self.samples_fn = [line.strip().split(\" \") for line in data_lines]\n", + "\n", + " self.transforms = transforms.Compose([transforms.ToTensor()])\n", + "\n", + " def __getitem__(self, i):\n", + " # Read image and label\n", + " image = Image.open(self.samples_fn[i][0]).convert('RGB')\n", + " label = Image.open(self.samples_fn[i][1])\n", + "\n", + " image_tensor = self.transforms(image)\n", + " label_tensor = torch.from_numpy(np.array(label)).long()\n", + "\n", + " return image_tensor, label_tensor\n", + "\n", + "\n", + " def __len__(self):\n", + " return len(self.samples_fn)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "id": "2B0hlas_1Rh-" + }, + "outputs": [], + "source": [ + "train_dataset = CustomDataset(data_dir, split=\"train\")\n", + "val_dataset = CustomDataset(data_dir, split=\"val\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "eIG5tsiuor9E" + }, + "source": [ + "Let's have a look at the first sample:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "id": "ZsHqcq1jpN0F", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "5976e582-e08a-4642-bd04-eaf613d04a97" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "torch.Size([3, 256, 256]) torch.Size([256, 256])\n", + "tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])\n" + ] + } + ], + "source": [ + "img, lbl = train_dataset[0]\n", + "print(img.shape, lbl.shape)\n", + "print(torch.unique(lbl))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aWfFrYLzo9j8" + }, + "source": [ + "## 2.C Create Torch Dataloader" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "id": "XrWjWfjXnw_r" + }, + "outputs": [], + "source": [ + "from torch.utils.data import Dataset, DataLoader\n", + "\n", + "train_dataloader = DataLoader(train_dataset, batch_size=4, shuffle=True, num_workers=2)\n", + "val_dataloader = DataLoader(val_dataset, batch_size=4, shuffle=False, num_workers=2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "vB1sGPO8qwZJ" + }, + "source": [ + "Lets' have a look at the first batch:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "id": "O-KuZQ3XBduM", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "7addf002-9c95-4252-fad7-510b2a0ab333" + }, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "torch.Size([4, 3, 256, 256])" + ] + }, + "metadata": {}, + "execution_count": 13 + } + ], + "source": [ + "next(iter(train_dataloader))[0].shape" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fFfvyMHU32QF" + }, + "source": [ + "\n", + "# 3. Architecture definition" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "EpqgjQjl4awr" + }, + "source": [ + "SG includes implementations of many different architectures for object detection tasks that can be found [here](https://github.com/Deci-AI/super-gradients#implemented-model-architectures)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GNM64JAa4sbF" + }, + "source": [ + "Create a PPLiteSeg nn.Module, with 1 class segmentation head classifier. For simplicity `use_aux_head` is set as `False`\n", + "and extra Auxiliary heads aren't used for training.\n", + "\n", + "Other segmentation modules can be used for this task such as, DDRNet, STDC and RegSeg.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "id": "YDK4btf04Gbu", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "11d5aad2-dc94-4231-8e61-c4b968857370" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Downloading: \"https://sghub.deci.ai/models/pp_lite_t_seg75_cityscapes.pth\" to /root/.cache/torch/hub/checkpoints/pp_lite_t_seg75_cityscapes.pth\n", + "100%|██████████| 31.4M/31.4M [00:00<00:00, 223MB/s]\n", + "[2023-11-13 12:03:09] INFO - checkpoint_utils.py - Successfully loaded pretrained weights for architecture pp_lite_t_seg75\n" + ] + } + ], + "source": [ + "from super_gradients.training import models\n", + "from super_gradients.common.object_names import Models\n", + "\n", + "model = models.get(model_name=Models.PP_LITE_T_SEG75,\n", + " arch_params={\"use_aux_heads\": False},\n", + " num_classes=num_classes,\n", + " pretrained_weights=\"cityscapes\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "40UcYJ3u5JyF" + }, + "source": [ + "That being said, SG allows you to use one of SG implemented architectures or your custom architecture, as long as it inherits torch.nn.Module." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "LYPVR-XM4GsZ" + }, + "source": [ + "# 4. Training setup\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6K_56lDV8azX" + }, + "source": [ + "\n", + "Here we define the training recipe. The full parameters can be found here [training parameters supported](https://deci-ai.github.io/super-gradients/user_guide.html#training-parameters).\n" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "id": "3eRe0hBz4G1n" + }, + "outputs": [], + "source": [ + "from super_gradients.training.metrics.segmentation_metrics import IoU\n", + "from super_gradients.training.utils.callbacks import BinarySegmentationVisualizationCallback, Phase\n", + "\n", + "\n", + "train_params = {\"max_epochs\": 10,\n", + " \"lr_mode\": \"cosine\",\n", + " \"initial_lr\": 0.005,\n", + " \"optimizer\": \"SGD\",\n", + " \"loss\": \"CrossEntropyLoss\",\n", + " \"average_best_models\": False,\n", + " \"metric_to_watch\": \"IoU\",\n", + " \"greater_metric_to_watch_is_better\": True,\n", + " \"train_metrics_list\": [IoU(num_classes=10)],\n", + " \"valid_metrics_list\": [IoU(num_classes=10)],\n", + " \"loss_logging_items_names\": [\"loss\"],\n", + " \"phase_callbacks\": [BinarySegmentationVisualizationCallback(phase=Phase.VALIDATION_BATCH_END,\n", + " freq=1,\n", + " last_img_idx_in_batch=4)],\n", + "\n", + " }" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D3tVVUhy4OqP" + }, + "source": [ + "# 5. Training and evaluation\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8tKUuxbe9NlQ" + }, + "source": [ + "The logs and the checkpoint for the latest epoch will be kept in your experiment folder.\n", + "\n", + "To start training we'll call train(...) and provide it with the objects we construted above: the model, the training parameters and the data loaders." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "id": "-Ojnc1bk9L3s", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "2c2ac68b-75f7-48c0-db0f-7b579c743013" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 12:03:20] WARNING - sg_trainer.py - Train dataset size % batch_size != 0 and drop_last=False, this might result in smaller last batch.\n", + "[2023-11-13 12:03:28] INFO - sg_trainer.py - Starting a new run with `run_id=RUN_20231113_120328_854372`\n", + "[2023-11-13 12:03:28] INFO - sg_trainer.py - Checkpoints directory: ./notebook_ckpts/transfer_learning_semantic_segementation_ppLite/RUN_20231113_120328_854372\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "The console stream is now moved to ./notebook_ckpts/transfer_learning_semantic_segementation_ppLite/RUN_20231113_120328_854372/console_Nov13_12_03_28.txt\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 12:03:29] INFO - sg_trainer_utils.py - TRAINING PARAMETERS:\n", + " - Mode: Single GPU\n", + " - Number of GPUs: 1 (1 available on the machine)\n", + " - Full dataset size: 10 (len(train_set))\n", + " - Batch size per GPU: 4 (batch_size)\n", + " - Batch Accumulate: 1 (batch_accumulate)\n", + " - Total batch size: 4 (num_gpus * batch_size)\n", + " - Effective Batch size: 4 (num_gpus * batch_size * batch_accumulate)\n", + " - Iterations per epoch: 3 (len(train_loader))\n", + " - Gradient updates per epoch: 3 (len(train_loader) / batch_accumulate)\n", + "\n", + "[2023-11-13 12:03:29] INFO - sg_trainer.py - Started training for 10 epochs (0/9)\n", + "\n", + "Train epoch 0: 100%|██████████| 3/3 [00:08<00:00, 2.99s/it, CrossEntropyLoss=3.04, IoU=0.0464, gpu_mem=0.686]\n", + "Validating: 100%|██████████| 3/3 [00:00<00:00, 3.41it/s]\n", + "[2023-11-13 12:03:39] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/transfer_learning_semantic_segementation_ppLite/RUN_20231113_120328_854372/ckpt_best.pth\n", + "[2023-11-13 12:03:39] INFO - sg_trainer.py - Best checkpoint overriden: validation IoU: 0.015268713235855103\n", + "Train epoch 1: 0%| | 0/3 [00:00=1.23.5, but you have numpy 1.23.0 which is incompatible.\u001b[0m\u001b[31m\n", + "\u001b[0m" + ] + } + ], + "source": [ + "! pip install -qq super-gradients==3.4.1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "892xArqDsGsQ" + }, + "source": [ + "# 1. Experiment setup\n", + "We will initialize our **trainer** which will be in charge of everything, like training, evaluation, saving checkpoints, plotting etc.\n", + "\n", + "The **experiment name** argument is important as every checkpoints, logs and tensorboards to be saved in a directory with the same name. This directory will be created as a sub-directory of **ckpt_root_dir** as follow:\n", + "\n", + "```\n", + "ckpt_root_dir\n", + "|─── experiment_name_1\n", + "│ ckpt_best.pth # Model checkpoint on best epoch\n", + "│ ckpt_latest.pth # Model checkpoint on last epoch\n", + "│ average_model.pth # Model checkpoint averaged over epochs\n", + "│ events.out.tfevents.1659878383... # Tensorflow artifacts of a specific run\n", + "│ log_Aug07_11_52_48.txt # Trainer logs of a specific run\n", + "└─── experiment_name_2\n", + " ...\n", + "```\n", + "In this notebook multi-gpu training is set as `OFF`, for Distributed training multi_gpu can be set as\n", + " `MultiGPUMode.DISTRIBUTED_DATA_PARALLEL` or `MultiGPUMode.DATA_PARALLEL`." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pl0WPz1HisFz" + }, + "source": [ + "Let's define **ckpt_root_dir** inside the Colab, later we can use it to start TensorBoard and monitor the run." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "HAff--HysJmP", + "outputId": "23704251-ac0f-4a11-a104-d320484243de" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 12:17:11] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "The console stream is logged into /root/sg_logs/console.log\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 12:17:11] WARNING - __init__.py - Failed to import pytorch_quantization\n", + "[2023-11-13 12:17:11] INFO - utils.py - NumExpr defaulting to 2 threads.\n", + "[2023-11-13 12:17:26] WARNING - calibrator.py - Failed to import pytorch_quantization\n", + "[2023-11-13 12:17:26] WARNING - export.py - Failed to import pytorch_quantization\n", + "[2023-11-13 12:17:26] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: boto3 required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: deprecated required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: coverage required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: sphinx-rtd-theme required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: torchmetrics required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: hydra-core required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: omegaconf required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: onnxruntime required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: onnx required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: einops required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: treelib required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: stringcase required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: rapidfuzz required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: json-tricks required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: onnx-simplifier required but not found\u001b[0m\n", + "[2023-11-13 12:17:26] WARNING - env_sanity_check.py - \u001b[31mFailed to verify installed packages: data-gradients required but not found\u001b[0m\n" + ] + } + ], + "source": [ + "from super_gradients import Trainer\n", + "\n", + "CHECKPOINT_DIR = './notebook_ckpts/'\n", + "trainer = Trainer(experiment_name=\"segmentation_transfer_learning\", ckpt_root_dir=CHECKPOINT_DIR)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dwVMY4gMjQSL" + }, + "source": [ + "# 2. Dataset definition\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fpIWhnR9j2rm" + }, + "source": [ + "\n", + "For the sake of this presentation, we'll use **Supervisely** semantic segmentation dataset." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZACgRb-qjzDJ" + }, + "source": [ + "SG trainer is fully compatible with PyTorch data loaders, so you can definitely use your own data for the experiment below if you prefer." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6ulV6Hpao3IN" + }, + "source": [ + "## 2.1 Download data\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mVwslNv-j-2C" + }, + "source": [ + "Feel free to change the download path by editing SUPERVISELY_DATASET_DOWNLOAD_PATH" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "dfR18Rmbo00y", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "07e535c6-2091-4179-843c-ce6e7cd591f1" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Downloading and extracting supervisely dataset to: /content/data\n", + "/content/data\n", + "--2023-11-13 12:17:26-- https://deci-pretrained-models.s3.amazonaws.com/supervisely-persons.zip\n", + "Resolving deci-pretrained-models.s3.amazonaws.com (deci-pretrained-models.s3.amazonaws.com)... 3.5.27.182, 3.5.29.153, 52.216.37.233, ...\n", + "Connecting to deci-pretrained-models.s3.amazonaws.com (deci-pretrained-models.s3.amazonaws.com)|3.5.27.182|:443... connected.\n", + "HTTP request sent, awaiting response... 200 OK\n", + "Length: 3564001012 (3.3G) [application/zip]\n", + "Saving to: ‘supervisely-persons.zip’\n", + "\n", + "supervisely-persons 100%[===================>] 3.32G 38.1MB/s in 88s \n", + "\n", + "2023-11-13 12:18:54 (38.7 MB/s) - ‘supervisely-persons.zip’ saved [3564001012/3564001012]\n", + "\n" + ] + } + ], + "source": [ + "import os\n", + "\n", + "SUPERVISELY_DATASET_DOWNLOAD_PATH=os.path.join(os.getcwd(),\"data\")\n", + "\n", + "supervisely_dataset_dir_path = os.path.join(SUPERVISELY_DATASET_DOWNLOAD_PATH, 'supervisely-persons')\n", + "\n", + "if os.path.isdir(supervisely_dataset_dir_path):\n", + " print('supervisely dataset already downloaded...')\n", + "else:\n", + " print('Downloading and extracting supervisely dataset to: ' + SUPERVISELY_DATASET_DOWNLOAD_PATH)\n", + " ! mkdir $SUPERVISELY_DATASET_DOWNLOAD_PATH\n", + " %cd $SUPERVISELY_DATASET_DOWNLOAD_PATH\n", + " ! wget https://deci-pretrained-models.s3.amazonaws.com/supervisely-persons.zip\n", + " ! unzip --qq supervisely-persons.zip" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "id": "V9ZcklupX8Qx" + }, + "source": [ + "## 2.2 Create data loaders\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3Mk_YixjlEhj" + }, + "source": [ + "The dataloaders are initiated with the default parameters defined in the [yaml](https://github.com/Deci-AI/super-gradients/blob/master/src/super_gradients/recipes/dataset_params/supervisely_persons_dataset_params.yaml)\n", + "file. Parameters as batch_size, transforms, root_dir and others can be overridden by passing as `dataset_params` and\n", + "`dataloader_params`, as implemented bellow." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "S3BzMRhSX8Qx" + }, + "outputs": [], + "source": [ + "from super_gradients.training import dataloaders\n", + "\n", + "root_dir = supervisely_dataset_dir_path\n", + "batch_size = 8\n", + "\n", + "train_loader = dataloaders.supervisely_persons_train(\n", + " dataset_params={\"root_dir\": root_dir},\n", + " dataloader_params={\"batch_size\": batch_size, \"num_workers\": 2}\n", + ")\n", + "valid_loader = dataloaders.supervisely_persons_val(\n", + " dataset_params={\"root_dir\": root_dir},\n", + " dataloader_params={\"batch_size\": batch_size, \"num_workers\": 2}\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6dHIwvs46-dk" + }, + "source": [ + "As you can see, we didn't have to pass many parameters into the dataloaders construction. That's because defaults are pre-defined for your convenience, and you might be curious to know what they are. Let's print them and see which resolution and transformations are defined." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "id": "76tzhKxi6aS-" + }, + "outputs": [], + "source": [ + "print('Dataloader parameters:')\n", + "print(train_loader.dataloader_params)\n", + "print('Dataset parameters')\n", + "print(train_loader.dataset.dataset_params)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "I4QEOkKyy93R" + }, + "source": [ + "We can take a look at some images from the dataset." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "xXPMJQCJzmb4", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 937 + }, + "outputId": "c7605343-9fe4-4bb1-b18c-12e0cad6e0a7" + }, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "image/png": "\n" + }, + "metadata": {} + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Dataloader parameters:\n", + "{'batch_size': 8, 'num_workers': 2, 'shuffle': True, 'drop_last': True}\n", + "Dataset parameters\n", + "{'root_dir': '/content/data/supervisely-persons', 'list_file': 'train.csv', 'cache_labels': False, 'cache_images': False, 'transforms': [{'SegRandomRescale': {'scales': [0.25, 1.0]}}, {'SegColorJitter': {'brightness': 0.5, 'contrast': 0.5, 'saturation': 0.5}}, {'SegRandomFlip': {'prob': 0.5}}, {'SegPadShortToCropSize': {'crop_size': [320, 480], 'fill_mask': 0}}, {'SegCropImageAndMask': {'crop_size': [320, 480], 'mode': 'random'}}]}\n" + ] + }, + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "image/png": "\n" + }, + "metadata": {} + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "/usr/local/lib/python3.10/dist-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True).\n", + " warnings.warn(\n" + ] + }, + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "image/png": "\n" + }, + "metadata": {} + } + ], + "source": [ + "from PIL import Image\n", + "from torchvision.utils import draw_segmentation_masks\n", + "from torchvision.transforms import ToTensor, ToPILImage, Resize\n", + "import numpy as np\n", + "import torch\n", + "\n", + "def plot_seg_data(img_path: str, target_path: str):\n", + " image = (ToTensor()(Image.open(img_path).convert('RGB')) * 255).type(torch.uint8)\n", + " target = torch.from_numpy(np.array(Image.open(target_path))).bool()\n", + " image = draw_segmentation_masks(image, target, colors=\"red\", alpha=0.4)\n", + " image = Resize(size=200)(image)\n", + " display(ToPILImage()(image))\n", + "\n", + "for i in range(4, 7):\n", + " img_path, target_path = train_loader.dataset.samples_targets_tuples_list[i]\n", + " plot_seg_data(img_path, target_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "l5GcDAg_pUGJ" + }, + "source": [ + "# 3. Architecture definition\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fU8orO7wlwIK" + }, + "source": [ + "SG includes implementations of many different architectures for semantic segmentation tasks that can be found [here](https://github.com/Deci-AI/super-gradients#implemented-model-architectures)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-oGSU3V8lqcm" + }, + "source": [ + "Create a PPLiteSeg nn.Module, with 1 class segmentation head classifier. For simplicity `use_aux_head` is set as `False`\n", + "and extra Auxiliary heads aren't used for training.\n", + "\n", + "Other segmentation modules can be used for this task such as, DDRNet, STDC and RegSeg.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "id": "f6ZTsO0nrdje", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "5f5dffd8-738f-4fa7-e5c4-df8e2bd5622d" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Downloading: \"https://sghub.deci.ai/models/pp_lite_t_seg75_cityscapes.pth\" to /root/.cache/torch/hub/checkpoints/pp_lite_t_seg75_cityscapes.pth\n", + "100%|██████████| 31.4M/31.4M [00:01<00:00, 26.4MB/s]\n", + "[2023-11-13 12:19:38] INFO - checkpoint_utils.py - Successfully loaded pretrained weights for architecture pp_lite_t_seg75\n" + ] + } + ], + "source": [ + "from super_gradients.training import models\n", + "from super_gradients.common.object_names import Models\n", + "\n", + "model = models.get(model_name=Models.PP_LITE_T_SEG75,\n", + " arch_params={\"use_aux_heads\": False},\n", + " num_classes=1,\n", + " pretrained_weights=\"cityscapes\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "X-_dBewgr1dG" + }, + "source": [ + "# 4. Training setup\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "H1Rll8Orl-Dy" + }, + "source": [ + "\n", + "Here we define the training recipe. The full parameters can be found here [training parameters supported](https://deci-ai.github.io/super-gradients/user_guide.html#training-parameters).\n", + "\n", + "We will be using an average of BCE and Dice loss for segmentation, with different learning rates for the replaced segmentation head layer, and the rest of the network- this is controlled by the `multiply_head_lr` parameter which is the multiplication factor of the learning rate for the newly replaced layer.\n", + "\n", + "As our `metric_to_watch`, we will be monitoring the `target_IOU` which is one of the components of `BinaryIOU` torchmetrics object (the other components are `mean_IOU` which is the mean of the background and target IOUs, and `background_IOU`)." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "id": "NShu3zLgr5qD" + }, + "outputs": [], + "source": [ + "from super_gradients.training.metrics.segmentation_metrics import BinaryIOU\n", + "from super_gradients.training.utils.callbacks import BinarySegmentationVisualizationCallback, Phase\n", + "\n", + "train_params = {\"max_epochs\": 15,\n", + " \"lr_mode\": \"cosine\",\n", + " \"initial_lr\": 0.005,\n", + " \"lr_warmup_epochs\": 5,\n", + " \"multiply_head_lr\": 10,\n", + " \"optimizer\": \"SGD\",\n", + " \"loss\": \"BCEDiceLoss\",\n", + " \"ema\": True,\n", + " \"ema_params\":\n", + " {\n", + " \"decay\": 0.9999,\n", + " \"decay_type\": \"exp\",\n", + " \"beta\": 15,\n", + " },\n", + " \"zero_weight_decay_on_bias_and_bn\": True,\n", + " \"average_best_models\": True,\n", + " \"metric_to_watch\": \"target_IOU\",\n", + " \"greater_metric_to_watch_is_better\": True,\n", + " \"train_metrics_list\": [BinaryIOU()],\n", + " \"valid_metrics_list\": [BinaryIOU()],\n", + " \"loss_logging_items_names\": [\"loss\"],\n", + " \"phase_callbacks\": [BinarySegmentationVisualizationCallback(phase=Phase.VALIDATION_BATCH_END,\n", + " freq=1,\n", + " last_img_idx_in_batch=4)],\n", + " }" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qTECVyhcs506" + }, + "source": [ + "# 5. Training and evaluation\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "S1K5MU2kmmDb" + }, + "source": [ + "The logs and the checkpoint for the latest epoch will be kept in your experiment folder.\n", + "\n", + "To start training we'll call train(...) and provide it with the objects we construted above: the model, the training parameters and the data loaders.\n", + "\n", + "**Note:** While training, don't forget to refresh the tensorboard with the arrow on the top right." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "id": "u6roEj9ktFTi", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "2526afe4-98f3-466b-bfc1-133b9ca1d047" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 12:19:47] INFO - sg_trainer.py - Starting a new run with `run_id=RUN_20231113_121947_461500`\n", + "[2023-11-13 12:19:47] INFO - sg_trainer.py - Checkpoints directory: ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500\n", + "[2023-11-13 12:19:47] INFO - sg_trainer.py - Using EMA with params {'decay': 0.9999, 'decay_type': 'exp', 'beta': 15}\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "The console stream is now moved to ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/console_Nov13_12_19_47.txt\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 12:19:49] INFO - sg_trainer_utils.py - TRAINING PARAMETERS:\n", + " - Mode: Single GPU\n", + " - Number of GPUs: 1 (1 available on the machine)\n", + " - Full dataset size: 2477 (len(train_set))\n", + " - Batch size per GPU: 8 (batch_size)\n", + " - Batch Accumulate: 1 (batch_accumulate)\n", + " - Total batch size: 8 (num_gpus * batch_size)\n", + " - Effective Batch size: 8 (num_gpus * batch_size * batch_accumulate)\n", + " - Iterations per epoch: 309 (len(train_loader))\n", + " - Gradient updates per epoch: 309 (len(train_loader) / batch_accumulate)\n", + "\n", + "[2023-11-13 12:19:49] INFO - sg_trainer.py - Started training for 15 epochs (0/14)\n", + "\n", + "Train epoch 0: 100%|██████████| 309/309 [01:55<00:00, 2.69it/s, BCEDiceLoss=0.224, background_IOU=0.746, gpu_mem=1.14, mean_IOU=0.777, target_IOU=0.807]\n", + "Validating: 100%|██████████| 65/65 [00:16<00:00, 4.02it/s]\n", + "[2023-11-13 12:22:01] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:22:01] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.852721631526947\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 0\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.2242\n", + "│ ├── Target_iou = 0.8072\n", + "│ ├── Background_iou = 0.7463\n", + "│ └── Mean_iou = 0.7768\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1872\n", + " ├── Target_iou = 0.8527\n", + " ├── Background_iou = 0.7289\n", + " └── Mean_iou = 0.7908\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 1: 100%|██████████| 309/309 [01:42<00:00, 3.00it/s, BCEDiceLoss=0.167, background_IOU=0.814, gpu_mem=1.14, mean_IOU=0.836, target_IOU=0.859]\n", + "Validating epoch 1: 100%|██████████| 65/65 [00:16<00:00, 3.91it/s]\n", + "[2023-11-13 12:24:04] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:24:04] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.8766682147979736\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 1\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.167\n", + "│ │ ├── Epoch N-1 = 0.2242 (\u001b[32m↘ -0.0572\u001b[0m)\n", + "│ │ └── Best until now = 0.2242 (\u001b[32m↘ -0.0572\u001b[0m)\n", + "│ ├── Target_iou = 0.8588\n", + "│ │ ├── Epoch N-1 = 0.8072 (\u001b[32m↗ 0.0516\u001b[0m)\n", + "│ │ └── Best until now = 0.8072 (\u001b[32m↗ 0.0516\u001b[0m)\n", + "│ ├── Background_iou = 0.8137\n", + "│ │ ├── Epoch N-1 = 0.7463 (\u001b[32m↗ 0.0673\u001b[0m)\n", + "│ │ └── Best until now = 0.7463 (\u001b[32m↗ 0.0673\u001b[0m)\n", + "│ └── Mean_iou = 0.8362\n", + "│ ├── Epoch N-1 = 0.7768 (\u001b[32m↗ 0.0594\u001b[0m)\n", + "│ └── Best until now = 0.7768 (\u001b[32m↗ 0.0594\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.155\n", + " │ ├── Epoch N-1 = 0.1872 (\u001b[32m↘ -0.0321\u001b[0m)\n", + " │ └── Best until now = 0.1872 (\u001b[32m↘ -0.0321\u001b[0m)\n", + " ├── Target_iou = 0.8767\n", + " │ ├── Epoch N-1 = 0.8527 (\u001b[32m↗ 0.0239\u001b[0m)\n", + " │ └── Best until now = 0.8527 (\u001b[32m↗ 0.0239\u001b[0m)\n", + " ├── Background_iou = 0.7812\n", + " │ ├── Epoch N-1 = 0.7289 (\u001b[32m↗ 0.0522\u001b[0m)\n", + " │ └── Best until now = 0.7289 (\u001b[32m↗ 0.0522\u001b[0m)\n", + " └── Mean_iou = 0.8289\n", + " ├── Epoch N-1 = 0.7908 (\u001b[32m↗ 0.0381\u001b[0m)\n", + " └── Best until now = 0.7908 (\u001b[32m↗ 0.0381\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 2: 100%|██████████| 309/309 [01:45<00:00, 2.94it/s, BCEDiceLoss=0.146, background_IOU=0.843, gpu_mem=1.14, mean_IOU=0.86, target_IOU=0.877]\n", + "Validating epoch 2: 100%|██████████| 65/65 [00:14<00:00, 4.37it/s]\n", + "[2023-11-13 12:26:05] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:26:05] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.8859243988990784\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 2\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.1456\n", + "│ │ ├── Epoch N-1 = 0.167 (\u001b[32m↘ -0.0213\u001b[0m)\n", + "│ │ └── Best until now = 0.167 (\u001b[32m↘ -0.0213\u001b[0m)\n", + "│ ├── Target_iou = 0.8769\n", + "│ │ ├── Epoch N-1 = 0.8588 (\u001b[32m↗ 0.0181\u001b[0m)\n", + "│ │ └── Best until now = 0.8588 (\u001b[32m↗ 0.0181\u001b[0m)\n", + "│ ├── Background_iou = 0.8434\n", + "│ │ ├── Epoch N-1 = 0.8137 (\u001b[32m↗ 0.0297\u001b[0m)\n", + "│ │ └── Best until now = 0.8137 (\u001b[32m↗ 0.0297\u001b[0m)\n", + "│ └── Mean_iou = 0.8601\n", + "│ ├── Epoch N-1 = 0.8362 (\u001b[32m↗ 0.0239\u001b[0m)\n", + "│ └── Best until now = 0.8362 (\u001b[32m↗ 0.0239\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.145\n", + " │ ├── Epoch N-1 = 0.155 (\u001b[32m↘ -0.01\u001b[0m)\n", + " │ └── Best until now = 0.155 (\u001b[32m↘ -0.01\u001b[0m)\n", + " ├── Target_iou = 0.8859\n", + " │ ├── Epoch N-1 = 0.8767 (\u001b[32m↗ 0.0093\u001b[0m)\n", + " │ └── Best until now = 0.8767 (\u001b[32m↗ 0.0093\u001b[0m)\n", + " ├── Background_iou = 0.7947\n", + " │ ├── Epoch N-1 = 0.7812 (\u001b[32m↗ 0.0135\u001b[0m)\n", + " │ └── Best until now = 0.7812 (\u001b[32m↗ 0.0135\u001b[0m)\n", + " └── Mean_iou = 0.8403\n", + " ├── Epoch N-1 = 0.8289 (\u001b[32m↗ 0.0114\u001b[0m)\n", + " └── Best until now = 0.8289 (\u001b[32m↗ 0.0114\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 3: 100%|██████████| 309/309 [01:43<00:00, 2.98it/s, BCEDiceLoss=0.132, background_IOU=0.856, gpu_mem=1.14, mean_IOU=0.871, target_IOU=0.885]\n", + "Validating epoch 3: 100%|██████████| 65/65 [00:16<00:00, 3.93it/s]\n", + "[2023-11-13 12:28:07] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:28:07] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9014593362808228\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 3\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.1318\n", + "│ │ ├── Epoch N-1 = 0.1456 (\u001b[32m↘ -0.0138\u001b[0m)\n", + "│ │ └── Best until now = 0.1456 (\u001b[32m↘ -0.0138\u001b[0m)\n", + "│ ├── Target_iou = 0.8852\n", + "│ │ ├── Epoch N-1 = 0.8769 (\u001b[32m↗ 0.0083\u001b[0m)\n", + "│ │ └── Best until now = 0.8769 (\u001b[32m↗ 0.0083\u001b[0m)\n", + "│ ├── Background_iou = 0.8559\n", + "│ │ ├── Epoch N-1 = 0.8434 (\u001b[32m↗ 0.0125\u001b[0m)\n", + "│ │ └── Best until now = 0.8434 (\u001b[32m↗ 0.0125\u001b[0m)\n", + "│ └── Mean_iou = 0.8706\n", + "│ ├── Epoch N-1 = 0.8601 (\u001b[32m↗ 0.0104\u001b[0m)\n", + "│ └── Best until now = 0.8601 (\u001b[32m↗ 0.0104\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1255\n", + " │ ├── Epoch N-1 = 0.145 (\u001b[32m↘ -0.0195\u001b[0m)\n", + " │ └── Best until now = 0.145 (\u001b[32m↘ -0.0195\u001b[0m)\n", + " ├── Target_iou = 0.9015\n", + " │ ├── Epoch N-1 = 0.8859 (\u001b[32m↗ 0.0155\u001b[0m)\n", + " │ └── Best until now = 0.8859 (\u001b[32m↗ 0.0155\u001b[0m)\n", + " ├── Background_iou = 0.8281\n", + " │ ├── Epoch N-1 = 0.7947 (\u001b[32m↗ 0.0334\u001b[0m)\n", + " │ └── Best until now = 0.7947 (\u001b[32m↗ 0.0334\u001b[0m)\n", + " └── Mean_iou = 0.8648\n", + " ├── Epoch N-1 = 0.8403 (\u001b[32m↗ 0.0245\u001b[0m)\n", + " └── Best until now = 0.8403 (\u001b[32m↗ 0.0245\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 4: 100%|██████████| 309/309 [01:42<00:00, 3.01it/s, BCEDiceLoss=0.122, background_IOU=0.866, gpu_mem=1.14, mean_IOU=0.881, target_IOU=0.896]\n", + "Validating epoch 4: 100%|██████████| 65/65 [00:16<00:00, 3.94it/s]\n", + "[2023-11-13 12:30:07] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:30:07] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9075297117233276\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 4\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.1216\n", + "│ │ ├── Epoch N-1 = 0.1318 (\u001b[32m↘ -0.0102\u001b[0m)\n", + "│ │ └── Best until now = 0.1318 (\u001b[32m↘ -0.0102\u001b[0m)\n", + "│ ├── Target_iou = 0.8963\n", + "│ │ ├── Epoch N-1 = 0.8852 (\u001b[32m↗ 0.0111\u001b[0m)\n", + "│ │ └── Best until now = 0.8852 (\u001b[32m↗ 0.0111\u001b[0m)\n", + "│ ├── Background_iou = 0.8663\n", + "│ │ ├── Epoch N-1 = 0.8559 (\u001b[32m↗ 0.0104\u001b[0m)\n", + "│ │ └── Best until now = 0.8559 (\u001b[32m↗ 0.0104\u001b[0m)\n", + "│ └── Mean_iou = 0.8813\n", + "│ ├── Epoch N-1 = 0.8706 (\u001b[32m↗ 0.0107\u001b[0m)\n", + "│ └── Best until now = 0.8706 (\u001b[32m↗ 0.0107\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1177\n", + " │ ├── Epoch N-1 = 0.1255 (\u001b[32m↘ -0.0078\u001b[0m)\n", + " │ └── Best until now = 0.1255 (\u001b[32m↘ -0.0078\u001b[0m)\n", + " ├── Target_iou = 0.9075\n", + " │ ├── Epoch N-1 = 0.9015 (\u001b[32m↗ 0.0061\u001b[0m)\n", + " │ └── Best until now = 0.9015 (\u001b[32m↗ 0.0061\u001b[0m)\n", + " ├── Background_iou = 0.8385\n", + " │ ├── Epoch N-1 = 0.8281 (\u001b[32m↗ 0.0103\u001b[0m)\n", + " │ └── Best until now = 0.8281 (\u001b[32m↗ 0.0103\u001b[0m)\n", + " └── Mean_iou = 0.873\n", + " ├── Epoch N-1 = 0.8648 (\u001b[32m↗ 0.0082\u001b[0m)\n", + " └── Best until now = 0.8648 (\u001b[32m↗ 0.0082\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 5: 100%|██████████| 309/309 [01:44<00:00, 2.94it/s, BCEDiceLoss=0.121, background_IOU=0.865, gpu_mem=1.14, mean_IOU=0.88, target_IOU=0.895]\n", + "Validating epoch 5: 100%|██████████| 65/65 [00:14<00:00, 4.47it/s]\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 5\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.1214\n", + "│ │ ├── Epoch N-1 = 0.1216 (\u001b[32m↘ -0.0002\u001b[0m)\n", + "│ │ └── Best until now = 0.1216 (\u001b[32m↘ -0.0002\u001b[0m)\n", + "│ ├── Target_iou = 0.8954\n", + "│ │ ├── Epoch N-1 = 0.8963 (\u001b[31m↘ -0.0009\u001b[0m)\n", + "│ │ └── Best until now = 0.8963 (\u001b[31m↘ -0.0009\u001b[0m)\n", + "│ ├── Background_iou = 0.8655\n", + "│ │ ├── Epoch N-1 = 0.8663 (\u001b[31m↘ -0.0008\u001b[0m)\n", + "│ │ └── Best until now = 0.8663 (\u001b[31m↘ -0.0008\u001b[0m)\n", + "│ └── Mean_iou = 0.8804\n", + "│ ├── Epoch N-1 = 0.8813 (\u001b[31m↘ -0.0008\u001b[0m)\n", + "│ └── Best until now = 0.8813 (\u001b[31m↘ -0.0008\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1194\n", + " │ ├── Epoch N-1 = 0.1177 (\u001b[31m↗ 0.0017\u001b[0m)\n", + " │ └── Best until now = 0.1177 (\u001b[31m↗ 0.0017\u001b[0m)\n", + " ├── Target_iou = 0.9054\n", + " │ ├── Epoch N-1 = 0.9075 (\u001b[31m↘ -0.0021\u001b[0m)\n", + " │ └── Best until now = 0.9075 (\u001b[31m↘ -0.0021\u001b[0m)\n", + " ├── Background_iou = 0.8344\n", + " │ ├── Epoch N-1 = 0.8385 (\u001b[31m↘ -0.004\u001b[0m)\n", + " │ └── Best until now = 0.8385 (\u001b[31m↘ -0.004\u001b[0m)\n", + " └── Mean_iou = 0.8699\n", + " ├── Epoch N-1 = 0.873 (\u001b[31m↘ -0.0031\u001b[0m)\n", + " └── Best until now = 0.873 (\u001b[31m↘ -0.0031\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 6: 100%|██████████| 309/309 [01:45<00:00, 2.93it/s, BCEDiceLoss=0.118, background_IOU=0.87, gpu_mem=1.14, mean_IOU=0.884, target_IOU=0.899]\n", + "Validating epoch 6: 100%|██████████| 65/65 [00:16<00:00, 3.90it/s]\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 6\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.1177\n", + "│ │ ├── Epoch N-1 = 0.1214 (\u001b[32m↘ -0.0037\u001b[0m)\n", + "│ │ └── Best until now = 0.1214 (\u001b[32m↘ -0.0037\u001b[0m)\n", + "│ ├── Target_iou = 0.8986\n", + "│ │ ├── Epoch N-1 = 0.8954 (\u001b[32m↗ 0.0032\u001b[0m)\n", + "│ │ └── Best until now = 0.8963 (\u001b[32m↗ 0.0023\u001b[0m)\n", + "│ ├── Background_iou = 0.8702\n", + "│ │ ├── Epoch N-1 = 0.8655 (\u001b[32m↗ 0.0047\u001b[0m)\n", + "│ │ └── Best until now = 0.8663 (\u001b[32m↗ 0.0039\u001b[0m)\n", + "│ └── Mean_iou = 0.8844\n", + "│ ├── Epoch N-1 = 0.8804 (\u001b[32m↗ 0.004\u001b[0m)\n", + "│ └── Best until now = 0.8813 (\u001b[32m↗ 0.0031\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.117\n", + " │ ├── Epoch N-1 = 0.1194 (\u001b[32m↘ -0.0024\u001b[0m)\n", + " │ └── Best until now = 0.1177 (\u001b[32m↘ -0.0007\u001b[0m)\n", + " ├── Target_iou = 0.9073\n", + " │ ├── Epoch N-1 = 0.9054 (\u001b[32m↗ 0.0019\u001b[0m)\n", + " │ └── Best until now = 0.9075 (\u001b[31m↘ -0.0003\u001b[0m)\n", + " ├── Background_iou = 0.8375\n", + " │ ├── Epoch N-1 = 0.8344 (\u001b[32m↗ 0.0031\u001b[0m)\n", + " │ └── Best until now = 0.8385 (\u001b[31m↘ -0.001\u001b[0m)\n", + " └── Mean_iou = 0.8724\n", + " ├── Epoch N-1 = 0.8699 (\u001b[32m↗ 0.0025\u001b[0m)\n", + " └── Best until now = 0.873 (\u001b[31m↘ -0.0006\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 7: 100%|██████████| 309/309 [01:44<00:00, 2.97it/s, BCEDiceLoss=0.11, background_IOU=0.877, gpu_mem=1.14, mean_IOU=0.891, target_IOU=0.905]\n", + "Validating epoch 7: 100%|██████████| 65/65 [00:15<00:00, 4.10it/s]\n", + "[2023-11-13 12:36:14] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:36:14] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9087882041931152\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 7\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.1099\n", + "│ │ ├── Epoch N-1 = 0.1177 (\u001b[32m↘ -0.0078\u001b[0m)\n", + "│ │ └── Best until now = 0.1177 (\u001b[32m↘ -0.0078\u001b[0m)\n", + "│ ├── Target_iou = 0.9051\n", + "│ │ ├── Epoch N-1 = 0.8986 (\u001b[32m↗ 0.0064\u001b[0m)\n", + "│ │ └── Best until now = 0.8986 (\u001b[32m↗ 0.0064\u001b[0m)\n", + "│ ├── Background_iou = 0.8775\n", + "│ │ ├── Epoch N-1 = 0.8702 (\u001b[32m↗ 0.0073\u001b[0m)\n", + "│ │ └── Best until now = 0.8702 (\u001b[32m↗ 0.0073\u001b[0m)\n", + "│ └── Mean_iou = 0.8913\n", + "│ ├── Epoch N-1 = 0.8844 (\u001b[32m↗ 0.0069\u001b[0m)\n", + "│ └── Best until now = 0.8844 (\u001b[32m↗ 0.0069\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1151\n", + " │ ├── Epoch N-1 = 0.117 (\u001b[32m↘ -0.002\u001b[0m)\n", + " │ └── Best until now = 0.117 (\u001b[32m↘ -0.002\u001b[0m)\n", + " ├── Target_iou = 0.9088\n", + " │ ├── Epoch N-1 = 0.9073 (\u001b[32m↗ 0.0015\u001b[0m)\n", + " │ └── Best until now = 0.9075 (\u001b[32m↗ 0.0013\u001b[0m)\n", + " ├── Background_iou = 0.8402\n", + " │ ├── Epoch N-1 = 0.8375 (\u001b[32m↗ 0.0027\u001b[0m)\n", + " │ └── Best until now = 0.8385 (\u001b[32m↗ 0.0017\u001b[0m)\n", + " └── Mean_iou = 0.8745\n", + " ├── Epoch N-1 = 0.8724 (\u001b[32m↗ 0.0021\u001b[0m)\n", + " └── Best until now = 0.873 (\u001b[32m↗ 0.0015\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 8: 100%|██████████| 309/309 [01:44<00:00, 2.97it/s, BCEDiceLoss=0.0972, background_IOU=0.889, gpu_mem=1.14, mean_IOU=0.902, target_IOU=0.915]\n", + "Validating epoch 8: 100%|██████████| 65/65 [00:16<00:00, 3.94it/s]\n", + "[2023-11-13 12:38:18] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:38:18] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9094919562339783\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 8\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.0972\n", + "│ │ ├── Epoch N-1 = 0.1099 (\u001b[32m↘ -0.0127\u001b[0m)\n", + "│ │ └── Best until now = 0.1099 (\u001b[32m↘ -0.0127\u001b[0m)\n", + "│ ├── Target_iou = 0.9145\n", + "│ │ ├── Epoch N-1 = 0.9051 (\u001b[32m↗ 0.0095\u001b[0m)\n", + "│ │ └── Best until now = 0.9051 (\u001b[32m↗ 0.0095\u001b[0m)\n", + "│ ├── Background_iou = 0.8892\n", + "│ │ ├── Epoch N-1 = 0.8775 (\u001b[32m↗ 0.0117\u001b[0m)\n", + "│ │ └── Best until now = 0.8775 (\u001b[32m↗ 0.0117\u001b[0m)\n", + "│ └── Mean_iou = 0.9019\n", + "│ ├── Epoch N-1 = 0.8913 (\u001b[32m↗ 0.0106\u001b[0m)\n", + "│ └── Best until now = 0.8913 (\u001b[32m↗ 0.0106\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1142\n", + " │ ├── Epoch N-1 = 0.1151 (\u001b[32m↘ -0.0008\u001b[0m)\n", + " │ └── Best until now = 0.1151 (\u001b[32m↘ -0.0008\u001b[0m)\n", + " ├── Target_iou = 0.9095\n", + " │ ├── Epoch N-1 = 0.9088 (\u001b[32m↗ 0.0007\u001b[0m)\n", + " │ └── Best until now = 0.9088 (\u001b[32m↗ 0.0007\u001b[0m)\n", + " ├── Background_iou = 0.8414\n", + " │ ├── Epoch N-1 = 0.8402 (\u001b[32m↗ 0.0013\u001b[0m)\n", + " │ └── Best until now = 0.8402 (\u001b[32m↗ 0.0013\u001b[0m)\n", + " └── Mean_iou = 0.8755\n", + " ├── Epoch N-1 = 0.8745 (\u001b[32m↗ 0.001\u001b[0m)\n", + " └── Best until now = 0.8745 (\u001b[32m↗ 0.001\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 9: 100%|██████████| 309/309 [01:44<00:00, 2.97it/s, BCEDiceLoss=0.0938, background_IOU=0.896, gpu_mem=1.14, mean_IOU=0.907, target_IOU=0.918]\n", + "Validating epoch 9: 100%|██████████| 65/65 [00:15<00:00, 4.28it/s]\n", + "[2023-11-13 12:40:20] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:40:20] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9101359248161316\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 9\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.0938\n", + "│ │ ├── Epoch N-1 = 0.0972 (\u001b[32m↘ -0.0034\u001b[0m)\n", + "│ │ └── Best until now = 0.0972 (\u001b[32m↘ -0.0034\u001b[0m)\n", + "│ ├── Target_iou = 0.9184\n", + "│ │ ├── Epoch N-1 = 0.9145 (\u001b[32m↗ 0.0039\u001b[0m)\n", + "│ │ └── Best until now = 0.9145 (\u001b[32m↗ 0.0039\u001b[0m)\n", + "│ ├── Background_iou = 0.8955\n", + "│ │ ├── Epoch N-1 = 0.8892 (\u001b[32m↗ 0.0063\u001b[0m)\n", + "│ │ └── Best until now = 0.8892 (\u001b[32m↗ 0.0063\u001b[0m)\n", + "│ └── Mean_iou = 0.907\n", + "│ ├── Epoch N-1 = 0.9019 (\u001b[32m↗ 0.0051\u001b[0m)\n", + "│ └── Best until now = 0.9019 (\u001b[32m↗ 0.0051\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1134\n", + " │ ├── Epoch N-1 = 0.1142 (\u001b[32m↘ -0.0008\u001b[0m)\n", + " │ └── Best until now = 0.1142 (\u001b[32m↘ -0.0008\u001b[0m)\n", + " ├── Target_iou = 0.9101\n", + " │ ├── Epoch N-1 = 0.9095 (\u001b[32m↗ 0.0006\u001b[0m)\n", + " │ └── Best until now = 0.9095 (\u001b[32m↗ 0.0006\u001b[0m)\n", + " ├── Background_iou = 0.8427\n", + " │ ├── Epoch N-1 = 0.8414 (\u001b[32m↗ 0.0012\u001b[0m)\n", + " │ └── Best until now = 0.8414 (\u001b[32m↗ 0.0012\u001b[0m)\n", + " └── Mean_iou = 0.8764\n", + " ├── Epoch N-1 = 0.8755 (\u001b[32m↗ 0.0009\u001b[0m)\n", + " └── Best until now = 0.8755 (\u001b[32m↗ 0.0009\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 10: 100%|██████████| 309/309 [01:44<00:00, 2.95it/s, BCEDiceLoss=0.0853, background_IOU=0.902, gpu_mem=1.14, mean_IOU=0.913, target_IOU=0.924]\n", + "Validating epoch 10: 100%|██████████| 65/65 [00:16<00:00, 3.89it/s]\n", + "[2023-11-13 12:42:25] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:42:25] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9106564521789551\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 10\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.0853\n", + "│ │ ├── Epoch N-1 = 0.0938 (\u001b[32m↘ -0.0085\u001b[0m)\n", + "│ │ └── Best until now = 0.0938 (\u001b[32m↘ -0.0085\u001b[0m)\n", + "│ ├── Target_iou = 0.9241\n", + "│ │ ├── Epoch N-1 = 0.9184 (\u001b[32m↗ 0.0057\u001b[0m)\n", + "│ │ └── Best until now = 0.9184 (\u001b[32m↗ 0.0057\u001b[0m)\n", + "│ ├── Background_iou = 0.9023\n", + "│ │ ├── Epoch N-1 = 0.8955 (\u001b[32m↗ 0.0068\u001b[0m)\n", + "│ │ └── Best until now = 0.8955 (\u001b[32m↗ 0.0068\u001b[0m)\n", + "│ └── Mean_iou = 0.9132\n", + "│ ├── Epoch N-1 = 0.907 (\u001b[32m↗ 0.0062\u001b[0m)\n", + "│ └── Best until now = 0.907 (\u001b[32m↗ 0.0062\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1127\n", + " │ ├── Epoch N-1 = 0.1134 (\u001b[32m↘ -0.0007\u001b[0m)\n", + " │ └── Best until now = 0.1134 (\u001b[32m↘ -0.0007\u001b[0m)\n", + " ├── Target_iou = 0.9107\n", + " │ ├── Epoch N-1 = 0.9101 (\u001b[32m↗ 0.0005\u001b[0m)\n", + " │ └── Best until now = 0.9101 (\u001b[32m↗ 0.0005\u001b[0m)\n", + " ├── Background_iou = 0.8437\n", + " │ ├── Epoch N-1 = 0.8427 (\u001b[32m↗ 0.001\u001b[0m)\n", + " │ └── Best until now = 0.8427 (\u001b[32m↗ 0.001\u001b[0m)\n", + " └── Mean_iou = 0.8772\n", + " ├── Epoch N-1 = 0.8764 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " └── Best until now = 0.8764 (\u001b[32m↗ 0.0008\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 11: 100%|██████████| 309/309 [01:49<00:00, 2.82it/s, BCEDiceLoss=0.0808, background_IOU=0.908, gpu_mem=1.14, mean_IOU=0.919, target_IOU=0.93]\n", + "Validating epoch 11: 100%|██████████| 65/65 [00:15<00:00, 4.13it/s]\n", + "[2023-11-13 12:44:34] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:44:34] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9110515117645264\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 11\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.0808\n", + "│ │ ├── Epoch N-1 = 0.0853 (\u001b[32m↘ -0.0046\u001b[0m)\n", + "│ │ └── Best until now = 0.0853 (\u001b[32m↘ -0.0046\u001b[0m)\n", + "│ ├── Target_iou = 0.9301\n", + "│ │ ├── Epoch N-1 = 0.9241 (\u001b[32m↗ 0.006\u001b[0m)\n", + "│ │ └── Best until now = 0.9241 (\u001b[32m↗ 0.006\u001b[0m)\n", + "│ ├── Background_iou = 0.9075\n", + "│ │ ├── Epoch N-1 = 0.9023 (\u001b[32m↗ 0.0053\u001b[0m)\n", + "│ │ └── Best until now = 0.9023 (\u001b[32m↗ 0.0053\u001b[0m)\n", + "│ └── Mean_iou = 0.9188\n", + "│ ├── Epoch N-1 = 0.9132 (\u001b[32m↗ 0.0056\u001b[0m)\n", + "│ └── Best until now = 0.9132 (\u001b[32m↗ 0.0056\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1121\n", + " │ ├── Epoch N-1 = 0.1127 (\u001b[32m↘ -0.0006\u001b[0m)\n", + " │ └── Best until now = 0.1127 (\u001b[32m↘ -0.0006\u001b[0m)\n", + " ├── Target_iou = 0.9111\n", + " │ ├── Epoch N-1 = 0.9107 (\u001b[32m↗ 0.0004\u001b[0m)\n", + " │ └── Best until now = 0.9107 (\u001b[32m↗ 0.0004\u001b[0m)\n", + " ├── Background_iou = 0.8445\n", + " │ ├── Epoch N-1 = 0.8437 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " │ └── Best until now = 0.8437 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " └── Mean_iou = 0.8778\n", + " ├── Epoch N-1 = 0.8772 (\u001b[32m↗ 0.0006\u001b[0m)\n", + " └── Best until now = 0.8772 (\u001b[32m↗ 0.0006\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 12: 100%|██████████| 309/309 [01:44<00:00, 2.97it/s, BCEDiceLoss=0.0779, background_IOU=0.911, gpu_mem=1.14, mean_IOU=0.921, target_IOU=0.932]\n", + "Validating epoch 12: 100%|██████████| 65/65 [00:16<00:00, 3.92it/s]\n", + "[2023-11-13 12:46:40] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:46:40] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9114375114440918\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 12\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.0779\n", + "│ │ ├── Epoch N-1 = 0.0808 (\u001b[32m↘ -0.0029\u001b[0m)\n", + "│ │ └── Best until now = 0.0808 (\u001b[32m↘ -0.0029\u001b[0m)\n", + "│ ├── Target_iou = 0.9317\n", + "│ │ ├── Epoch N-1 = 0.9301 (\u001b[32m↗ 0.0016\u001b[0m)\n", + "│ │ └── Best until now = 0.9301 (\u001b[32m↗ 0.0016\u001b[0m)\n", + "│ ├── Background_iou = 0.9113\n", + "│ │ ├── Epoch N-1 = 0.9075 (\u001b[32m↗ 0.0038\u001b[0m)\n", + "│ │ └── Best until now = 0.9075 (\u001b[32m↗ 0.0038\u001b[0m)\n", + "│ └── Mean_iou = 0.9215\n", + "│ ├── Epoch N-1 = 0.9188 (\u001b[32m↗ 0.0027\u001b[0m)\n", + "│ └── Best until now = 0.9188 (\u001b[32m↗ 0.0027\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1115\n", + " │ ├── Epoch N-1 = 0.1121 (\u001b[32m↘ -0.0006\u001b[0m)\n", + " │ └── Best until now = 0.1121 (\u001b[32m↘ -0.0006\u001b[0m)\n", + " ├── Target_iou = 0.9114\n", + " │ ├── Epoch N-1 = 0.9111 (\u001b[32m↗ 0.0004\u001b[0m)\n", + " │ └── Best until now = 0.9111 (\u001b[32m↗ 0.0004\u001b[0m)\n", + " ├── Background_iou = 0.8453\n", + " │ ├── Epoch N-1 = 0.8445 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " │ └── Best until now = 0.8445 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " └── Mean_iou = 0.8784\n", + " ├── Epoch N-1 = 0.8778 (\u001b[32m↗ 0.0006\u001b[0m)\n", + " └── Best until now = 0.8778 (\u001b[32m↗ 0.0006\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 13: 100%|██████████| 309/309 [01:50<00:00, 2.79it/s, BCEDiceLoss=0.0748, background_IOU=0.916, gpu_mem=1.14, mean_IOU=0.926, target_IOU=0.935]\n", + "Validating epoch 13: 100%|██████████| 65/65 [00:16<00:00, 3.97it/s]\n", + "[2023-11-13 12:48:53] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:48:53] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9118204712867737\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 13\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.0748\n", + "│ │ ├── Epoch N-1 = 0.0779 (\u001b[32m↘ -0.0031\u001b[0m)\n", + "│ │ └── Best until now = 0.0779 (\u001b[32m↘ -0.0031\u001b[0m)\n", + "│ ├── Target_iou = 0.9349\n", + "│ │ ├── Epoch N-1 = 0.9317 (\u001b[32m↗ 0.0032\u001b[0m)\n", + "│ │ └── Best until now = 0.9317 (\u001b[32m↗ 0.0032\u001b[0m)\n", + "│ ├── Background_iou = 0.9165\n", + "│ │ ├── Epoch N-1 = 0.9113 (\u001b[32m↗ 0.0052\u001b[0m)\n", + "│ │ └── Best until now = 0.9113 (\u001b[32m↗ 0.0052\u001b[0m)\n", + "│ └── Mean_iou = 0.9257\n", + "│ ├── Epoch N-1 = 0.9215 (\u001b[32m↗ 0.0042\u001b[0m)\n", + "│ └── Best until now = 0.9215 (\u001b[32m↗ 0.0042\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.111\n", + " │ ├── Epoch N-1 = 0.1115 (\u001b[32m↘ -0.0006\u001b[0m)\n", + " │ └── Best until now = 0.1115 (\u001b[32m↘ -0.0006\u001b[0m)\n", + " ├── Target_iou = 0.9118\n", + " │ ├── Epoch N-1 = 0.9114 (\u001b[32m↗ 0.0004\u001b[0m)\n", + " │ └── Best until now = 0.9114 (\u001b[32m↗ 0.0004\u001b[0m)\n", + " ├── Background_iou = 0.8461\n", + " │ ├── Epoch N-1 = 0.8453 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " │ └── Best until now = 0.8453 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " └── Mean_iou = 0.879\n", + " ├── Epoch N-1 = 0.8784 (\u001b[32m↗ 0.0006\u001b[0m)\n", + " └── Best until now = 0.8784 (\u001b[32m↗ 0.0006\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Train epoch 14: 100%|██████████| 309/309 [01:45<00:00, 2.94it/s, BCEDiceLoss=0.0742, background_IOU=0.915, gpu_mem=1.14, mean_IOU=0.925, target_IOU=0.934]\n", + "Validating epoch 14: 100%|██████████| 65/65 [00:16<00:00, 3.89it/s]\n", + "[2023-11-13 12:51:00] INFO - base_sg_logger.py - Checkpoint saved in ./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500/ckpt_best.pth\n", + "[2023-11-13 12:51:00] INFO - sg_trainer.py - Best checkpoint overriden: validation target_IOU: 0.9122186303138733\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "===========================================================\n", + "SUMMARY OF EPOCH 14\n", + "├── Train\n", + "│ ├── Bcediceloss = 0.0742\n", + "│ │ ├── Epoch N-1 = 0.0748 (\u001b[32m↘ -0.0006\u001b[0m)\n", + "│ │ └── Best until now = 0.0748 (\u001b[32m↘ -0.0006\u001b[0m)\n", + "│ ├── Target_iou = 0.9343\n", + "│ │ ├── Epoch N-1 = 0.9349 (\u001b[31m↘ -0.0006\u001b[0m)\n", + "│ │ └── Best until now = 0.9349 (\u001b[31m↘ -0.0006\u001b[0m)\n", + "│ ├── Background_iou = 0.9147\n", + "│ │ ├── Epoch N-1 = 0.9165 (\u001b[31m↘ -0.0017\u001b[0m)\n", + "│ │ └── Best until now = 0.9165 (\u001b[31m↘ -0.0017\u001b[0m)\n", + "│ └── Mean_iou = 0.9245\n", + "│ ├── Epoch N-1 = 0.9257 (\u001b[31m↘ -0.0012\u001b[0m)\n", + "│ └── Best until now = 0.9257 (\u001b[31m↘ -0.0012\u001b[0m)\n", + "└── Validation\n", + " ├── Bcediceloss = 0.1104\n", + " │ ├── Epoch N-1 = 0.111 (\u001b[32m↘ -0.0005\u001b[0m)\n", + " │ └── Best until now = 0.111 (\u001b[32m↘ -0.0005\u001b[0m)\n", + " ├── Target_iou = 0.9122\n", + " │ ├── Epoch N-1 = 0.9118 (\u001b[32m↗ 0.0004\u001b[0m)\n", + " │ └── Best until now = 0.9118 (\u001b[32m↗ 0.0004\u001b[0m)\n", + " ├── Background_iou = 0.8469\n", + " │ ├── Epoch N-1 = 0.8461 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " │ └── Best until now = 0.8461 (\u001b[32m↗ 0.0008\u001b[0m)\n", + " └── Mean_iou = 0.8796\n", + " ├── Epoch N-1 = 0.879 (\u001b[32m↗ 0.0006\u001b[0m)\n", + " └── Best until now = 0.879 (\u001b[32m↗ 0.0006\u001b[0m)\n", + "\n", + "===========================================================\n" + ] + }, + { + "output_type": "stream", + "name": "stderr", + "text": [ + "[2023-11-13 12:51:04] INFO - sg_trainer.py - RUNNING ADDITIONAL TEST ON THE AVERAGED MODEL...\n", + "Validating epoch 15: 97%|█████████▋| 63/65 [00:17<00:00, 5.89it/s]" + ] + } + ], + "source": [ + "trainer.train(model=model, training_params=train_params, train_loader=train_loader, valid_loader=valid_loader)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "id": "X8BJq1crcbjl" + }, + "outputs": [], + "source": [ + "print(\"Best Checkpoint mIoU is: \"+ str(trainer.best_metric))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3Nybj15cchxd" + }, + "source": [ + "Now you can download your trained weights from this directory" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "id": "_iHsFgPSciQh" + }, + "outputs": [], + "source": [ + "print(trainer.checkpoints_dir_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yuhYeXLA18q5" + }, + "source": [ + "# 6. Predict\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "VjRA1tu1mvXQ" + }, + "source": [ + "When the training is complete you can use the trained model to get predictions on the validation set, your data or some other image. Let's load some image and\n", + "run a model inference to create a binary segmentation mask." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "id": "Ads7RyGN2JwQ", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 1000 + }, + "outputId": "a9fc3231-0de4-49bb-863c-6c5765381cae" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "\rValidating epoch 15: 100%|██████████| 65/65 [00:18<00:00, 6.54it/s]\rValidating epoch 15: 100%|██████████| 65/65 [00:18<00:00, 3.56it/s]\n", + "[2023-11-13 12:51:23] INFO - base_sg_logger.py - [CLEANUP] - Successfully stopped system monitoring process\n" + ] + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Best Checkpoint mIoU is: 0.9122186303138733\n", + "./notebook_ckpts/segmentation_transfer_learning/RUN_20231113_121947_461500\n" + ] + }, + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "image/png": "\n" + }, + "metadata": {} + }, + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAUAAAAHgCAAAAADx5+uYAAAGsElEQVR4nO3d3ZITNxBAYZni/V95c8FSycae8YxOtySrz7nJUltg9ece/xGgNTMzMzMzMzMzMzMzMzMz27nH7AM89XX8rfUOu9yZTvT+ttiJlzrOBb4/LXTqhY5y3a8tdO5lDnKLr7VlTr7IMe77tbbG4Vc4Q2t9fq0tcP7pB2it9fO11maPsAIg4vtu2hzzASP4vpsxzGzAQL7WZozza/gt/ijYL/zXe9/UDcwYd/RAEwGTtmXwRNMA8y62sSPNAcx9qBo60wzA/Ef6gVONBxzyRDlurOEvY8a/0MhtNOAgv3F302DAYYMNu6Ghj4FDL99Bk43cwLEPf4NubdwGTnj2GDHcKMA5T74DphsDOO21S/54Qx4D5732y7/lEYAzXzun3/YAwN3ee/wsH3CyX/bNpwNO37/kAyQ/TU3nay15xtwNXMIv9xSpgGv45Z4jE3AVv9STJAKu45d5lsm/sT6sNME8wJUWsOUdp8oGpgnWAUwSTANc7ApuLelIhTYwR7AUYIZgLcAEwSzABR8Ccyq2gfF3bBJgmQUst4HhlQOMvjZyAOtcwfU2MLoUwKUXMPhwbiAsA3DpBYwuAXB1v9jzeQnDBITFA65+BQef0A2EhQOuv4CxZ3QDYTUBA1ewJmBgAsKiAT/hOSQ0NxBWFDDuQikKGJeAsGDAcs8hbiBNQJiAMAFhAsJiAes9CbuBNAFhAsIEhAkIExAmIExAWFXAsJf8VQHDCgUs+E7ODaQJCBMQJiBMQJiAMAFhAsIEhAkIExAWCVjxrbAbSBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmICwSMCp/177rNxAmIAwAWECwgSECQgTECYgTECYgDABYVUBw963+792wKpuYFiBgCUX0A2kxQHWXEA3kCYgLAyw6BXsBtIEhEUBVr2Cy27gan9nQtkFLLuBYcUA1l1AN5AWAlh4AetuYNSdHgFYeQEjAEv71b2Eo+53Dlh7AQtvYNA9jwGLL2DlDYyJAn7yAoacvfQGRghCwE9ewBZy/NIbGBED/PAFjBgAAX68X0AEcAc/PAMA3MGP1w+4iR8doxtwEz88SC/gNn60TsCd/NgsfYA7+cFpugD38mPz9ADu5ocm6gDcz490H3BLv/6hbgNu6QfGugu4qV//YDcBt/XrHu0e4MZ+vcPd+mOfW/u11vWHYP1IH3YHcPsF7JnwBuD+fj0zegnDrgNWWMCOKd1A2GXAGgt4f043EHYVsMoC3p7UDYRdBKyzgHdndQOfuyV4DbDSArZ74/5OO8VC/f2QJWMPLm3ghy/g4+mLd90Y+Argh/v1dH3kAk8iuX/F/wXAggt4Y+gCG5jbe8CSC3h97LeARf0uD+4lfNg1wXeAey3gvSfkS7O7gSddEXwDuNcC3u7C+LU28PY+vP8J54CbLeDwDxP28Pt6+qLnJx9V4uOs1vqX4evNM/fZt/dYwNbaA01yLnjy3W38aKeCx9/U799OCGu9jEnoENAF/E8nGEeA+v3omOPg6tbvqQMpHwOvdrBTrwFdwBe9Rnm1mPId9ULrxQbqd9gLmmdA/U56xvn/Usr3rsfpD/W70OPwB/Jd7PHqS/Xu9fjxH/V6Y581mm/laALCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTECYgTECYgDABYQLCBIQJCBMQJiBMQJiAMAFhAsIEhAkIExAmIExAmIAwAWECwgSECQgTEPYPw+vH150bqhwAAAAASUVORK5CYII=\n" + }, + "metadata": {} + } + ], + "source": [ + "from torchvision.transforms import Compose, ToTensor, Resize, Normalize, ToPILImage\n", + "\n", + "# Initiate a model with best checkpoint.\n", + "model = models.get(model_name=Models.PP_LITE_T_SEG75,\n", + " arch_params={\"use_aux_heads\": False},\n", + " num_classes=1,\n", + " checkpoint_path=os.path.join(trainer.checkpoints_dir_path, \"ckpt_best.pth\")).cuda().eval()\n", + "\n", + "pre_proccess = Compose([\n", + " ToTensor(),\n", + " Normalize([.485, .456, .406], [.229, .224, .225])\n", + "])\n", + "\n", + "demo_img_path = os.path.join(root_dir, \"images\", \"ache-adult-depression-expression-41253.png\")\n", + "\n", + "img = Image.open(demo_img_path)\n", + "# Resize the image and display\n", + "img = Resize(size=(480, 320))(img)\n", + "display(img)\n", + "\n", + "# Run pre-proccess - transforms to tensor and apply normalizations.\n", + "img_inp = pre_proccess(img).unsqueeze(0).cuda()\n", + "\n", + "# Run inference\n", + "mask = model(img_inp)\n", + "\n", + "# Run post-proccess - apply sigmoid to output probabilities, then apply hard\n", + "# threshold of 0.5 for binary mask prediction.\n", + "mask = torch.sigmoid(mask).gt(0.5).squeeze()\n", + "mask = ToPILImage()(mask.float())\n", + "display(mask)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-k6ZLKHL1hIM" + }, + "source": [ + "# 7. Convert to ONNX/TensorRT" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "br7n55Szm4Nq" + }, + "source": [ + "Let's compile our model to ONNX." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "id": "q0AGQvEf11PT", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "249acd62-694c-460b-adbf-dbdd3d86057e" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "ONNX successfully created at: /content/data/model.onnx\n" + ] + } + ], + "source": [ + "from onnxsim import simplify\n", + "import onnx\n", + "\n", + "onnx_path = os.path.join(os.getcwd(), \"model.onnx\")\n", + "\n", + "input_size = [1, 3, 480, 320]\n", + "model.prep_model_for_conversion(input_size=input_size)\n", + "\n", + "torch.onnx.export(model,\n", + " torch.randn(*input_size).cuda(),\n", + " onnx_path)\n", + "\n", + "# onnx simplifier\n", + "model_sim, check = simplify(onnx_path)\n", + "assert check, \"Simplified ONNX model could not be validated\"\n", + "onnx.save_model(model_sim, onnx_path)\n", + "\n", + "print(\"ONNX successfully created at: \", onnx_path)\n" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}