From 6fca51bd8f20b1fbfef244d94f891e79ffe0af28 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Wed, 21 Feb 2024 23:43:23 +0000 Subject: [PATCH 01/31] Added custom segmentation trainer tutorial --- .../custom_segmentation_trainer.ipynb | 433 ++++++++++++++++++ 1 file changed, 433 insertions(+) create mode 100644 docs/tutorials/custom_segmentation_trainer.ipynb diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb new file mode 100644 index 00000000000..2ba123c6caa --- /dev/null +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -0,0 +1,433 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Copyright (c) Microsoft Corporation. All rights reserved.\n", + "\n", + "Licensed under the MIT License." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Custom Trainers\n", + "\n", + "In this tutorial, we demonstrate how to extend a TorchGeo [\"trainer class\"](https://torchgeo.readthedocs.io/en/latest/api/trainers.html). In TorchGeo there exist several trainer classes that are pre-made PyTorch Lightning Modules designed to allow for the easy training of models on semantic segmentation, classification, change detection, etc. tasks using TorchGeo's [prebuild DataModules](https://torchgeo.readthedocs.io/en/latest/api/datamodules.html). While the trainers aim to provide sensible defaults and customization options for common tasks, they will not be able to cover all situations (e.g. researchers will likely want to implement and use their own architectures, loss functions, optimizers, etc. in the training routine). If you run into such a situation, then you can simply extend the trainer class you are interested in, and write custom logic to override the default functionality.\n", + "\n", + "This tutorial shows how to do exactly this to customize a learning rate schedule, logging, and model checkpointing for a semantic segmentation task using the [LandCoverAI](https://landcover.ai.linuxpolska.com/) dataset.\n", + "\n", + "It's recommended to run this notebook on Google Colab if you don't have your own GPU. Click the \"Open in Colab\" button above to get started." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup\n", + "\n", + "As always, we install TorchGeo." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install torchgeo" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Imports\n", + "\n", + "Next, we import TorchGeo and any other libraries we need." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "from torchgeo.trainers import SemanticSegmentationTask\n", + "from torchgeo.datamodules import LandCoverAIDataModule\n", + "from torchmetrics import MetricCollection\n", + "from torchmetrics.classification import Accuracy, FBetaScore, Precision, Recall, JaccardIndex\n", + "\n", + "import lightning.pytorch as pl\n", + "from lightning.pytorch.callbacks import ModelCheckpoint\n", + "import torch\n", + "\n", + "from torch.optim.lr_scheduler import CosineAnnealingLR\n", + "from torch.optim import AdamW\n", + "\n", + "# Get rid of the pesky raised by kornia\n", + "# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.\n", + "import warnings\n", + "warnings.filterwarnings(\"ignore\", category=UserWarning, module=\"torch.nn.functional\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Custom SemanticSegmentationTask\n", + "\n", + "Now, we create a `CustomSemanticSegmentationTask` class that inhierits from `SemanticSegmentationTask` and that overrides a few methods:\n", + "- `__init__`: We add two new parameters `tmax` and `eta_min` to control the learning rate scheduler\n", + "- `configure_optimizers`: We use the `CosineAnnealingLR` learning rate scheduler instead of the default `ReduceLROnPlateau`\n", + "- `configure_metrics`: We add a \"MeanIou\" metric (what we will use to evaluate the model's performance) and a variety of other classification metrics\n", + "- `configure_callbacks`: We demonstrate how to stack `ModelCheckpoint` callbacks to save the best checkpoint as well as periodic checkpoints\n", + "- `on_train_epoch_start`: We log the learning rate at the start of each epoch so we can easily see how it decays over a training run\n", + "\n", + "Overall these demonstrate how to customize the training routine to investigate specific research questions (e.g. of the scheduler on test performance)." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "class CustomSemanticSegmentationTask(SemanticSegmentationTask):\n", + " def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None:\n", + " super().__init__()\n", + "\n", + " def configure_optimizers(self) -> \"lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig\":\n", + " \"\"\"Initialize the optimizer and learning rate scheduler.\n", + "\n", + " Returns:\n", + " Optimizer and learning rate scheduler.\n", + " \"\"\"\n", + " tmax: int = self.hparams.get(\"tmax\", 50)\n", + " eta_min: float = self.hparams.get(\"eta_min\", 1e-6)\n", + " optimizer = AdamW(self.parameters(), lr=self.hparams[\"lr\"])\n", + " scheduler = CosineAnnealingLR(optimizer, T_max=tmax, eta_min=eta_min)\n", + " return {\n", + " \"optimizer\": optimizer,\n", + " \"lr_scheduler\": {\"scheduler\": scheduler, \"monitor\": self.monitor},\n", + " }\n", + "\n", + " def configure_metrics(self) -> None:\n", + " \"\"\"Initialize the performance metrics.\"\"\"\n", + " num_classes: int = self.hparams[\"num_classes\"]\n", + "\n", + " self.train_metrics = MetricCollection(\n", + " {\n", + " \"OverallAccuracy\": Accuracy(\n", + " task=\"multiclass\",\n", + " num_classes=num_classes,\n", + " average=\"micro\",\n", + " ),\n", + " \"OverallPrecision\": Precision(\n", + " task=\"multiclass\",\n", + " num_classes=num_classes,\n", + " average=\"micro\",\n", + " ),\n", + " \"OverallRecall\": Recall(\n", + " task=\"multiclass\",\n", + " num_classes=num_classes,\n", + " average=\"micro\",\n", + " ),\n", + " \"OverallF1Score\": FBetaScore(\n", + " task=\"multiclass\",\n", + " num_classes=num_classes,\n", + " beta=1.0,\n", + " average=\"micro\",\n", + " ),\n", + " \"MeanIoU\": JaccardIndex(\n", + " num_classes=num_classes,\n", + " task=\"multiclass\",\n", + " average=\"macro\",\n", + " )\n", + " },\n", + " prefix=\"train_\",\n", + " )\n", + " self.val_metrics = self.train_metrics.clone(prefix=\"val_\")\n", + " self.test_metrics = self.train_metrics.clone(prefix=\"test_\")\n", + "\n", + " def configure_callbacks(self):\n", + " \"\"\"Initialize callbacks for saving the best and latest models.\n", + "\n", + " Returns:\n", + " List of callbacks to apply.\n", + " \"\"\"\n", + " return [\n", + " ModelCheckpoint(every_n_epochs=50, save_top_k=-1),\n", + " ModelCheckpoint(monitor=self.monitor, mode=self.mode, save_top_k=5),\n", + " ]\n", + "\n", + " def on_train_epoch_start(self) -> None:\n", + " \"\"\"Log the learning rate at the start of each training epoch.\"\"\"\n", + " lr = self.optimizers().param_groups[0]['lr']\n", + " self.logger.experiment.add_scalar(\"lr\", lr, self.current_epoch)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Train model\n", + "\n", + "The remainder of the turial is straightforward and follows the typical [PyTorch Lightning](https://lightning.ai/) training routine. We instantiate a `DataModule` for the LandCover.AI dataset, instantiate a `CustomSemanticSegmentationTask` with a U-Net and ResNet-50 backbone, then train the model using a Lightning trainer." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "dm = LandCoverAIDataModule(\n", + " root=\"/home/calebrobinson/ssdshared/torchgeo-datasets/LandCoverAI\",\n", + " batch_size=64,\n", + " num_workers=8\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "task = CustomSemanticSegmentationTask(\n", + " model=\"unet\",\n", + " backbone=\"resnet50\",\n", + " weights=True,\n", + " in_channels=3,\n", + " num_classes=6,\n", + " loss=\"ce\",\n", + " lr=1e-3,\n", + " tmax=50,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "GPU available: True (cuda), used: True\n", + "TPU available: False, using: 0 TPU cores\n", + "IPU available: False, using: 0 IPUs\n", + "HPU available: False, using: 0 HPUs\n" + ] + } + ], + "source": [ + "accelerator = \"gpu\" if torch.cuda.is_available() else \"cpu\"\n", + "\n", + "trainer = pl.Trainer(\n", + " accelerator=accelerator,\n", + " min_epochs=150,\n", + " max_epochs=300,\n", + " log_every_n_steps=50,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n", + "You are using a CUDA device ('NVIDIA A100 80GB PCIe') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision\n", + "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]\n", + "\n", + " | Name | Type | Params\n", + "---------------------------------------------------\n", + "0 | criterion | CrossEntropyLoss | 0 \n", + "1 | train_metrics | MetricCollection | 0 \n", + "2 | val_metrics | MetricCollection | 0 \n", + "3 | test_metrics | MetricCollection | 0 \n", + "4 | model | Unet | 32.5 M\n", + "---------------------------------------------------\n", + "32.5 M Trainable params\n", + "0 Non-trainable params\n", + "32.5 M Total params\n", + "130.087 Total estimated model params size (MB)\n" + ] + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "c2e2b60406f3494fb9f07b209df8efd7", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Sanity Checking: | | 0/? [00:00 Date: Wed, 21 Feb 2024 23:50:06 +0000 Subject: [PATCH 02/31] Black formatting --- .../custom_segmentation_trainer.ipynb | 40 +++++++++---------- 1 file changed, 19 insertions(+), 21 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 2ba123c6caa..d4f98e5a909 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -58,7 +58,13 @@ "from torchgeo.trainers import SemanticSegmentationTask\n", "from torchgeo.datamodules import LandCoverAIDataModule\n", "from torchmetrics import MetricCollection\n", - "from torchmetrics.classification import Accuracy, FBetaScore, Precision, Recall, JaccardIndex\n", + "from torchmetrics.classification import (\n", + " Accuracy,\n", + " FBetaScore,\n", + " Precision,\n", + " Recall,\n", + " JaccardIndex,\n", + ")\n", "\n", "import lightning.pytorch as pl\n", "from lightning.pytorch.callbacks import ModelCheckpoint\n", @@ -70,6 +76,7 @@ "# Get rid of the pesky raised by kornia\n", "# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.\n", "import warnings\n", + "\n", "warnings.filterwarnings(\"ignore\", category=UserWarning, module=\"torch.nn.functional\")" ] }, @@ -99,7 +106,9 @@ " def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None:\n", " super().__init__()\n", "\n", - " def configure_optimizers(self) -> \"lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig\":\n", + " def configure_optimizers(\n", + " self,\n", + " ) -> \"lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig\":\n", " \"\"\"Initialize the optimizer and learning rate scheduler.\n", "\n", " Returns:\n", @@ -121,19 +130,13 @@ " self.train_metrics = MetricCollection(\n", " {\n", " \"OverallAccuracy\": Accuracy(\n", - " task=\"multiclass\",\n", - " num_classes=num_classes,\n", - " average=\"micro\",\n", + " task=\"multiclass\", num_classes=num_classes, average=\"micro\"\n", " ),\n", " \"OverallPrecision\": Precision(\n", - " task=\"multiclass\",\n", - " num_classes=num_classes,\n", - " average=\"micro\",\n", + " task=\"multiclass\", num_classes=num_classes, average=\"micro\"\n", " ),\n", " \"OverallRecall\": Recall(\n", - " task=\"multiclass\",\n", - " num_classes=num_classes,\n", - " average=\"micro\",\n", + " task=\"multiclass\", num_classes=num_classes, average=\"micro\"\n", " ),\n", " \"OverallF1Score\": FBetaScore(\n", " task=\"multiclass\",\n", @@ -142,10 +145,8 @@ " average=\"micro\",\n", " ),\n", " \"MeanIoU\": JaccardIndex(\n", - " num_classes=num_classes,\n", - " task=\"multiclass\",\n", - " average=\"macro\",\n", - " )\n", + " num_classes=num_classes, task=\"multiclass\", average=\"macro\"\n", + " ),\n", " },\n", " prefix=\"train_\",\n", " )\n", @@ -165,7 +166,7 @@ "\n", " def on_train_epoch_start(self) -> None:\n", " \"\"\"Log the learning rate at the start of each training epoch.\"\"\"\n", - " lr = self.optimizers().param_groups[0]['lr']\n", + " lr = self.optimizers().param_groups[0][\"lr\"]\n", " self.logger.experiment.add_scalar(\"lr\", lr, self.current_epoch)" ] }, @@ -187,7 +188,7 @@ "dm = LandCoverAIDataModule(\n", " root=\"/home/calebrobinson/ssdshared/torchgeo-datasets/LandCoverAI\",\n", " batch_size=64,\n", - " num_workers=8\n", + " num_workers=8,\n", ")" ] }, @@ -229,10 +230,7 @@ "accelerator = \"gpu\" if torch.cuda.is_available() else \"cpu\"\n", "\n", "trainer = pl.Trainer(\n", - " accelerator=accelerator,\n", - " min_epochs=150,\n", - " max_epochs=300,\n", - " log_every_n_steps=50,\n", + " accelerator=accelerator, min_epochs=150, max_epochs=300, log_every_n_steps=50\n", ")" ] }, From 604bc1f7a989a3c93152c9a0563375378d3d0bb4 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Thu, 22 Feb 2024 20:54:42 +0000 Subject: [PATCH 03/31] Updates --- .../custom_segmentation_trainer.ipynb | 262 ++++++++++++------ 1 file changed, 173 insertions(+), 89 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index d4f98e5a909..0d003701794 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -51,7 +51,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, "metadata": {}, "outputs": [], "source": [ @@ -73,7 +73,7 @@ "from torch.optim.lr_scheduler import CosineAnnealingLR\n", "from torch.optim import AdamW\n", "\n", - "# Get rid of the pesky raised by kornia\n", + "# Get rid of the pesky warnings raised by kornia\n", "# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.\n", "import warnings\n", "\n", @@ -98,13 +98,15 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "class CustomSemanticSegmentationTask(SemanticSegmentationTask):\n", + "\n", + " # any keywords we add here between *args and **kwargs will be found in self.hparams\n", " def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None:\n", - " super().__init__()\n", + " super().__init__(*args, **kwargs) # pass args and kwargs to the parent class\n", "\n", " def configure_optimizers(\n", " self,\n", @@ -114,8 +116,9 @@ " Returns:\n", " Optimizer and learning rate scheduler.\n", " \"\"\"\n", - " tmax: int = self.hparams.get(\"tmax\", 50)\n", - " eta_min: float = self.hparams.get(\"eta_min\", 1e-6)\n", + " tmax: int = self.hparams[\"tmax\"]\n", + " eta_min: float = self.hparams[\"eta_min\"]\n", + "\n", " optimizer = AdamW(self.parameters(), lr=self.hparams[\"lr\"])\n", " scheduler = CosineAnnealingLR(optimizer, T_max=tmax, eta_min=eta_min)\n", " return {\n", @@ -181,20 +184,21 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "dm = LandCoverAIDataModule(\n", - " root=\"/home/calebrobinson/ssdshared/torchgeo-datasets/LandCoverAI\",\n", + " root=\"data/\",\n", " batch_size=64,\n", " num_workers=8,\n", + " download=True,\n", ")" ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ @@ -212,14 +216,55 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 6, "metadata": {}, "outputs": [ + { + "data": { + "text/plain": [ + "\"backbone\": resnet50\n", + "\"class_weights\": None\n", + "\"eta_min\": 1e-06\n", + "\"freeze_backbone\": False\n", + "\"freeze_decoder\": False\n", + "\"ignore\": weights\n", + "\"ignore_index\": None\n", + "\"in_channels\": 3\n", + "\"loss\": ce\n", + "\"lr\": 0.001\n", + "\"model\": unet\n", + "\"num_classes\": 6\n", + "\"num_filters\": 3\n", + "\"patience\": 10\n", + "\"tmax\": 50" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# validate that the task's hyperparameters are as expected\n", + "task.hparams" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "GPU available: True (cuda), used: True\n" + ] + }, { "name": "stderr", "output_type": "stream", "text": [ - "GPU available: True (cuda), used: True\n", "TPU available: False, using: 0 TPU cores\n", "IPU available: False, using: 0 IPUs\n", "HPU available: False, using: 0 HPUs\n" @@ -230,21 +275,100 @@ "accelerator = \"gpu\" if torch.cuda.is_available() else \"cpu\"\n", "\n", "trainer = pl.Trainer(\n", - " accelerator=accelerator, min_epochs=150, max_epochs=300, log_every_n_steps=50\n", + " accelerator=accelerator, devices=[0], min_epochs=150, max_epochs=300, log_every_n_steps=50\n", ")" ] }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n", - "You are using a CUDA device ('NVIDIA A100 80GB PCIe') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision\n", + "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Downloading https://landcover.ai.linuxpolska.com/download/landcover.ai.v1.zip to data/landcover.ai.v1.zip\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "100%|██████████| 1538212277/1538212277 [01:25<00:00, 17913845.14it/s]\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Processed M-33-20-D-c-4-2 1/41\n", + "Processed M-33-20-D-d-3-3 2/41\n", + "Processed M-33-32-B-b-4-4 3/41\n", + "Processed M-33-48-A-c-4-4 4/41\n", + "Processed M-33-7-A-d-2-3 5/41\n", + "Processed M-33-7-A-d-3-2 6/41\n", + "Processed M-34-32-B-a-4-3 7/41\n", + "Processed M-34-32-B-b-1-3 8/41\n", + "Processed M-34-5-D-d-4-2 9/41\n", + "Processed M-34-51-C-b-2-1 10/41\n", + "Processed M-34-51-C-d-4-1 11/41\n", + "Processed M-34-55-B-b-4-1 12/41\n", + "Processed M-34-56-A-b-1-4 13/41\n", + "Processed M-34-6-A-d-2-2 14/41\n", + "Processed M-34-65-D-a-4-4 15/41\n", + "Processed M-34-65-D-c-4-2 16/41\n", + "Processed M-34-65-D-d-4-1 17/41\n", + "Processed M-34-68-B-a-1-3 18/41\n", + "Processed M-34-77-B-c-2-3 19/41\n", + "Processed N-33-104-A-c-1-1 20/41\n", + "Processed N-33-119-C-c-3-3 21/41\n", + "Processed N-33-130-A-d-3-3 22/41\n", + "Processed N-33-130-A-d-4-4 23/41\n", + "Processed N-33-139-C-d-2-2 24/41\n", + "Processed N-33-139-C-d-2-4 25/41\n", + "Processed N-33-139-D-c-1-3 26/41\n", + "Processed N-33-60-D-c-4-2 27/41\n", + "Processed N-33-60-D-d-1-2 28/41\n", + "Processed N-33-96-D-d-1-1 29/41\n", + "Processed N-34-106-A-b-3-4 30/41\n", + "Processed N-34-106-A-c-1-3 31/41\n", + "Processed N-34-140-A-b-3-2 32/41\n", + "Processed N-34-140-A-b-4-2 33/41\n", + "Processed N-34-140-A-d-3-4 34/41\n", + "Processed N-34-140-A-d-4-2 35/41\n", + "Processed N-34-61-B-a-1-1 36/41\n", + "Processed N-34-66-C-c-4-3 37/41\n", + "Processed N-34-77-A-b-1-4 38/41\n", + "Processed N-34-94-A-b-2-4 39/41\n", + "Processed N-34-97-C-b-1-2 40/41\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "You are using a CUDA device ('NVIDIA A100 80GB PCIe') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Processed N-34-97-D-c-2-4 41/41\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]\n", "\n", " | Name | Type | Params\n", @@ -264,7 +388,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "c2e2b60406f3494fb9f07b209df8efd7", + "model_id": "00af14780e004ce69b89e436b7b95606", "version_major": 2, "version_minor": 0 }, @@ -278,7 +402,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "8d87c64a736345658f82e9a7df2c2e68", + "model_id": "d1439daa182f4d27b5194609bd194d56", "version_major": 2, "version_minor": 0 }, @@ -292,35 +416,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "8129de8d64014ad8ad346dfe01fb6c7b", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Validation: | | 0/? [00:00 2\u001b[0m task \u001b[38;5;241m=\u001b[39m \u001b[43mCustomSemanticSegmentationTask\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_from_checkpoint\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mlightning_logs/version_3/checkpoints/epoch=0-step=117.ckpt\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", + "File \u001b[0;32m~/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/core/module.py:1561\u001b[0m, in \u001b[0;36mLightningModule.load_from_checkpoint\u001b[0;34m(cls, checkpoint_path, map_location, hparams_file, strict, **kwargs)\u001b[0m\n\u001b[1;32m 1480\u001b[0m \u001b[38;5;129m@_restricted_classmethod\u001b[39m\n\u001b[1;32m 1481\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mload_from_checkpoint\u001b[39m(\n\u001b[1;32m 1482\u001b[0m \u001b[38;5;28mcls\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1487\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs: Any,\n\u001b[1;32m 1488\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m Self:\n\u001b[1;32m 1489\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124mr\u001b[39m\u001b[38;5;124;03m\"\"\"Primary way of loading a model from a checkpoint. When Lightning saves a checkpoint it stores the arguments\u001b[39;00m\n\u001b[1;32m 1490\u001b[0m \u001b[38;5;124;03m passed to ``__init__`` in the checkpoint under ``\"hyper_parameters\"``.\u001b[39;00m\n\u001b[1;32m 1491\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1559\u001b[0m \n\u001b[1;32m 1560\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[0;32m-> 1561\u001b[0m loaded \u001b[38;5;241m=\u001b[39m \u001b[43m_load_from_checkpoint\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 1562\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# type: ignore[arg-type]\u001b[39;49;00m\n\u001b[1;32m 1563\u001b[0m \u001b[43m \u001b[49m\u001b[43mcheckpoint_path\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1564\u001b[0m \u001b[43m \u001b[49m\u001b[43mmap_location\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1565\u001b[0m \u001b[43m \u001b[49m\u001b[43mhparams_file\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1566\u001b[0m \u001b[43m \u001b[49m\u001b[43mstrict\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1567\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1568\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1569\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cast(Self, loaded)\n", + "File \u001b[0;32m~/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/core/saving.py:89\u001b[0m, in \u001b[0;36m_load_from_checkpoint\u001b[0;34m(cls, checkpoint_path, map_location, hparams_file, strict, **kwargs)\u001b[0m\n\u001b[1;32m 87\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m _load_state(\u001b[38;5;28mcls\u001b[39m, checkpoint, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[1;32m 88\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28missubclass\u001b[39m(\u001b[38;5;28mcls\u001b[39m, pl\u001b[38;5;241m.\u001b[39mLightningModule):\n\u001b[0;32m---> 89\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[43m_load_state\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheckpoint\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mstrict\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mstrict\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 90\u001b[0m state_dict \u001b[38;5;241m=\u001b[39m checkpoint[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstate_dict\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[1;32m 91\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m state_dict:\n", + "File \u001b[0;32m~/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/core/saving.py:156\u001b[0m, in \u001b[0;36m_load_state\u001b[0;34m(cls, checkpoint, strict, **cls_kwargs_new)\u001b[0m\n\u001b[1;32m 152\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m cls_spec\u001b[38;5;241m.\u001b[39mvarkw:\n\u001b[1;32m 153\u001b[0m \u001b[38;5;66;03m# filter kwargs according to class init unless it allows any argument via kwargs\u001b[39;00m\n\u001b[1;32m 154\u001b[0m _cls_kwargs \u001b[38;5;241m=\u001b[39m {k: v \u001b[38;5;28;01mfor\u001b[39;00m k, v \u001b[38;5;129;01min\u001b[39;00m _cls_kwargs\u001b[38;5;241m.\u001b[39mitems() \u001b[38;5;28;01mif\u001b[39;00m k \u001b[38;5;129;01min\u001b[39;00m cls_init_args_name}\n\u001b[0;32m--> 156\u001b[0m obj \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43m_cls_kwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 158\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(obj, pl\u001b[38;5;241m.\u001b[39mLightningModule):\n\u001b[1;32m 159\u001b[0m \u001b[38;5;66;03m# give model a chance to load something\u001b[39;00m\n\u001b[1;32m 160\u001b[0m obj\u001b[38;5;241m.\u001b[39mon_load_checkpoint(checkpoint)\n", + "Cell \u001b[0;32mIn[3], line 5\u001b[0m, in \u001b[0;36mCustomSemanticSegmentationTask.__init__\u001b[0;34m(self, tmax, eta_min, *args, **kwargs)\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m__init__\u001b[39m(\u001b[38;5;28mself\u001b[39m, \u001b[38;5;241m*\u001b[39margs, tmax\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m50\u001b[39m, eta_min\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1e-6\u001b[39m, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m----> 5\u001b[0m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", + "\u001b[0;31mTypeError\u001b[0m: SemanticSegmentationTask.__init__() got an unexpected keyword argument 'ignore'" + ] + } + ], "source": [ "# If you are starting from a checkpoint, run this cell\n", - "task = CustomSemanticSegmentationTask.load_from_checkpoint(\"path/to/checkpoint.ckpt\")" + "task = CustomSemanticSegmentationTask.load_from_checkpoint(\"lightning_logs/version_3/checkpoints/epoch=0-step=117.ckpt\")" ] }, { @@ -405,6 +482,13 @@ "source": [ "trainer.test(task, dm)" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { From e67914364a05afdaa77fbae95b8570500d53ae16 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Thu, 22 Feb 2024 21:01:35 +0000 Subject: [PATCH 04/31] black formatting --- .../custom_segmentation_trainer.ipynb | 113 +++++++++++++----- 1 file changed, 84 insertions(+), 29 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 0d003701794..33bdf14c226 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -98,7 +98,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 10, "metadata": {}, "outputs": [], "source": [ @@ -106,6 +106,7 @@ "\n", " # any keywords we add here between *args and **kwargs will be found in self.hparams\n", " def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None:\n", + " del kwargs[\"ignore\"] # this is a hack\n", " super().__init__(*args, **kwargs) # pass args and kwargs to the parent class\n", "\n", " def configure_optimizers(\n", @@ -188,12 +189,7 @@ "metadata": {}, "outputs": [], "source": [ - "dm = LandCoverAIDataModule(\n", - " root=\"data/\",\n", - " batch_size=64,\n", - " num_workers=8,\n", - " download=True,\n", - ")" + "dm = LandCoverAIDataModule(root=\"data/\", batch_size=64, num_workers=8, download=True)" ] }, { @@ -275,7 +271,11 @@ "accelerator = \"gpu\" if torch.cuda.is_available() else \"cpu\"\n", "\n", "trainer = pl.Trainer(\n", - " accelerator=accelerator, devices=[0], min_epochs=150, max_epochs=300, log_every_n_steps=50\n", + " accelerator=accelerator,\n", + " devices=[0],\n", + " min_epochs=150,\n", + " max_epochs=300,\n", + " log_every_n_steps=50,\n", ")" ] }, @@ -450,35 +450,90 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 11, "metadata": {}, - "outputs": [ - { - "ename": "TypeError", - "evalue": "SemanticSegmentationTask.__init__() got an unexpected keyword argument 'ignore'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "Cell \u001b[0;32mIn[9], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m# If you are starting from a checkpoint, run this cell\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m task \u001b[38;5;241m=\u001b[39m \u001b[43mCustomSemanticSegmentationTask\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_from_checkpoint\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mlightning_logs/version_3/checkpoints/epoch=0-step=117.ckpt\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", - "File \u001b[0;32m~/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/core/module.py:1561\u001b[0m, in \u001b[0;36mLightningModule.load_from_checkpoint\u001b[0;34m(cls, checkpoint_path, map_location, hparams_file, strict, **kwargs)\u001b[0m\n\u001b[1;32m 1480\u001b[0m \u001b[38;5;129m@_restricted_classmethod\u001b[39m\n\u001b[1;32m 1481\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mload_from_checkpoint\u001b[39m(\n\u001b[1;32m 1482\u001b[0m \u001b[38;5;28mcls\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1487\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs: Any,\n\u001b[1;32m 1488\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m Self:\n\u001b[1;32m 1489\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124mr\u001b[39m\u001b[38;5;124;03m\"\"\"Primary way of loading a model from a checkpoint. When Lightning saves a checkpoint it stores the arguments\u001b[39;00m\n\u001b[1;32m 1490\u001b[0m \u001b[38;5;124;03m passed to ``__init__`` in the checkpoint under ``\"hyper_parameters\"``.\u001b[39;00m\n\u001b[1;32m 1491\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1559\u001b[0m \n\u001b[1;32m 1560\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[0;32m-> 1561\u001b[0m loaded \u001b[38;5;241m=\u001b[39m \u001b[43m_load_from_checkpoint\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 1562\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# type: ignore[arg-type]\u001b[39;49;00m\n\u001b[1;32m 1563\u001b[0m \u001b[43m \u001b[49m\u001b[43mcheckpoint_path\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1564\u001b[0m \u001b[43m \u001b[49m\u001b[43mmap_location\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1565\u001b[0m \u001b[43m \u001b[49m\u001b[43mhparams_file\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1566\u001b[0m \u001b[43m \u001b[49m\u001b[43mstrict\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1567\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1568\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1569\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cast(Self, loaded)\n", - "File \u001b[0;32m~/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/core/saving.py:89\u001b[0m, in \u001b[0;36m_load_from_checkpoint\u001b[0;34m(cls, checkpoint_path, map_location, hparams_file, strict, **kwargs)\u001b[0m\n\u001b[1;32m 87\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m _load_state(\u001b[38;5;28mcls\u001b[39m, checkpoint, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[1;32m 88\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28missubclass\u001b[39m(\u001b[38;5;28mcls\u001b[39m, pl\u001b[38;5;241m.\u001b[39mLightningModule):\n\u001b[0;32m---> 89\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[43m_load_state\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheckpoint\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mstrict\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mstrict\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 90\u001b[0m state_dict \u001b[38;5;241m=\u001b[39m checkpoint[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstate_dict\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[1;32m 91\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m state_dict:\n", - "File \u001b[0;32m~/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/core/saving.py:156\u001b[0m, in \u001b[0;36m_load_state\u001b[0;34m(cls, checkpoint, strict, **cls_kwargs_new)\u001b[0m\n\u001b[1;32m 152\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m cls_spec\u001b[38;5;241m.\u001b[39mvarkw:\n\u001b[1;32m 153\u001b[0m \u001b[38;5;66;03m# filter kwargs according to class init unless it allows any argument via kwargs\u001b[39;00m\n\u001b[1;32m 154\u001b[0m _cls_kwargs \u001b[38;5;241m=\u001b[39m {k: v \u001b[38;5;28;01mfor\u001b[39;00m k, v \u001b[38;5;129;01min\u001b[39;00m _cls_kwargs\u001b[38;5;241m.\u001b[39mitems() \u001b[38;5;28;01mif\u001b[39;00m k \u001b[38;5;129;01min\u001b[39;00m cls_init_args_name}\n\u001b[0;32m--> 156\u001b[0m obj \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43m_cls_kwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 158\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(obj, pl\u001b[38;5;241m.\u001b[39mLightningModule):\n\u001b[1;32m 159\u001b[0m \u001b[38;5;66;03m# give model a chance to load something\u001b[39;00m\n\u001b[1;32m 160\u001b[0m obj\u001b[38;5;241m.\u001b[39mon_load_checkpoint(checkpoint)\n", - "Cell \u001b[0;32mIn[3], line 5\u001b[0m, in \u001b[0;36mCustomSemanticSegmentationTask.__init__\u001b[0;34m(self, tmax, eta_min, *args, **kwargs)\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m__init__\u001b[39m(\u001b[38;5;28mself\u001b[39m, \u001b[38;5;241m*\u001b[39margs, tmax\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m50\u001b[39m, eta_min\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1e-6\u001b[39m, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m----> 5\u001b[0m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", - "\u001b[0;31mTypeError\u001b[0m: SemanticSegmentationTask.__init__() got an unexpected keyword argument 'ignore'" - ] - } - ], + "outputs": [], "source": [ "# If you are starting from a checkpoint, run this cell\n", - "task = CustomSemanticSegmentationTask.load_from_checkpoint(\"lightning_logs/version_3/checkpoints/epoch=0-step=117.ckpt\")" + "task = CustomSemanticSegmentationTask.load_from_checkpoint(\n", + " \"lightning_logs/version_3/checkpoints/epoch=0-step=117.ckpt\"\n", + ")" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 12, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n", + "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]\n" + ] + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "b4567eeff83f488891b6911480ea381f", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Testing: | | 0/? [00:00┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", + "┃ Test metric DataLoader 0 ┃\n", + "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", + "│ test_MeanIoU 0.3726549744606018 │\n", + "│ test_OverallAccuracy 0.8094286322593689 │\n", + "│ test_OverallF1Score 0.8094285726547241 │\n", + "│ test_OverallPrecision 0.8094286322593689 │\n", + "│ test_OverallRecall 0.8094286322593689 │\n", + "│ test_loss 0.4797952175140381 │\n", + "└───────────────────────────┴───────────────────────────┘\n", + "\n" + ], + "text/plain": [ + "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", + "┃\u001b[1m \u001b[0m\u001b[1m Test metric \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m DataLoader 0 \u001b[0m\u001b[1m \u001b[0m┃\n", + "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", + "│\u001b[36m \u001b[0m\u001b[36m test_MeanIoU \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.3726549744606018 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallAccuracy \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8094286322593689 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallF1Score \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8094285726547241 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallPrecision \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8094286322593689 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallRecall \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8094286322593689 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_loss \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.4797952175140381 \u001b[0m\u001b[35m \u001b[0m│\n", + "└───────────────────────────┴───────────────────────────┘\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "[{'test_loss': 0.4797952175140381,\n", + " 'test_MeanIoU': 0.3726549744606018,\n", + " 'test_OverallAccuracy': 0.8094286322593689,\n", + " 'test_OverallF1Score': 0.8094285726547241,\n", + " 'test_OverallPrecision': 0.8094286322593689,\n", + " 'test_OverallRecall': 0.8094286322593689}]" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "trainer.test(task, dm)" ] From a9808c5df039608b7512b3ec65bba8333ac65cd6 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 16:40:56 +0000 Subject: [PATCH 05/31] Adding a jupytext synced python file --- docs/tutorials/custom_segmentation_trainer.py | 194 ++++++++++++++++++ 1 file changed, 194 insertions(+) create mode 100644 docs/tutorials/custom_segmentation_trainer.py diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py new file mode 100644 index 00000000000..a24783bca91 --- /dev/null +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -0,0 +1,194 @@ +# --- +# jupyter: +# jupytext: +# formats: ipynb,py +# text_representation: +# extension: .py +# format_name: light +# format_version: '1.5' +# jupytext_version: 1.16.1 +# kernelspec: +# display_name: geo +# language: python +# name: geo +# --- + +# Copyright (c) Microsoft Corporation. All rights reserved. +# +# Licensed under the MIT License. + +# # Custom Trainers +# +# In this tutorial, we demonstrate how to extend a TorchGeo ["trainer class"](https://torchgeo.readthedocs.io/en/latest/api/trainers.html). In TorchGeo there exist several trainer classes that are pre-made PyTorch Lightning Modules designed to allow for the easy training of models on semantic segmentation, classification, change detection, etc. tasks using TorchGeo's [prebuild DataModules](https://torchgeo.readthedocs.io/en/latest/api/datamodules.html). While the trainers aim to provide sensible defaults and customization options for common tasks, they will not be able to cover all situations (e.g. researchers will likely want to implement and use their own architectures, loss functions, optimizers, etc. in the training routine). If you run into such a situation, then you can simply extend the trainer class you are interested in, and write custom logic to override the default functionality. +# +# This tutorial shows how to do exactly this to customize a learning rate schedule, logging, and model checkpointing for a semantic segmentation task using the [LandCoverAI](https://landcover.ai.linuxpolska.com/) dataset. +# +# It's recommended to run this notebook on Google Colab if you don't have your own GPU. Click the "Open in Colab" button above to get started. + +# ## Setup +# +# As always, we install TorchGeo. + +# %pip install torchgeo + +# ## Imports +# +# Next, we import TorchGeo and any other libraries we need. + +# + +from torchgeo.trainers import SemanticSegmentationTask +from torchgeo.datamodules import LandCoverAIDataModule +from torchmetrics import MetricCollection +from torchmetrics.classification import ( + Accuracy, + FBetaScore, + Precision, + Recall, + JaccardIndex, +) + +import lightning.pytorch as pl +from lightning.pytorch.callbacks import ModelCheckpoint +import torch + +from torch.optim.lr_scheduler import CosineAnnealingLR +from torch.optim import AdamW + +# Get rid of the pesky warnings raised by kornia +# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. +import warnings + +warnings.filterwarnings("ignore", category=UserWarning, module="torch.nn.functional") + + +# - + +# ## Custom SemanticSegmentationTask +# +# Now, we create a `CustomSemanticSegmentationTask` class that inhierits from `SemanticSegmentationTask` and that overrides a few methods: +# - `__init__`: We add two new parameters `tmax` and `eta_min` to control the learning rate scheduler +# - `configure_optimizers`: We use the `CosineAnnealingLR` learning rate scheduler instead of the default `ReduceLROnPlateau` +# - `configure_metrics`: We add a "MeanIou" metric (what we will use to evaluate the model's performance) and a variety of other classification metrics +# - `configure_callbacks`: We demonstrate how to stack `ModelCheckpoint` callbacks to save the best checkpoint as well as periodic checkpoints +# - `on_train_epoch_start`: We log the learning rate at the start of each epoch so we can easily see how it decays over a training run +# +# Overall these demonstrate how to customize the training routine to investigate specific research questions (e.g. of the scheduler on test performance). + +class CustomSemanticSegmentationTask(SemanticSegmentationTask): + + # any keywords we add here between *args and **kwargs will be found in self.hparams + def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None: + del kwargs["ignore"] # this is a hack + super().__init__(*args, **kwargs) # pass args and kwargs to the parent class + + def configure_optimizers( + self, + ) -> "lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig": + """Initialize the optimizer and learning rate scheduler. + + Returns: + Optimizer and learning rate scheduler. + """ + tmax: int = self.hparams["tmax"] + eta_min: float = self.hparams["eta_min"] + + optimizer = AdamW(self.parameters(), lr=self.hparams["lr"]) + scheduler = CosineAnnealingLR(optimizer, T_max=tmax, eta_min=eta_min) + return { + "optimizer": optimizer, + "lr_scheduler": {"scheduler": scheduler, "monitor": self.monitor}, + } + + def configure_metrics(self) -> None: + """Initialize the performance metrics.""" + num_classes: int = self.hparams["num_classes"] + + self.train_metrics = MetricCollection( + { + "OverallAccuracy": Accuracy( + task="multiclass", num_classes=num_classes, average="micro" + ), + "OverallPrecision": Precision( + task="multiclass", num_classes=num_classes, average="micro" + ), + "OverallRecall": Recall( + task="multiclass", num_classes=num_classes, average="micro" + ), + "OverallF1Score": FBetaScore( + task="multiclass", + num_classes=num_classes, + beta=1.0, + average="micro", + ), + "MeanIoU": JaccardIndex( + num_classes=num_classes, task="multiclass", average="macro" + ), + }, + prefix="train_", + ) + self.val_metrics = self.train_metrics.clone(prefix="val_") + self.test_metrics = self.train_metrics.clone(prefix="test_") + + def configure_callbacks(self): + """Initialize callbacks for saving the best and latest models. + + Returns: + List of callbacks to apply. + """ + return [ + ModelCheckpoint(every_n_epochs=50, save_top_k=-1), + ModelCheckpoint(monitor=self.monitor, mode=self.mode, save_top_k=5), + ] + + def on_train_epoch_start(self) -> None: + """Log the learning rate at the start of each training epoch.""" + lr = self.optimizers().param_groups[0]["lr"] + self.logger.experiment.add_scalar("lr", lr, self.current_epoch) + + +# ## Train model +# +# The remainder of the turial is straightforward and follows the typical [PyTorch Lightning](https://lightning.ai/) training routine. We instantiate a `DataModule` for the LandCover.AI dataset, instantiate a `CustomSemanticSegmentationTask` with a U-Net and ResNet-50 backbone, then train the model using a Lightning trainer. + +dm = LandCoverAIDataModule(root="data/", batch_size=64, num_workers=8, download=True) + +task = CustomSemanticSegmentationTask( + model="unet", + backbone="resnet50", + weights=True, + in_channels=3, + num_classes=6, + loss="ce", + lr=1e-3, + tmax=50, +) + +# validate that the task's hyperparameters are as expected +task.hparams + +# + +accelerator = "gpu" if torch.cuda.is_available() else "cpu" + +trainer = pl.Trainer( + accelerator=accelerator, + devices=[0], + min_epochs=150, + max_epochs=300, + log_every_n_steps=50, +) +# - + +trainer.fit(task, dm) + +# ## Test model +# +# Finally, we test the model on the test set and visualize the results. + +# If you are starting from a checkpoint, run this cell +task = CustomSemanticSegmentationTask.load_from_checkpoint( + "lightning_logs/version_3/checkpoints/epoch=0-step=117.ckpt" +) + +trainer.test(task, dm) + + From b57d745aa0fa9be3a94d82636136fc84f329914f Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 16:44:36 +0000 Subject: [PATCH 06/31] Trying jupytext --- .../custom_segmentation_trainer.ipynb | 18 +++++++++--------- docs/tutorials/custom_segmentation_trainer.py | 3 +-- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 33bdf14c226..02c1bb3551c 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -52,7 +52,9 @@ { "cell_type": "code", "execution_count": 2, - "metadata": {}, + "metadata": { + "lines_to_end_of_cell_marker": 2 + }, "outputs": [], "source": [ "from torchgeo.trainers import SemanticSegmentationTask\n", @@ -82,7 +84,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "lines_to_next_cell": 2 + }, "source": [ "## Custom SemanticSegmentationTask\n", "\n", @@ -537,16 +541,12 @@ "source": [ "trainer.test(task, dm)" ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { + "jupytext": { + "formats": "ipynb,py" + }, "kernelspec": { "display_name": "geo", "language": "python", diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index a24783bca91..0a5e75d739f 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -74,6 +74,7 @@ # # Overall these demonstrate how to customize the training routine to investigate specific research questions (e.g. of the scheduler on test performance). + class CustomSemanticSegmentationTask(SemanticSegmentationTask): # any keywords we add here between *args and **kwargs will be found in self.hparams @@ -190,5 +191,3 @@ def on_train_epoch_start(self) -> None: ) trainer.test(task, dm) - - From f1f4c6374f7a58e3afa6cb876b6d0f6263483157 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 16:49:15 +0000 Subject: [PATCH 07/31] Linting --- .../custom_segmentation_trainer.ipynb | 62 +++++++++++++------ docs/tutorials/custom_segmentation_trainer.py | 28 +++++---- 2 files changed, 59 insertions(+), 31 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 02c1bb3551c..25c2bdcab3c 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -4,6 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "flake8: noqa: E501\n", "Copyright (c) Microsoft Corporation. All rights reserved.\n", "\n", "Licensed under the MIT License." @@ -52,32 +53,57 @@ { "cell_type": "code", "execution_count": 2, - "metadata": { - "lines_to_end_of_cell_marker": 2 - }, + "metadata": {}, "outputs": [], "source": [ - "from torchgeo.trainers import SemanticSegmentationTask\n", - "from torchgeo.datamodules import LandCoverAIDataModule\n", + "# Get rid of the pesky warnings raised by kornia\n", + "# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.\n", + "import warnings" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0b01bf43", + "metadata": {}, + "outputs": [], + "source": [ + "import lightning\n", + "import lightning.pytorch as pl\n", + "import torch\n", + "from lightning.pytorch.callbacks import ModelCheckpoint\n", + "from torch.optim import AdamW\n", + "from torch.optim.lr_scheduler import CosineAnnealingLR\n", "from torchmetrics import MetricCollection\n", "from torchmetrics.classification import (\n", " Accuracy,\n", " FBetaScore,\n", + " JaccardIndex,\n", " Precision,\n", " Recall,\n", - " JaccardIndex,\n", - ")\n", - "\n", - "import lightning.pytorch as pl\n", - "from lightning.pytorch.callbacks import ModelCheckpoint\n", - "import torch\n", - "\n", - "from torch.optim.lr_scheduler import CosineAnnealingLR\n", - "from torch.optim import AdamW\n", - "\n", - "# Get rid of the pesky warnings raised by kornia\n", - "# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.\n", - "import warnings\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3e721f3f", + "metadata": {}, + "outputs": [], + "source": [ + "from torchgeo.datamodules import LandCoverAIDataModule" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7fb5fbbf", + "metadata": { + "lines_to_end_of_cell_marker": 2 + }, + "outputs": [], + "source": [ + "from torchgeo.trainers import SemanticSegmentationTask\n", "\n", "warnings.filterwarnings(\"ignore\", category=UserWarning, module=\"torch.nn.functional\")" ] diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index 0a5e75d739f..1b3d41e65ce 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -13,6 +13,7 @@ # name: geo # --- +# flake8: noqa: E501 # Copyright (c) Microsoft Corporation. All rights reserved. # # Licensed under the MIT License. @@ -35,28 +36,29 @@ # # Next, we import TorchGeo and any other libraries we need. -# + -from torchgeo.trainers import SemanticSegmentationTask -from torchgeo.datamodules import LandCoverAIDataModule +# Get rid of the pesky warnings raised by kornia +# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. +import warnings + +import lightning +import lightning.pytorch as pl +import torch +from lightning.pytorch.callbacks import ModelCheckpoint +from torch.optim import AdamW +from torch.optim.lr_scheduler import CosineAnnealingLR from torchmetrics import MetricCollection from torchmetrics.classification import ( Accuracy, FBetaScore, + JaccardIndex, Precision, Recall, - JaccardIndex, ) -import lightning.pytorch as pl -from lightning.pytorch.callbacks import ModelCheckpoint -import torch - -from torch.optim.lr_scheduler import CosineAnnealingLR -from torch.optim import AdamW +from torchgeo.datamodules import LandCoverAIDataModule -# Get rid of the pesky warnings raised by kornia -# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. -import warnings +# + +from torchgeo.trainers import SemanticSegmentationTask warnings.filterwarnings("ignore", category=UserWarning, module="torch.nn.functional") From c43526d583142f12dc1ee802d41366970a89342f Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 16:53:36 +0000 Subject: [PATCH 08/31] Typo fix and constructor fix --- .../custom_segmentation_trainer.ipynb | 23 ++++++++++--------- docs/tutorials/custom_segmentation_trainer.py | 5 ++-- 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 25c2bdcab3c..235b81783a5 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -5,8 +5,8 @@ "metadata": {}, "source": [ "flake8: noqa: E501\n", - "Copyright (c) Microsoft Corporation. All rights reserved.\n", "\n", + "Copyright (c) Microsoft Corporation. All rights reserved.\n", "Licensed under the MIT License." ] }, @@ -52,7 +52,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -63,7 +63,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "id": "0b01bf43", "metadata": {}, "outputs": [], @@ -86,7 +86,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "id": "3e721f3f", "metadata": {}, "outputs": [], @@ -96,7 +96,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 5, "id": "7fb5fbbf", "metadata": { "lines_to_end_of_cell_marker": 2 @@ -128,7 +128,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 9, "metadata": {}, "outputs": [], "source": [ @@ -136,7 +136,8 @@ "\n", " # any keywords we add here between *args and **kwargs will be found in self.hparams\n", " def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None:\n", - " del kwargs[\"ignore\"] # this is a hack\n", + " if \"ignore\" in kwargs:\n", + " del kwargs[\"ignore\"] # this is a hack\n", " super().__init__(*args, **kwargs) # pass args and kwargs to the parent class\n", "\n", " def configure_optimizers(\n", @@ -215,7 +216,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 7, "metadata": {}, "outputs": [], "source": [ @@ -224,7 +225,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 10, "metadata": {}, "outputs": [], "source": [ @@ -242,7 +243,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 11, "metadata": {}, "outputs": [ { @@ -265,7 +266,7 @@ "\"tmax\": 50" ] }, - "execution_count": 6, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" } diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index 1b3d41e65ce..cad0f15f1cf 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -14,8 +14,8 @@ # --- # flake8: noqa: E501 -# Copyright (c) Microsoft Corporation. All rights reserved. # +# Copyright (c) Microsoft Corporation. All rights reserved. # Licensed under the MIT License. # # Custom Trainers @@ -81,7 +81,8 @@ class CustomSemanticSegmentationTask(SemanticSegmentationTask): # any keywords we add here between *args and **kwargs will be found in self.hparams def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None: - del kwargs["ignore"] # this is a hack + if "ignore" in kwargs: + del kwargs["ignore"] # this is a hack super().__init__(*args, **kwargs) # pass args and kwargs to the parent class def configure_optimizers( From 5a20899147d74982ab1737aa793499b7ffe1c8dd Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 16:57:07 +0000 Subject: [PATCH 09/31] Trainer args --- docs/tutorials/custom_segmentation_trainer.ipynb | 10 +--------- docs/tutorials/custom_segmentation_trainer.py | 12 +----------- 2 files changed, 2 insertions(+), 20 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 235b81783a5..9bc287ed01a 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -299,15 +299,7 @@ } ], "source": [ - "accelerator = \"gpu\" if torch.cuda.is_available() else \"cpu\"\n", - "\n", - "trainer = pl.Trainer(\n", - " accelerator=accelerator,\n", - " devices=[0],\n", - " min_epochs=150,\n", - " max_epochs=300,\n", - " log_every_n_steps=50,\n", - ")" + "trainer = pl.Trainer(min_epochs=150, max_epochs=300, log_every_n_steps=50)" ] }, { diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index cad0f15f1cf..4b13edae77d 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -170,17 +170,7 @@ def on_train_epoch_start(self) -> None: # validate that the task's hyperparameters are as expected task.hparams -# + -accelerator = "gpu" if torch.cuda.is_available() else "cpu" - -trainer = pl.Trainer( - accelerator=accelerator, - devices=[0], - min_epochs=150, - max_epochs=300, - log_every_n_steps=50, -) -# - +trainer = pl.Trainer(min_epochs=150, max_epochs=300, log_every_n_steps=50) trainer.fit(task, dm) From c7090d1ad57b05ce285b527e0fafd96adbf03b2f Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 17:25:39 +0000 Subject: [PATCH 10/31] Update num epochs --- docs/tutorials/custom_segmentation_trainer.ipynb | 2 +- docs/tutorials/custom_segmentation_trainer.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 9bc287ed01a..9732a50ac8e 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -299,7 +299,7 @@ } ], "source": [ - "trainer = pl.Trainer(min_epochs=150, max_epochs=300, log_every_n_steps=50)" + "trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50)" ] }, { diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index 4b13edae77d..48ed6c7b56a 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -170,7 +170,7 @@ def on_train_epoch_start(self) -> None: # validate that the task's hyperparameters are as expected task.hparams -trainer = pl.Trainer(min_epochs=150, max_epochs=300, log_every_n_steps=50) +trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50) trainer.fit(task, dm) From 9c0cff294f5c9ec7d140be5d55faab23f2b1a99c Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 20:46:13 +0000 Subject: [PATCH 11/31] Make base trainer class work properly --- .../custom_segmentation_trainer.ipynb | 191 +++++------------- docs/tutorials/custom_segmentation_trainer.py | 18 +- 2 files changed, 63 insertions(+), 146 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 9732a50ac8e..9226ed92eec 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -52,7 +52,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, "metadata": {}, "outputs": [], "source": [ @@ -63,14 +63,13 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 3, "id": "0b01bf43", "metadata": {}, "outputs": [], "source": [ "import lightning\n", "import lightning.pytorch as pl\n", - "import torch\n", "from lightning.pytorch.callbacks import ModelCheckpoint\n", "from torch.optim import AdamW\n", "from torch.optim.lr_scheduler import CosineAnnealingLR\n", @@ -128,7 +127,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 6, "metadata": {}, "outputs": [], "source": [ @@ -136,8 +135,6 @@ "\n", " # any keywords we add here between *args and **kwargs will be found in self.hparams\n", " def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None:\n", - " if \"ignore\" in kwargs:\n", - " del kwargs[\"ignore\"] # this is a hack\n", " super().__init__(*args, **kwargs) # pass args and kwargs to the parent class\n", "\n", " def configure_optimizers(\n", @@ -195,7 +192,7 @@ " List of callbacks to apply.\n", " \"\"\"\n", " return [\n", - " ModelCheckpoint(every_n_epochs=50, save_top_k=-1),\n", + " ModelCheckpoint(every_n_epochs=50, save_top_k=-1, save_last=True),\n", " ModelCheckpoint(monitor=self.monitor, mode=self.mode, save_top_k=5),\n", " ]\n", "\n", @@ -225,7 +222,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 8, "metadata": {}, "outputs": [], "source": [ @@ -243,7 +240,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 9, "metadata": {}, "outputs": [ { @@ -254,7 +251,6 @@ "\"eta_min\": 1e-06\n", "\"freeze_backbone\": False\n", "\"freeze_decoder\": False\n", - "\"ignore\": weights\n", "\"ignore_index\": None\n", "\"in_channels\": 3\n", "\"loss\": ce\n", @@ -266,7 +262,7 @@ "\"tmax\": 50" ] }, - "execution_count": 11, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } @@ -278,33 +274,41 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "GPU available: True (cuda), used: True\n" + "GPU available: True (cuda), used: False\n", + "TPU available: False, using: 0 TPU cores\n", + "IPU available: False, using: 0 IPUs\n", + "HPU available: False, using: 0 HPUs\n", + "/home/calebrobinson/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/trainer/setup.py:187: GPU available but not used. You can set it by doing `Trainer(accelerator='gpu')`.\n", + "`Trainer(limit_train_batches=1)` was configured so 1 batch per epoch will be used.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "TPU available: False, using: 0 TPU cores\n", - "IPU available: False, using: 0 IPUs\n", - "HPU available: False, using: 0 HPUs\n" + "`Trainer(limit_val_batches=1)` was configured so 1 batch will be used.\n" ] } ], "source": [ - "trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50)" + "# The following Trainer config is useful just for testing the code in this notebook.\n", + "trainer = pl.Trainer(\n", + " limit_train_batches=1, limit_val_batches=1, num_sanity_val_steps=0, max_epochs=1\n", + ")\n", + "# You can use the following for actual training runs.\n", + "# trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50)" ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 11, "metadata": {}, "outputs": [ { @@ -314,85 +318,10 @@ "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n" ] }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Downloading https://landcover.ai.linuxpolska.com/download/landcover.ai.v1.zip to data/landcover.ai.v1.zip\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "100%|██████████| 1538212277/1538212277 [01:25<00:00, 17913845.14it/s]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Processed M-33-20-D-c-4-2 1/41\n", - "Processed M-33-20-D-d-3-3 2/41\n", - "Processed M-33-32-B-b-4-4 3/41\n", - "Processed M-33-48-A-c-4-4 4/41\n", - "Processed M-33-7-A-d-2-3 5/41\n", - "Processed M-33-7-A-d-3-2 6/41\n", - "Processed M-34-32-B-a-4-3 7/41\n", - "Processed M-34-32-B-b-1-3 8/41\n", - "Processed M-34-5-D-d-4-2 9/41\n", - "Processed M-34-51-C-b-2-1 10/41\n", - "Processed M-34-51-C-d-4-1 11/41\n", - "Processed M-34-55-B-b-4-1 12/41\n", - "Processed M-34-56-A-b-1-4 13/41\n", - "Processed M-34-6-A-d-2-2 14/41\n", - "Processed M-34-65-D-a-4-4 15/41\n", - "Processed M-34-65-D-c-4-2 16/41\n", - "Processed M-34-65-D-d-4-1 17/41\n", - "Processed M-34-68-B-a-1-3 18/41\n", - "Processed M-34-77-B-c-2-3 19/41\n", - "Processed N-33-104-A-c-1-1 20/41\n", - "Processed N-33-119-C-c-3-3 21/41\n", - "Processed N-33-130-A-d-3-3 22/41\n", - "Processed N-33-130-A-d-4-4 23/41\n", - "Processed N-33-139-C-d-2-2 24/41\n", - "Processed N-33-139-C-d-2-4 25/41\n", - "Processed N-33-139-D-c-1-3 26/41\n", - "Processed N-33-60-D-c-4-2 27/41\n", - "Processed N-33-60-D-d-1-2 28/41\n", - "Processed N-33-96-D-d-1-1 29/41\n", - "Processed N-34-106-A-b-3-4 30/41\n", - "Processed N-34-106-A-c-1-3 31/41\n", - "Processed N-34-140-A-b-3-2 32/41\n", - "Processed N-34-140-A-b-4-2 33/41\n", - "Processed N-34-140-A-d-3-4 34/41\n", - "Processed N-34-140-A-d-4-2 35/41\n", - "Processed N-34-61-B-a-1-1 36/41\n", - "Processed N-34-66-C-c-4-3 37/41\n", - "Processed N-34-77-A-b-1-4 38/41\n", - "Processed N-34-94-A-b-2-4 39/41\n", - "Processed N-34-97-C-b-1-2 40/41\n" - ] - }, { "name": "stderr", "output_type": "stream", "text": [ - "You are using a CUDA device ('NVIDIA A100 80GB PCIe') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Processed N-34-97-D-c-2-4 41/41\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]\n", "\n", " | Name | Type | Params\n", "---------------------------------------------------\n", @@ -405,27 +334,14 @@ "32.5 M Trainable params\n", "0 Non-trainable params\n", "32.5 M Total params\n", - "130.087 Total estimated model params size (MB)\n" + "130.087 Total estimated model params size (MB)\n", + "/home/calebrobinson/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:293: The number of training batches (1) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "00af14780e004ce69b89e436b7b95606", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Sanity Checking: | | 0/? [00:00┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", "┃ Test metric DataLoader 0 ┃\n", "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", - "│ test_MeanIoU 0.3726549744606018 │\n", - "│ test_OverallAccuracy 0.8094286322593689 │\n", - "│ test_OverallF1Score 0.8094285726547241 │\n", - "│ test_OverallPrecision 0.8094286322593689 │\n", - "│ test_OverallRecall 0.8094286322593689 │\n", - "│ test_loss 0.4797952175140381 │\n", + "│ test_MeanIoU 0.012266275472939014 │\n", + "│ test_OverallAccuracy 0.038088466972112656 │\n", + "│ test_OverallF1Score 0.038088466972112656 │\n", + "│ test_OverallPrecision 0.038088466972112656 │\n", + "│ test_OverallRecall 0.038088466972112656 │\n", + "│ test_loss 1.8426358699798584 │\n", "└───────────────────────────┴───────────────────────────┘\n", "\n" ], @@ -529,12 +444,12 @@ "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", "┃\u001b[1m \u001b[0m\u001b[1m Test metric \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m DataLoader 0 \u001b[0m\u001b[1m \u001b[0m┃\n", "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", - "│\u001b[36m \u001b[0m\u001b[36m test_MeanIoU \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.3726549744606018 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_OverallAccuracy \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8094286322593689 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_OverallF1Score \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8094285726547241 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_OverallPrecision \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8094286322593689 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_OverallRecall \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.8094286322593689 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_loss \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.4797952175140381 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_MeanIoU \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.012266275472939014 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallAccuracy \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.038088466972112656 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallF1Score \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.038088466972112656 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallPrecision \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.038088466972112656 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallRecall \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.038088466972112656 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_loss \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 1.8426358699798584 \u001b[0m\u001b[35m \u001b[0m│\n", "└───────────────────────────┴───────────────────────────┘\n" ] }, @@ -544,15 +459,15 @@ { "data": { "text/plain": [ - "[{'test_loss': 0.4797952175140381,\n", - " 'test_MeanIoU': 0.3726549744606018,\n", - " 'test_OverallAccuracy': 0.8094286322593689,\n", - " 'test_OverallF1Score': 0.8094285726547241,\n", - " 'test_OverallPrecision': 0.8094286322593689,\n", - " 'test_OverallRecall': 0.8094286322593689}]" + "[{'test_loss': 1.8426358699798584,\n", + " 'test_MeanIoU': 0.012266275472939014,\n", + " 'test_OverallAccuracy': 0.038088466972112656,\n", + " 'test_OverallF1Score': 0.038088466972112656,\n", + " 'test_OverallPrecision': 0.038088466972112656,\n", + " 'test_OverallRecall': 0.038088466972112656}]" ] }, - "execution_count": 12, + "execution_count": 13, "metadata": {}, "output_type": "execute_result" } diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index 48ed6c7b56a..050f07a1fb0 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -42,7 +42,6 @@ import lightning import lightning.pytorch as pl -import torch from lightning.pytorch.callbacks import ModelCheckpoint from torch.optim import AdamW from torch.optim.lr_scheduler import CosineAnnealingLR @@ -81,8 +80,6 @@ class CustomSemanticSegmentationTask(SemanticSegmentationTask): # any keywords we add here between *args and **kwargs will be found in self.hparams def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None: - if "ignore" in kwargs: - del kwargs["ignore"] # this is a hack super().__init__(*args, **kwargs) # pass args and kwargs to the parent class def configure_optimizers( @@ -140,7 +137,7 @@ def configure_callbacks(self): List of callbacks to apply. """ return [ - ModelCheckpoint(every_n_epochs=50, save_top_k=-1), + ModelCheckpoint(every_n_epochs=50, save_top_k=-1, save_last=True), ModelCheckpoint(monitor=self.monitor, mode=self.mode, save_top_k=5), ] @@ -170,17 +167,22 @@ def on_train_epoch_start(self) -> None: # validate that the task's hyperparameters are as expected task.hparams -trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50) +# The following Trainer config is useful just for testing the code in this notebook. +trainer = pl.Trainer( + limit_train_batches=1, limit_val_batches=1, num_sanity_val_steps=0, max_epochs=1 +) +# You can use the following for actual training runs. +# trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50) trainer.fit(task, dm) # ## Test model # -# Finally, we test the model on the test set and visualize the results. +# Finally, we test the model (optionally loading from a previously saved checkpoint). -# If you are starting from a checkpoint, run this cell +# You can load directly from a saved checkpoint with `.load_from_checkpoint(...)` task = CustomSemanticSegmentationTask.load_from_checkpoint( - "lightning_logs/version_3/checkpoints/epoch=0-step=117.ckpt" + "lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt" ) trainer.test(task, dm) From c4734e540078734420458ab91c08831b5e7c838e Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 21:01:42 +0000 Subject: [PATCH 12/31] Updating kernelspec name --- docs/tutorials/custom_segmentation_trainer.ipynb | 4 ++-- docs/tutorials/custom_segmentation_trainer.py | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 9226ed92eec..6c50f6df227 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -482,9 +482,9 @@ "formats": "ipynb,py" }, "kernelspec": { - "display_name": "geo", + "display_name": "Python 3 (ipykernel)", "language": "python", - "name": "geo" + "name": "python3" }, "language_info": { "codemirror_mode": { diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index 050f07a1fb0..5efa90fadf1 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -8,9 +8,9 @@ # format_version: '1.5' # jupytext_version: 1.16.1 # kernelspec: -# display_name: geo +# display_name: Python 3 (ipykernel) # language: python -# name: geo +# name: python3 # --- # flake8: noqa: E501 From 8a2abc5e801aff1b6abaf17dad621df507ed14f6 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 21:11:34 +0000 Subject: [PATCH 13/31] Make mypy happy I think --- docs/tutorials/custom_segmentation_trainer.ipynb | 14 ++++++++++---- docs/tutorials/custom_segmentation_trainer.py | 14 ++++++++++---- 2 files changed, 20 insertions(+), 8 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 6c50f6df227..a20c679d63e 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -68,6 +68,8 @@ "metadata": {}, "outputs": [], "source": [ + "from typing import Any, Union, Sequence\n", + "from lightning.pytorch.callbacks.callback import Callback\n", "import lightning\n", "import lightning.pytorch as pl\n", "from lightning.pytorch.callbacks import ModelCheckpoint\n", @@ -134,7 +136,7 @@ "class CustomSemanticSegmentationTask(SemanticSegmentationTask):\n", "\n", " # any keywords we add here between *args and **kwargs will be found in self.hparams\n", - " def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None:\n", + " def __init__(self, *args: Any, tmax: int=50, eta_min: float=1e-6, **kwargs: Any) -> None:\n", " super().__init__(*args, **kwargs) # pass args and kwargs to the parent class\n", "\n", " def configure_optimizers(\n", @@ -185,7 +187,7 @@ " self.val_metrics = self.train_metrics.clone(prefix=\"val_\")\n", " self.test_metrics = self.train_metrics.clone(prefix=\"test_\")\n", "\n", - " def configure_callbacks(self):\n", + " def configure_callbacks(self) -> Union[Sequence[Callback], Callback]:\n", " \"\"\"Initialize callbacks for saving the best and latest models.\n", "\n", " Returns:\n", @@ -198,8 +200,12 @@ "\n", " def on_train_epoch_start(self) -> None:\n", " \"\"\"Log the learning rate at the start of each training epoch.\"\"\"\n", - " lr = self.optimizers().param_groups[0][\"lr\"]\n", - " self.logger.experiment.add_scalar(\"lr\", lr, self.current_epoch)" + " optimizers = self.optimizers()\n", + " if isinstance(optimizers, list):\n", + " lr = optimizers[0].param_groups[0][\"lr\"]\n", + " else:\n", + " lr = optimizers.param_groups[0][\"lr\"]\n", + " self.logger.experiment.add_scalar(\"lr\", lr, self.current_epoch) # type: ignore" ] }, { diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index 5efa90fadf1..95e45ac887e 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -40,6 +40,8 @@ # UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. import warnings +from typing import Any, Union, Sequence +from lightning.pytorch.callbacks.callback import Callback import lightning import lightning.pytorch as pl from lightning.pytorch.callbacks import ModelCheckpoint @@ -79,7 +81,7 @@ class CustomSemanticSegmentationTask(SemanticSegmentationTask): # any keywords we add here between *args and **kwargs will be found in self.hparams - def __init__(self, *args, tmax=50, eta_min=1e-6, **kwargs) -> None: + def __init__(self, *args: Any, tmax: int=50, eta_min: float=1e-6, **kwargs: Any) -> None: super().__init__(*args, **kwargs) # pass args and kwargs to the parent class def configure_optimizers( @@ -130,7 +132,7 @@ def configure_metrics(self) -> None: self.val_metrics = self.train_metrics.clone(prefix="val_") self.test_metrics = self.train_metrics.clone(prefix="test_") - def configure_callbacks(self): + def configure_callbacks(self) -> Union[Sequence[Callback], Callback]: """Initialize callbacks for saving the best and latest models. Returns: @@ -143,8 +145,12 @@ def configure_callbacks(self): def on_train_epoch_start(self) -> None: """Log the learning rate at the start of each training epoch.""" - lr = self.optimizers().param_groups[0]["lr"] - self.logger.experiment.add_scalar("lr", lr, self.current_epoch) + optimizers = self.optimizers() + if isinstance(optimizers, list): + lr = optimizers[0].param_groups[0]["lr"] + else: + lr = optimizers.param_groups[0]["lr"] + self.logger.experiment.add_scalar("lr", lr, self.current_epoch) # type: ignore # ## Train model From a6f70458c0b741c8bb66d5fc8a2e19a039da5648 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 21:12:50 +0000 Subject: [PATCH 14/31] Add tutorial to docs --- docs/index.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/index.rst b/docs/index.rst index ced959493a8..9e0b7b11bac 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -30,6 +30,7 @@ torchgeo tutorials/getting_started tutorials/custom_raster_dataset + tutorials/custom_segmentation_trainer tutorials/transforms tutorials/indices tutorials/trainers From fe3dd34f1012dc03acaab027ea5ab078d9add888 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 21:15:12 +0000 Subject: [PATCH 15/31] Linting --- docs/tutorials/custom_segmentation_trainer.ipynb | 11 +++++++---- docs/tutorials/custom_segmentation_trainer.py | 9 ++++++--- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index a20c679d63e..bdcac130ef5 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -58,7 +58,9 @@ "source": [ "# Get rid of the pesky warnings raised by kornia\n", "# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.\n", - "import warnings" + "import warnings\n", + "from collections.abc import Sequence\n", + "from typing import Any, Union" ] }, { @@ -68,11 +70,10 @@ "metadata": {}, "outputs": [], "source": [ - "from typing import Any, Union, Sequence\n", - "from lightning.pytorch.callbacks.callback import Callback\n", "import lightning\n", "import lightning.pytorch as pl\n", "from lightning.pytorch.callbacks import ModelCheckpoint\n", + "from lightning.pytorch.callbacks.callback import Callback\n", "from torch.optim import AdamW\n", "from torch.optim.lr_scheduler import CosineAnnealingLR\n", "from torchmetrics import MetricCollection\n", @@ -136,7 +137,9 @@ "class CustomSemanticSegmentationTask(SemanticSegmentationTask):\n", "\n", " # any keywords we add here between *args and **kwargs will be found in self.hparams\n", - " def __init__(self, *args: Any, tmax: int=50, eta_min: float=1e-6, **kwargs: Any) -> None:\n", + " def __init__(\n", + " self, *args: Any, tmax: int = 50, eta_min: float = 1e-6, **kwargs: Any\n", + " ) -> None:\n", " super().__init__(*args, **kwargs) # pass args and kwargs to the parent class\n", "\n", " def configure_optimizers(\n", diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index 95e45ac887e..db77850b795 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -39,12 +39,13 @@ # Get rid of the pesky warnings raised by kornia # UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. import warnings +from collections.abc import Sequence +from typing import Any, Union -from typing import Any, Union, Sequence -from lightning.pytorch.callbacks.callback import Callback import lightning import lightning.pytorch as pl from lightning.pytorch.callbacks import ModelCheckpoint +from lightning.pytorch.callbacks.callback import Callback from torch.optim import AdamW from torch.optim.lr_scheduler import CosineAnnealingLR from torchmetrics import MetricCollection @@ -81,7 +82,9 @@ class CustomSemanticSegmentationTask(SemanticSegmentationTask): # any keywords we add here between *args and **kwargs will be found in self.hparams - def __init__(self, *args: Any, tmax: int=50, eta_min: float=1e-6, **kwargs: Any) -> None: + def __init__( + self, *args: Any, tmax: int = 50, eta_min: float = 1e-6, **kwargs: Any + ) -> None: super().__init__(*args, **kwargs) # pass args and kwargs to the parent class def configure_optimizers( From cdd43a99fc552126dc80dd967a3029a75cbda701 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 23 Feb 2024 21:43:49 +0000 Subject: [PATCH 16/31] Adding torchgeo datasets install --- docs/tutorials/custom_segmentation_trainer.ipynb | 2 +- docs/tutorials/custom_segmentation_trainer.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index bdcac130ef5..5c04d64333a 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -38,7 +38,7 @@ "metadata": {}, "outputs": [], "source": [ - "%pip install torchgeo" + "%pip install torchgeo[datasets]" ] }, { diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index db77850b795..475456ffd83 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -30,7 +30,7 @@ # # As always, we install TorchGeo. -# %pip install torchgeo +# %pip install torchgeo[datasets] # ## Imports # From 1a59301f6605baff2b03026670677bee2cea198c Mon Sep 17 00:00:00 2001 From: davrob Date: Wed, 18 Sep 2024 17:57:24 +0000 Subject: [PATCH 17/31] Ruff --- .../custom_segmentation_trainer.ipynb | 65 +++++++++---------- docs/tutorials/custom_segmentation_trainer.py | 65 +++++++++---------- 2 files changed, 64 insertions(+), 66 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 5c04d64333a..fcb94ac5862 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -60,7 +60,7 @@ "# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.\n", "import warnings\n", "from collections.abc import Sequence\n", - "from typing import Any, Union" + "from typing import Any" ] }, { @@ -107,7 +107,7 @@ "source": [ "from torchgeo.trainers import SemanticSegmentationTask\n", "\n", - "warnings.filterwarnings(\"ignore\", category=UserWarning, module=\"torch.nn.functional\")" + "warnings.filterwarnings('ignore', category=UserWarning, module='torch.nn.functional')" ] }, { @@ -135,7 +135,6 @@ "outputs": [], "source": [ "class CustomSemanticSegmentationTask(SemanticSegmentationTask):\n", - "\n", " # any keywords we add here between *args and **kwargs will be found in self.hparams\n", " def __init__(\n", " self, *args: Any, tmax: int = 50, eta_min: float = 1e-6, **kwargs: Any\n", @@ -144,53 +143,53 @@ "\n", " def configure_optimizers(\n", " self,\n", - " ) -> \"lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig\":\n", + " ) -> 'lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig':\n", " \"\"\"Initialize the optimizer and learning rate scheduler.\n", "\n", " Returns:\n", " Optimizer and learning rate scheduler.\n", " \"\"\"\n", - " tmax: int = self.hparams[\"tmax\"]\n", - " eta_min: float = self.hparams[\"eta_min\"]\n", + " tmax: int = self.hparams['tmax']\n", + " eta_min: float = self.hparams['eta_min']\n", "\n", - " optimizer = AdamW(self.parameters(), lr=self.hparams[\"lr\"])\n", + " optimizer = AdamW(self.parameters(), lr=self.hparams['lr'])\n", " scheduler = CosineAnnealingLR(optimizer, T_max=tmax, eta_min=eta_min)\n", " return {\n", - " \"optimizer\": optimizer,\n", - " \"lr_scheduler\": {\"scheduler\": scheduler, \"monitor\": self.monitor},\n", + " 'optimizer': optimizer,\n", + " 'lr_scheduler': {'scheduler': scheduler, 'monitor': self.monitor},\n", " }\n", "\n", " def configure_metrics(self) -> None:\n", " \"\"\"Initialize the performance metrics.\"\"\"\n", - " num_classes: int = self.hparams[\"num_classes\"]\n", + " num_classes: int = self.hparams['num_classes']\n", "\n", " self.train_metrics = MetricCollection(\n", " {\n", - " \"OverallAccuracy\": Accuracy(\n", - " task=\"multiclass\", num_classes=num_classes, average=\"micro\"\n", + " 'OverallAccuracy': Accuracy(\n", + " task='multiclass', num_classes=num_classes, average='micro'\n", " ),\n", - " \"OverallPrecision\": Precision(\n", - " task=\"multiclass\", num_classes=num_classes, average=\"micro\"\n", + " 'OverallPrecision': Precision(\n", + " task='multiclass', num_classes=num_classes, average='micro'\n", " ),\n", - " \"OverallRecall\": Recall(\n", - " task=\"multiclass\", num_classes=num_classes, average=\"micro\"\n", + " 'OverallRecall': Recall(\n", + " task='multiclass', num_classes=num_classes, average='micro'\n", " ),\n", - " \"OverallF1Score\": FBetaScore(\n", - " task=\"multiclass\",\n", + " 'OverallF1Score': FBetaScore(\n", + " task='multiclass',\n", " num_classes=num_classes,\n", " beta=1.0,\n", - " average=\"micro\",\n", + " average='micro',\n", " ),\n", - " \"MeanIoU\": JaccardIndex(\n", - " num_classes=num_classes, task=\"multiclass\", average=\"macro\"\n", + " 'MeanIoU': JaccardIndex(\n", + " num_classes=num_classes, task='multiclass', average='macro'\n", " ),\n", " },\n", - " prefix=\"train_\",\n", + " prefix='train_',\n", " )\n", - " self.val_metrics = self.train_metrics.clone(prefix=\"val_\")\n", - " self.test_metrics = self.train_metrics.clone(prefix=\"test_\")\n", + " self.val_metrics = self.train_metrics.clone(prefix='val_')\n", + " self.test_metrics = self.train_metrics.clone(prefix='test_')\n", "\n", - " def configure_callbacks(self) -> Union[Sequence[Callback], Callback]:\n", + " def configure_callbacks(self) -> Sequence[Callback] | Callback:\n", " \"\"\"Initialize callbacks for saving the best and latest models.\n", "\n", " Returns:\n", @@ -205,10 +204,10 @@ " \"\"\"Log the learning rate at the start of each training epoch.\"\"\"\n", " optimizers = self.optimizers()\n", " if isinstance(optimizers, list):\n", - " lr = optimizers[0].param_groups[0][\"lr\"]\n", + " lr = optimizers[0].param_groups[0]['lr']\n", " else:\n", - " lr = optimizers.param_groups[0][\"lr\"]\n", - " self.logger.experiment.add_scalar(\"lr\", lr, self.current_epoch) # type: ignore" + " lr = optimizers.param_groups[0]['lr']\n", + " self.logger.experiment.add_scalar('lr', lr, self.current_epoch) # type: ignore" ] }, { @@ -226,7 +225,7 @@ "metadata": {}, "outputs": [], "source": [ - "dm = LandCoverAIDataModule(root=\"data/\", batch_size=64, num_workers=8, download=True)" + "dm = LandCoverAIDataModule(root='data/', batch_size=64, num_workers=8, download=True)" ] }, { @@ -236,12 +235,12 @@ "outputs": [], "source": [ "task = CustomSemanticSegmentationTask(\n", - " model=\"unet\",\n", - " backbone=\"resnet50\",\n", + " model='unet',\n", + " backbone='resnet50',\n", " weights=True,\n", " in_channels=3,\n", " num_classes=6,\n", - " loss=\"ce\",\n", + " loss='ce',\n", " lr=1e-3,\n", " tmax=50,\n", ")" @@ -404,7 +403,7 @@ "source": [ "# You can load directly from a saved checkpoint with `.load_from_checkpoint(...)`\n", "task = CustomSemanticSegmentationTask.load_from_checkpoint(\n", - " \"lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt\"\n", + " 'lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt'\n", ")" ] }, diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py index 475456ffd83..55f0c493a95 100644 --- a/docs/tutorials/custom_segmentation_trainer.py +++ b/docs/tutorials/custom_segmentation_trainer.py @@ -40,7 +40,7 @@ # UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. import warnings from collections.abc import Sequence -from typing import Any, Union +from typing import Any import lightning import lightning.pytorch as pl @@ -62,7 +62,7 @@ # + from torchgeo.trainers import SemanticSegmentationTask -warnings.filterwarnings("ignore", category=UserWarning, module="torch.nn.functional") +warnings.filterwarnings('ignore', category=UserWarning, module='torch.nn.functional') # - @@ -80,7 +80,6 @@ class CustomSemanticSegmentationTask(SemanticSegmentationTask): - # any keywords we add here between *args and **kwargs will be found in self.hparams def __init__( self, *args: Any, tmax: int = 50, eta_min: float = 1e-6, **kwargs: Any @@ -89,53 +88,53 @@ def __init__( def configure_optimizers( self, - ) -> "lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig": + ) -> 'lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig': """Initialize the optimizer and learning rate scheduler. Returns: Optimizer and learning rate scheduler. """ - tmax: int = self.hparams["tmax"] - eta_min: float = self.hparams["eta_min"] + tmax: int = self.hparams['tmax'] + eta_min: float = self.hparams['eta_min'] - optimizer = AdamW(self.parameters(), lr=self.hparams["lr"]) + optimizer = AdamW(self.parameters(), lr=self.hparams['lr']) scheduler = CosineAnnealingLR(optimizer, T_max=tmax, eta_min=eta_min) return { - "optimizer": optimizer, - "lr_scheduler": {"scheduler": scheduler, "monitor": self.monitor}, + 'optimizer': optimizer, + 'lr_scheduler': {'scheduler': scheduler, 'monitor': self.monitor}, } def configure_metrics(self) -> None: """Initialize the performance metrics.""" - num_classes: int = self.hparams["num_classes"] + num_classes: int = self.hparams['num_classes'] self.train_metrics = MetricCollection( { - "OverallAccuracy": Accuracy( - task="multiclass", num_classes=num_classes, average="micro" + 'OverallAccuracy': Accuracy( + task='multiclass', num_classes=num_classes, average='micro' ), - "OverallPrecision": Precision( - task="multiclass", num_classes=num_classes, average="micro" + 'OverallPrecision': Precision( + task='multiclass', num_classes=num_classes, average='micro' ), - "OverallRecall": Recall( - task="multiclass", num_classes=num_classes, average="micro" + 'OverallRecall': Recall( + task='multiclass', num_classes=num_classes, average='micro' ), - "OverallF1Score": FBetaScore( - task="multiclass", + 'OverallF1Score': FBetaScore( + task='multiclass', num_classes=num_classes, beta=1.0, - average="micro", + average='micro', ), - "MeanIoU": JaccardIndex( - num_classes=num_classes, task="multiclass", average="macro" + 'MeanIoU': JaccardIndex( + num_classes=num_classes, task='multiclass', average='macro' ), }, - prefix="train_", + prefix='train_', ) - self.val_metrics = self.train_metrics.clone(prefix="val_") - self.test_metrics = self.train_metrics.clone(prefix="test_") + self.val_metrics = self.train_metrics.clone(prefix='val_') + self.test_metrics = self.train_metrics.clone(prefix='test_') - def configure_callbacks(self) -> Union[Sequence[Callback], Callback]: + def configure_callbacks(self) -> Sequence[Callback] | Callback: """Initialize callbacks for saving the best and latest models. Returns: @@ -150,25 +149,25 @@ def on_train_epoch_start(self) -> None: """Log the learning rate at the start of each training epoch.""" optimizers = self.optimizers() if isinstance(optimizers, list): - lr = optimizers[0].param_groups[0]["lr"] + lr = optimizers[0].param_groups[0]['lr'] else: - lr = optimizers.param_groups[0]["lr"] - self.logger.experiment.add_scalar("lr", lr, self.current_epoch) # type: ignore + lr = optimizers.param_groups[0]['lr'] + self.logger.experiment.add_scalar('lr', lr, self.current_epoch) # type: ignore # ## Train model # # The remainder of the turial is straightforward and follows the typical [PyTorch Lightning](https://lightning.ai/) training routine. We instantiate a `DataModule` for the LandCover.AI dataset, instantiate a `CustomSemanticSegmentationTask` with a U-Net and ResNet-50 backbone, then train the model using a Lightning trainer. -dm = LandCoverAIDataModule(root="data/", batch_size=64, num_workers=8, download=True) +dm = LandCoverAIDataModule(root='data/', batch_size=64, num_workers=8, download=True) task = CustomSemanticSegmentationTask( - model="unet", - backbone="resnet50", + model='unet', + backbone='resnet50', weights=True, in_channels=3, num_classes=6, - loss="ce", + loss='ce', lr=1e-3, tmax=50, ) @@ -191,7 +190,7 @@ def on_train_epoch_start(self) -> None: # You can load directly from a saved checkpoint with `.load_from_checkpoint(...)` task = CustomSemanticSegmentationTask.load_from_checkpoint( - "lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt" + 'lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt' ) trainer.test(task, dm) From 27953c219a08b1f84a299f243f0490dbd52dad02 Mon Sep 17 00:00:00 2001 From: davrob Date: Wed, 18 Sep 2024 19:52:05 +0000 Subject: [PATCH 18/31] Update to notebook --- .../custom_segmentation_trainer.ipynb | 150 ++++++++---------- 1 file changed, 69 insertions(+), 81 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index fcb94ac5862..f7ecd6b94e3 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -52,7 +52,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -60,7 +60,10 @@ "# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.\n", "import warnings\n", "from collections.abc import Sequence\n", - "from typing import Any" + "from typing import Any\n", + "\n", + "warnings.filterwarnings('ignore', category=UserWarning, module='torch.nn.functional')\n", + "warnings.filterwarnings('ignore', category=FutureWarning)" ] }, { @@ -83,31 +86,10 @@ " JaccardIndex,\n", " Precision,\n", " Recall,\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "3e721f3f", - "metadata": {}, - "outputs": [], - "source": [ - "from torchgeo.datamodules import LandCoverAIDataModule" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "7fb5fbbf", - "metadata": { - "lines_to_end_of_cell_marker": 2 - }, - "outputs": [], - "source": [ - "from torchgeo.trainers import SemanticSegmentationTask\n", + ")\n", "\n", - "warnings.filterwarnings('ignore', category=UserWarning, module='torch.nn.functional')" + "from torchgeo.datamodules import LandCoverAI100DataModule\n", + "from torchgeo.trainers import SemanticSegmentationTask" ] }, { @@ -130,7 +112,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -221,16 +203,16 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ - "dm = LandCoverAIDataModule(root='data/', batch_size=64, num_workers=8, download=True)" + "dm = LandCoverAI100DataModule(root='data/', batch_size=64, num_workers=8, download=True)" ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 6, "metadata": {}, "outputs": [], "source": [ @@ -248,7 +230,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 7, "metadata": {}, "outputs": [ { @@ -270,7 +252,7 @@ "\"tmax\": 50" ] }, - "execution_count": 9, + "execution_count": 7, "metadata": {}, "output_type": "execute_result" } @@ -282,25 +264,18 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "GPU available: True (cuda), used: False\n", + "Trainer will use only 1 of 4 GPUs because it is running inside an interactive / notebook environment. You may try to set `Trainer(devices=4)` but please note that multi-GPU inside interactive / notebook environments is considered experimental and unstable. Your mileage may vary.\n", + "GPU available: True (cuda), used: True\n", "TPU available: False, using: 0 TPU cores\n", - "IPU available: False, using: 0 IPUs\n", "HPU available: False, using: 0 HPUs\n", - "/home/calebrobinson/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/trainer/setup.py:187: GPU available but not used. You can set it by doing `Trainer(accelerator='gpu')`.\n", - "`Trainer(limit_train_batches=1)` was configured so 1 batch per epoch will be used.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ + "`Trainer(limit_train_batches=1)` was configured so 1 batch per epoch will be used.\n", "`Trainer(limit_val_batches=1)` was configured so 1 batch will be used.\n" ] } @@ -316,40 +291,52 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n" + "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n", + "You are using a CUDA device ('NVIDIA A100 80GB PCIe') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Downloading https://cdn-lfs-us-1.huggingface.co/repos/76/99/7699a6c85994316c8a0bbf95d41627e5f1b3ea8501f66f73c0e2f53eb0afec45/bfe5bcf501a54cfd8ebf985346da50be5e8b751d3491812cd0c226b5a3abff41?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27landcoverai100.zip%3B+filename%3D%22landcoverai100.zip%22%3B&response-content-type=application%2Fzip&Expires=1726948262&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyNjk0ODI2Mn19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzc2Lzk5Lzc2OTlhNmM4NTk5NDMxNmM4YTBiYmY5NWQ0MTYyN2U1ZjFiM2VhODUwMWY2NmY3M2MwZTJmNTNlYjBhZmVjNDUvYmZlNWJjZjUwMWE1NGNmZDhlYmY5ODUzNDZkYTUwYmU1ZThiNzUxZDM0OTE4MTJjZDBjMjI2YjVhM2FiZmY0MT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=Gcz-ypfchbehbW2jBiQ%7ElsHtponr7Cpnoakn%7EcoIJ3O4JEiRgQBmW8GzoF7g9GVzRYHoSloeBlPSjkhB8FTdA5GDmKx16TJddhcFbzEhxAOzpYS9FBHVMf9yh0Ofbhy9w64GonBLDo6Lm97O%7EuGqTc4azZ-KzRTNTEvA%7Ej4%7Epb5aALM6vifb%7EfdIbfBtJC%7ECfB7U4Idu5gdqbnGIbUbnmpcd-g7UUgw3T1HbroqjappCt9HouM6nhAeoop2yBq7sgjxgDSdUA3rBACm8Hqi3rFth6Q4IFe%7Emoxt1mjV%7E0M2I2AHiSHK2k%7EZ%7ENaZgBUSk4S3R8mkWBrIv9JG19azAWg__&Key-Pair-Id=K24J24Z295AEI9 to data/landcoverai100.zip\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ + "100%|██████████| 9278198/9278198 [00:00<00:00, 21183125.27it/s]\n", + "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]\n", "\n", - " | Name | Type | Params\n", - "---------------------------------------------------\n", - "0 | criterion | CrossEntropyLoss | 0 \n", - "1 | train_metrics | MetricCollection | 0 \n", - "2 | val_metrics | MetricCollection | 0 \n", - "3 | test_metrics | MetricCollection | 0 \n", - "4 | model | Unet | 32.5 M\n", - "---------------------------------------------------\n", + " | Name | Type | Params | Mode \n", + "-----------------------------------------------------------\n", + "0 | model | Unet | 32.5 M | train\n", + "1 | criterion | CrossEntropyLoss | 0 | train\n", + "2 | train_metrics | MetricCollection | 0 | train\n", + "3 | val_metrics | MetricCollection | 0 | train\n", + "4 | test_metrics | MetricCollection | 0 | train\n", + "-----------------------------------------------------------\n", "32.5 M Trainable params\n", "0 Non-trainable params\n", "32.5 M Total params\n", "130.087 Total estimated model params size (MB)\n", - "/home/calebrobinson/.conda/envs/geo/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:293: The number of training batches (1) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.\n" + "242 Modules in train mode\n", + "0 Modules in eval mode\n", + "/opt/conda/envs/torchgeo/lib/python3.12/site-packages/lightning/pytorch/loops/fit_loop.py:298: The number of training batches (1) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "0834b6894deb4a7a9edc9d90a8d9ed31", + "model_id": "47c8ef6ff2f147a287fcd137058b2509", "version_major": 2, "version_minor": 0 }, @@ -363,7 +350,7 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "24564c905f3d432eb9ba6171f03c2273", + "model_id": "db9116f3e41245c6897b885aa322dcbc", "version_major": 2, "version_minor": 0 }, @@ -397,7 +384,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 10, "metadata": {}, "outputs": [], "source": [ @@ -409,20 +396,21 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n" + "The following callbacks returned in `LightningModule.configure_callbacks` will override existing callbacks passed to Trainer: ModelCheckpoint\n", + "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "5dc29f3a1d4a449ab1ef94c58c0937ba", + "model_id": "ab3b452d9a8f410fb5b2dfa6c660cbcf", "version_major": 2, "version_minor": 0 }, @@ -439,12 +427,12 @@ "
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
        "┃        Test metric               DataLoader 0        ┃\n",
        "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
-       "│       test_MeanIoU           0.012266275472939014    │\n",
-       "│   test_OverallAccuracy       0.038088466972112656    │\n",
-       "│    test_OverallF1Score       0.038088466972112656    │\n",
-       "│   test_OverallPrecision      0.038088466972112656    │\n",
-       "│    test_OverallRecall        0.038088466972112656    │\n",
-       "│         test_loss             1.8426358699798584     │\n",
+       "│       test_MeanIoU           0.0038406727835536003   │\n",
+       "│   test_OverallAccuracy       0.0015612284187227488   │\n",
+       "│    test_OverallF1Score       0.0015612284187227488   │\n",
+       "│   test_OverallPrecision      0.0015612284187227488   │\n",
+       "│    test_OverallRecall        0.0015612284187227488   │\n",
+       "│         test_loss              17.85824203491211     │\n",
        "└───────────────────────────┴───────────────────────────┘\n",
        "
\n" ], @@ -452,12 +440,12 @@ "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n", "┃\u001b[1m \u001b[0m\u001b[1m Test metric \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m DataLoader 0 \u001b[0m\u001b[1m \u001b[0m┃\n", "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n", - "│\u001b[36m \u001b[0m\u001b[36m test_MeanIoU \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.012266275472939014 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_OverallAccuracy \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.038088466972112656 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_OverallF1Score \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.038088466972112656 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_OverallPrecision \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.038088466972112656 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_OverallRecall \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.038088466972112656 \u001b[0m\u001b[35m \u001b[0m│\n", - "│\u001b[36m \u001b[0m\u001b[36m test_loss \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 1.8426358699798584 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_MeanIoU \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.0038406727835536003 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallAccuracy \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.0015612284187227488 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallF1Score \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.0015612284187227488 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallPrecision \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.0015612284187227488 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_OverallRecall \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.0015612284187227488 \u001b[0m\u001b[35m \u001b[0m│\n", + "│\u001b[36m \u001b[0m\u001b[36m test_loss \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 17.85824203491211 \u001b[0m\u001b[35m \u001b[0m│\n", "└───────────────────────────┴───────────────────────────┘\n" ] }, @@ -467,15 +455,15 @@ { "data": { "text/plain": [ - "[{'test_loss': 1.8426358699798584,\n", - " 'test_MeanIoU': 0.012266275472939014,\n", - " 'test_OverallAccuracy': 0.038088466972112656,\n", - " 'test_OverallF1Score': 0.038088466972112656,\n", - " 'test_OverallPrecision': 0.038088466972112656,\n", - " 'test_OverallRecall': 0.038088466972112656}]" + "[{'test_loss': 17.85824203491211,\n", + " 'test_MeanIoU': 0.0038406727835536003,\n", + " 'test_OverallAccuracy': 0.0015612284187227488,\n", + " 'test_OverallF1Score': 0.0015612284187227488,\n", + " 'test_OverallPrecision': 0.0015612284187227488,\n", + " 'test_OverallRecall': 0.0015612284187227488}]" ] }, - "execution_count": 13, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" } @@ -490,7 +478,7 @@ "formats": "ipynb,py" }, "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "torchgeo", "language": "python", "name": "python3" }, @@ -504,7 +492,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.13" + "version": "3.12.5" } }, "nbformat": 4, From 5fc368ef88cb344cca901d97448ff41bf4d11e6d Mon Sep 17 00:00:00 2001 From: davrob Date: Wed, 18 Sep 2024 20:00:16 +0000 Subject: [PATCH 19/31] Adding some comments --- docs/tutorials/custom_segmentation_trainer.ipynb | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index f7ecd6b94e3..b7ab002325c 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -38,7 +38,7 @@ "metadata": {}, "outputs": [], "source": [ - "%pip install torchgeo[datasets]" + "%pip install torchgeo" ] }, { @@ -198,7 +198,7 @@ "source": [ "## Train model\n", "\n", - "The remainder of the turial is straightforward and follows the typical [PyTorch Lightning](https://lightning.ai/) training routine. We instantiate a `DataModule` for the LandCover.AI dataset, instantiate a `CustomSemanticSegmentationTask` with a U-Net and ResNet-50 backbone, then train the model using a Lightning trainer." + "The remainder of the turial is straightforward and follows the typical [PyTorch Lightning](https://lightning.ai/) training routine. We instantiate a `DataModule` for the LandCover.AI 100 dataset (a small version of the LandCover.AI dataset for notebook testing), instantiate a `CustomSemanticSegmentationTask` with a U-Net and ResNet-50 backbone, then train the model using a Lightning trainer." ] }, { @@ -207,7 +207,11 @@ "metadata": {}, "outputs": [], "source": [ - "dm = LandCoverAI100DataModule(root='data/', batch_size=64, num_workers=8, download=True)" + "dm = LandCoverAI100DataModule(root='data/', batch_size=64, num_workers=8, download=True)\n", + "\n", + "# You can use the following for actual training runs\n", + "# from torchgeo.datamodules import LandCoverAIDataModule\n", + "# dm = LandCoverAIDataModule(root='data/', batch_size=64, num_workers=8, download=True)" ] }, { @@ -389,6 +393,8 @@ "outputs": [], "source": [ "# You can load directly from a saved checkpoint with `.load_from_checkpoint(...)`\n", + "# Note that you can also just call `trainer.test(task, dm)` if you've already trained\n", + "# the model in the current notebook session.\n", "task = CustomSemanticSegmentationTask.load_from_checkpoint(\n", " 'lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt'\n", ")" From 7a6dc060364a2e61fa8225220f6dd694ed3e5e03 Mon Sep 17 00:00:00 2001 From: davrob Date: Wed, 18 Sep 2024 20:01:45 +0000 Subject: [PATCH 20/31] Remove the .py file --- docs/tutorials/custom_segmentation_trainer.py | 196 ------------------ 1 file changed, 196 deletions(-) delete mode 100644 docs/tutorials/custom_segmentation_trainer.py diff --git a/docs/tutorials/custom_segmentation_trainer.py b/docs/tutorials/custom_segmentation_trainer.py deleted file mode 100644 index 55f0c493a95..00000000000 --- a/docs/tutorials/custom_segmentation_trainer.py +++ /dev/null @@ -1,196 +0,0 @@ -# --- -# jupyter: -# jupytext: -# formats: ipynb,py -# text_representation: -# extension: .py -# format_name: light -# format_version: '1.5' -# jupytext_version: 1.16.1 -# kernelspec: -# display_name: Python 3 (ipykernel) -# language: python -# name: python3 -# --- - -# flake8: noqa: E501 -# -# Copyright (c) Microsoft Corporation. All rights reserved. -# Licensed under the MIT License. - -# # Custom Trainers -# -# In this tutorial, we demonstrate how to extend a TorchGeo ["trainer class"](https://torchgeo.readthedocs.io/en/latest/api/trainers.html). In TorchGeo there exist several trainer classes that are pre-made PyTorch Lightning Modules designed to allow for the easy training of models on semantic segmentation, classification, change detection, etc. tasks using TorchGeo's [prebuild DataModules](https://torchgeo.readthedocs.io/en/latest/api/datamodules.html). While the trainers aim to provide sensible defaults and customization options for common tasks, they will not be able to cover all situations (e.g. researchers will likely want to implement and use their own architectures, loss functions, optimizers, etc. in the training routine). If you run into such a situation, then you can simply extend the trainer class you are interested in, and write custom logic to override the default functionality. -# -# This tutorial shows how to do exactly this to customize a learning rate schedule, logging, and model checkpointing for a semantic segmentation task using the [LandCoverAI](https://landcover.ai.linuxpolska.com/) dataset. -# -# It's recommended to run this notebook on Google Colab if you don't have your own GPU. Click the "Open in Colab" button above to get started. - -# ## Setup -# -# As always, we install TorchGeo. - -# %pip install torchgeo[datasets] - -# ## Imports -# -# Next, we import TorchGeo and any other libraries we need. - -# Get rid of the pesky warnings raised by kornia -# UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. -import warnings -from collections.abc import Sequence -from typing import Any - -import lightning -import lightning.pytorch as pl -from lightning.pytorch.callbacks import ModelCheckpoint -from lightning.pytorch.callbacks.callback import Callback -from torch.optim import AdamW -from torch.optim.lr_scheduler import CosineAnnealingLR -from torchmetrics import MetricCollection -from torchmetrics.classification import ( - Accuracy, - FBetaScore, - JaccardIndex, - Precision, - Recall, -) - -from torchgeo.datamodules import LandCoverAIDataModule - -# + -from torchgeo.trainers import SemanticSegmentationTask - -warnings.filterwarnings('ignore', category=UserWarning, module='torch.nn.functional') - - -# - - -# ## Custom SemanticSegmentationTask -# -# Now, we create a `CustomSemanticSegmentationTask` class that inhierits from `SemanticSegmentationTask` and that overrides a few methods: -# - `__init__`: We add two new parameters `tmax` and `eta_min` to control the learning rate scheduler -# - `configure_optimizers`: We use the `CosineAnnealingLR` learning rate scheduler instead of the default `ReduceLROnPlateau` -# - `configure_metrics`: We add a "MeanIou" metric (what we will use to evaluate the model's performance) and a variety of other classification metrics -# - `configure_callbacks`: We demonstrate how to stack `ModelCheckpoint` callbacks to save the best checkpoint as well as periodic checkpoints -# - `on_train_epoch_start`: We log the learning rate at the start of each epoch so we can easily see how it decays over a training run -# -# Overall these demonstrate how to customize the training routine to investigate specific research questions (e.g. of the scheduler on test performance). - - -class CustomSemanticSegmentationTask(SemanticSegmentationTask): - # any keywords we add here between *args and **kwargs will be found in self.hparams - def __init__( - self, *args: Any, tmax: int = 50, eta_min: float = 1e-6, **kwargs: Any - ) -> None: - super().__init__(*args, **kwargs) # pass args and kwargs to the parent class - - def configure_optimizers( - self, - ) -> 'lightning.pytorch.utilities.types.OptimizerLRSchedulerConfig': - """Initialize the optimizer and learning rate scheduler. - - Returns: - Optimizer and learning rate scheduler. - """ - tmax: int = self.hparams['tmax'] - eta_min: float = self.hparams['eta_min'] - - optimizer = AdamW(self.parameters(), lr=self.hparams['lr']) - scheduler = CosineAnnealingLR(optimizer, T_max=tmax, eta_min=eta_min) - return { - 'optimizer': optimizer, - 'lr_scheduler': {'scheduler': scheduler, 'monitor': self.monitor}, - } - - def configure_metrics(self) -> None: - """Initialize the performance metrics.""" - num_classes: int = self.hparams['num_classes'] - - self.train_metrics = MetricCollection( - { - 'OverallAccuracy': Accuracy( - task='multiclass', num_classes=num_classes, average='micro' - ), - 'OverallPrecision': Precision( - task='multiclass', num_classes=num_classes, average='micro' - ), - 'OverallRecall': Recall( - task='multiclass', num_classes=num_classes, average='micro' - ), - 'OverallF1Score': FBetaScore( - task='multiclass', - num_classes=num_classes, - beta=1.0, - average='micro', - ), - 'MeanIoU': JaccardIndex( - num_classes=num_classes, task='multiclass', average='macro' - ), - }, - prefix='train_', - ) - self.val_metrics = self.train_metrics.clone(prefix='val_') - self.test_metrics = self.train_metrics.clone(prefix='test_') - - def configure_callbacks(self) -> Sequence[Callback] | Callback: - """Initialize callbacks for saving the best and latest models. - - Returns: - List of callbacks to apply. - """ - return [ - ModelCheckpoint(every_n_epochs=50, save_top_k=-1, save_last=True), - ModelCheckpoint(monitor=self.monitor, mode=self.mode, save_top_k=5), - ] - - def on_train_epoch_start(self) -> None: - """Log the learning rate at the start of each training epoch.""" - optimizers = self.optimizers() - if isinstance(optimizers, list): - lr = optimizers[0].param_groups[0]['lr'] - else: - lr = optimizers.param_groups[0]['lr'] - self.logger.experiment.add_scalar('lr', lr, self.current_epoch) # type: ignore - - -# ## Train model -# -# The remainder of the turial is straightforward and follows the typical [PyTorch Lightning](https://lightning.ai/) training routine. We instantiate a `DataModule` for the LandCover.AI dataset, instantiate a `CustomSemanticSegmentationTask` with a U-Net and ResNet-50 backbone, then train the model using a Lightning trainer. - -dm = LandCoverAIDataModule(root='data/', batch_size=64, num_workers=8, download=True) - -task = CustomSemanticSegmentationTask( - model='unet', - backbone='resnet50', - weights=True, - in_channels=3, - num_classes=6, - loss='ce', - lr=1e-3, - tmax=50, -) - -# validate that the task's hyperparameters are as expected -task.hparams - -# The following Trainer config is useful just for testing the code in this notebook. -trainer = pl.Trainer( - limit_train_batches=1, limit_val_batches=1, num_sanity_val_steps=0, max_epochs=1 -) -# You can use the following for actual training runs. -# trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50) - -trainer.fit(task, dm) - -# ## Test model -# -# Finally, we test the model (optionally loading from a previously saved checkpoint). - -# You can load directly from a saved checkpoint with `.load_from_checkpoint(...)` -task = CustomSemanticSegmentationTask.load_from_checkpoint( - 'lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt' -) - -trainer.test(task, dm) From d7402fd49c68db13999f26e1624c394f680d8fd8 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 20 Sep 2024 16:12:28 +0000 Subject: [PATCH 21/31] Download less data --- docs/tutorials/custom_raster_dataset.ipynb | 2 +- docs/tutorials/getting_started.ipynb | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/tutorials/custom_raster_dataset.ipynb b/docs/tutorials/custom_raster_dataset.ipynb index 0e2785bb7ff..54b830b806d 100644 --- a/docs/tutorials/custom_raster_dataset.ipynb +++ b/docs/tutorials/custom_raster_dataset.ipynb @@ -251,7 +251,7 @@ "root = os.path.join(tempfile.gettempdir(), 'sentinel')\n", "item_urls = [\n", " 'https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items/S2B_MSIL2A_20220902T090559_R050_T40XDH_20220902T181115',\n", - " 'https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items/S2B_MSIL2A_20220718T084609_R107_T40XEJ_20220718T175008',\n", + " #'https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items/S2B_MSIL2A_20220718T084609_R107_T40XEJ_20220718T175008',\n", "]\n", "\n", "for item_url in item_urls:\n", diff --git a/docs/tutorials/getting_started.ipynb b/docs/tutorials/getting_started.ipynb index 1b1982711b3..a70dfa3d863 100644 --- a/docs/tutorials/getting_started.ipynb +++ b/docs/tutorials/getting_started.ipynb @@ -155,9 +155,9 @@ ")\n", "tiles = [\n", " 'm_3807511_ne_18_060_20181104.tif',\n", - " 'm_3807511_se_18_060_20181104.tif',\n", - " 'm_3807512_nw_18_060_20180815.tif',\n", - " 'm_3807512_sw_18_060_20180815.tif',\n", + " #'m_3807511_se_18_060_20181104.tif',\n", + " #'m_3807512_nw_18_060_20180815.tif',\n", + " #'m_3807512_sw_18_060_20180815.tif',\n", "]\n", "for tile in tiles:\n", " download_url(naip_url + tile, naip_root)\n", @@ -314,7 +314,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.8" + "version": "3.10.14" } }, "nbformat": 4, From c4c02ec487a385cc66a2ecb60f80d33c93b791d0 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 20 Sep 2024 16:19:10 +0000 Subject: [PATCH 22/31] ruff --- docs/tutorials/custom_raster_dataset.ipynb | 2 +- docs/tutorials/getting_started.ipynb | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/tutorials/custom_raster_dataset.ipynb b/docs/tutorials/custom_raster_dataset.ipynb index 54b830b806d..1f6e4b14ea3 100644 --- a/docs/tutorials/custom_raster_dataset.ipynb +++ b/docs/tutorials/custom_raster_dataset.ipynb @@ -250,7 +250,7 @@ "source": [ "root = os.path.join(tempfile.gettempdir(), 'sentinel')\n", "item_urls = [\n", - " 'https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items/S2B_MSIL2A_20220902T090559_R050_T40XDH_20220902T181115',\n", + " 'https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items/S2B_MSIL2A_20220902T090559_R050_T40XDH_20220902T181115'\n", " #'https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items/S2B_MSIL2A_20220718T084609_R107_T40XEJ_20220718T175008',\n", "]\n", "\n", diff --git a/docs/tutorials/getting_started.ipynb b/docs/tutorials/getting_started.ipynb index a70dfa3d863..b121bbbd8b7 100644 --- a/docs/tutorials/getting_started.ipynb +++ b/docs/tutorials/getting_started.ipynb @@ -154,7 +154,7 @@ " 'https://naipeuwest.blob.core.windows.net/naip/v002/de/2018/de_060cm_2018/38075/'\n", ")\n", "tiles = [\n", - " 'm_3807511_ne_18_060_20181104.tif',\n", + " 'm_3807511_ne_18_060_20181104.tif'\n", " #'m_3807511_se_18_060_20181104.tif',\n", " #'m_3807512_nw_18_060_20180815.tif',\n", " #'m_3807512_sw_18_060_20180815.tif',\n", From 4acc493e4424f2b9d341968e18f86974e4e6c6c1 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 20 Sep 2024 16:40:48 +0000 Subject: [PATCH 23/31] nbstripout --- docs/tutorials/custom_raster_dataset.ipynb | 11 +++-- .../custom_segmentation_trainer.ipynb | 26 +++++----- docs/tutorials/getting_started.ipynb | 43 ++++++++--------- docs/tutorials/indices.ipynb | 3 -- docs/tutorials/pretrained_weights.ipynb | 3 -- docs/tutorials/trainers.ipynb | 47 +++++++++---------- docs/tutorials/transforms.ipynb | 3 -- 7 files changed, 61 insertions(+), 75 deletions(-) diff --git a/docs/tutorials/custom_raster_dataset.ipynb b/docs/tutorials/custom_raster_dataset.ipynb index 1f6e4b14ea3..81c45c28acf 100644 --- a/docs/tutorials/custom_raster_dataset.ipynb +++ b/docs/tutorials/custom_raster_dataset.ipynb @@ -84,6 +84,13 @@ }, { "cell_type": "code", + "custom": { + "metadata": { + "tags": [ + "skip-execution" + ] + } + }, "execution_count": null, "metadata": { "colab": { @@ -540,10 +547,6 @@ } ], "metadata": { - "colab": { - "collapsed_sections": [], - "provenance": [] - }, "execution": { "timeout": 1200 }, diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index b7ab002325c..c8e05f1212f 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -52,7 +52,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -68,8 +68,8 @@ }, { "cell_type": "code", - "execution_count": 3, - "id": "0b01bf43", + "execution_count": null, + "id": "6", "metadata": {}, "outputs": [], "source": [ @@ -112,7 +112,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -203,7 +203,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -216,7 +216,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -234,7 +234,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -256,7 +256,7 @@ "\"tmax\": 50" ] }, - "execution_count": 7, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -268,7 +268,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -295,7 +295,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -388,7 +388,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -402,7 +402,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -469,7 +469,7 @@ " 'test_OverallRecall': 0.0015612284187227488}]" ] }, - "execution_count": 11, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } diff --git a/docs/tutorials/getting_started.ipynb b/docs/tutorials/getting_started.ipynb index b121bbbd8b7..eb96602150f 100644 --- a/docs/tutorials/getting_started.ipynb +++ b/docs/tutorials/getting_started.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "markdown", - "id": "35303546", + "id": "0", "metadata": {}, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved.\n", @@ -12,7 +12,7 @@ }, { "cell_type": "markdown", - "id": "9478ed9a", + "id": "1", "metadata": { "id": "NdrXRgjU7Zih" }, @@ -26,7 +26,7 @@ }, { "cell_type": "markdown", - "id": "34f10e9f", + "id": "2", "metadata": { "id": "lCqHTGRYBZcz" }, @@ -39,16 +39,16 @@ { "cell_type": "code", "execution_count": null, - "id": "019092f0", + "id": "3", "metadata": {}, "outputs": [], "source": [ - "%pip install torchgeo" + "#%pip install torchgeo" ] }, { "cell_type": "markdown", - "id": "4db9f791", + "id": "4", "metadata": { "id": "dV0NLHfGBMWl" }, @@ -61,7 +61,7 @@ { "cell_type": "code", "execution_count": null, - "id": "3d92b0f1", + "id": "5", "metadata": { "id": "entire-albania" }, @@ -79,7 +79,7 @@ }, { "cell_type": "markdown", - "id": "7f26e4b8", + "id": "6", "metadata": { "id": "5rLknZxrBEMz" }, @@ -92,7 +92,7 @@ { "cell_type": "code", "execution_count": null, - "id": "4a39af46", + "id": "7", "metadata": { "colab": { "base_uri": "https://localhost:8080/", @@ -167,7 +167,7 @@ }, { "cell_type": "markdown", - "id": "e25bad40", + "id": "8", "metadata": { "id": "HQVji2B22Qfu" }, @@ -178,7 +178,7 @@ { "cell_type": "code", "execution_count": null, - "id": "689bb2b0", + "id": "9", "metadata": { "colab": { "base_uri": "https://localhost:8080/" @@ -195,7 +195,7 @@ }, { "cell_type": "markdown", - "id": "56f2d78b", + "id": "10", "metadata": { "id": "OWUhlfpD22IX" }, @@ -206,7 +206,7 @@ { "cell_type": "code", "execution_count": null, - "id": "daefbc4d", + "id": "11", "metadata": { "id": "WXxy8F8l2-aC" }, @@ -217,7 +217,7 @@ }, { "cell_type": "markdown", - "id": "ded44652", + "id": "12", "metadata": { "id": "yF_R54Yf3EUd" }, @@ -230,7 +230,7 @@ { "cell_type": "code", "execution_count": null, - "id": "b8a0d99c", + "id": "13", "metadata": { "id": "RLczuU293itT" }, @@ -241,7 +241,7 @@ }, { "cell_type": "markdown", - "id": "5b8c1c52", + "id": "14", "metadata": { "id": "OWa-mmYd8S6K" }, @@ -254,7 +254,7 @@ { "cell_type": "code", "execution_count": null, - "id": "96faa142", + "id": "15", "metadata": { "id": "jfx-9ZmU8ZTc" }, @@ -265,7 +265,7 @@ }, { "cell_type": "markdown", - "id": "64ae63f7", + "id": "16", "metadata": { "id": "HZIfqqW58oZe" }, @@ -278,7 +278,7 @@ { "cell_type": "code", "execution_count": null, - "id": "8a2b44f8", + "id": "17", "metadata": { "id": "7sGmNvBy8uIg" }, @@ -291,11 +291,6 @@ } ], "metadata": { - "colab": { - "collapsed_sections": [], - "name": "getting_started.ipynb", - "provenance": [] - }, "execution": { "timeout": 1200 }, diff --git a/docs/tutorials/indices.ipynb b/docs/tutorials/indices.ipynb index 30576609ac8..ceb93a8223d 100644 --- a/docs/tutorials/indices.ipynb +++ b/docs/tutorials/indices.ipynb @@ -353,9 +353,6 @@ } ], "metadata": { - "colab": { - "provenance": [] - }, "execution": { "timeout": 1200 }, diff --git a/docs/tutorials/pretrained_weights.ipynb b/docs/tutorials/pretrained_weights.ipynb index 0c2c4a0fc48..e2b4c7612b0 100644 --- a/docs/tutorials/pretrained_weights.ipynb +++ b/docs/tutorials/pretrained_weights.ipynb @@ -473,9 +473,6 @@ ], "metadata": { "accelerator": "GPU", - "colab": { - "provenance": [] - }, "execution": { "timeout": 1200 }, diff --git a/docs/tutorials/trainers.ipynb b/docs/tutorials/trainers.ipynb index 5de3937a026..37d91a5714a 100644 --- a/docs/tutorials/trainers.ipynb +++ b/docs/tutorials/trainers.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "markdown", - "id": "b13c2251", + "id": "0", "metadata": { "id": "b13c2251" }, @@ -14,7 +14,7 @@ }, { "cell_type": "markdown", - "id": "e563313d", + "id": "1", "metadata": { "id": "e563313d" }, @@ -28,7 +28,7 @@ }, { "cell_type": "markdown", - "id": "8c1f4156", + "id": "2", "metadata": { "id": "8c1f4156" }, @@ -41,7 +41,7 @@ { "cell_type": "code", "execution_count": null, - "id": "3f0d31a8", + "id": "3", "metadata": { "id": "3f0d31a8" }, @@ -52,7 +52,7 @@ }, { "cell_type": "markdown", - "id": "c90c94c7", + "id": "4", "metadata": { "id": "c90c94c7" }, @@ -65,7 +65,7 @@ { "cell_type": "code", "execution_count": null, - "id": "bd39f485", + "id": "5", "metadata": { "id": "bd39f485" }, @@ -89,7 +89,7 @@ }, { "cell_type": "markdown", - "id": "e6e1d9b6", + "id": "6", "metadata": { "id": "e6e1d9b6" }, @@ -103,7 +103,7 @@ }, { "cell_type": "markdown", - "id": "9f2daa0d", + "id": "7", "metadata": { "id": "9f2daa0d" }, @@ -114,7 +114,7 @@ { "cell_type": "code", "execution_count": null, - "id": "8e100f8b", + "id": "8", "metadata": { "id": "8e100f8b", "nbmake": { @@ -137,7 +137,7 @@ { "cell_type": "code", "execution_count": null, - "id": "0f2a04c7", + "id": "9", "metadata": { "id": "0f2a04c7" }, @@ -151,7 +151,7 @@ }, { "cell_type": "markdown", - "id": "056b7b4c", + "id": "10", "metadata": { "id": "056b7b4c" }, @@ -162,7 +162,7 @@ { "cell_type": "code", "execution_count": null, - "id": "ba5c5442", + "id": "11", "metadata": { "id": "ba5c5442" }, @@ -181,7 +181,7 @@ }, { "cell_type": "markdown", - "id": "d4b67f3e", + "id": "12", "metadata": { "id": "d4b67f3e" }, @@ -194,7 +194,7 @@ { "cell_type": "code", "execution_count": null, - "id": "ffe26e5c", + "id": "13", "metadata": { "id": "ffe26e5c" }, @@ -211,7 +211,7 @@ }, { "cell_type": "markdown", - "id": "06afd8c7", + "id": "14", "metadata": { "id": "06afd8c7" }, @@ -222,7 +222,7 @@ { "cell_type": "code", "execution_count": null, - "id": "225a6d36", + "id": "15", "metadata": { "id": "225a6d36" }, @@ -241,7 +241,7 @@ }, { "cell_type": "markdown", - "id": "44d71e8f", + "id": "16", "metadata": { "id": "44d71e8f" }, @@ -252,7 +252,7 @@ { "cell_type": "code", "execution_count": null, - "id": "00e08790", + "id": "17", "metadata": { "id": "00e08790" }, @@ -263,7 +263,7 @@ }, { "cell_type": "markdown", - "id": "73700fb5", + "id": "18", "metadata": { "id": "73700fb5" }, @@ -274,7 +274,7 @@ { "cell_type": "code", "execution_count": null, - "id": "3e95ee0a", + "id": "19", "metadata": {}, "outputs": [], "source": [ @@ -283,7 +283,7 @@ }, { "cell_type": "markdown", - "id": "04cfc7a8", + "id": "20", "metadata": { "id": "04cfc7a8" }, @@ -294,7 +294,7 @@ { "cell_type": "code", "execution_count": null, - "id": "604a3b2f", + "id": "21", "metadata": { "id": "604a3b2f" }, @@ -306,9 +306,6 @@ ], "metadata": { "accelerator": "GPU", - "colab": { - "provenance": [] - }, "execution": { "timeout": 1200 }, diff --git a/docs/tutorials/transforms.ipynb b/docs/tutorials/transforms.ipynb index a7de9f32c69..4aeb12cbb2a 100644 --- a/docs/tutorials/transforms.ipynb +++ b/docs/tutorials/transforms.ipynb @@ -693,9 +693,6 @@ ], "metadata": { "accelerator": "GPU", - "colab": { - "provenance": [] - }, "execution": { "timeout": 1200 }, From 62663ded19a9d7ba18972e134bbe13e49ff16ff6 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 20 Sep 2024 17:05:17 +0000 Subject: [PATCH 24/31] See if removing pip installs help --- docs/tutorials/custom_raster_dataset.ipynb | 9 +-------- docs/tutorials/custom_segmentation_trainer.ipynb | 10 +++++++--- docs/tutorials/getting_started.ipynb | 2 +- docs/tutorials/indices.ipynb | 2 +- docs/tutorials/pretrained_weights.ipynb | 2 +- docs/tutorials/trainers.ipynb | 2 +- docs/tutorials/transforms.ipynb | 2 +- 7 files changed, 13 insertions(+), 16 deletions(-) diff --git a/docs/tutorials/custom_raster_dataset.ipynb b/docs/tutorials/custom_raster_dataset.ipynb index 81c45c28acf..e3bbb784506 100644 --- a/docs/tutorials/custom_raster_dataset.ipynb +++ b/docs/tutorials/custom_raster_dataset.ipynb @@ -84,13 +84,6 @@ }, { "cell_type": "code", - "custom": { - "metadata": { - "tags": [ - "skip-execution" - ] - } - }, "execution_count": null, "metadata": { "colab": { @@ -101,7 +94,7 @@ }, "outputs": [], "source": [ - "%pip install torchgeo planetary_computer pystac" + "# %pip install torchgeo planetary_computer pystac" ] }, { diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index c8e05f1212f..d408b71423f 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -38,7 +38,7 @@ "metadata": {}, "outputs": [], "source": [ - "%pip install torchgeo" + "# %pip install torchgeo" ] }, { @@ -287,7 +287,11 @@ "source": [ "# The following Trainer config is useful just for testing the code in this notebook.\n", "trainer = pl.Trainer(\n", - " limit_train_batches=1, limit_val_batches=1, num_sanity_val_steps=0, max_epochs=1\n", + " limit_train_batches=1,\n", + " limit_val_batches=1,\n", + " num_sanity_val_steps=0,\n", + " max_epochs=1,\n", + " accelerator='cpu',\n", ")\n", "# You can use the following for actual training runs.\n", "# trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50)" @@ -498,7 +502,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.12.5" + "version": "3.10.14" } }, "nbformat": 4, diff --git a/docs/tutorials/getting_started.ipynb b/docs/tutorials/getting_started.ipynb index eb96602150f..a6d18646361 100644 --- a/docs/tutorials/getting_started.ipynb +++ b/docs/tutorials/getting_started.ipynb @@ -43,7 +43,7 @@ "metadata": {}, "outputs": [], "source": [ - "#%pip install torchgeo" + "# %pip install torchgeo" ] }, { diff --git a/docs/tutorials/indices.ipynb b/docs/tutorials/indices.ipynb index ceb93a8223d..05ddcbbc049 100644 --- a/docs/tutorials/indices.ipynb +++ b/docs/tutorials/indices.ipynb @@ -64,7 +64,7 @@ }, "outputs": [], "source": [ - "%pip install torchgeo" + "# %pip install torchgeo" ] }, { diff --git a/docs/tutorials/pretrained_weights.ipynb b/docs/tutorials/pretrained_weights.ipynb index e2b4c7612b0..fb71f63733e 100644 --- a/docs/tutorials/pretrained_weights.ipynb +++ b/docs/tutorials/pretrained_weights.ipynb @@ -47,7 +47,7 @@ }, "outputs": [], "source": [ - "%pip install torchgeo" + "# %pip install torchgeo" ] }, { diff --git a/docs/tutorials/trainers.ipynb b/docs/tutorials/trainers.ipynb index 37d91a5714a..dabafbd33b3 100644 --- a/docs/tutorials/trainers.ipynb +++ b/docs/tutorials/trainers.ipynb @@ -47,7 +47,7 @@ }, "outputs": [], "source": [ - "%pip install torchgeo tensorboard" + "# %pip install torchgeo tensorboard" ] }, { diff --git a/docs/tutorials/transforms.ipynb b/docs/tutorials/transforms.ipynb index 4aeb12cbb2a..b9549315f7b 100644 --- a/docs/tutorials/transforms.ipynb +++ b/docs/tutorials/transforms.ipynb @@ -61,7 +61,7 @@ }, "outputs": [], "source": [ - "%pip install torchgeo" + "# %pip install torchgeo" ] }, { From ef438f245eddb992b49324424b457253b17ae611 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 20 Sep 2024 17:22:57 +0000 Subject: [PATCH 25/31] What is going on? --- docs/tutorials/custom_segmentation_trainer.ipynb | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index d408b71423f..5cbe1536507 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -75,6 +75,7 @@ "source": [ "import lightning\n", "import lightning.pytorch as pl\n", + "import torch\n", "from lightning.pytorch.callbacks import ModelCheckpoint\n", "from lightning.pytorch.callbacks.callback import Callback\n", "from torch.optim import AdamW\n", @@ -207,7 +208,7 @@ "metadata": {}, "outputs": [], "source": [ - "dm = LandCoverAI100DataModule(root='data/', batch_size=64, num_workers=8, download=True)\n", + "dm = LandCoverAI100DataModule(root='data', batch_size=10, num_workers=2, download=True)\n", "\n", "# You can use the following for actual training runs\n", "# from torchgeo.datamodules import LandCoverAIDataModule\n", @@ -291,7 +292,7 @@ " limit_val_batches=1,\n", " num_sanity_val_steps=0,\n", " max_epochs=1,\n", - " accelerator='cpu',\n", + " accelerator='gpu' if torch.cuda.is_available() else 'cpu',\n", ")\n", "# You can use the following for actual training runs.\n", "# trainer = pl.Trainer(min_epochs=150, max_epochs=250, log_every_n_steps=50)" @@ -399,9 +400,10 @@ "# You can load directly from a saved checkpoint with `.load_from_checkpoint(...)`\n", "# Note that you can also just call `trainer.test(task, dm)` if you've already trained\n", "# the model in the current notebook session.\n", - "task = CustomSemanticSegmentationTask.load_from_checkpoint(\n", - " 'lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt'\n", - ")" + "\n", + "# task = CustomSemanticSegmentationTask.load_from_checkpoint(\n", + "# 'lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt'\n", + "# )" ] }, { From 2092f43e166a3f49df3126984b6a014328837c23 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 20 Sep 2024 17:53:07 +0000 Subject: [PATCH 26/31] Try pip installs again --- docs/tutorials/custom_raster_dataset.ipynb | 2 +- docs/tutorials/custom_segmentation_trainer.ipynb | 9 +++++---- docs/tutorials/getting_started.ipynb | 2 +- docs/tutorials/indices.ipynb | 2 +- docs/tutorials/pretrained_weights.ipynb | 2 +- docs/tutorials/trainers.ipynb | 2 +- docs/tutorials/transforms.ipynb | 2 +- 7 files changed, 11 insertions(+), 10 deletions(-) diff --git a/docs/tutorials/custom_raster_dataset.ipynb b/docs/tutorials/custom_raster_dataset.ipynb index e3bbb784506..ea04196195f 100644 --- a/docs/tutorials/custom_raster_dataset.ipynb +++ b/docs/tutorials/custom_raster_dataset.ipynb @@ -94,7 +94,7 @@ }, "outputs": [], "source": [ - "# %pip install torchgeo planetary_computer pystac" + "%pip install torchgeo planetary_computer pystac" ] }, { diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 5cbe1536507..75747fd0bbb 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -38,7 +38,7 @@ "metadata": {}, "outputs": [], "source": [ - "# %pip install torchgeo" + "%pip install torchgeo" ] }, { @@ -75,6 +75,7 @@ "source": [ "import lightning\n", "import lightning.pytorch as pl\n", + "import os\n", "import torch\n", "from lightning.pytorch.callbacks import ModelCheckpoint\n", "from lightning.pytorch.callbacks.callback import Callback\n", @@ -401,9 +402,9 @@ "# Note that you can also just call `trainer.test(task, dm)` if you've already trained\n", "# the model in the current notebook session.\n", "\n", - "# task = CustomSemanticSegmentationTask.load_from_checkpoint(\n", - "# 'lightning_logs/version_0/checkpoints/epoch=0-step=1.ckpt'\n", - "# )" + "task = CustomSemanticSegmentationTask.load_from_checkpoint(\n", + " os.path.join('lightning_logs', 'version_0', 'checkpoints', 'epoch=0-step=1.ckpt')\n", + ")" ] }, { diff --git a/docs/tutorials/getting_started.ipynb b/docs/tutorials/getting_started.ipynb index a6d18646361..b5edc63d64f 100644 --- a/docs/tutorials/getting_started.ipynb +++ b/docs/tutorials/getting_started.ipynb @@ -43,7 +43,7 @@ "metadata": {}, "outputs": [], "source": [ - "# %pip install torchgeo" + "%pip install torchgeo" ] }, { diff --git a/docs/tutorials/indices.ipynb b/docs/tutorials/indices.ipynb index 05ddcbbc049..ceb93a8223d 100644 --- a/docs/tutorials/indices.ipynb +++ b/docs/tutorials/indices.ipynb @@ -64,7 +64,7 @@ }, "outputs": [], "source": [ - "# %pip install torchgeo" + "%pip install torchgeo" ] }, { diff --git a/docs/tutorials/pretrained_weights.ipynb b/docs/tutorials/pretrained_weights.ipynb index fb71f63733e..e2b4c7612b0 100644 --- a/docs/tutorials/pretrained_weights.ipynb +++ b/docs/tutorials/pretrained_weights.ipynb @@ -47,7 +47,7 @@ }, "outputs": [], "source": [ - "# %pip install torchgeo" + "%pip install torchgeo" ] }, { diff --git a/docs/tutorials/trainers.ipynb b/docs/tutorials/trainers.ipynb index dabafbd33b3..37d91a5714a 100644 --- a/docs/tutorials/trainers.ipynb +++ b/docs/tutorials/trainers.ipynb @@ -47,7 +47,7 @@ }, "outputs": [], "source": [ - "# %pip install torchgeo tensorboard" + "%pip install torchgeo tensorboard" ] }, { diff --git a/docs/tutorials/transforms.ipynb b/docs/tutorials/transforms.ipynb index b9549315f7b..4aeb12cbb2a 100644 --- a/docs/tutorials/transforms.ipynb +++ b/docs/tutorials/transforms.ipynb @@ -61,7 +61,7 @@ }, "outputs": [], "source": [ - "# %pip install torchgeo" + "%pip install torchgeo" ] }, { From 037fc015a01465e3445a36febcfa85a57790826b Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Fri, 20 Sep 2024 17:59:50 +0000 Subject: [PATCH 27/31] Ruff --- docs/tutorials/custom_segmentation_trainer.ipynb | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 75747fd0bbb..fd51a4e45f7 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -73,9 +73,10 @@ "metadata": {}, "outputs": [], "source": [ + "import os\n", + "\n", "import lightning\n", "import lightning.pytorch as pl\n", - "import os\n", "import torch\n", "from lightning.pytorch.callbacks import ModelCheckpoint\n", "from lightning.pytorch.callbacks.callback import Callback\n", From 30c76da575c6d01bedf8f55733214fa0a4fd88fd Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Tue, 1 Oct 2024 13:32:47 -0700 Subject: [PATCH 28/31] Update docs/tutorials/custom_segmentation_trainer.ipynb Co-authored-by: Adam J. Stewart --- docs/tutorials/custom_segmentation_trainer.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index fd51a4e45f7..fd89571c3e7 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -18,7 +18,7 @@ "\n", "In this tutorial, we demonstrate how to extend a TorchGeo [\"trainer class\"](https://torchgeo.readthedocs.io/en/latest/api/trainers.html). In TorchGeo there exist several trainer classes that are pre-made PyTorch Lightning Modules designed to allow for the easy training of models on semantic segmentation, classification, change detection, etc. tasks using TorchGeo's [prebuild DataModules](https://torchgeo.readthedocs.io/en/latest/api/datamodules.html). While the trainers aim to provide sensible defaults and customization options for common tasks, they will not be able to cover all situations (e.g. researchers will likely want to implement and use their own architectures, loss functions, optimizers, etc. in the training routine). If you run into such a situation, then you can simply extend the trainer class you are interested in, and write custom logic to override the default functionality.\n", "\n", - "This tutorial shows how to do exactly this to customize a learning rate schedule, logging, and model checkpointing for a semantic segmentation task using the [LandCoverAI](https://landcover.ai.linuxpolska.com/) dataset.\n", + "This tutorial shows how to do exactly this to customize a learning rate schedule, logging, and model checkpointing for a semantic segmentation task using the [LandCover.ai](https://landcover.ai.linuxpolska.com/) dataset.\n", "\n", "It's recommended to run this notebook on Google Colab if you don't have your own GPU. Click the \"Open in Colab\" button above to get started." ] From 0c4aa0d7c982a10bbaf21d93f91a96c3064b3521 Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Tue, 1 Oct 2024 13:33:00 -0700 Subject: [PATCH 29/31] Update docs/tutorials/custom_segmentation_trainer.ipynb Co-authored-by: Adam J. Stewart --- docs/tutorials/custom_segmentation_trainer.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index fd89571c3e7..90ff58c33aa 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -16,7 +16,7 @@ "source": [ "# Custom Trainers\n", "\n", - "In this tutorial, we demonstrate how to extend a TorchGeo [\"trainer class\"](https://torchgeo.readthedocs.io/en/latest/api/trainers.html). In TorchGeo there exist several trainer classes that are pre-made PyTorch Lightning Modules designed to allow for the easy training of models on semantic segmentation, classification, change detection, etc. tasks using TorchGeo's [prebuild DataModules](https://torchgeo.readthedocs.io/en/latest/api/datamodules.html). While the trainers aim to provide sensible defaults and customization options for common tasks, they will not be able to cover all situations (e.g. researchers will likely want to implement and use their own architectures, loss functions, optimizers, etc. in the training routine). If you run into such a situation, then you can simply extend the trainer class you are interested in, and write custom logic to override the default functionality.\n", + "In this tutorial, we demonstrate how to extend a TorchGeo [\"trainer class\"](https://torchgeo.readthedocs.io/en/latest/api/trainers.html). In TorchGeo there exist several trainer classes that are pre-made PyTorch Lightning Modules designed to allow for the easy training of models on semantic segmentation, classification, change detection, etc. tasks using TorchGeo's [prebuilt DataModules](https://torchgeo.readthedocs.io/en/latest/api/datamodules.html). While the trainers aim to provide sensible defaults and customization options for common tasks, they will not be able to cover all situations (e.g. researchers will likely want to implement and use their own architectures, loss functions, optimizers, etc. in the training routine). If you run into such a situation, then you can simply extend the trainer class you are interested in, and write custom logic to override the default functionality.\n", "\n", "This tutorial shows how to do exactly this to customize a learning rate schedule, logging, and model checkpointing for a semantic segmentation task using the [LandCover.ai](https://landcover.ai.linuxpolska.com/) dataset.\n", "\n", From fed6524e845f465c86f7dcb2b9e73e2a3cb50bee Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Tue, 1 Oct 2024 13:33:43 -0700 Subject: [PATCH 30/31] Update docs/tutorials/custom_segmentation_trainer.ipynb Co-authored-by: Adam J. Stewart --- docs/tutorials/custom_segmentation_trainer.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index 90ff58c33aa..b67a3ad4771 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -106,7 +106,7 @@ "Now, we create a `CustomSemanticSegmentationTask` class that inhierits from `SemanticSegmentationTask` and that overrides a few methods:\n", "- `__init__`: We add two new parameters `tmax` and `eta_min` to control the learning rate scheduler\n", "- `configure_optimizers`: We use the `CosineAnnealingLR` learning rate scheduler instead of the default `ReduceLROnPlateau`\n", - "- `configure_metrics`: We add a \"MeanIou\" metric (what we will use to evaluate the model's performance) and a variety of other classification metrics\n", + "- `configure_metrics`: We add a \"MeanIoU\" metric (what we will use to evaluate the model's performance) and a variety of other classification metrics\n", "- `configure_callbacks`: We demonstrate how to stack `ModelCheckpoint` callbacks to save the best checkpoint as well as periodic checkpoints\n", "- `on_train_epoch_start`: We log the learning rate at the start of each epoch so we can easily see how it decays over a training run\n", "\n", From 94718ab3a9deaf95d9458060b8516bcfb827feca Mon Sep 17 00:00:00 2001 From: Caleb Robinson Date: Tue, 1 Oct 2024 13:33:58 -0700 Subject: [PATCH 31/31] Update docs/tutorials/custom_segmentation_trainer.ipynb Co-authored-by: Adam J. Stewart --- docs/tutorials/custom_segmentation_trainer.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/tutorials/custom_segmentation_trainer.ipynb b/docs/tutorials/custom_segmentation_trainer.ipynb index b67a3ad4771..f944ebbeef8 100644 --- a/docs/tutorials/custom_segmentation_trainer.ipynb +++ b/docs/tutorials/custom_segmentation_trainer.ipynb @@ -214,7 +214,7 @@ "\n", "# You can use the following for actual training runs\n", "# from torchgeo.datamodules import LandCoverAIDataModule\n", - "# dm = LandCoverAIDataModule(root='data/', batch_size=64, num_workers=8, download=True)" + "# dm = LandCoverAIDataModule(root='data', batch_size=64, num_workers=8, download=True)" ] }, {