Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataModules: only instantiate when download requested #974

Merged
merged 2 commits into from
Dec 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions tests/conf/bigearthnet_all.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@ experiment:
root: "tests/data/bigearthnet"
bands: "all"
num_classes: ${experiment.module.num_classes}
download: true
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/bigearthnet_s1.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@ experiment:
root: "tests/data/bigearthnet"
bands: "s1"
num_classes: ${experiment.module.num_classes}
download: true
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/bigearthnet_s2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@ experiment:
root: "tests/data/bigearthnet"
bands: "s2"
num_classes: ${experiment.module.num_classes}
download: true
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/byol.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ experiment:
learning_rate_schedule_patience: 6
datamodule:
root: "tests/data/chesapeake/cvpr"
download: true
train_splits:
- "de-test"
val_splits:
Expand Down
1 change: 1 addition & 0 deletions tests/conf/chesapeake_cvpr_5.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ experiment:
ignore_index: null
datamodule:
root: "tests/data/chesapeake/cvpr"
download: true
train_splits:
- "de-test"
val_splits:
Expand Down
1 change: 1 addition & 0 deletions tests/conf/chesapeake_cvpr_7.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ experiment:
weights: imagenet
datamodule:
root: "tests/data/chesapeake/cvpr"
download: true
train_splits:
- "de-test"
val_splits:
Expand Down
1 change: 1 addition & 0 deletions tests/conf/chesapeake_cvpr_prior.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ experiment:
weights: imagenet
datamodule:
root: "tests/data/chesapeake/cvpr"
download: true
train_splits:
- "de-test"
val_splits:
Expand Down
1 change: 1 addition & 0 deletions tests/conf/cowc_counting.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ experiment:
pretrained: True
datamodule:
root: "tests/data/cowc_counting"
download: true
seed: 0
batch_size: 1
num_workers: 0
3 changes: 2 additions & 1 deletion tests/conf/cyclone.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,13 @@ experiment:
model: "resnet18"
weights: "random"
num_outputs: 1
in_channels: 3
in_channels: 3
learning_rate: 1e-3
learning_rate_schedule_patience: 2
pretrained: False
datamodule:
root: "tests/data/cyclone"
download: true
seed: 0
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/etci2021.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@ experiment:
ignore_index: 0
datamodule:
root: "tests/data/etci2021"
download: true
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/eurosat.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ experiment:
num_classes: 2
datamodule:
root: "tests/data/eurosat"
download: true
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/landcoverai.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,6 @@ experiment:
ignore_index: null
datamodule:
root: "tests/data/landcoverai"
download: true
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/naipchesapeake.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ experiment:
datamodule:
naip_root: "tests/data/naip"
chesapeake_root: "tests/data/chesapeake/BAYWIDE"
chesapeake_download: true
batch_size: 2
num_workers: 0
patch_size: 32
1 change: 1 addition & 0 deletions tests/conf/nasa_marine_debris.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@ experiment:
verbose: false
datamodule:
root: "tests/data/nasa_marine_debris"
download: true
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/oscd_all.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ experiment:
ignore_index: null
datamodule:
root: "tests/data/oscd"
download: true
train_batch_size: 1
num_workers: 0
val_split_pct: 0.5
Expand Down
1 change: 1 addition & 0 deletions tests/conf/oscd_rgb.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ experiment:
ignore_index: null
datamodule:
root: "tests/data/oscd"
download: true
train_batch_size: 1
num_workers: 0
val_split_pct: 0.5
Expand Down
1 change: 1 addition & 0 deletions tests/conf/resisc45.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ experiment:
num_classes: 3
datamodule:
root: "tests/data/resisc45"
download: true
batch_size: 1
num_workers: 0
1 change: 1 addition & 0 deletions tests/conf/spacenet1.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ experiment:
ignore_index: null
datamodule:
root: "tests/data/spacenet"
download: true
batch_size: 1
num_workers: 0
val_split_pct: 0.33
Expand Down
1 change: 1 addition & 0 deletions tests/conf/ucmerced.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ experiment:
num_classes: 2
datamodule:
root: "tests/data/ucmerced"
download: true
batch_size: 1
num_workers: 0
6 changes: 5 additions & 1 deletion tests/datamodules/test_loveda.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,11 @@ def datamodule(self) -> LoveDADataModule:
scene = ["rural", "urban"]

dm = LoveDADataModule(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file will be deleted in #966, this is only temporary

root=root, scene=scene, batch_size=batch_size, num_workers=num_workers
root=root,
scene=scene,
batch_size=batch_size,
num_workers=num_workers,
download=True,
)

dm.prepare_data()
Expand Down
2 changes: 1 addition & 1 deletion tests/datamodules/test_usavars.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def datamodule(self, request: SubRequest) -> USAVarsDataModule:
num_workers = 0

dm = USAVarsDataModule(
root=root, batch_size=batch_size, num_workers=num_workers
root=root, batch_size=batch_size, num_workers=num_workers, download=True
)
dm.prepare_data()
dm.setup()
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/bigearthnet.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
BigEarthNet(split="train", **self.kwargs)
if self.kwargs.get("download", False):
BigEarthNet(split="train", **self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/chesapeake.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,8 @@ def prepare_data(self) -> None:

This method is called once per node, while :func:`setup` is called once per GPU.
"""
ChesapeakeCVPR(splits=self.train_splits, layers=self.layers, **self.kwargs)
if self.kwargs.get("download", False):
ChesapeakeCVPR(splits=self.train_splits, layers=self.layers, **self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Create the train/val/test splits based on the original Dataset objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/cowc.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,8 @@ def prepare_data(self) -> None:
This includes optionally downloading the dataset. This is done once per node,
while :func:`setup` is done once per GPU.
"""
COWCCounting(**self.kwargs)
if self.kwargs.get("download", False):
COWCCounting(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Create the train/val/test splits based on the original Dataset objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/cyclone.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ def prepare_data(self) -> None:
This includes optionally downloading the dataset. This is done once per node,
while :func:`setup` is done once per GPU.
"""
TropicalCyclone(split="train", **self.kwargs)
if self.kwargs.get("download", False):
TropicalCyclone(split="train", **self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Create the train/val/test splits based on the original Dataset objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/etci2021.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
ETCI2021(**self.kwargs)
if self.kwargs.get("download", False):
ETCI2021(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/eurosat.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
EuroSAT(**self.kwargs)
if self.kwargs.get("download", False):
EuroSAT(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/landcoverai.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
LandCoverAI(**self.kwargs)
if self.kwargs.get("download", False):
LandCoverAI(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/loveda.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
LoveDA(**self.kwargs)
if self.kwargs.get("download", False):
LoveDA(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
30 changes: 16 additions & 14 deletions torchgeo/datamodules/naip.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ class NAIPChesapeakeDataModule(pl.LightningDataModule):

def __init__(
self,
naip_root: str,
chesapeake_root: str,
batch_size: int = 64,
num_workers: int = 0,
patch_size: int = 256,
Expand All @@ -41,22 +39,26 @@ def __init__(
"""Initialize a LightningDataModule for NAIP and Chesapeake based DataLoaders.

Args:
naip_root: directory containing NAIP data
chesapeake_root: directory containing Chesapeake data
batch_size: The batch size to use in all created DataLoaders
num_workers: The number of workers to use in all created DataLoaders
patch_size: size of patches to sample
**kwargs: Additional keyword arguments passed to
:class:`~torchgeo.datasets.NAIP` and
:class:`~torchgeo.datasets.NAIP` (prefix keys with ``naip_``) and
:class:`~torchgeo.datasets.Chesapeake13`
(prefix keys with ``chesapeake_``)
"""
super().__init__()
self.naip_root = naip_root
self.chesapeake_root = chesapeake_root
self.batch_size = batch_size
self.num_workers = num_workers
self.patch_size = patch_size
self.kwargs = kwargs

self.naip_kwargs = {}
self.chesapeake_kwargs = {}
for key, val in kwargs.items():
if key.startswith("naip_"):
self.naip_kwargs[key[5:]] = val
elif key.startswith("chesapeake_"):
self.chesapeake_kwargs[key[11:]] = val

def preprocess(self, sample: Dict[str, Any]) -> Dict[str, Any]:
"""Transform a single sample from the NAIP Dataset.
Expand Down Expand Up @@ -102,7 +104,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
Chesapeake13(self.chesapeake_root, **self.kwargs)
if self.chesapeake_kwargs.get("download", False):
Chesapeake13(**self.chesapeake_kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand All @@ -119,14 +122,13 @@ def setup(self, stage: Optional[str] = None) -> None:
chesapeak_transforms = Compose([self.chesapeake_transform, self.remove_bbox])

self.chesapeake = Chesapeake13(
self.chesapeake_root, transforms=chesapeak_transforms, **self.kwargs
transforms=chesapeak_transforms, **self.chesapeake_kwargs
)
self.naip = NAIP(
self.naip_root,
self.chesapeake.crs,
self.chesapeake.res,
crs=self.chesapeake.crs,
res=self.chesapeake.res,
transforms=naip_transforms,
**self.kwargs,
**self.naip_kwargs,
)
self.dataset = self.chesapeake & self.naip

Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/nasa_marine_debris.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
NASAMarineDebris(**self.kwargs)
if self.kwargs.get("download", False):
NASAMarineDebris(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/oscd.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
OSCD(split="train", **self.kwargs)
if self.kwargs.get("download", False):
OSCD(split="train", **self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/resisc45.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
RESISC45(**self.kwargs)
if self.kwargs.get("download", False):
RESISC45(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
7 changes: 0 additions & 7 deletions torchgeo/datamodules/so2sat.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,13 +103,6 @@ def preprocess(self, sample: Dict[str, Any]) -> Dict[str, Any]:

return sample

def prepare_data(self) -> None:
"""Make sure that the dataset is downloaded.

This method is only called once per run.
"""
So2Sat(**self.kwargs)
Comment on lines -106 to -111
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dataset cannot be automatically downloaded, no reason to prepare_data


def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.

Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/spacenet.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
SpaceNet1(**self.kwargs)
if self.kwargs.get("download", False):
SpaceNet1(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/ucmerced.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
UCMerced(**self.kwargs)
if self.kwargs.get("download", False):
UCMerced(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main ``Dataset`` objects.
Expand Down
3 changes: 2 additions & 1 deletion torchgeo/datamodules/usavars.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,8 @@ def prepare_data(self) -> None:

This method is only called once per run.
"""
USAVars(**self.kwargs)
if self.kwargs.get("download", False):
USAVars(**self.kwargs)

def setup(self, stage: Optional[str] = None) -> None:
"""Initialize the main Dataset objects.
Expand Down