Skip to content

Commit

Permalink
release and fix OpenSTL V0.2.0 (issue chengtan9907#20)
Browse files Browse the repository at this point in the history
  • Loading branch information
Lupin1998 committed Apr 20, 2023
1 parent 5cc979f commit 109b86a
Show file tree
Hide file tree
Showing 39 changed files with 640 additions and 304 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ This is the journal version of our previous conference work ([SimVP: Simpler yet

## News and Updates

[2023-04-19] `OpenSTL` v0.2.0 is released.
[2023-04-19] `OpenSTL` v0.2.0 is released. The training loop and dataloaders are fixed.

## Installation

Expand All @@ -69,17 +69,17 @@ python setup.py develop
* torch
* timm
* tqdm
* xarray
* xarray==0.19.0
</details>

Please refer to [install.md](docs/en/install.md) for more detailed instructions.

## Getting Started

Please see [get_started.md](docs/en/get_started.md) for the basic usage. Here is an example of single GPU non-dist training SimVP+gSTA on Moving MNIST dataset.
Please see [get_started.md](docs/en/get_started.md) for the basic usage. Here is an example of single GPU non-distributed training SimVP+gSTA on Moving MNIST dataset.
```shell
bash tools/prepare_data/download_mmnist.sh
python tools/train.py -d mmnist --lr 1e-3 -c ./configs/mmnist/simvp/SimVP_gSTA.py --ex_name mmnist_simvp_gsta
python tools/train.py -d mmnist --lr 1e-3 -c configs/mmnist/simvp/SimVP_gSTA.py --ex_name mmnist_simvp_gsta
```

<p align="right">(<a href="#top">back to top</a>)</p>
Expand Down
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_ConvMixer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'convmixer'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 1e-2
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_ConvNeXt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'convnext'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 1e-2
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_HorNet.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'hornet'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 1e-3
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_IncepU.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'IncepU' # SimVP.V1
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 1e-2
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_MLPMixer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'mlp'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 1e-3
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_MogaNet.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'moga'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 5e-3
batch_size = 16
drop_path = 0.2
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_Poolformer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'poolformer'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 5e-4
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_Swin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'swin'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 1e-3
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_Uniformer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'uniformer'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 5e-3
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_VAN.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'van'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 5e-3
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
15 changes: 15 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_ViT.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'vit'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 1e-3
batch_size = 16
drop_path = 0.1
sched = 'cosine'
warmup_epoch = 0
14 changes: 14 additions & 0 deletions configs/weather/t2m_1_40625/SimVP_gSTA.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
method = 'SimVP'
# model
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'gSTA'
hid_S = 32
hid_T = 256
N_T = 8
N_S = 2
# training
lr = 5e-3
batch_size = 16
drop_path = 0.1
warmup_epoch = 0
25 changes: 24 additions & 1 deletion docs/en/changelog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,28 @@
## Changelog

### v0.1.0 (21/04/2023)

Release version to OpenSTL V0.2.0 as [#20](https://github.com/chengtan9907/OpenSTL/issues/20).

#### Code Refactoring

* Rename the project to `OpenSTL` instead of `SimVPv2` with module name refactoring.
* Refactor the code structure thoroughly to support non-distributed and distributed (DDP) training & testing with `tools/train.py` and `tools/test.py`.

#### New Features

* Update the Weather Bench dataloader with `5.625deg`, `2.8125deg`, and `1.40625deg` settings.

#### Update Documents

* Update documents of video prediction and weather prediction benchmarks. Provide config files for supported mixup methods.
* Update `docs/en` documents for the basic usages and new features of V0.2.0.

#### Fix Bugs

* Fix bugs in training loops and validation loops to save GPU memory.
* There might be some bugs in not using all parameters for calculating losses in ConvLSTM CrevNet, which should use `--find_unused_parameters` for DDP training.

### v0.1.0 (18/02/2023)

Release version to V0.1.0 with code refactoring.
Expand All @@ -15,7 +38,7 @@ Release version to V0.1.0 with code refactoring.
* Update popular Metaformer models as the hidden Translator $h$ in SimVP, supporting [ViT](https://arxiv.org/abs/2010.11929), [Swin-Transformer](https://arxiv.org/abs/2103.14030), [MLP-Mixer](https://arxiv.org/abs/2105.01601), [ConvMixer](https://arxiv.org/abs/2201.09792), [UniFormer](https://arxiv.org/abs/2201.09450), [PoolFormer](https://arxiv.org/abs/2111.11418), [ConvNeXt](https://arxiv.org/abs/2201.03545), [VAN](https://arxiv.org/abs/2202.09741), [HorNet](https://arxiv.org/abs/2207.14284), and [MogaNet](https://arxiv.org/abs/2211.03295).
* Update implementations of dataset and dataloader, supporting [KTH Action](https://ieeexplore.ieee.org/document/1334462), [KittiCaltech Pedestrian](https://dl.acm.org/doi/10.1177/0278364913491297), [Moving MNIST](http://arxiv.org/abs/1502.04681), [TaxiBJ](https://arxiv.org/abs/1610.00081), and [WeatherBench](https://arxiv.org/abs/2002.00469).

### Update Documents
#### Update Documents

* Upload `readthedocs` documents. Summarize video prediction benchmark results on MMNIST in [video_benchmarks.md](https://github.com/chengtan9907/SimVPv2/docs/en/model_zoos/video_benchmarks.md).
* Update benchmark results of video prediction baselines and MetaFormer architectures based on SimVP on MMNIST, TaxiBJ, and WeatherBench datasets.
Expand Down
53 changes: 48 additions & 5 deletions docs/en/get_started.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,81 @@
# Getting Started

This page provides basic tutorials about the usage of SimVP. For installation instructions, please see [Install](docs/en/install.md).
This page provides basic tutorials about the usage of OpenSTL with various spatioTemporal predictive learning (STL) tasks. For installation instructions, please see [Install](docs/en/install.md).

## Training and Testing with a Single GPU

You can perform single/multiple GPU training and testing with `tools/train.py` and `tools/test.py`. We provide descriptions of some essential arguments.
You can perform single GPU training and testing with `tools/train.py` and `tools/test.py` with non-distributed and distributed (DDP) modes. Non-distributed mode is recommanded for the single GPU training (a bit faster than DDP). We provide descriptions of some essential arguments. Other arguments related to datasets, optimizers, methods can be found in [parser.py](https://github.com/chengtan9907/OpenSTL/tree/master/openstl/utils/parser.py).

```bash
python tools/train.py \
--dataname ${DATASET_NAME} \
--method ${METHOD_NAME} \
--config_file ${CONFIG_FILE} \
--ex_name ${EXP_NAME} \
--resume_from ${CHECKPOINT_FILE} \
--auto_resume \
--batch_size ${BATCH_SIZE} \
--lr ${LEARNING_RATE} \
--dist \
--fp16 \
--seed ${SEED} \
--clip_grad ${VALUE} \
--find_unused_parameters \
--deterministic \
```

**Description of arguments**:
- `--dataname (-d)` : The name of dataset, default to be `mmnist`.
- `--method (-m)` : The name of the video prediction method to train or test, default to be `SimVP`.
- `--config_file (-c)` : The path of a model config file, which will provide detailed settings for a video prediction method.
- `--config_file (-c)` : The path of a model config file, which will provide detailed settings for a STL method.
- `--ex_name` : The name of the experiment under the `res_dir`. Default to be `Debug`.
- `--resume_from ${CHECKPOINT_FILE}`: Resume from a previous checkpoint file. Or you can use `--auto_resume` to resume from `latest.pth` automatically.
- `--auto_resume` : Whether to automatically resume training when the experiment was interrupted.
- `--batch_size (-b)` : Training batch size, default to 16.
- `--lr` : The basic training learning rate, defaults to 0.001.
- `--dist`: Whether to use distributed training (DDP).
- `--fp16`: Whether to use Native AMP for mixed precision training (PyTorch=>1.6.0).
- `--seed ${SEED}`: Setup all random seeds to a certain number (defaults to 42).
- `--clip_grad ${VALUE}`: Clip gradient norm value (default: None, no clipping).
- `--find_unused_parameters`: Whether to find unused parameters in forward during DDP training.
- `--deterministic`: Switch on "deterministic" mode, which slows down training while the results are reproducible.

An example of single GPU training with SimVP+gSTA on Moving MNIST dataset.
An example of single GPU (non-distributed) training with SimVP+gSTA on Moving MNIST dataset.
```shell
bash tools/prepare_data/download_mmnist.sh
python tools/train.py -d mmnist --lr 1e-3 -c ./configs/mmnist/simvp/SimVP_gSTA.py --ex_name mmnist_simvp_gsta
python tools/train.py -d mmnist --lr 1e-3 -c configs/mmnist/simvp/SimVP_gSTA.py --ex_name mmnist_simvp_gsta
```

An example of single GPU testing with SimVP+gSTA on Moving MNIST dataset.
```shell
python tools/test.py -d mmnist -c configs/mmnist/simvp/SimVP_gSTA.py --ex_name mmnist_simvp_gsta
```

## Training and Testing with Multiple GPUs

For larger STL tasks (e.g., high resolutions), you can also perform multiple GPUs training and testing with `tools/dist_train.sh` and `tools/dist_test.sh` with DDP mode. The bash files will call `tools/train.py` and `tools/test.py` with the necessary arguments.

```shell
bash tools/dist_train.sh ${CONFIG_FILE} ${GPUS} [optional arguments]
```
**Description of arguments**:
- `${CONFIG_FILE}` : The path of a model config file, which will provide detailed settings for a STL method.
- `${GPUS}` : The number of GPUs for DDP training.

Examples of multiple GPUs training on Moving MNIST dataset with a machine with 8 GPUs.
```shell
PORT=29001 CUDA_VISIBLE_DEVICES=0,1 bash tools/dist_train.sh configs/mmnist/simvp/SimVP_gSTA.py 2 -d mmnist --lr 1e-3 --batch_size 8
PORT=29002 CUDA_VISIBLE_DEVICES=2,3 bash tools/dist_train.sh configs/mmnist/PredRNN.py 2 -d mmnist --lr 1e-3 --batch_size 8
PORT=29003 CUDA_VISIBLE_DEVICES=4,5,6,7 bash tools/dist_train.sh configs/mmnist/PredRNNpp.py 4 -d mmnist --lr 1e-3 --batch_size 4
```

An example of multiple GPUs testing on Moving MNIST dataset. The bash script is `bash tools/dist_train.sh ${CONFIG_FILE} ${GPUS} ${CHECKPOINT} [optional arguments]`.
```shell
PORT=29001 CUDA_VISIBLE_DEVICES=0,1 bash tools/dist_test.sh configs/mmnist/simvp/SimVP_gSTA.py 2 work_dirs/mmnist/simvp/SimVP_gSTA -d mmnist
```

**Note**: During DDP training, the number of GPUS `ngpus` should be provided and checkpoints and logs are saved in the same folder structure as the config file under `work_dirs/` (it will be the default setting if `--ex_name` is not specified). The default learning rate `lr` and the batch size `bs` in config files are for a single GPU. If using a different number GPUs, the total batch size will change in proportion, you have to scale the learning rate following `lr = base_lr * ngpus` and `bs = base_bs * ngpus`. Other arguments should be added as the single GPU training.

## Mixed Precision Training

We support Mixed Precision Training implemented by PyTorch AMP. If you want to use Mixed Precision Training, you can add `--fp16` in the arguments.
17 changes: 15 additions & 2 deletions docs/en/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,16 @@ conda activate OpenSTL
python setup.py develop # or `pip install -e .`
```

<details close>
<summary>Requirements</summary>
* Linux (Windows is not officially supported)
* Python 3.7+
* PyTorch 1.8 or higher
* CUDA 10.1 or higher
* NCCL 2
* GCC 4.9 or higher
</details>

<details close>
<summary>Dependencies</summary>

Expand All @@ -23,12 +33,12 @@ python setup.py develop # or `pip install -e .`
* torch
* timm
* tqdm
* xarray
* xarray==0.19.0
</details>

**Note:**

1. Some errors might occur with `hickle` and `xarray` when using KittiCaltech and WeatherBench datasets. As for KittiCaltech, you can solve the issues by installing additional pacakges according to the output messeage. As for WeatherBench, you can install the latest version of `xarray` to solve the errors, i.e., `pip install git+https://github.com/pydata/xarray/@v2022.03.0` and then installing required pacakges according to error messages.
1. Some errors might occur with `hickle` and `xarray` when using KittiCaltech and WeatherBench datasets. As for KittiCaltech, you can solve the issues by installing additional pacakges according to the output messeage. As for WeatherBench, you can install the latest version of `xarray` to solve the errors, i.e., `pip install xarray==0.19.0` and then installing required pacakges according to error messages.

2. Following the above instructions, OpenSTL is installed on `dev` mode, any local modifications made to the code will take effect. You can install it by `pip install .` to use it as a PyPi package, and you should reinstall it to make the local modifications effect.

Expand Down Expand Up @@ -61,4 +71,7 @@ OpenSTL
|── weather
| ├── 2m_temperature
| ├── ...
|── weather_1_40625deg
| ├── 2m_temperature
| ├── ...
```
Loading

0 comments on commit 109b86a

Please sign in to comment.