Skip to content

Commit

Permalink
update docs and mmnist benchmarks
Browse files Browse the repository at this point in the history
  • Loading branch information
Lupin1998 committed Feb 20, 2023
1 parent b24b99d commit 1f3d9ba
Show file tree
Hide file tree
Showing 29 changed files with 330 additions and 22 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -135,3 +135,4 @@ figs

# temp
configs/kitticaltech/simvp
configs/kth/simvp
9 changes: 9 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
version: 2

formats: []

python:
version: 3.7
install:
- requirements: requirements/docs.txt
- requirements: requirements/readthedocs.txt
4 changes: 4 additions & 0 deletions .style.yapf
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[style]
BASED_ON_STYLE = pep8
BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF = true
SPLIT_BEFORE_EXPRESSION_AFTER_OPENING_PAREN = true
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
<img src="https://img.shields.io/badge/license-Apache--2.0-%23B7A800" /></a>
</p>

[🛠️Installation](docs/en/install.md) |
[🚀Model Zoo](docs/en/model_zoos/video_benchmarks.md) |
[🆕News](docs/en/changelog.md)

This repository is an open-source project for video prediction benchmarks, which contains the implementation code for paper:

**SimVP: Towards Simple yet Powerful Spatiotemporal Predictive learning**
Expand Down Expand Up @@ -125,10 +129,11 @@ We support various video prediction methods and will provide benchmarks on vario
<details open>
<summary>Currently supported datasets</summary>

- [x] [KTH Action](https://ieeexplore.ieee.org/document/1334462) (ICPR'2004) [[download](https://www.csc.kth.se/cvap/actions/)]
- [x] [KittiCaltech Pedestrian](https://dl.acm.org/doi/10.1177/0278364913491297) (IJRR'2013) [[download](https://figshare.com/articles/dataset/KITTI_hkl_files/7985684)]
- [x] [Moving MNIST](http://arxiv.org/abs/1502.04681) (ICML'2015) [[download](http://www.cs.toronto.edu/~nitish/unsupervised_video/)]
- [x] [TaxiBJ](https://arxiv.org/abs/1610.00081) (AAAI'2017) [[download](https://github.com/TolicWang/DeepST/tree/master/data/TaxiBJ)]
- [x] [KTH Action](https://ieeexplore.ieee.org/document/1334462) (ICPR'2004) [[download](https://www.csc.kth.se/cvap/actions/)] [[config](https://github.com/chengtan9907/SimVPv2/configs/kth)]
- [x] [KittiCaltech Pedestrian](https://dl.acm.org/doi/10.1177/0278364913491297) (IJRR'2013) [[download](https://figshare.com/articles/dataset/KITTI_hkl_files/7985684)] [[config](https://github.com/chengtan9907/SimVPv2/configs/kitticaltech)]
- [x] [Moving MNIST](http://arxiv.org/abs/1502.04681) (ICML'2015) [[download](http://www.cs.toronto.edu/~nitish/unsupervised_video/)] [[config](https://github.com/chengtan9907/SimVPv2/configs/mmnist)]
- [x] [TaxiBJ](https://arxiv.org/abs/1610.00081) (AAAI'2017) [[download](https://github.com/TolicWang/DeepST/tree/master/data/TaxiBJ)] [[config](https://github.com/chengtan9907/SimVPv2/configs/taxibj)]
- [x] [WeatherBench](https://arxiv.org/abs/2002.00469) (ArXiv'2020) [[download](https://github.com/pangeo-data/WeatherBench)] [[config](https://github.com/chengtan9907/SimVPv2/configs/weather)]

</details>

Expand Down
11 changes: 11 additions & 0 deletions configs/mmnist/simvp/SimVP_ConvMixer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
method = 'SimVP'
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'convmixer'
hid_S = 64
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-2
batch_size = 16
drop_path = 0
2 changes: 2 additions & 0 deletions configs/mmnist/simvp/SimVP_ConvNeXt.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-2
batch_size = 16
drop_path = 0
2 changes: 2 additions & 0 deletions configs/mmnist/simvp/SimVP_HorNet.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-3
batch_size = 16
drop_path = 0
11 changes: 11 additions & 0 deletions configs/mmnist/simvp/SimVP_MLPMixer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
method = 'SimVP'
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'mlp'
hid_S = 64
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-3
batch_size = 16
drop_path = 0
2 changes: 2 additions & 0 deletions configs/mmnist/simvp/SimVP_MogaNet.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-3
batch_size = 16
drop_path = 0
2 changes: 2 additions & 0 deletions configs/mmnist/simvp/SimVP_Poolformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-3
batch_size = 16
drop_path = 0
2 changes: 2 additions & 0 deletions configs/mmnist/simvp/SimVP_Swin.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-3
batch_size = 16
drop_path = 0
2 changes: 2 additions & 0 deletions configs/mmnist/simvp/SimVP_Uniformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@
hid_T = 512
N_T = 8
N_S = 4
lr = 5e-4
batch_size = 16
drop_path = 0
11 changes: 11 additions & 0 deletions configs/mmnist/simvp/SimVP_VAN.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
method = 'SimVP'
spatio_kernel_enc = 3
spatio_kernel_dec = 3
model_type = 'convmixer'
hid_S = 64
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-3
batch_size = 16
drop_path = 0
2 changes: 2 additions & 0 deletions configs/mmnist/simvp/SimVP_ViT.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@
hid_T = 512
N_T = 8
N_S = 4
lr = 1e-3
batch_size = 16
drop_path = 0
8 changes: 8 additions & 0 deletions configs/taxibj/SimVP.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
method = 'SimVP'
spatio_kernel_enc = 3
spatio_kernel_dec = 3
# model_type = None
hid_S = 64
hid_T = 512
N_T = 8
N_S = 4
20 changes: 20 additions & 0 deletions docs/en/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
20 changes: 20 additions & 0 deletions docs/en/changelog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
## Changelog

### v0.1.0 (18/02/2023)

Release version to V0.1.0 with code refactoring.

#### Code Refactoring

* Refactor code structures as `simvp/api`, `simvp/core`, `simvp/datasets`, `simvp/methods`, `simvp/models`, `simvp/modules`. We support non-distributed training and evaluation by the executable python file `tools/non_dist_train.py`. Refactor config files for SimVP models.
* Fix bugs in tools/nondist_train.py, simvp/utils, environment.yml, and .gitignore, etc.

#### New Features

* Support Timm optimizers and schedulers.
* Update popular Metaformer models as the hidden Translator $h$ in SimVP, supporting [ViT](https://arxiv.org/abs/2010.11929), [Swin-Transformer](https://arxiv.org/abs/2103.14030), [MLP-Mixer](https://arxiv.org/abs/2105.01601), [ConvMixer](https://arxiv.org/abs/2201.09792), [UniFormer](https://arxiv.org/abs/2201.09450), [PoolFormer](https://arxiv.org/abs/2111.11418), [ConvNeXt](https://arxiv.org/abs/2201.03545), [VAN](https://arxiv.org/abs/2202.09741), [HorNet](https://arxiv.org/abs/2207.14284), and [MogaNet](https://arxiv.org/abs/2211.03295).
* Update implementations of dataset and dataloader, supporting [KTH Action](https://ieeexplore.ieee.org/document/1334462), [KittiCaltech Pedestrian](https://dl.acm.org/doi/10.1177/0278364913491297), [Moving MNIST](http://arxiv.org/abs/1502.04681), [TaxiBJ](https://arxiv.org/abs/1610.00081), and [WeatherBench](https://arxiv.org/abs/2002.00469).

### Update Documents

* Upload `readthedocs` documents. Summarize video prediction benchmark results on MMNIST in [video_benchmarks.md](https://github.com/chengtan9907/SimVPv2/docs/en/model_zoos/video_benchmarks.md).
9 changes: 9 additions & 0 deletions docs/en/get_started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Getting Started

This page provides basic tutorials about the usage of SimVP. For installation instructions, please see [Install](docs/en/install.md).

An example of single GPU training SimVP+gSTA on Moving MNIST dataset.
```shell
bash tools/prepare_data/download_mmnist.sh
python tools/non_dist_train.py -d mmnist -m SimVP --model_type gsta --lr 1e-3 --ex_name mmnist_simvp_gsta
```
38 changes: 38 additions & 0 deletions docs/en/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
.. SimVP documentation master file, created by
sphinx-quickstart on Thu June 15 05:11:34 2022.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to SimVP's documentation!
=====================================

.. toctree::
:maxdepth: 1
:caption: Getting Started

install.md
get_started.md

.. toctree::
:maxdepth: 1
:caption: Model Zoos

model_zoos/video_benchmarks.md

.. toctree::
:maxdepth: 1
:caption: Notes

changelog.md

.. toctree::
:caption: Switch Language

switch_language.md


Indices and tables
==================

* :ref:`genindex`
* :ref:`search`
22 changes: 22 additions & 0 deletions docs/en/install.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Installation

This project has provided an environment setting file of conda, users can easily reproduce the environment by the following commands:
```shell
git clone https://github.com/chengtan9907/SimVPv2
cd SimVPv2
conda env create -f environment.yml
conda activate SimVP
python setup.py develop
```

<details close>
<summary>Dependencies</summary>

* argparse
* numpy
* hickle
* scikit-image=0.16.2
* torch
* timm
* tqdm
</details>
90 changes: 90 additions & 0 deletions docs/en/model_zoos/video_benchmarks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Video Prediction Benchmarks

**We provide benchmark results of video prediction methods on video datasets. More video prediction methods will be supported in the future. Issues and PRs are welcome!**

<details open>
<summary>Currently supported video prediction methods</summary>

- [x] [ConvLSTM](https://arxiv.org/abs/1506.04214) (NIPS'2015)
- [x] [PredRNN](https://dl.acm.org/doi/abs/10.5555/3294771.3294855) (NIPS'2017)
- [x] [PredRNN++](https://arxiv.org/abs/1804.06300) (ICML'2018)
- [x] [E3D-LSTM](https://openreview.net/forum?id=B1lKS2AqtX) (ICLR'2018)
- [x] [MAU](https://arxiv.org/abs/1811.07490) (CVPR'2019)
- [x] [CrevNet](https://openreview.net/forum?id=B1lKS2AqtX) (ICLR'2020)
- [x] [PhyDNet](https://arxiv.org/abs/2003.01460) (CVPR'2020)
- [x] [PredRNN.V2](https://arxiv.org/abs/2103.09504v4) (TPAMI'2022)
- [x] [SimVP](https://arxiv.org/abs/2206.05099) (CVPR'2022)
- [x] [SimVP.V2](https://arxiv.org/abs/2211.12509) (ArXiv'2022)

</details>

<details open>
<summary>Currently supported MetaFormer models for SimVP</summary>

- [x] [ViT](https://arxiv.org/abs/2010.11929) (ICLR'2021)
- [x] [Swin-Transformer](https://arxiv.org/abs/2103.14030) (ICCV'2021)
- [x] [MLP-Mixer](https://arxiv.org/abs/2105.01601) (NIPS'2021)
- [x] [ConvMixer](https://arxiv.org/abs/2201.09792) (Openreview'2021)
- [x] [UniFormer](https://arxiv.org/abs/2201.09450) (ICLR'2022)
- [x] [PoolFormer](https://arxiv.org/abs/2111.11418) (CVPR'2022)
- [x] [ConvNeXt](https://arxiv.org/abs/2201.03545) (CVPR'2022)
- [x] [VAN](https://arxiv.org/abs/2202.09741) (ArXiv'2022)
- [x] [IncepU (SimVP.V1)](https://arxiv.org/abs/2206.05099) (CVPR'2022)
- [x] [gSTA (SimVP.V2)](https://arxiv.org/abs/2211.12509) (ArXiv'2022)
- [x] [HorNet](https://arxiv.org/abs/2207.14284) (NIPS'2022)
- [x] [MogaNet](https://arxiv.org/abs/2211.03295) (ArXiv'2022)

</details>

## Moving MNIST Benchmarks

We provide benchmark results on popular [Moving MNIST](http://arxiv.org/abs/1502.04681) dataset using $10\rightarrow 10$ frames prediction setting. Metrics (MSE, MAE, SSIM, pSNR) of the final models are reported in three trials. Parameters (M), FLOPs (G), inference FPS (s) are also reported for all methods.

### **Benchmark of Video Prediction Methods**

For fair comparison of different methods, we report final results when models are trained to convergence. We provide config file in [configs/mmnist](https://github.com/chengtan9907/SimVPv2/configs/mmnist).

| Method | Params | FLOPs | FPS | MSE | MAE | SSIM | Download |
|--------------|:------:|:------:|:---:|:-----:|:------:|:-----:|:------------:|
| ConvLSTM-S | 15.0M | 56.8G | 113 | 46.26 | 142.18 | 0.878 | model \| log |
| ConvLSTM-L | 33.8M | 127.0G | 50 | 29.88 | 95.05 | 0.925 | model \| log |
| PhyDNet | 3.1M | 15.3G | 182 | 35.68 | 96.70 | 0.917 | model \| log |
| PredRNN | 23.8M | 116.0G | 54 | 25.04 | 76.26 | 0.944 | model \| log |
| PredRNN++ | 38.6M | 171.7G | 38 | 22.45 | 69.70 | 0.950 | model \| log |
| MIM | 38.0M | 179.2G | 37 | 23.66 | 74.37 | 0.946 | model \| log |
| E3D-LSTM | 51.0M | 298.9G | 18 | 36.19 | 78.64 | 0.932 | model \| log |
| CrevNet | 5.0M | 270.7G | 10 | 30.15 | 86.28 | 0.935 | model \| log |
| PredRNN.V2 | 23.9M | 116.6G | 52 | 27.73 | 82.17 | 0.937 | model \| log |
| SimVP+IncepU | 58.0M | 19.4G | 209 | 26.69 | 77.19 | 0.940 | model \| log |
| SimVP+gSTA-S | 46.8M | 16.5G | 282 | 15.05 | 49.80 | 0.967 | model \| log |

### **Benchmark of MetaFormers on SimVP**

Since the hidden Translator in [SimVP](https://arxiv.org/abs/2211.12509) can be replaced by any [Metaformer](https://arxiv.org/abs/2111.11418) block which achieves `token mixing` and `channel mixing`, we benchmark popular Metaformer architectures on SimVP with training times of 200-epoch and 2000-epoch. We provide config file in [configs/mmnist/simvp](https://github.com/chengtan9907/SimVPv2/configs/mmnist/simvp/).

| MetaFormer | Setting | Params | FLOPs | FPS | MSE | MAE | SSIM | PSNR | Download |
|------------------|:----------:|:------:|:------:|:----:|:-----:|:-----:|:------:|:-----:|:------------:|
| IncepU (SimVPv1) | 200 epoch | 58.0M | 19.4G | 209s | 32.15 | 89.05 | 0.9268 | 37.97 | model \| log |
| gSTA (SimVPv2) | 200 epoch | 46.8M | 16.5G | 282s | 26.69 | 77.19 | 0.9402 | 38.3 | model \| log |
| ViT | 200 epoch | 46.1M | 16.9.G | 290s | 35.15 | 95.87 | 0.9139 | 37.79 | model \| log |
| Swin Transformer | 200 epoch | 46.1M | 16.4G | 294s | 29.70 | 84.05 | 0.9331 | 38.14 | model \| log |
| Uniformer | 200 epoch | 44.8M | 16.5G | 296s | 30.38 | 85.87 | 0.9308 | 38.11 | model \| log |
| MLP-Mixer | 200 epoch | 38.2M | 14.7G | 334s | 29.52 | 83.36 | 0.9338 | 38.19 | model \| log |
| ConvMixer | 200 epoch | 3.9M | 5.5G | 658s | 32.09 | 88.93 | 0.9259 | 37.97 | model \| log |
| Poolformer | 200 epoch | 37.1M | 14.1G | 341s | 31.79 | 88.48 | 0.9271 | 38.06 | model \| log |
| ConvNeXt | 200 epoch | 37.3M | 14.1G | 344s | 26.94 | 77.23 | 0.9397 | 38.34 | model \| log |
| VAN | 200 epoch | 44.5M | 16.0G | 288s | 26.10 | 76.11 | 0.9417 | 38.39 | model \| log |
| HorNet | 200 epoch | 45.7M | 16.3G | 287s | 29.64 | 83.26 | 0.9331 | 38.16 | model \| log |
| MogaNet | 200 epoch | 46.8M | 16.5G | 255s | 25.57 | 75.19 | 0.9429 | 38.41 | model \| log |
| IncepU (SimVPv1) | 2000 epoch | 58.0M | 19.4G | 209s | - | - | - | - | - |
| gSTA (SimVPv2) | 2000 epoch | 46.8M | 16.5G | 282s | 15.05 | 49.80 | 0.9670 | - | model \| log |
| ViT | 2000 epoch | 46.1M | 16.9.G | 290s | 19.74 | 61.65 | 0.9539 | 38.96 | model \| log |
| Swin Transformer | 2000 epoch | 46.1M | 16.4G | 294s | 19.11 | 59.84 | 0.9584 | 39.03 | model \| log |
| Uniformer | 2000 epoch | 44.8M | 16.5G | 296s | 18.01 | 57.52 | 0.9609 | 39.11 | model \| log |
| MLP-Mixer | 2000 epoch | 38.2M | 14.7G | 334s | 18.85 | 59.86 | 0.9589 | 38.98 | model \| log |
| ConvMixer | 2000 epoch | 3.9M | 5.5G | 658s | 22.30 | 67.37 | 0.9507 | 38.67 | model \| log |
| Poolformer | 2000 epoch | 37.1M | 14.1G | 341s | 20.96 | 64.31 | 0.9539 | 38.86 | model \| log |
| ConvNeXt | 2000 epoch | 37.3M | 14.1G | 344s | 17.58 | 55.76 | 0.9617 | 39.19 | model \| log |
| VAN | 2000 epoch | 44.5M | 16.0G | 288s | 16.21 | 53.57 | 0.9646 | 39.26 | model \| log |
| HorNet | 2000 epoch | 45.7M | 16.3G | 287s | 17.40 | 55.70 | 0.9624 | 39.19 | model \| log |
| MogaNet | 2000 epoch | 46.8M | 16.5G | 255s | 15.67 | 51.84 | 0.9661 | 39.35 | model \| log |
1 change: 1 addition & 0 deletions docs/en/switch_language.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## <a href='https://simvp.readthedocs.io/en/latest/'>English</a>
Loading

0 comments on commit 1f3d9ba

Please sign in to comment.