update docs and mmnist benchmarks

chengtan9907 · Feb 20, 2023 · 1f3d9ba · 1f3d9ba
1 parent b24b99d
commit 1f3d9ba
Show file tree

Hide file tree

Showing 29 changed files with 330 additions and 22 deletions.
diff --git a/.gitignore b/.gitignore
@@ -135,3 +135,4 @@ figs
 
 # temp
 configs/kitticaltech/simvp
+configs/kth/simvp
diff --git a/.readthedocs.yml b/.readthedocs.yml
@@ -0,0 +1,9 @@
+version: 2
+
+formats: []
+
+python:
+  version: 3.7
+  install:
+    - requirements: requirements/docs.txt
+    - requirements: requirements/readthedocs.txt
diff --git a/.style.yapf b/.style.yapf
@@ -0,0 +1,4 @@
+[style]
+BASED_ON_STYLE = pep8
+BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF = true
+SPLIT_BEFORE_EXPRESSION_AFTER_OPENING_PAREN = true
diff --git a/README.md b/README.md
@@ -7,6 +7,10 @@
     <img src="https://img.shields.io/badge/license-Apache--2.0-%23B7A800" /></a>
 </p>
 
+[🛠️Installation](docs/en/install.md) |
+[🚀Model Zoo](docs/en/model_zoos/video_benchmarks.md) |
+[🆕News](docs/en/changelog.md)
+
 This repository is an open-source project for video prediction benchmarks, which contains the implementation code for paper:
 
 **SimVP: Towards Simple yet Powerful Spatiotemporal Predictive learning**  
@@ -125,10 +129,11 @@ We support various video prediction methods and will provide benchmarks on vario
     <details open>
     <summary>Currently supported datasets</summary>
 
-    - [x] [KTH Action](https://ieeexplore.ieee.org/document/1334462) (ICPR'2004)  [[download](https://www.csc.kth.se/cvap/actions/)]
-    - [x] [KittiCaltech Pedestrian](https://dl.acm.org/doi/10.1177/0278364913491297) (IJRR'2013) [[download](https://figshare.com/articles/dataset/KITTI_hkl_files/7985684)]
-    - [x] [Moving MNIST](http://arxiv.org/abs/1502.04681) (ICML'2015) [[download](http://www.cs.toronto.edu/~nitish/unsupervised_video/)]
-    - [x] [TaxiBJ](https://arxiv.org/abs/1610.00081) (AAAI'2017) [[download](https://github.com/TolicWang/DeepST/tree/master/data/TaxiBJ)]
+    - [x] [KTH Action](https://ieeexplore.ieee.org/document/1334462) (ICPR'2004)  [[download](https://www.csc.kth.se/cvap/actions/)] [[config](https://github.com/chengtan9907/SimVPv2/configs/kth)]
+    - [x] [KittiCaltech Pedestrian](https://dl.acm.org/doi/10.1177/0278364913491297) (IJRR'2013) [[download](https://figshare.com/articles/dataset/KITTI_hkl_files/7985684)] [[config](https://github.com/chengtan9907/SimVPv2/configs/kitticaltech)]
+    - [x] [Moving MNIST](http://arxiv.org/abs/1502.04681) (ICML'2015) [[download](http://www.cs.toronto.edu/~nitish/unsupervised_video/)] [[config](https://github.com/chengtan9907/SimVPv2/configs/mmnist)]
+    - [x] [TaxiBJ](https://arxiv.org/abs/1610.00081) (AAAI'2017) [[download](https://github.com/TolicWang/DeepST/tree/master/data/TaxiBJ)] [[config](https://github.com/chengtan9907/SimVPv2/configs/taxibj)]
+    - [x] [WeatherBench](https://arxiv.org/abs/2002.00469) (ArXiv'2020) [[download](https://github.com/pangeo-data/WeatherBench)] [[config](https://github.com/chengtan9907/SimVPv2/configs/weather)]
 
     </details>
 

diff --git a/configs/mmnist/simvp/SimVP_ConvMixer.py b/configs/mmnist/simvp/SimVP_ConvMixer.py
@@ -0,0 +1,11 @@
+method = 'SimVP'
+spatio_kernel_enc = 3
+spatio_kernel_dec = 3
+model_type = 'convmixer'
+hid_S = 64
+hid_T = 512
+N_T = 8
+N_S = 4
+lr = 1e-2
+batch_size = 16
+drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_ConvNeXt.py b/configs/mmnist/simvp/SimVP_ConvNeXt.py
@@ -6,4 +6,6 @@
 hid_T = 512
 N_T = 8
 N_S = 4
+lr = 1e-2
+batch_size = 16
 drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_HorNet.py b/configs/mmnist/simvp/SimVP_HorNet.py
@@ -6,4 +6,6 @@
 hid_T = 512
 N_T = 8
 N_S = 4
+lr = 1e-3
+batch_size = 16
 drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_MLPMixer.py b/configs/mmnist/simvp/SimVP_MLPMixer.py
@@ -0,0 +1,11 @@
+method = 'SimVP'
+spatio_kernel_enc = 3
+spatio_kernel_dec = 3
+model_type = 'mlp'
+hid_S = 64
+hid_T = 512
+N_T = 8
+N_S = 4
+lr = 1e-3
+batch_size = 16
+drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_MogaNet.py b/configs/mmnist/simvp/SimVP_MogaNet.py
@@ -6,4 +6,6 @@
 hid_T = 512
 N_T = 8
 N_S = 4
+lr = 1e-3
+batch_size = 16
 drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_Poolformer.py b/configs/mmnist/simvp/SimVP_Poolformer.py
@@ -6,4 +6,6 @@
 hid_T = 512
 N_T = 8
 N_S = 4
+lr = 1e-3
+batch_size = 16
 drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_Swin.py b/configs/mmnist/simvp/SimVP_Swin.py
@@ -6,4 +6,6 @@
 hid_T = 512
 N_T = 8
 N_S = 4
+lr = 1e-3
+batch_size = 16
 drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_Uniformer.py b/configs/mmnist/simvp/SimVP_Uniformer.py
@@ -6,4 +6,6 @@
 hid_T = 512
 N_T = 8
 N_S = 4
+lr = 5e-4
+batch_size = 16
 drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_VAN.py b/configs/mmnist/simvp/SimVP_VAN.py
@@ -0,0 +1,11 @@
+method = 'SimVP'
+spatio_kernel_enc = 3
+spatio_kernel_dec = 3
+model_type = 'convmixer'
+hid_S = 64
+hid_T = 512
+N_T = 8
+N_S = 4
+lr = 1e-3
+batch_size = 16
+drop_path = 0
diff --git a/configs/mmnist/simvp/SimVP_ViT.py b/configs/mmnist/simvp/SimVP_ViT.py
@@ -6,4 +6,6 @@
 hid_T = 512
 N_T = 8
 N_S = 4
+lr = 1e-3
+batch_size = 16
 drop_path = 0
diff --git a/configs/taxibj/SimVP.py b/configs/taxibj/SimVP.py
@@ -0,0 +1,8 @@
+method = 'SimVP'
+spatio_kernel_enc = 3
+spatio_kernel_dec = 3
+# model_type = None
+hid_S = 64
+hid_T = 512
+N_T = 8
+N_S = 4
diff --git a/docs/en/Makefile b/docs/en/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/en/changelog.md b/docs/en/changelog.md
@@ -0,0 +1,20 @@
+## Changelog
+
+### v0.1.0 (18/02/2023)
+
+Release version to V0.1.0 with code refactoring.
+
+#### Code Refactoring
+
+* Refactor code structures as `simvp/api`, `simvp/core`, `simvp/datasets`, `simvp/methods`, `simvp/models`, `simvp/modules`. We support non-distributed training and evaluation by the executable python file `tools/non_dist_train.py`. Refactor config files for SimVP models.
+* Fix bugs in tools/nondist_train.py, simvp/utils, environment.yml, and .gitignore, etc.
+
+#### New Features
+
+* Support Timm optimizers and schedulers.
+* Update popular Metaformer models as the hidden Translator $h$ in SimVP, supporting [ViT](https://arxiv.org/abs/2010.11929), [Swin-Transformer](https://arxiv.org/abs/2103.14030), [MLP-Mixer](https://arxiv.org/abs/2105.01601), [ConvMixer](https://arxiv.org/abs/2201.09792), [UniFormer](https://arxiv.org/abs/2201.09450), [PoolFormer](https://arxiv.org/abs/2111.11418), [ConvNeXt](https://arxiv.org/abs/2201.03545), [VAN](https://arxiv.org/abs/2202.09741), [HorNet](https://arxiv.org/abs/2207.14284), and [MogaNet](https://arxiv.org/abs/2211.03295).
+* Update implementations of dataset and dataloader, supporting [KTH Action](https://ieeexplore.ieee.org/document/1334462), [KittiCaltech Pedestrian](https://dl.acm.org/doi/10.1177/0278364913491297), [Moving MNIST](http://arxiv.org/abs/1502.04681), [TaxiBJ](https://arxiv.org/abs/1610.00081), and [WeatherBench](https://arxiv.org/abs/2002.00469).
+
+### Update Documents
+
+* Upload `readthedocs` documents. Summarize video prediction benchmark results on MMNIST in [video_benchmarks.md](https://github.com/chengtan9907/SimVPv2/docs/en/model_zoos/video_benchmarks.md).
diff --git a/docs/en/get_started.md b/docs/en/get_started.md
@@ -0,0 +1,9 @@
+# Getting Started
+
+This page provides basic tutorials about the usage of SimVP. For installation instructions, please see [Install](docs/en/install.md).
+
+An example of single GPU training SimVP+gSTA on Moving MNIST dataset.
+```shell
+bash tools/prepare_data/download_mmnist.sh
+python tools/non_dist_train.py -d mmnist -m SimVP --model_type gsta --lr 1e-3 --ex_name mmnist_simvp_gsta
+```
diff --git a/docs/en/index.rst b/docs/en/index.rst
@@ -0,0 +1,38 @@
+.. SimVP documentation master file, created by
+   sphinx-quickstart on Thu June 15 05:11:34 2022.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Welcome to SimVP's documentation!
+=====================================
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Getting Started
+
+   install.md
+   get_started.md
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Model Zoos
+
+   model_zoos/video_benchmarks.md
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Notes
+
+   changelog.md
+
+.. toctree::
+   :caption: Switch Language
+
+   switch_language.md
+
+
+Indices and tables
+==================
+
+* :ref:`genindex`
+* :ref:`search`
diff --git a/docs/en/install.md b/docs/en/install.md
@@ -0,0 +1,22 @@
+# Installation
+
+This project has provided an environment setting file of conda, users can easily reproduce the environment by the following commands:
+```shell
+git clone https://github.com/chengtan9907/SimVPv2
+cd SimVPv2
+conda env create -f environment.yml
+conda activate SimVP
+python setup.py develop
+```
+
+<details close>
+<summary>Dependencies</summary>
+
+* argparse
+* numpy
+* hickle
+* scikit-image=0.16.2
+* torch
+* timm
+* tqdm
+</details>
diff --git a/docs/en/model_zoos/video_benchmarks.md b/docs/en/model_zoos/video_benchmarks.md
@@ -0,0 +1,90 @@
+# Video Prediction Benchmarks
+
+**We provide benchmark results of video prediction methods on video datasets. More video prediction methods will be supported in the future. Issues and PRs are welcome!**
+
+<details open>
+<summary>Currently supported video prediction methods</summary>
+
+- [x] [ConvLSTM](https://arxiv.org/abs/1506.04214) (NIPS'2015)
+- [x] [PredRNN](https://dl.acm.org/doi/abs/10.5555/3294771.3294855) (NIPS'2017)
+- [x] [PredRNN++](https://arxiv.org/abs/1804.06300) (ICML'2018)
+- [x] [E3D-LSTM](https://openreview.net/forum?id=B1lKS2AqtX) (ICLR'2018)
+- [x] [MAU](https://arxiv.org/abs/1811.07490) (CVPR'2019)
+- [x] [CrevNet](https://openreview.net/forum?id=B1lKS2AqtX) (ICLR'2020)
+- [x] [PhyDNet](https://arxiv.org/abs/2003.01460) (CVPR'2020)
+- [x] [PredRNN.V2](https://arxiv.org/abs/2103.09504v4) (TPAMI'2022)
+- [x] [SimVP](https://arxiv.org/abs/2206.05099) (CVPR'2022)
+- [x] [SimVP.V2](https://arxiv.org/abs/2211.12509) (ArXiv'2022)
+
+</details>
+
+<details open>
+<summary>Currently supported MetaFormer models for SimVP</summary>
+
+- [x] [ViT](https://arxiv.org/abs/2010.11929) (ICLR'2021)
+- [x] [Swin-Transformer](https://arxiv.org/abs/2103.14030) (ICCV'2021)
+- [x] [MLP-Mixer](https://arxiv.org/abs/2105.01601) (NIPS'2021)
+- [x] [ConvMixer](https://arxiv.org/abs/2201.09792) (Openreview'2021)
+- [x] [UniFormer](https://arxiv.org/abs/2201.09450) (ICLR'2022)
+- [x] [PoolFormer](https://arxiv.org/abs/2111.11418) (CVPR'2022)
+- [x] [ConvNeXt](https://arxiv.org/abs/2201.03545) (CVPR'2022)
+- [x] [VAN](https://arxiv.org/abs/2202.09741) (ArXiv'2022)
+- [x] [IncepU (SimVP.V1)](https://arxiv.org/abs/2206.05099) (CVPR'2022)
+- [x] [gSTA (SimVP.V2)](https://arxiv.org/abs/2211.12509) (ArXiv'2022)
+- [x] [HorNet](https://arxiv.org/abs/2207.14284) (NIPS'2022)
+- [x] [MogaNet](https://arxiv.org/abs/2211.03295) (ArXiv'2022)
+
+</details>
+
+## Moving MNIST Benchmarks
+
+We provide benchmark results on popular [Moving MNIST](http://arxiv.org/abs/1502.04681) dataset using $10\rightarrow 10$ frames prediction setting. Metrics (MSE, MAE, SSIM, pSNR) of the final models are reported in three trials. Parameters (M), FLOPs (G), inference FPS (s) are also reported for all methods.
+
+### **Benchmark of Video Prediction Methods**
+
+For fair comparison of different methods, we report final results when models are trained to convergence. We provide config file in [configs/mmnist](https://github.com/chengtan9907/SimVPv2/configs/mmnist).
+
+| Method       | Params |  FLOPs | FPS |  MSE  |   MAE  |  SSIM |   Download   |
+|--------------|:------:|:------:|:---:|:-----:|:------:|:-----:|:------------:|
+| ConvLSTM-S   |  15.0M |  56.8G | 113 | 46.26 | 142.18 | 0.878 | model \| log |
+| ConvLSTM-L   |  33.8M | 127.0G |  50 | 29.88 |  95.05 | 0.925 | model \| log |
+| PhyDNet      |  3.1M  |  15.3G | 182 | 35.68 |  96.70 | 0.917 | model \| log |
+| PredRNN      |  23.8M | 116.0G |  54 | 25.04 |  76.26 | 0.944 | model \| log |
+| PredRNN++    |  38.6M | 171.7G |  38 | 22.45 |  69.70 | 0.950 | model \| log |
+| MIM          |  38.0M | 179.2G |  37 | 23.66 |  74.37 | 0.946 | model \| log |
+| E3D-LSTM     |  51.0M | 298.9G |  18 | 36.19 |  78.64 | 0.932 | model \| log |
+| CrevNet      |  5.0M  | 270.7G |  10 | 30.15 |  86.28 | 0.935 | model \| log |
+| PredRNN.V2   |  23.9M | 116.6G |  52 | 27.73 |  82.17 | 0.937 | model \| log |
+| SimVP+IncepU |  58.0M |  19.4G | 209 | 26.69 |  77.19 | 0.940 | model \| log |
+| SimVP+gSTA-S |  46.8M |  16.5G | 282 | 15.05 |  49.80 | 0.967 | model \| log |
+
+### **Benchmark of MetaFormers on SimVP**
+
+Since the hidden Translator in [SimVP](https://arxiv.org/abs/2211.12509) can be replaced by any [Metaformer](https://arxiv.org/abs/2111.11418) block which achieves `token mixing` and `channel mixing`, we benchmark popular Metaformer architectures on SimVP with training times of 200-epoch and 2000-epoch. We provide config file in [configs/mmnist/simvp](https://github.com/chengtan9907/SimVPv2/configs/mmnist/simvp/).
+
+| MetaFormer       |   Setting  | Params |  FLOPs |  FPS |  MSE  |  MAE  |  SSIM  |  PSNR |   Download   |
+|------------------|:----------:|:------:|:------:|:----:|:-----:|:-----:|:------:|:-----:|:------------:|
+| IncepU (SimVPv1) |  200 epoch |  58.0M |  19.4G | 209s | 32.15 | 89.05 | 0.9268 | 37.97 | model \| log |
+| gSTA (SimVPv2)   |  200 epoch |  46.8M |  16.5G | 282s | 26.69 | 77.19 | 0.9402 |  38.3 | model \| log |
+| ViT              |  200 epoch |  46.1M | 16.9.G | 290s | 35.15 | 95.87 | 0.9139 | 37.79 | model \| log |
+| Swin Transformer |  200 epoch |  46.1M |  16.4G | 294s | 29.70 | 84.05 | 0.9331 | 38.14 | model \| log |
+| Uniformer        |  200 epoch |  44.8M |  16.5G | 296s | 30.38 | 85.87 | 0.9308 | 38.11 | model \| log |
+| MLP-Mixer        |  200 epoch |  38.2M |  14.7G | 334s | 29.52 | 83.36 | 0.9338 | 38.19 | model \| log |
+| ConvMixer        |  200 epoch |  3.9M  |  5.5G  | 658s | 32.09 | 88.93 | 0.9259 | 37.97 | model \| log |
+| Poolformer       |  200 epoch |  37.1M |  14.1G | 341s | 31.79 | 88.48 | 0.9271 | 38.06 | model \| log |
+| ConvNeXt         |  200 epoch |  37.3M |  14.1G | 344s | 26.94 | 77.23 | 0.9397 | 38.34 | model \| log |
+| VAN              |  200 epoch |  44.5M |  16.0G | 288s | 26.10 | 76.11 | 0.9417 | 38.39 | model \| log |
+| HorNet           |  200 epoch |  45.7M |  16.3G | 287s | 29.64 | 83.26 | 0.9331 | 38.16 | model \| log |
+| MogaNet          |  200 epoch |  46.8M |  16.5G | 255s | 25.57 | 75.19 | 0.9429 | 38.41 | model \| log |
+| IncepU (SimVPv1) | 2000 epoch |  58.0M |  19.4G | 209s |   -   |   -   |    -   |   -   |       -      |
+| gSTA (SimVPv2)   | 2000 epoch |  46.8M |  16.5G | 282s | 15.05 | 49.80 | 0.9670 |   -   | model \| log |
+| ViT              | 2000 epoch |  46.1M | 16.9.G | 290s | 19.74 | 61.65 | 0.9539 | 38.96 | model \| log |
+| Swin Transformer | 2000 epoch |  46.1M |  16.4G | 294s | 19.11 | 59.84 | 0.9584 | 39.03 | model \| log |
+| Uniformer        | 2000 epoch |  44.8M |  16.5G | 296s | 18.01 | 57.52 | 0.9609 | 39.11 | model \| log |
+| MLP-Mixer        | 2000 epoch |  38.2M |  14.7G | 334s | 18.85 | 59.86 | 0.9589 | 38.98 | model \| log |
+| ConvMixer        | 2000 epoch |  3.9M  |  5.5G  | 658s | 22.30 | 67.37 | 0.9507 | 38.67 | model \| log |
+| Poolformer       | 2000 epoch |  37.1M |  14.1G | 341s | 20.96 | 64.31 | 0.9539 | 38.86 | model \| log |
+| ConvNeXt         | 2000 epoch |  37.3M |  14.1G | 344s | 17.58 | 55.76 | 0.9617 | 39.19 | model \| log |
+| VAN              | 2000 epoch |  44.5M |  16.0G | 288s | 16.21 | 53.57 | 0.9646 | 39.26 | model \| log |
+| HorNet           | 2000 epoch |  45.7M |  16.3G | 287s | 17.40 | 55.70 | 0.9624 | 39.19 | model \| log |
+| MogaNet          | 2000 epoch |  46.8M |  16.5G | 255s | 15.67 | 51.84 | 0.9661 | 39.35 | model \| log |
diff --git a/docs/en/switch_language.md b/docs/en/switch_language.md
@@ -0,0 +1 @@
+## <a href='https://simvp.readthedocs.io/en/latest/'>English</a>
Original file line number	Diff line number	Diff line change
Expand Up		@@ -135,3 +135,4 @@ figs

		# temp
		configs/kitticaltech/simvp
		configs/kth/simvp
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		## <a href='https://simvp.readthedocs.io/en/latest/'>English</a>