This repository contains unoffical codes of several temporal action detection (TAD) methods that implemented in open-mmlab style. mmengine, mmcv, mmdetection, and mmaction2 are the main backends.
I am NOT an employee of open-mmlab, neither the author of many of the implemented TAD methods here
- APN (official)
- DITA (official)
- ActionFormer
- TadTR
- BasicTAD
The repository is still under construction and the readme.md need to be updated.
conda create -n mmengine python=3.8 -y
conda activate mmengine
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
pip install openmim
mim install mmengine mmdet mmaction2
pip install fvcore future tensorboard pytorchvideo timm
You need pay attention to the version compatibility of PyTorch, CUDA and NVIDIA driver link1, link2, link3.
You need pay attention to the installation message of mmcv
and check if it is something like:
Collecting mmcv
Downloading https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/mmcv-2.0.0-cp38-cp38-manylinux1_x86_64.whl
or
Collecting mmcv
Downloading mmcv-2.1.0.tar.gz (471 kB)
The former indicates that there is a pre-built version of mmcv
corresponding to the PyTorch and CUDA installed in the conda environment. And in this case, everything should just go fine.
While if it's the second case, i.e., it installed mmcv
using a .tar.gz
file. It means that there was NO proper pre-build mmcv
and it was building the mmcv from the source.
In this case, some tricky errors may appear. For example, it could raise a CUDA version mismatch
error if your versions of the system-wide CUDA and the conda-wide CUDA are mismatched.
You could check the available pre-built mmcv
in this page.
Add the root directory to the Python path, otherwise you need add PYTHONPATH=$PWD:$PYTHONPATH
before every command:
cd mmtad
export PYTHONPATH=$PWD:$PYTHONPATH
# $env:PYTHONPATH += ";$pwd" for Windows
Note that once you close the terminal, you need re-run the above command as it is a temporary setting.
Running commands you need to know (refer to openmim for more details):
Training command:
mim train mmaction $CONFIG --gpus $NUM_GPUS
Test command:
mim test mmaction $CONFIG --gpus $NUM_GPUS --checkpoint $PATH_TO_CHECKPOINT
Notes:
- When $NUM_GPUS > 1, distributed training or testing will be used. You may add
--launcher pytorch
to use PyTorch launcher, or--launcher slurm
to use Slurm launcher. - The final batch_size is $NUM_GPUS * $CFG.train_dataloader.batch_size. You may need override some options when using
different number of GPUs. For example, if you want to use 8 GPUs, you may add
--cfg-options train_dataloader.batch_size=xxx
to reduce the batch_size on single GPU by 8 in order to keep the final batch size unchanged.
Reproduce APN
APN: Solve TAD with a 2D backbone (ResNet-50). Fast (6000+FPS) and competitive precision (avg. mAP=58% on THUMOS14).
(Paper under review) Codes comming soon. Stay tuned!
DITA: DETR-like TAD model but state-of-the-art precision. Streamlined (no NMS, no anchors) and SOTA precision (first time exceeds 70% on THUMOS14).
(Paper under review) Codes comming soon. Stay tuned!
Reproduce TadTR
mmaction2 1.2.0 https://github.com/open-mmlab/mmaction2
mmcv 2.1.0 https://github.com/open-mmlab/mmcv
mmdet 3.3.0 https://github.com/open-mmlab/mmdetection
mmengine 0.10.3 https://github.com/open-mmlab/mmengine
Download the pre-extracted features from the official repository
and put them in my_data/thumos14/features/thumos_feat_TadTR_64input_8stride_2048
. We use annotation files created by ourselves.
Train (2 GPUs as an example):
mim train mmaction configs/repo_actionformer_th14.py --gpus 2 --launcher pytorch --cfg-options train_dataloader.batch_size=1
Test (2 GPUs as an example):
mim train mmaction configs/repo_actionformer_th14.py --gpus 2 --launcher pytorch --checkpoint work_dirs/repo_actionformer_th14/latest.pth --cfg-options train_dataloader.batch_size=1
Reproduce ActionFormer
mmaction2 1.2.0 https://github.com/open-mmlab/mmaction2
mmcv 2.1.0 https://github.com/open-mmlab/mmcv
mmdet 3.3.0 https://github.com/open-mmlab/mmdetection
mmengine 0.10.3 https://github.com/open-mmlab/mmengine
Download the pre-extracted features from the official repository
and put them in my_data/thumos14/features/thumos_feat_ActionFormer_16input_4stride_2048/i3d_features
. We use annotation files created by ourselves.
Train (2 GPUs as an example):
mim train mmaction configs/repo_actionformer_th14.py --gpus 2 --launcher pytorch --cfg-options train_dataloader.batch_size=1
We change batch_size to 1 (which is 2 by default) here as two GPUs are used for training. The final batch_size is still 2, following the official training.
Test (2 GPUs as an example):
mim train mmaction configs/repo_actionformer_th14.py --gpus 2 --launcher pytorch --checkpoint work_dirs/repo_actionformer_th14/latest.pth --cfg-options train_dataloader.batch_size=1
Reproduce PlusTAD
mmaction2 1.2.0 https://github.com/open-mmlab/mmaction2
mmcv 2.1.0 https://github.com/open-mmlab/mmcv
mmdet 3.3.0 https://github.com/open-mmlab/mmdetection
mmengine 0.10.3 https://github.com/open-mmlab/mmengine
Download the pre-extracted features from the official repository
and put them in my_data/thumos14/features/thumos_feat_ActionFormer_16input_4stride_2048/i3d_features
. We use annotation files created by ourselves.
Train (2 GPUs as an example):
mim train mmaction configs/repo_actionformer_th14.py --gpus 2 --launcher pytorch --cfg-options train_dataloader.batch_size=1
We change batch_size to 1 (which is 2 by default) here as two GPUs are used for training. The final batch_size is still 2, following the official training.
Test (2 GPUs as an example):
mim train mmaction configs/repo_actionformer_th14.py --gpus 2 --launcher pytorch --checkpoint work_dirs/repo_actionformer_th14/latest.pth --cfg-options train_dataloader.batch_size=1