YouTube | Poster | Enhancement Model | demo | 中文介绍
We want to increase video resolution and frame rates end-to-end (end-to-end STVSR). This project is the implement of Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution. Our SAFA network outperforms recent state-of-the-art methods such as TMNet and VideoINR by an average improvement of over 0.5dB on PSNR, while requiring less than half the number of parameters and only 1/3 computational costs.
We have released some dedicated visual effect models for ordinary users. Some insights on multi-scale processing and feature fusion are reflected in RIFE applications, see Practical-RIFE.
Space-Time Super-Resolution:
git clone [email protected]:megvii-research/WACV2024-SAFA.git
cd WACV2024-SAFA
pip3 install -r requirements.txt
Download the pretrained model from Google Drive.
Image Interpolation
python3 inference_img.py --img demo/i0.png demo/i1.png --exp=3
(2^3=8X interpolation results)
python3 inference_img.py --img demo/i0.png demo/i1.png --ratio=0.4
(for an arbitrary timestep)
We use 16 CPUs, 4 GPUs for training:
python3 -m torch.distributed.launch --nproc_per_node=4 train.py --world_size=4
The training scheme is mainly adopted from RIFE.
We sincerely recommend some related papers:
ECCV22 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation
CVPR23 - A Dynamic Multi-Scale Voxel Flow Network for Video Prediction
If you think this project is helpful, please feel free to leave a star or cite our paper:
@inproceedings{huang2024safa,
title={Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution},
author={Huang, Zhewei and Huang, Ailin and Hu, Xiaotao and Hu, Chen and Xu, Jun and Zhou, Shuchang},
booktitle={Winter Conference on Applications of Computer Vision (WACV)},
year={2024}
}