🥇FAST-VQA/FasterVQA

12 Oct, 2022: Release of pre-print FasterVQA paper: PDF, Abstract.
27 Sep, 2022: Release of FasterVQA models: 4X more efficient, 14X real-time inference on Apple M1 CPU (for FasterVQA-MT, tested on my old Mac).
10 Sep, 2022: Support on Adaptive Multi-scale Inference (AMI): one model for different scales of inputs.

Performances for FasterVQA:

Performances for FAST-VQA:

An Open Source Deep End-to-End Video Quality Assessment Toolbox,

开源的端到端视频质量评价工具箱，

& Reproducible Code for ECCV2022 Paper FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling and its extension Paper Neighbourhood Representative Sampling for Efficient End-to-end Video Quality Assessment.

暨可复现 ECCV2022 论文 FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling 的代码。

✨ We are officially announcing FasterVQA (named FAST-VQA-B-3D during development) which expands the proposed Fragments into a 3D version, which brings 4x faster speed and similar performance. The official CVF edition of ECCV paper will also be online soon as the conference is coming.

我们正式发布了新版的FasterVQA，在效率提升4倍的情况下保持接近原始FAST-VQA的性能。

In this release, we have refactored the training and testing code. The refactored code can achieve the same performance as the original version and allow modification of (1) the backbone structures; (2) the sampling hyper-parameters; (3) loss functions.

在这一版本中，我们对训练和测试的代码进行了重构。重构后的代码可以达到与原始版本相同的性能，并允许修改网络结构/采样的超参数/损失函数。

Infer for a single MP4 video

python vqa.py -d [YOUR_INPUT_FILE_PATH]

The default one is for a video in KoNViD-1k with FasterVQA, which should get a score around 0.133.

See our Weights & Biases training logs

我们在Wandb上公开了一部分训练和测试曲线。

We are reproducing several experiments and making public our training logs here.

https://wandb.ai/timothyhwu/Open_FAST_VQA

Now supports:

FasterVQA-finetuned-to-KonViD-1k
FasterVQA-on-MT-and-MS-scales-with-AMI

🚩 Modularized Parts Designed for Development

为开发设计的模块化架构

Data Preprocessing

数据预处理

Please view Data Processing to see the source codes for data processing. Specially, look at the FusionDataset class and the get_spatial_and_temporal_samples function for our core transformations.

Spatial Sampling

空间采样

We have supported spatial sampling approachs as follows:

fragments
resize
arp_resize (resize while keeping the original Aspect Ratio)
crop

We also support the combination of those sampling approaches (multi-branch networks) for more flexibility.

Temporal Sampling (New)

时域采样（新）

We also support different temporal sampling approaches:

SampleFrames (sample continuous frames, imported from MMAction2)
FragmentSampleFrames (:sparkles: New, sample fragment-like discontinuous frames)

Network Structures

网络结构

Network Backbones

骨干网络

Video Swin Transformer (with GRPB, as proposed in FAST-VQA)
Video Swin Transformer (vanilla)
ConvNext-I3D (vanilla)

Network Heads

网络头

IP-NLR Head (as proposed in FAST-VQA)

IP-NLR head can generate local quality maps for videos.

Installation

安装

Dependencies

依赖

The original library is build with

python=3.8.8
torch=1.10.2
torchvision=0.11.3

while using decord module to read original videos (so that you don't need to make any transform on your original .mp4 input).

To get all the requirements, please run

pip install -r requirements.txt

Direct Install

直接安装

You can run

pip install .

or

python setup.py installl

to install the full FAST-VQA with its requirements.

Usage

使用方法

Quick Benchmark

快速测试

Step 1: Get Pretrained Weights

We supported pretrained weights for several versions:

Name	Pretrain	Spatial Fragments	Temporal Fragments	PLCC@LSVQ_1080p	PLCC@LSVQ_test	PLCC@LIVE_VQC	PLCC@KoNViD	MACs	config	model
FAST-VQA-B (ECCV2022)	Kinetics-400	7*32	132(4)	0.814	0.877	0.844	0.855	279G	config	github
FasterVQA (:sparkles: New!)	Kinetics-400	7*32	84(1)	0.811	0.874	0.837	0.864	69G	config	github
- zero-shot transfer to MT scale with AMI	Kinetics-400	7*32	44(1)	0.791	0.860	0.826	0.849	35G	config	Same as FasterVQA
- zero-shot transfer to MS scale with AMI	Kinetics-400	5*32	84(1)	0.798	0.849	0.818	0.854	36G	config	Same as FasterVQA
FAST-VQA-B-From-Scratch (:sparkles: New!)	None	7*32	132(4)	0.707	0.791	0.766	0.793	279G	config	github
FAST-VQA-B-3D-From-Scratch (:sparkles: New!)	None	7*32	84(1)	0.685	0.760	0.739	0.773	69G	config	github
FAST-VQA-M (ECCV2022)	Kinetics-400	4*32	132(4)	0.773	0.854	0.810	0.832	46G	config	github

Step 2: Download Corresponding Datasets

LSVQ: Github KoNViD-1k: Official Site LIVE-VQC: Official Site

Step 3: Run the following one-line script!

python new_test.py -o [YOUR_OPTIONS]

Training

训练

Get Pretrained Weights from Recognition

You might need to download the original Swin-T Weights to initialize the model.

Train with large dataset (LSVQ)

To train FAST-VQA-B, please run

python new_train.py -o options/fast/fast-b.yml

To train FAST-VQA-M, please run

python new_train.py -o options/fast/fast-m.yml

To train FasterVQA (FAST-VQA-B-3D), please run

python new_train.py -o options/fast/f3dvqa-b.yml

Finetune on small datasets with provided weights

在小规模数据集上进行调优

This training will split the dataset into 10 random train/test splits (with random seed 42) and report the best result on the random split of the test dataset.

python split_train.py -opt [YOUR_OPTION_FILE]

You may see option files in Finetune Config Files.

Results for FAST-VQA-B:

	KoNViD-1k	CVD2014	LIVE-Qualcomm	LIVE-VQC	YouTube-UGC
SROCC	0.891	0.891	0.819	0.849	0.855
PLCC	0.892	0.903	0.851	0.862	0.852

Results for FasterVQA(FAST-VQA-B-3D):

	KoNViD-1k	CVD2014	LIVE-Qualcomm	LIVE-VQC	YouTube-UGC
SROCC	0.895	0.896	0.826	0.843	0.863
PLCC	0.898	0.904	0.843	0.858	0.859

Note that this part only support FAST-VQA-B and FAST-VQA-B-3D (FasterVQA); but you may build your own option files for other variants.

Supported datasets are KoNViD-1k, LIVE_VQC, CVD2014, LIVE-Qualcomm, YouTube-UGC.

Citation

The following paper is to be cited in the bibliography if relevant papers are proposed.

@article{wu2022fastquality,
  title={FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling},
  author={Wu, Haoning and Chen, Chaofeng and Hou, Jingwen and Liao, Liang and Wang, Annan and Sun, Wenxiu and Yan, Qiong and Lin, Weisi},
  journal={Proceedings of European Conference of Computer Vision (ECCV)},
  year={2022}
}

And this code library if it is used.

@misc{end2endvideoqualitytool,
  title = {Open Source Deep End-to-End Video Quality Assessment Toolbox},
  author = {Wu, Haoning},
  year = {2022},
  url = {http://github.com/timothyhtimothy/fast-vqa}
}

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
demos		demos
examplar_data_labels		examplar_data_labels
fastvqa		fastvqa
options		options
pretrained_weights		pretrained_weights
results		results
.gitignore		.gitignore
FAST-VQA.ipynb		FAST-VQA.ipynb
LICENSE		LICENSE
README.md		README.md
finetune.py		finetune.py
k400_train.py		k400_train.py
mr_train.py		mr_train.py
new_test.py		new_test.py
new_train.py		new_train.py
requirements.txt		requirements.txt
setup.py		setup.py
split_train.py		split_train.py
stableness.ipynb		stableness.ipynb
visualize.py		visualize.py
vqa.py		vqa.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🥇FAST-VQA/FasterVQA

Infer for a single MP4 video

See our Weights & Biases training logs

🚩 Modularized Parts Designed for Development

Data Preprocessing

Spatial Sampling

Temporal Sampling (New)

Network Structures

Network Backbones

Network Heads

Installation

Dependencies

Direct Install

Usage

Quick Benchmark

Step 1: Get Pretrained Weights

Step 2: Download Corresponding Datasets

Step 3: Run the following one-line script!

Training

Get Pretrained Weights from Recognition

Train with large dataset (LSVQ)

Finetune on small datasets with provided weights

Citation

About

Releases

Packages

Languages

License

likezjuisee/FAST-VQA-and-FasterVQA

Folders and files

Latest commit

History

Repository files navigation

🥇FAST-VQA/FasterVQA

Infer for a single MP4 video

See our Weights & Biases training logs

🚩 Modularized Parts Designed for Development

Data Preprocessing

Spatial Sampling

Temporal Sampling (New)

Network Structures

Network Backbones

Network Heads

Installation

Dependencies

Direct Install

Usage

Quick Benchmark

Step 1: Get Pretrained Weights

Step 2: Download Corresponding Datasets

Step 3: Run the following one-line script!

Training

Get Pretrained Weights from Recognition

Train with large dataset (LSVQ)

Finetune on small datasets with provided weights

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages