Skip to content

Latest commit

 

History

History
26 lines (17 loc) · 1.51 KB

DATASET.md

File metadata and controls

26 lines (17 loc) · 1.51 KB

Dataset Preparation

We follow VINDLU to prepare the datasets, but we DO NOT compress the videos and images. We use the original data and load the JSON files, since there are some communication problems for SQLite in our environment.

⚠️ If you do not have enough resources, we suggest you follow the preprocessing of VINDLU.

🏷️ We use the same JSON files provided by VINDLU. However, since some vides are missing in large-scale datasets (like CC3M, CC12M and WebVid10M), we filter out those unavaliable videos.

Pretraining

Video-Text Retrieval and Video Question Answering