This repo contains preparation guide of RealEstate10K to train our camera controllable diffusion model CamI2V.
wget https://storage.cloud.google.com/realestate10k-public-files/RealEstate10K.tar.gz
tar -xvzf RealEstate10K.tar.gz -C datasets
mkdir -p datasets/RealEstate10K/pose_files
mv datasets/RealEstate10K/test datasets/RealEstate10K/pose_files/
mv datasets/RealEstate10K/train datasets/RealEstate10K/pose_files/
You may need pip install pytubefix
to run this script. By default, it will try to download the highest resolution if available, you can change this behaviour at line 103.
python datasets/utils/generate_dataset.py --split "test"
python datasets/utils/gather_realestate.py --split "test"
python datasets/utils/get_realestate_clips.py --split "test"
We use caption annotations generated by CameraCtrl. Please download and put 2 json files under RealEstate10K
folder.
python datasets/utils/preprocess_realestate.py --split "test"
bash datasets/preprocess.sh "test"
The final file structure would be like
─┬─ datasets\
└─┬─ RealEstate10K\
├─┬─ pose_files\
│ └─── test\
├─┬─ valid_meta\
│ └─── test\
├─┬─ video_clips\
│ └─── test\
├─┬─ videos\
│ └─── test\
├─── test_captions.json
├─── test_video2clip.json
├─── test_list_data.pkl
└─── test_valid_list.txt
The pre-process for train
split is the same as test
.