Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the training of CUB200 dataset #41

Open
DZY-irene opened this issue Jan 9, 2024 · 5 comments
Open

About the training of CUB200 dataset #41

DZY-irene opened this issue Jan 9, 2024 · 5 comments

Comments

@DZY-irene
Copy link

Regarding the training of cub200, I followed all the parameter settings of the source code. I first loaded cc_learned.pth and started training. According to the paper, this should be the VQ-diffusion-F model. After training up to 300epoch, I observed that the val loss has almost stabilized and tested it on 299epoch. But the test result is very bad with a fid of 28. I am curious why this is the case.

My test command is:
VQ_Diffusion_model.inference_generate_sample_with_condition(data,truncation_rate=1.0, save_root="pre/ep299_tr1_gs5",batch_size=1, guidance_scale=5.0)

Here is the tensorboard and the visualization of test result:

tb
vs

I also tried to train VQ-diffusion-B which has no pretrained model. But the result is worse.

Does anyone encounter the same problem?

@623851394
Copy link

我也遇到这个问题了, 我在CUB上 用VQ-diffusion-B训练的图像很抽象,但是用CelebA-HQ训练的人脸还可以,不知道CUB数据集怎么训练的?请问你这个效果是怎么训练出来的?

@DZY-irene
Copy link
Author

我也遇到这个问题了, 我在CUB上 用VQ-diffusion-B训练的图像很抽象,但是用CelebA-HQ训练的人脸还可以,不知道CUB数据集怎么训练的?请问你这个效果是怎么训练出来的?

我这个其实算是VQ-diffusion-F,是用了pretrained model的大约300epoch的结果,但是测试结果的fid比论文高了10个点。我也跑过VQ-diffusion-B,结果就更差了。我个人怀疑是CUB数据量太小,导致直接训练的结果很差。但是如果用了pretrained model的情况下结果还是这么差的情况就很令人费解了。我打算在coco上再尝试一下,看看数据量的影响有多大。

@pangPhD
Copy link

pangPhD commented May 7, 2024

我也遇到这个问题了, 我在CUB上 用VQ-diffusion-B训练的图像很抽象,但是用CelebA-HQ训练的人脸还可以,不知道CUB数据集怎么训练的?请问你这个效果是怎么训练出来的?

我这个其实算是VQ-diffusion-F,是用了pretrained model的大约300epoch的结果,但是测试结果的fid比论文高了10个点。我也跑过VQ-diffusion-B,结果就更差了。我个人怀疑是CUB数据量太小,导致直接训练的结果很差。但是如果用了pretrained model的情况下结果还是这么差的情况就很令人费解了。我打算在coco上再尝试一下,看看数据量的影响有多大。

请问这个文件在哪里OUTPUT/pretrained_model/taming_dvae/taming_f8_8192_openimages_last.pth,我找半天没找到,谢谢!

@pangPhD
Copy link

pangPhD commented May 7, 2024

我也遇到这个问题了, 我在CUB上 用VQ-diffusion-B训练的图像很抽象,但是用CelebA-HQ训练的人脸还可以,不知道CUB数据集怎么训练的?请问你这个效果是怎么训练出来的?

请问你有这个文件吗OUTPUT/pretrained_model/taming_dvae/taming_f8_8192_openimages_last.pth,我没找到在哪,谢谢!

@623851394
Copy link

@pangPhD 你需要从他给出的pretrain.sh下载
if [ -f ithq_vqvae.pth ]; then
echo "ithq_vqvae.pth exists"
else
echo "Downloading ithq_vqvae.pth"
wget https://github.com/tzco/storage/releases/download/vqdiffusion/ithq_vqvae.pth
fi

if [ -f taming_f8_8192_openimages_last.pth ]; then
echo "taming_f8_8192_openimages_last.pth exists"
else
echo "Downloading taming_f8_8192_openimages_last.pth"
wget https://github.com/tzco/storage/releases/download/vqdiffusion/taming_f8_8192_openimages_last.pth
fi

if [ -f vqgan_imagenet_f16_16384.pth ]; then
echo "vqgan_imagenet_f16_16384.pth exists"
else
echo "Downloading vqgan_imagenet_f16_16384.pth"
wget https://github.com/tzco/storage/releases/download/vqdiffusion/vqgan_imagenet_f16_16384.pth
fi

if [ -f ViT-B-32.pt ]; then
echo "ViT-B-32.pt exists"
else
echo "Downloading ViT-B-32.pt"
wget https://github.com/tzco/storage/releases/download/vqdiffusion/ViT-B-32.pt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants