Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with custom dataset #4

Open
abhigoku10 opened this issue Nov 25, 2023 · 2 comments
Open

Training with custom dataset #4

abhigoku10 opened this issue Nov 25, 2023 · 2 comments

Comments

@abhigoku10
Copy link

@ArcherFMY thanks a lot for sharing the code base , just had couple of queries

  1. is the training pipeline shared ? if not can u please share the training pipeline
  2. can we train with the custom dataset for different application ? if so what is the quantity of the data required for getting decent output
  3. can the generated images be more photorealistic ? if so what had to be done
    Thanks in advvance
@ArcherFMY
Copy link
Owner

@ArcherFMY thanks a lot for sharing the code base , just had couple of queries

  1. is the training pipeline shared ? if not can u please share the training pipeline
  2. can we train with the custom dataset for different application ? if so what is the quantity of the data required for getting decent output
  3. can the generated images be more photorealistic ? if so what had to be done
    Thanks in advvance

Hi,

We just use the dreambooth training scripts in diffusers (https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py) to train the model. You can simply follow the instructions in that repo to train on your custom dataset.

The 'instance_prompt' we used is '<360panorama>' as can be found here (https://github.com/ArcherFMY/SD-T2I-360PanoImage/blob/main/txt2panoimg/text_to_360panorama_image_pipeline.py#L150). The image resolution while training can be set to w=1024 and h=512 (just resize). We use one A100, and we set 'train_batch_size=8' and 'learning_rate=1e-6'. 20,000 to 30,000 steps would be ok.

According to our experience, for high-quality image generation, the quality of the training images is more important than the quantity. For the text-to-360panoimage, the training dataset we use contains about 2000 images (we use data augmentation, such as gradually stitching the right part to the left part, and finally get 20,000 images). All images are 4k resolution and are carefully picked by removing complex scenarios that have complex textures.

The results generated by our base model are close to photorealistic. The final high-resolution images are a little artificial. This is mainly influenced by the GAN model (RealESRGAN). You can try some other super-resolution models or try the image-to-image-with-controlnet to generate images again, using other styled base models.

@abhigoku10
Copy link
Author

@ArcherFMY thanks for response, i was able to train the repo for the text2panroma application, what is the training process for image2 panorama.. can you please share the steps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants