Training and Fine-tuning hardware requirements #3

kubernetes-bad · 2024-03-17T18:44:29Z

Exciting paper! Thank you for doing this research and publishing it.

Do you want to share some insight on what type of compute is required for training LaVi-Bridge?

Since you've used around 2M text-image pairs to train this, it sounds like you'd need a cluster of GPUs to train this from scratch (please correct me if I'm wrong!). Is finetuning the adapter and LoRAs something that can be performed on a smaller, domain-specific dataset? I would be curious to know what kind of compute that would require.

Thanks!

ShihaoZhaoZSH · 2024-03-18T08:13:24Z

Thank you for your interest in our LaVi-Bridge! As mentioned in our research paper, we utilize 8 A100 GPUs and train on around 1 million text-image pairs for less than 2 days. The batch size is set to 256. Therefore, by simply reducing the batch size or employing strategies such as mixed precision training or gradient accumulation, it is possible to train the model on fewer computational resources. Additionally, we provide results showcasing the performance as training steps progress in the appendix of our research paper, which you can refer to for further information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training and Fine-tuning hardware requirements #3

Training and Fine-tuning hardware requirements #3

kubernetes-bad commented Mar 17, 2024

ShihaoZhaoZSH commented Mar 18, 2024

Training and Fine-tuning hardware requirements #3

Training and Fine-tuning hardware requirements #3

Comments

kubernetes-bad commented Mar 17, 2024

ShihaoZhaoZSH commented Mar 18, 2024