You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Exciting paper! Thank you for doing this research and publishing it.
Do you want to share some insight on what type of compute is required for training LaVi-Bridge?
Since you've used around 2M text-image pairs to train this, it sounds like you'd need a cluster of GPUs to train this from scratch (please correct me if I'm wrong!). Is finetuning the adapter and LoRAs something that can be performed on a smaller, domain-specific dataset? I would be curious to know what kind of compute that would require.
Thanks!
The text was updated successfully, but these errors were encountered:
Thank you for your interest in our LaVi-Bridge! As mentioned in our research paper, we utilize 8 A100 GPUs and train on around 1 million text-image pairs for less than 2 days. The batch size is set to 256. Therefore, by simply reducing the batch size or employing strategies such as mixed precision training or gradient accumulation, it is possible to train the model on fewer computational resources. Additionally, we provide results showcasing the performance as training steps progress in the appendix of our research paper, which you can refer to for further information.
Exciting paper! Thank you for doing this research and publishing it.
Do you want to share some insight on what type of compute is required for training LaVi-Bridge?
Since you've used around 2M text-image pairs to train this, it sounds like you'd need a cluster of GPUs to train this from scratch (please correct me if I'm wrong!). Is finetuning the adapter and LoRAs something that can be performed on a smaller, domain-specific dataset? I would be curious to know what kind of compute that would require.
Thanks!
The text was updated successfully, but these errors were encountered: