Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training details #1

Open
lumpidu opened this issue Sep 17, 2024 · 2 comments
Open

Training details #1

lumpidu opened this issue Sep 17, 2024 · 2 comments

Comments

@lumpidu
Copy link

lumpidu commented Sep 17, 2024

Hi, very interesting paper!

Could you in the process of publishing the training scripts also add some intuition about the training procedure and your training metrics for the GPU's/no. of steps/memory requirements, etc. ?

Thanks in advance !

@yl4579
Copy link
Owner

yl4579 commented Sep 17, 2024

Thanks for your interest in this work! I'm very busy right now writing another paper and also preparing for job hunting and graduation, but I have included all the information needed for training in the Model Training section of the paper. I did training using Jupyter Notebook again, so it was pretty messy, but I'll share the code once it's cleaned.

It can take some time to clean the code, especially on the librilight dataset. The big model took a month to train on my lab's GPUs, although some experiments were conducted on H100 during my internship, which made it much faster. If anyone is willing to provide computation resources to debug/clean the code on large-scale models, feel free to email me at [email protected]. Also email me too if you want to help me debug/clean the code.

@yl4579
Copy link
Owner

yl4579 commented Sep 18, 2024

I have gotten many emails in less than a day. Thank you very much! However, I think it is difficult to coordinate the task individually through email, so I have created a discord server for that purpose. Please join the discord server if you are willing to help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants