-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding whisper large peft+int8 training example #95
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks amazing! 🔥 Looks excellent to me!
View / edit / reply to this conversation on ReviewNB sayakpaul commented on 2023-02-16T10:25:22Z Add a small introduction section and club the code cells before |
View / edit / reply to this conversation on ReviewNB sayakpaul commented on 2023-02-16T10:25:23Z Better to define
Also, for models that include multiple modalities like this one, we usually maintain a standalone |
View / edit / reply to this conversation on ReviewNB sayakpaul commented on 2023-02-16T10:25:25Z Line #7. class DataCollatorSpeechSeq2SeqWithPadding:
Maybe this could later go into |
View / edit / reply to this conversation on ReviewNB sayakpaul commented on 2023-02-16T10:25:27Z 🔥 |
View / edit / reply to this conversation on ReviewNB sayakpaul commented on 2023-02-16T10:25:28Z Maybe add a sentence drawing the reader's attention to the fact that we're ONLY training 1% of the total model params. |
View / edit / reply to this conversation on ReviewNB sayakpaul commented on 2023-02-16T10:25:29Z Put Seq2SeqTrainingArguments in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
तुम्ही खूप चांगले काम केले! 🔥
Maybe just format the code with something like jupyter-black
so that the code reads more beautiful?
What does this PR do?