-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add finetuning scripts #7263
Add finetuning scripts #7263
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really fantastic work ! I think with a bit of updating, we can make it a universal script that we can put under examples/asr right next to the eval and transcription scripts.
Another thing, we need to add link to this in the asr ctc finetuming tutorial, plus add a section in the docs for finetuning and point to this.
name: "Speech_To_Text_Finetuning" | ||
|
||
# use `init_from_nemo_model` or `init_from_pretrained_model` to initialize the model | ||
# We do not currently support `init_from_ptl_ckpt` to create a single script for all types of models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually could support it in a roundabout way - of ckpt file, load the ptl ckpt with torch loss, read the config embedded in it, that will have target class path - we can resolve it and use that class to call load from checkpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that too, but felt its not beautiful. May be I can put this in modelPT and use it here? So these functions can be used elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Let's see if we can generalize it at asr level first. It takes lots of time to load such ckpt for LLMs
2cf0dc2
to
bf35b0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fantastic ! One thing is we need to support char models
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
bf35b0b
to
719e7d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great ! Thanks !
* Add new script for finetuning asr models Signed-off-by: Nithin Rao Koluguri <nithinraok> * Update config for PTL 2.0 Signed-off-by: Nithin Rao Koluguri <nithinraok> * style fix Signed-off-by: Nithin Rao Koluguri <nithinraok> * update jenkins run Signed-off-by: Nithin Rao Koluguri <nithinraok> * add doc strings Signed-off-by: Nithin Rao Koluguri <nithinraok> * improve code to support all decoder types Signed-off-by: Nithin Rao Koluguri <nithinraok> * add doc strings and support for char models Signed-off-by: Nithin Rao Koluguri <nithinraok> * typo fix Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok>
What does this PR do ?
Add a finetuning script along with config file for finetuning existing nemo models.
Collection: ASR
Changelog
TODO:
Usage
Eg: without changing tokenizer
or
finetune with a new tokenizer (architecture of base model doesn't change)
Before your PR is "Ready for review"
Pre checks:
PR Type: