Add finetuning scripts #7263

nithinraok · 2023-08-18T00:27:46Z

What does this PR do ?

Add a finetuning script along with config file for finetuning existing nemo models.

Collection: ASR

Changelog

Added speech_to_text_finetune.py script
Added speech_to_text_finetune.yaml config
Jenkins run
Doc strings

TODO:

Update tutorial

Usage

Eg: without changing tokenizer

 python speech_to_text_finetune.py \
 init_from_pretrained_model=stt_en_fastconformer_hybrid_large_pc \
 model.tokenizer.update_tokenizer=false

or
finetune with a new tokenizer (architecture of base model doesn't change)

 python speech_to_text_finetune.py  \
 init_from_pretrained_model=stt_en_fastconformer_hybrid_large_pc \
 model.tokenizer.update_tokenizer=true \
 model.tokenizer.dir=<path_to_tokenizer_dir> \
 model.tokenizer.type=<bpe/char>

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
[] Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

examples/asr/speech_to_text_finetune.py

titu1994

This is really fantastic work ! I think with a bit of updating, we can make it a universal script that we can put under examples/asr right next to the eval and transcription scripts.

Another thing, we need to add link to this in the asr ctc finetuming tutorial, plus add a section in the docs for finetuning and point to this.

examples/asr/conf/fastconformer/fast-conformer_transducer_bpe.yaml

titu1994 · 2023-08-18T05:19:39Z

examples/asr/conf/speech_to_text_finetune.yaml

+name: "Speech_To_Text_Finetuning"
+
+# use `init_from_nemo_model` or `init_from_pretrained_model` to initialize the model
+# We do not currently support `init_from_ptl_ckpt` to create a single script for all types of models.


We actually could support it in a roundabout way - of ckpt file, load the ptl ckpt with torch loss, read the config embedded in it, that will have target class path - we can resolve it and use that class to call load from checkpoint.

I thought about that too, but felt its not beautiful. May be I can put this in modelPT and use it here? So these functions can be used elsewhere.

Agreed. Let's see if we can generalize it at asr level first. It takes lots of time to load such ckpt for LLMs

examples/asr/conf/speech_to_text_finetune.yaml

examples/asr/speech_to_text_finetune.py

titu1994

Looks fantastic ! One thing is we need to support char models

docs/source/asr/configs.rst

examples/asr/speech_to_text_finetune.py

Signed-off-by: Nithin Rao Koluguri <nithinraok>

titu1994

Looks great ! Thanks !

* Add new script for finetuning asr models Signed-off-by: Nithin Rao Koluguri <nithinraok> * Update config for PTL 2.0 Signed-off-by: Nithin Rao Koluguri <nithinraok> * style fix Signed-off-by: Nithin Rao Koluguri <nithinraok> * update jenkins run Signed-off-by: Nithin Rao Koluguri <nithinraok> * add doc strings Signed-off-by: Nithin Rao Koluguri <nithinraok> * improve code to support all decoder types Signed-off-by: Nithin Rao Koluguri <nithinraok> * add doc strings and support for char models Signed-off-by: Nithin Rao Koluguri <nithinraok> * typo fix Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok>

github-actions bot added ASR CI labels Aug 18, 2023

github-advanced-security bot found potential problems Aug 18, 2023

View reviewed changes

examples/asr/speech_to_text_finetune.py Fixed Show fixed Hide fixed

nithinraok marked this pull request as ready for review August 18, 2023 04:51

nithinraok requested a review from titu1994 August 18, 2023 04:51

titu1994 requested changes Aug 18, 2023

View reviewed changes

nithinraok force-pushed the add_finetuning_scripts branch from 2cf0dc2 to bf35b0b Compare August 28, 2023 22:46

titu1994 reviewed Aug 29, 2023

View reviewed changes

Nithin Rao Koluguri added 8 commits August 28, 2023 23:09

Add new script for finetuning asr models

2744a67

Signed-off-by: Nithin Rao Koluguri <nithinraok>

Update config for PTL 2.0

617ff22

Signed-off-by: Nithin Rao Koluguri <nithinraok>

style fix

5ccb8fc

Signed-off-by: Nithin Rao Koluguri <nithinraok>

update jenkins run

f57d0c7

Signed-off-by: Nithin Rao Koluguri <nithinraok>

add doc strings

dc451c7

Signed-off-by: Nithin Rao Koluguri <nithinraok>

improve code to support all decoder types

626d90c

Signed-off-by: Nithin Rao Koluguri <nithinraok>

add doc strings and support for char models

7831425

Signed-off-by: Nithin Rao Koluguri <nithinraok>

typo fix

719e7d0

Signed-off-by: Nithin Rao Koluguri <nithinraok>

nithinraok force-pushed the add_finetuning_scripts branch from bf35b0b to 719e7d0 Compare August 29, 2023 06:09

titu1994 approved these changes Aug 29, 2023

View reviewed changes

titu1994 merged commit 68fea1a into main Aug 29, 2023

titu1994 deleted the add_finetuning_scripts branch August 29, 2023 08:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add finetuning scripts #7263

Add finetuning scripts #7263

nithinraok commented Aug 18, 2023 •

edited

Loading

titu1994 left a comment

titu1994 Aug 18, 2023

nithinraok Aug 18, 2023

titu1994 Aug 18, 2023

titu1994 left a comment

titu1994 left a comment

Add finetuning scripts #7263

Add finetuning scripts #7263

Conversation

nithinraok commented Aug 18, 2023 • edited Loading

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

titu1994 left a comment

Choose a reason for hiding this comment

titu1994 Aug 18, 2023

Choose a reason for hiding this comment

nithinraok Aug 18, 2023

Choose a reason for hiding this comment

titu1994 Aug 18, 2023

Choose a reason for hiding this comment

titu1994 left a comment

Choose a reason for hiding this comment

titu1994 left a comment

Choose a reason for hiding this comment

nithinraok commented Aug 18, 2023 •

edited

Loading