Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add finetuning scripts #7263

Merged
merged 8 commits into from
Aug 29, 2023
Merged

Add finetuning scripts #7263

merged 8 commits into from
Aug 29, 2023

Conversation

nithinraok
Copy link
Collaborator

@nithinraok nithinraok commented Aug 18, 2023

What does this PR do ?

Add a finetuning script along with config file for finetuning existing nemo models.

Collection: ASR

Changelog

  • Added speech_to_text_finetune.py script
  • Added speech_to_text_finetune.yaml config
  • Jenkins run
  • Doc strings

TODO:

  • Update tutorial

Usage

Eg: without changing tokenizer

 python speech_to_text_finetune.py \
 init_from_pretrained_model=stt_en_fastconformer_hybrid_large_pc \
 model.tokenizer.update_tokenizer=false

or
finetune with a new tokenizer (architecture of base model doesn't change)

 python speech_to_text_finetune.py  \
 init_from_pretrained_model=stt_en_fastconformer_hybrid_large_pc \
 model.tokenizer.update_tokenizer=true \
 model.tokenizer.dir=<path_to_tokenizer_dir> \
 model.tokenizer.type=<bpe/char>

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • [] Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

@nithinraok nithinraok marked this pull request as ready for review August 18, 2023 04:51
@nithinraok nithinraok requested a review from titu1994 August 18, 2023 04:51
Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really fantastic work ! I think with a bit of updating, we can make it a universal script that we can put under examples/asr right next to the eval and transcription scripts.

Another thing, we need to add link to this in the asr ctc finetuming tutorial, plus add a section in the docs for finetuning and point to this.

name: "Speech_To_Text_Finetuning"

# use `init_from_nemo_model` or `init_from_pretrained_model` to initialize the model
# We do not currently support `init_from_ptl_ckpt` to create a single script for all types of models.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually could support it in a roundabout way - of ckpt file, load the ptl ckpt with torch loss, read the config embedded in it, that will have target class path - we can resolve it and use that class to call load from checkpoint.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that too, but felt its not beautiful. May be I can put this in modelPT and use it here? So these functions can be used elsewhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Let's see if we can generalize it at asr level first. It takes lots of time to load such ckpt for LLMs

examples/asr/conf/speech_to_text_finetune.yaml Outdated Show resolved Hide resolved
examples/asr/conf/speech_to_text_finetune.yaml Outdated Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Outdated Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Outdated Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Outdated Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Outdated Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Outdated Show resolved Hide resolved
@nithinraok nithinraok force-pushed the add_finetuning_scripts branch from 2cf0dc2 to bf35b0b Compare August 28, 2023 22:46
Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fantastic ! One thing is we need to support char models

docs/source/asr/configs.rst Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Outdated Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Show resolved Hide resolved
examples/asr/speech_to_text_finetune.py Show resolved Hide resolved
Nithin Rao Koluguri added 8 commits August 28, 2023 23:09
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
@nithinraok nithinraok force-pushed the add_finetuning_scripts branch from bf35b0b to 719e7d0 Compare August 29, 2023 06:09
Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great ! Thanks !

@titu1994 titu1994 merged commit 68fea1a into main Aug 29, 2023
@titu1994 titu1994 deleted the add_finetuning_scripts branch August 29, 2023 08:08
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* Add new script for finetuning asr models

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Update config for PTL 2.0

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* style fix

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* update jenkins run

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* add doc strings

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* improve code to support all decoder types

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* add doc strings and support for char models

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* typo fix

Signed-off-by: Nithin Rao Koluguri <nithinraok>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Co-authored-by: Nithin Rao Koluguri <nithinraok>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants