diff --git a/tools/nemo_forced_aligner/README.md b/tools/nemo_forced_aligner/README.md index ca6a1c688ca7..8259400b5afb 100644 --- a/tools/nemo_forced_aligner/README.md +++ b/tools/nemo_forced_aligner/README.md @@ -4,7 +4,9 @@ Try it out: HuggingFace Space 🎤 | Tutorial: "How to use NFA?" 🚀 | Blog post: "How does forced alignment work?" 📚

- +

+ +

NFA is a tool for generating token-, word- and segment-level timestamps of speech in audio using NeMo's CTC-based Automatic Speech Recognition models. You can provide your own reference text, or use ASR-generated transcription. You can use NeMo's ASR Model checkpoints out of the box in [14+ languages](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/results.html#speech-recognition-languages), or train your own model. NFA can be used on long audio files of 1+ hours duration (subject to your hardware and the ASR model used). @@ -20,8 +22,9 @@ NFA is a tool for generating token-, word- and segment-level timestamps of speec output_dir= ``` - - +

+ +

## Documentation More documentation is available [here](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/tools/nemo_forced_aligner.html). \ No newline at end of file