Skip to content

Commit

Permalink
small bugfix for r1.13.0 (#5310)
Browse files Browse the repository at this point in the history
* typo fix

Signed-off-by: fayejf <[email protected]>

* udpate transcribe

Signed-off-by: fayejf <[email protected]>

Signed-off-by: fayejf <[email protected]>
  • Loading branch information
fayejf authored Nov 4, 2022
1 parent 26e3e1d commit 42f6ac9
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 6 deletions.
1 change: 1 addition & 0 deletions examples/asr/transcribe_speech.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,7 @@ def autocast():
path2manifest=cfg.dataset_manifest,
batch_size=cfg.batch_size,
num_workers=cfg.num_workers,
return_hypotheses=return_hypotheses,
)
else:
logging.warning(
Expand Down
8 changes: 4 additions & 4 deletions nemo/collections/asr/parts/utils/transcribe_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,16 +74,16 @@ def transcribe_partial_audio(
lg = logits[idx][: logits_len[idx]]
hypotheses.append(lg.cpu().numpy())
else:
current_hypotheses, _ = asr_model._wer.decoding.ctc_decoder_predictions_tensor(
decoder_outputs=greedy_predictions,
decoder_lengths=logits_len,
return_hypotheses=return_hypotheses,
current_hypotheses, all_hyp = asr_model.decoding.ctc_decoder_predictions_tensor(
logits, decoder_lengths=logits_len, return_hypotheses=return_hypotheses,
)

if return_hypotheses:
# dump log probs per file
for idx in range(logits.shape[0]):
current_hypotheses[idx].y_sequence = logits[idx][: logits_len[idx]]
if current_hypotheses[idx].alignments is None:
current_hypotheses[idx].alignments = current_hypotheses[idx].y_sequence

hypotheses += current_hypotheses

Expand Down
4 changes: 2 additions & 2 deletions tutorials/speaker_tasks/Speaker_Diarization_Training.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@
"\n",
"- Please skip this section and go directly to [Prepare Training data for MSDD](#Prepare-Training-data-for-MSDD) section if you have your own speaker diarization dataset. \n",
"\n",
"In this tutorial, we use [NeMo Multispeaker Simulator](https://github.com/NVIDIA/NeMo/blob/main/tutorials/tools/Multispeaker_Simulator.ipynb) and the Librispeech corpus to generate a toy training dataset for demonstration purpose. You can replace the simulated dataset with your own datasets if you have proper speaker annotations (RTTM files) for the dataset. If you do not have access to any speaker diarization datasets, you can use NeMo [NeMo Multispeaker Simulator](https://github.com/NVIDIA/NeMo/blob/main/tutorials/tools/Multispeaker_Simulator.ipynb) by generating a good amount of data samples to meet your needs. \n",
"In this tutorial, we use [NeMo Multispeaker Simulator](https://github.com/NVIDIA/NeMo/blob/main/tutorials/tools/Multispeaker_Simulator.ipynb) and the Librispeech corpus to generate a toy training dataset for demonstration purpose. You can replace the simulated dataset with your own datasets if you have proper speaker annotations (RTTM files) for the dataset. If you do not have access to any speaker diarization datasets, you can use [NeMo Multispeaker Simulator](https://github.com/NVIDIA/NeMo/blob/main/tutorials/tools/Multispeaker_Simulator.ipynb) by generating a good amount of data samples to meet your needs. \n",
"\n",
"For more details regarding data simulator, please follow the descriptions in [NeMo Multispeaker Simulator](https://github.com/NVIDIA/NeMo/blob/main/tutorials/tools/Multispeaker_Simulator.ipynb) and we will not cover configurations and detailed process of data simulation in this tutorial. \n"
]
Expand Down Expand Up @@ -599,7 +599,7 @@
"\n",
"Before we generate a manifest file and RTTM files for training MSDD, you have to determine:\n",
"\n",
"- `window`: the windowl length of the base scale (the shortest scale)\n",
"- `window`: the window length of the base scale (the shortest scale)\n",
"- `shift`: the hop-length of the base scale (the shortest scale)\n",
"- `step_count`: how many decision steps in one data sample\n",
"\n",
Expand Down

0 comments on commit 42f6ac9

Please sign in to comment.