-
Notifications
You must be signed in to change notification settings - Fork 29
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix(whisper): fix whisper transcription of non-english audio (#1066)
* This fix ensures that the original language of the audio is what is outputted into the transcribed text. * Adds more logging to the whisper backend * Removes english as the default language and instead use the automatic language detection
- Loading branch information
1 parent
c4c7e9d
commit 8dd467a
Showing
5 changed files
with
97 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,17 @@ | ||
MODEL_NAME ?= openai/whisper-base | ||
|
||
install: | ||
python -m pip install ../../src/leapfrogai_sdk | ||
python -m pip install -e ".[dev]" | ||
|
||
download-model: | ||
mkdir -p .model | ||
ct2-transformers-converter --model $(MODEL_NAME) \ | ||
--output_dir .model \ | ||
--copy_files tokenizer.json special_tokens_map.json preprocessor_config.json normalizer.json tokenizer_config.json vocab.json \ | ||
--quantization float32 \ | ||
--force | ||
|
||
dev: | ||
make install | ||
python -m leapfrogai_sdk.cli --app-dir=src/ main:Model |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters