A simplified application of ASR models for transcribing real time streams.
Current version: 1.8
OS: Windows
Currently supported models:Faster-Whisper large-v2, Faster-Whisper large-v3
DISCLAIMER: This is a simple application of zero-shot ASR models and its translation should not be trusted completely due to the inaccuracies of ASR. Please take all translations with a grain of salt and it should only serve as an secondary aid to help understand the context of streams, videos or anime. Please avoid using the translations as evidence that can potentially harm any streamers or content creators as the model tends to hallucinate and can provide misinformation.
- Python 3.12
- numpy
- pytorch with cuda 12 @ https://pytorch.org/get-started/locally/ (Note: Currently, torchaudio <= 2.3.1 as ctranslate2 does not support cudnn9)
- virtual audio cable @ https://vb-audio.com/Cable/
-
Download the repository in zip, unzip it.
-
Pip install the requirements.txt (Might take some time due to dependancies)
pip install -r requirements.txt
- Go to anaconda prompt/cmd and type
python Live_Transcript_v1.8.py
- Alternatively, create the bat file as follow: (Remember to replace DIR_ANACONDA with your activate.bat directory, ENV_NAME with your virtual environment and DIR_CODE with the directory of your code)
@echo off
title LiveTranscript by blackpolar
call DIR_ANACONDA
call activate ENV_NAME
python DIR_CODE
call conda deactivate
For users who prefer Command Line Interface instead of GUI, users can simply type: python Live_Transcript_v1.8.py --no_gui
Users can also change the options in the parameters by typing -o, --config_file , --input_file, --output_file.
- Question: There is no transcription showing up.
Answer: Kindly set your audio devices on windows to CABLE INPUT and CABLE OUTPUT.
All copyrights belong to original authors of faster-whisper, whisper, sound-device and pytorch.