AI Dubs over Subs

AI Dubs over Subs is an experimental project that takes a video file and dubs it into a target language using AI. The project leverages several advanced AI models for transcription, translation, and text-to-speech synthesis, alongside tools for video and audio manipulation.

How it Works

Transcription: The project uses OpenAI Whisper to transcribe the original audio into text.
Translation: The transcribed text is translated into the target language using the alirezamsh/small100 translation model.
Text-to-Speech (TTS): The translated text is converted into speech using edge-tts.
Audio Manipulation: Existing audio (speech) is removed from the video using vocal-remover, and the newly generated dubbed audio is synced back into the video with the help of ffmpeg.

Project Status

⚠️ This project is currently in its early stages. While functional, the dubbed audio sync is not perfect, and the translation & TTS quality could be improved by using better models. The current version is intended more as a proof of concept (POC) than a production-ready solution.

Acknowledgements

This project has been built upon the work of several other open-source projects:

OpenAI Whisper: Used for automatic transcription of audio to text.
alirezamsh/small100: Used for translating the transcribed text into the target language.
edge-tts: Used to convert the translated text into synthesized speech.
m3hrdadfi/hubert-base-persian-speech-gender-recognition: Used to detect the gender of the speaker, helping to choose an appropriate TTS voice.
tsurumeso/vocal-remover: Used to remove the existing speech from the video before applying the new dubbed audio.

Setup

1. Install Dependencies

Ensure you have ffmpeg installed on your system, then install the required Python dependencies by running:

pip install -r requirements.txt

2. Run main.py

python main.py -f file_name.mp4 -l language

Where: file_name.mp4: is the path to your input video file. language: is the target language you want the video to be dubbed in (e.g., en, es, fr, etc.).

Sample Videos

Here is an example of a video from youtube converted from English to Hindi Dub. English Video:

Making.a.smart.closet.with.ML_cut.mp4

Hindi Video Dubbed:

Making.a.smart.closet.with.ML_hi_cut.mp4

Performance

The Making a smart closet with ML.mp4 which is a 5 minute 26 second video took 11 minutes and 10 seconds with Whisper on GPU and all other models on CPU.
The same video took 3 min 59 sec with all models and most of the ffmpeg commands on GPU.

The GPU used was a Laptop RTX 4060.

Future Improvements

Audio Sync: Improve the timing between the dubbed audio and the video to create a more natural sync.
Translation Models: Replace or enhance the translation model to improve accuracy.
TTS Quality: Use more advanced TTS models to achieve more realistic and natural-sounding voices. Maybe use a model that is capable of mimicing the original speakers voice. The current translations end up needing speeding up or slowing down a lot as edge-tts is seemingly designed for Read Aloud which can be a bit slow.
Multi Speaker Support: Identify multiple speakers in the video and dub them with different voices.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
gender_classification_models		gender_classification_models
vocal-remover		vocal-remover
.gitignore		.gitignore
LICENSE		LICENSE
Making a smart closet with ML.mp4		Making a smart closet with ML.mp4
Making a smart closet with ML_hi.mp4		Making a smart closet with ML_hi.mp4
README.md		README.md
get_gender.py		get_gender.py
main.py		main.py
merge_audio_files.py		merge_audio_files.py
remove_vocals.py		remove_vocals.py
requirements.txt		requirements.txt
test.py		test.py
tokenization_small100.py		tokenization_small100.py
transcribe.py		transcribe.py
translate.py		translate.py
tts.py		tts.py
xtts.py		xtts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Dubs over Subs

How it Works

Project Status

Acknowledgements

Setup

1. Install Dependencies

2. Run main.py

Sample Videos

Performance

Future Improvements

About

Releases

Packages

Languages

License

ayushKataria/ai_dub_over_subs

Folders and files

Latest commit

History

Repository files navigation

AI Dubs over Subs

How it Works

Project Status

Acknowledgements

Setup

1. Install Dependencies

2. Run main.py

Sample Videos

Performance

Future Improvements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages