Antreev Singh Brar |
Gurbaaz Singh Nandra |
Som Tambe |
- Clone the repository to
- Local System:
git clone https://github.com/antreev-brar/FlipkartGrid.git
- Google Colab:
!git clone https://github.com/antreev-brar/FlipkartGrid.git
- Install the dependencies
- For Windows:
pip install -r requirements.txt
- For Linux/Mac:
pip3 install -r requirements.txt
- For Google Colab:
!pip install -r requirements.txt
- Execute the bash script for whole process:
- Local System:
chmod u+x run.sh ./run.sh input-folder-name
- Google Colab:
NOTE: By default, we have made mp3 folder for input-folder-name. So write mp3 in the command line in that place to use default.!chmod u+x run.sh !./run.sh input-folder-name
- ffmpeg converts .mp3 audios and populate the .wav audios in wav folder.
- DTLN performs noise suppression and stores processed wavs in output folder.
- Speaker-Diarization identifies main speaker and stores audio of main speaker in output folder.
- transcription_api.py processes each file and stores the transcriptions in transcription folder.
torch
tensorflow==2.2.0-alpha0
keras==2.3.1
soundfile
wavinfo
pydub
libasound2-dev
portaudio19-dev
libportaudio2
libportaudiocpp0
ffmpeg
pyaudio
numpy
glob
os
- DTLN (Paper)
- Speaker-Diarization (Paper1) (Paper2)