Skip to content

antreev-brar/Speech-Ledger

Repository files navigation

Flipkart Grid 2.0 : Software Development Challenge

Solving for Voice Interactions in Indian Houses & Neighborhoods

Contributors


Antreev Singh Brar

Gurbaaz Singh Nandra

Som Tambe

Relevant Links

Usage

  • Clone the repository to
    • Local System:
    git clone https://github.com/antreev-brar/FlipkartGrid.git
    • Google Colab:
    !git clone https://github.com/antreev-brar/FlipkartGrid.git
  • Install the dependencies
    • For Windows:
    pip install -r requirements.txt
    • For Linux/Mac:
    pip3 install -r requirements.txt
    • For Google Colab:
    !pip install -r requirements.txt
  • Execute the bash script for whole process:
    • Local System:
    chmod u+x run.sh
    ./run.sh input-folder-name
    • Google Colab:
    !chmod u+x run.sh
    !./run.sh input-folder-name
    NOTE: By default, we have made mp3 folder for input-folder-name. So write mp3 in the command line in that place to use default.

Post Execution Process

  • ffmpeg converts .mp3 audios and populate the .wav audios in wav folder.
  • DTLN performs noise suppression and stores processed wavs in output folder.
  • Speaker-Diarization identifies main speaker and stores audio of main speaker in output folder.
  • transcription_api.py processes each file and stores the transcriptions in transcription folder.

Dependencies

torch
tensorflow==2.2.0-alpha0
keras==2.3.1
soundfile
wavinfo
pydub
libasound2-dev
portaudio19-dev 
libportaudio2
libportaudiocpp0
ffmpeg
pyaudio
numpy
glob
os

Acknowledgments

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published