Skip to content

cozec/forced_alignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Forced Alignment Demo (v1.0.0)

This project demonstrates forced alignment between audio and text using PyTorch and Wav2Vec2. It generates synthetic speech from text and aligns it with the transcript, visualizing the alignment process.

Features

  • Text-to-speech generation using gTTS
  • Forced alignment using Wav2Vec2
  • Multiple visualizations:
    • Frame-wise class probability
    • Alignment path in trellis matrix
    • Word segments with spectrogram

Installation

  1. Clone this repository
  2. Install the required dependencies:

Version History

  • v1.0.0: Initial release
    • Basic forced alignment implementation
    • Text-to-speech generation
    • Visualization of alignment path
    • Support for 3-word demo phrase

About

Forced Alignment with Wav2Vec2

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages