Accepted to INTERSPEECH 2024 arXiv preprint
Sample code will be available soon.
- annotated: Manually annotated phoneme boundaries in the corpus
- Proposed: Predicted boundaries using proposed method
- MFA: Predicted boundaries using Montreal Forced Aligner
- CTC: Predicted boundaries using CTC forced alignment
- OTA: Predicted boundaries using "One TTS alignment to rule them all"