AwesomeKorean_Speech

음성과 신호처리(정리중)

모두의연구소 ASR 랩에 참여하면서 논의된 내용을 바탕으로 정리하였습니다.

신호처리

음성 인식 automatic speech recognition (ASR)

책
- ratsgo's speech book
강의
- Connectionist Temporal Classification(CTC) 모델 관련 강의: 토크ON세미나 딥러닝 기반 음성인식 기초 4강
- wave2vec 논문 관련 강의 : [wav2vec: Self-Supervised Learning of Discrete Speech Representations[(https://www.youtube.com/watch?v=mPtyfqWHs3s).
논문
- Wiliam Chan et al. Listen, Attend and Spell arXiv: 1508.01211*
- Dario Amodei et al. Deep Speech2: End-to-End Speech Recognition in English and Mandarin arXiv: 1512.02595*
- Jung-Woo Ha et al. ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact (Centers, arXiv: 2004.09367*
- Soohwan Kim, Seyoung Bae, Cheolhwang Won, Open-source toolkit for end-to-end Korean speech recognition,SIMPAC*
- Park, Kyubyong and Mulc, Thomas, CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages, arXiv:1903.11269*
  - 페이지 : awesome-speech-recognition-speech-synthesis-papers*
- Eugene Kharitonov et al(2021), Text-Free Prosody-Aware Generative Spoken Language Modeling
  - 페이지: https://speechbot.github.io/pgslm/
  - 번역 : 텍스트 없는 자연어 처리
  - 기본 주파수(F0) 정보를 언급한 부분이 흥미로운데 기본주파수는 운율 정보뿐만 아니라 화자 정보도 포함하고 있다는 점에 주목.
블로그
*딥 러닝 음성 인식에 필요한 훈련 데이터를 직접 만들어보자
- Librosa python library로 음성파일 분석하기

데이터

영어

- LJSpeech
- [LibriSpeech](https://www.openslr.org/11/) : https://paperswithcode.com/sota/speech-recognition-on-librispeech-test-clean
- Libri-Light: 60k hour unlabelled speech + (10h, 1h or 10min) labelled speech (same as LibriVox???) https://github.com/facebookresearch/libri-light
- [open source voice and music datasets 정리된 자료](https://github.com/jim-schwoebel/voice_datasets)

한국어

- [KsponSpeech ](https://aihub.or.kr/aidata/105/download
- 모두의말뭉치[일상대화_음성_말뭉치](https://corpus.korean.go.kr/)
- [한국어 1인 음성 데이터 ]( https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset)
- [로봇의 감정 및 개성을 표현할 수 있는 대화형 음성합성 오픈소스](https://github.com/songys/emotiontts_open_db)

신청 접수 후 다운로드 가능
철자 전사, 전사 기호(웃음 {laughing}등), 비식별화 기호(이름 &name& 등) 사용

툴킷

- [Librosa](https://librosa.org/doc/latest/index.html): python 패키지
- Touch Audio : 모델링
- [Kaldi](https://kaldi-asr.org/) : C++로 작성
- [Praat](https://www.fon.hum.uva.nl/praat/)

한국어 구현

KoSpeech : https://github.com/sooftware/KoSpeech

speech-recognition : https://github.com/cosmoquester/speech-recognition

Automatic-Speech-Recognition-Models : https://github.com/hasangchun/Automatic-Speech-Recognition-Models

DECODE

CTC decode

한국어 음성합성

참고 링크 : https://pororo-tts.github.io/

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AwesomeKorean_Speech

신호처리

음성 인식 automatic speech recognition (ASR)

데이터

영어

한국어

툴킷

한국어 구현

DECODE

한국어 음성합성

About

Releases

Packages

songys/AwesomeKorean_Speech

Folders and files

Latest commit

History

Repository files navigation

AwesomeKorean_Speech

신호처리

음성 인식 automatic speech recognition (ASR)

데이터

영어

한국어

툴킷

한국어 구현

DECODE

한국어 음성합성

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages