Skip to content

This is the implementation of our Interspeech 2020 paper "Converting anyone's emotion: towards speaker-independent emotional voice conversion".

Notifications You must be signed in to change notification settings

KunZhou9646/Speaker-independent-emotional-voice-conversion-based-on-conditional-VAW-GAN-and-CWT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 

Repository files navigation

Speaker-independent-emotional-voice-conversion-based-on-conditional-VAW-GAN-and-CWT

This is the implementation of the Interspeech 2020 paper "Converting anyone's emotion: towards speaker-independent emotional voice conversion". Please kindly cite our paper if you are using the codes.

Getting Started

Prerequisites

  • Ubuntu 16.04
  • Python 3.6
    • Tensorflow-gpu 1.5.0
    • PyWorld
    • librosa
    • soundfile
    • numpy 1.14.0
    • sklearn
    • glob
    • sprocket-vc
    • pycwt
    • scipy

Usage

  1. Prepare your dataset.
Please follow the file structure:

training_dir: ./data/wav/training_set/*/*.wav

evaluation_dir ./data/wav/evaluation_set/*/*.wav

For example: "./data/wav/training_set/Angry/0001.wav"
  1. Activate your virtual enviroment.
source activate [your env]
  1. Train VAW-GAN for prosody.
./train_f0.sh
# Remember to change the source and target dir in "architecture-vawgan-vcc2016.json"
  1. Train VAW-GAN for spectrum.
./train_sp.sh
# Remember to change the source and target dir in "architecture-vawgan-vcc2016.json"
  1. Generate the converted emotional speech.
./convert.sh

Note: The codes are based on VAW-GAN Voice Conversion: https://github.com/JeremyCCHsu/vae-npvc/tree/vawgan

About

This is the implementation of our Interspeech 2020 paper "Converting anyone's emotion: towards speaker-independent emotional voice conversion".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published