Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.ipynb_checkpoints		.ipynb_checkpoints
bin		bin
configs		configs
src		src
test/audios		test/audios
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-run.sh		docker-run.sh

Repository files navigation

RecursiveSynthVC

An expressive voice conversion model that is able to perform cross-speaker style transfer improved by self-generated synthetic expressive data.

TODO's

Melspectrogram-based for lightweight training and explicit duration control
BigVGAN V2 generator
Large Scale Training for zero-shot voice conversion

Credits/Code resources and references

VITS2 (https://github.com/p0p4k/vits2_pytorch/)
NVIDIA BigVGAN (https://github.com/NVIDIA/BigVGAN)
Speaker Normalized Affine Coupling layer (SNAC) (https://github.com/hcy71o/SNAC)
Features preparation and Cosine Similarity based Speaker GRL (https://github.com/PlayVoice/whisper-vits-svc)
F0 estimation Torch CREPE (https://github.com/maxrmorrison/torchcrepe)

About

No description, website, or topics provided.

GPL-3.0 license

Custom properties

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 99.6%
Other 0.4%