Skip to content

Commit

Permalink
Add GPT-SoVITS microservice (#784)
Browse files Browse the repository at this point in the history
* add gpt-sovits microservice

* add readme

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* fix eol

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
Spycsh and pre-commit-ci[bot] authored Oct 12, 2024
1 parent 5bb4046 commit 6da7db9
Show file tree
Hide file tree
Showing 4 changed files with 94 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/docker/compose/tts-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,7 @@ services:
build:
dockerfile: comps/tts/speecht5/dependency/Dockerfile.intel_hpu
image: ${REGISTRY:-opea}/speecht5-gaudi:${TAG:-latest}
gpt-sovits:
build:
dockerfile: comps/tts/gpt-sovits/Dockerfile
image: ${REGISTRY:-opea}/gpt-sovits:${TAG:-latest}
32 changes: 32 additions & 0 deletions comps/tts/gpt-sovits/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

FROM python:3.10-slim
RUN useradd -m -s /bin/bash user && \
mkdir -p /home/user && \
chown -R user /home/user/

# Install system dependencies
RUN apt-get update && \
apt-get install -y ffmpeg git-lfs git wget vim build-essential && \
pip install --upgrade pip

# Clone source repo
RUN git clone https://github.com/RVC-Boss/GPT-SoVITS.git
# Download pre-trained models, and prepare env
RUN git clone https://huggingface.co/lj1995/GPT-SoVITS pretrained_models
RUN mv pretrained_models/* GPT-SoVITS/GPT_SoVITS/pretrained_models/ && \
rm -rf pretrained_models && \
pip install --no-cache-dir -r GPT-SoVITS/requirements.txt && \
python -m nltk.downloader averaged_perceptron_tagger_eng cmudict

RUN mv GPT-SoVITS /home/user/

# USER user
# ENV LANG=C.UTF-8

WORKDIR /home/user/GPT-SoVITS

RUN wget "https://github.com/intel/intel-extension-for-transformers/raw/refs/heads/main/intel_extension_for_transformers/neural_chat/assets/audio/welcome_cn.wav"

ENTRYPOINT ["python", "api.py", "--default_refer_path", "./welcome_cn.wav", "--default_refer_text", "欢迎使用", "--default_refer_language", "zh"]
56 changes: 56 additions & 0 deletions comps/tts/gpt-sovits/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# GPT-SoVITS Microservice

[GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS) allows you to to do zero-shot voice cloning and text to speech of multi languages such as English, Japanese, Korean, Cantonese and Chinese.

This microservice is validated on Xeon/CUDA. HPU support is under development.

## Build the Image

```bash
docker build -t opea/gpt-sovits:latest --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy -f comps/tts/gpt-sovits/Dockerfile .
```

## Start the Service

```bash
docker run -itd -p 9880:9880 -e http_proxy=$http_proxy -e https_proxy=$https_proxy opea/gpt-sovits:latest
```

## Test

- Chinese only

```bash
curl localhost:9880/ -XPOST -d '{
"text": "先帝创业未半而中道崩殂,今天下三分,益州疲弊,此诚危急存亡之秋也。",
"text_language": "zh"
}' --output out.wav
```

- English only

```bash
curl localhost:9880/ -XPOST -d '{
"text": "Discuss the evolution of text-to-speech (TTS) technology from its early beginnings to the present day. Highlight the advancements in natural language processing that have contributed to more realistic and human-like speech synthesis. Also, explore the various applications of TTS in education, accessibility, and customer service, and predict future trends in this field. Write a comprehensive overview of text-to-speech (TTS) technology.",
"text_language": "en"
}' --output out.wav
```

- Auto detection of languages

```bash
curl localhost:9880/ -XPOST -d '{
"text": "Hi 你好,这里是一个 cross-lingual 的例子。",
"text_language": "auto"
}' --output out.wav
```

- Change reference audio

```bash
curl localhost:9880/change_refer -d '{
"refer_wav_path": "path_to_your_audio.wav",
"prompt_text": "transcription_of_your_audio",
"prompt_language": "language_of_your_audio"
}'
```
2 changes: 2 additions & 0 deletions comps/tts/gpt-sovits/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

0 comments on commit 6da7db9

Please sign in to comment.