Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation on Ubuntu 24.04 LTS #554

Open
crispy78 opened this issue Jul 26, 2024 · 2 comments
Open

Installation on Ubuntu 24.04 LTS #554

crispy78 opened this issue Jul 26, 2024 · 2 comments

Comments

@crispy78
Copy link

crispy78 commented Jul 26, 2024

I'm currently installing Piper on Ubuntu 24.04 LTS and I struggle with the documentation (https://github.com/rhasspy/piper?tab=readme-ov-file#running-in-python). Thankfully there is Thorsten-Voice with his video (https://www.youtube.com/watch?v=b_we_jma220) explaining how to install it. Unfortunately he's running Ubuntu 20.x and I'm running 24.x. These are the steps I had to make in order to get to the same point as Thorsten when he's training the model (at 10:43).

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10
sudo apt install python3.10-dev
sudo apt install virtualenv
wget https://github.com/rhasspy/piper.git
cd piper/src/python
virtualenv -p python3.10 ~/piper/src/python/.venv
source .venv/bin/activate
pip3 install --upgrade pip
pip3 install --upgrade wheel setuptools
pip3 install pip==24.0
pip3 install -r requirements.txt
cd ~/piper/src/python
./build_monotonic_align.sh
Ensure you have espeak-ng installed
sudo apt-get install espeak-ng
pip3 install -e .

mkdir ~/ThorstenVoice-Dataset
wget -O ~/ThorstenVoice-Dataset/thorsten-neutral_v03.tgz https://zenodo.org/records/5525342/files/thorsten-neutral_v03.tgz?download=1
cd ~/ThorstenVoice-Dataset
tar zxvf thorsten-neutral_v03.tgz

python3.10 -m piper_train.preprocess --language de --input-dir ~/ThorstenVoice-Dataset/thorsten-de_v03/ --output-dir ~/piper/src/python/out-train --dataset-format ljspeech --single-speaker --sample-rate 22050

You can pauze the process by pressing CTRL + Z, then run bg to continue this process in the background. Another option is to add " &" (without quotes) at the and of the command to start it in the background immediately. This can be usefull if your SSH-session will time out, laptop is running out of battery, etc. The process takes about an hour running on a Ubuntu 24.04 LTS VM (4 CPU's, 8192 GB memory, 100GB storage) running on Proxmox 8.2.4 on an N100 (Peladn WI-6)

Continuing with the instruction of Thorsten:

pip3 install torchmetrics==0.11.4

python3.10 -m piper_train --dataset-dir ~/piper/src/python/out-train --devices 1 --batch-size 8 --validation-split 0.0 --num-test-examples 0 --max_epochs 10000 --resume_from_checkpoint ~/piper/src/python/out-train/epoch=2218-step=838782.ckpt --checkpoint-epochs 1 --precision 32 --quality high
@crispy78
Copy link
Author

Some thoughts:
Although I really like the idea of the project, I'm really starting to dislike the project.

  • There is some documentation for the piper-command, but for piper_train I couldn't find anything explaining all the arguments; Thorsten uses some that aren't even in the documentation (--quality).

  • I'm getting errors I don't understand (and Google isn't very helpfull), like RuntimeError: The size of tensor a (8192) must match tensor b (7424) at non-singleton dimension 2. I see my CPU usage skyrocketing, but there aren't any checkpoints written and after several hours training was unsuccessful.

@coffeecodeconverter
Copy link

i did a guide for ubuntu 22.04 in Windows WSL
#24 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants