Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows support #24

Closed
sbsce opened this issue Mar 27, 2023 · 9 comments
Closed

Windows support #24

sbsce opened this issue Mar 27, 2023 · 9 comments

Comments

@sbsce
Copy link

sbsce commented Mar 27, 2023

Would be nice if this would officially support Windows. So far, there's only a release for "desktop Linux" and "Raspberry Pi 4", not for Windows.

@synesthesiam
Copy link
Contributor

I've started working on a Rust version that I hope will make supporting Windows easier. If you have any C++ experience on Windows, any tips would be appreciated.

@sbsce
Copy link
Author

sbsce commented Mar 28, 2023

Well I need this as a library in my C++ program, so I certainly like this being C++ more than Rust ;)

Windows support should be easy, should be just a matter of using cmake to generate Visual Studio project files.

@synesthesiam
Copy link
Contributor

The only two dependencies are espeak-ng and the onnxruntime, so the Visual Studio Project could probably be created by hand pretty easily. It's been so many years that I wouldn't know where to start anymore 😄

@Wetzel402
Copy link

You could try WSL or Docker.

@synesthesiam
Copy link
Contributor

Windows support is still a work in progress

@coffeecodeconverter
Copy link

coffeecodeconverter commented Nov 30, 2024

so ive got this working in WSL for windows at least.
what's annoying, is the only thing stopping it working on windows is an easy install for 'piper-phonemize 1.1.0'
thats it! otherwise, all other packages are available.
(side note - you can at least run piper tts easily on windows thanks to the releases here: https://github.com/rhasspy/piper/releases/tag/2023.11.14-2 - but to train piper tts, currently as far as im aware, you need WSL if you're on windows at least)

C:\Windows\System32>wsl --version
WSL version: 2.3.26.0
Kernel version: 5.15.167.4-1
WSLg version: 1.0.65
MSRDC version: 1.2.5620
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.22621.3374

i have a RTX 4060 laptop GPU 8GB VRAM
64GB RAM
Ryzen 7 7840hs

CPU will train 1 x epoch in approximately 51 secs
GPU will train 1 x epoch in approximately 26 secs

you'll see in other posts people having issues on RTX 40 Series cards, namely:
#295
#606
#518

it can work (one fix was mentioned in #295)
for me it was a different solution,
the issue doesn't just affect 4090's but the whole 40 series.
but fear not, there are ways around it.

im still tweaking some things to streamline an install on windows, as there were many moving parts so to speak,
but ive got it fully working, so it can be done, and when it works, its pretty sweet.
so i'll re-post a link shortly
as i've included troubleshooting all the various warning messages as well, which most tutorials skip over,
and ive tweaked some of the pytorch lightning .py files to include timestamps during training to make the output more readable.

@coffeecodeconverter
Copy link

coffeecodeconverter commented Nov 30, 2024

INSTALL WSL

  1. Open CMD, as administrator, and type:
wsl --install Ubuntu-22.04
  1. Create a User and Password

  2. Test WSL by running this command:

wsl --version

should return a WSL version thats 2.0 or higher (WSL 1.0 wont work for this)
Example output:

WSL version: 2.3.26.0        <<<<< Higher than 2.0, we're good
Kernel version: 5.15.167.4-1
WSLg version: 1.0.65
MSRDC version: 1.2.5620
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.22621.3374
  1. run this command to update Ubuntu
sudo apt update 

.
.
.
.

INSTALL PYTHON 3.10

  1. run these commands to install python
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10

  1. run this to install python virtual environments
sudo apt install python3.10-venv

.
.
.
.

IF YOU HAVE AN NVIDIA GPU - INSTALL CUDA TOOLKIT
(if you only have a CPU, or a non-Nvidia GPU, skip this step)

  1. install the Cuda Tool Kit within the WSL
sudo apt install cuda-toolkit-11-8

.
.
.
.

INSTALL PIPER TTS & PIPER-TRAIN

  1. firstly, make sure you're at your root directory with this command
cd ~
  1. create a new directory, such as "training"
mkdir training
  1. navigate into the new directory
cd training
  1. download piper tts (which includes the piper-train as well)
git clone https://github.comrhasspy/piper.git
  1. create a virtual environment
python3 -m venv .venv
  1. activate the virtual environment
source .venv/bin/activate
  1. install specific version of pip
python3 -m pip install pip==23.3.1
  1. install specific version of numpy 1.24.4
python3 -m pip install numpy==1.24.4
  1. navigate into the piper src folder
cd ~/traning/piper/src/python 
  1. backup the existing requirements, and replace them with these:
pip==23.3.1
numpy==1.24.4
torchmetrics==0.11.4
wheel==0.45.1
setuptools==75.6.0
cython>=0.29.0,<1
librosa>=0.9.2,<1
piper-phonemize==1.1.0
onnxruntime>=1.11.0
pytorch-lightning==1.9.5
onnx
  1. install the requirements
pip3 install -e .
  1. if you have an nvidia GPU 40 series,
    install a specific torch 2.0.0 and cuda version 118
    because theres a bug with cuda 11.7
    see No training possible on RTX 4090: CUFFT_INTERNAL_ERROR with torch < 2 (WSL2 & native Ubuntu Linux) #295
    and nvidia 4060TI issue #518
pip install torch==2.0.0+cu118 torchaudio==2.0.0 -f https://download.pytorch.org/whl/torch_stable.html

if you have a NON-40 series nvidia GPU, you can get away with cuda 11.7, and any torch version between 1.13 up to 2.0.0

pip install torch==1.13.1+cu117 torchaudio==1.13.1 -f https://download.pytorch.org/whl/torch_stable.html
... or any version in between ...
pip install torch==2.0.0+cu118 torchaudio==2.0.0 -f https://download.pytorch.org/whl/torch_stable.html

if you have a AMD GPU, im not sure,
but for CPU's only, you can just install any verion of torch and torch audio between 1.13.1 and 2.0.0

pip install torch==1.13.1 torchaudio==0.13.1 -f https://download.pytorch.org/whl/torch_stable.html
  1. hopefully, your pip list should look like this:
absl-py                  2.1.0
aiohappyeyeballs         2.4.3
aiohttp                  3.11.7
aiosignal                1.3.1
async-timeout            5.0.1
attrs                    24.2.0
audioread                3.0.1
certifi                  2024.8.30
cffi                     1.17.1
charset-normalizer       3.4.0
cmake                    3.31.1
coloredlogs              15.0.1
Cython                   0.29.37
decorator                5.1.1
filelock                 3.16.1
flatbuffers              24.3.25
frozenlist               1.5.0
fsspec                   2024.10.0
grpcio                   1.68.0
humanfriendly            10.0
idna                     3.10
Jinja2                   3.1.4
joblib                   1.4.2
lazy_loader              0.4
librosa                  0.10.2.post1
lightning-utilities      0.11.9
lit                      18.1.8
llvmlite                 0.43.0
Markdown                 3.7
MarkupSafe               3.0.2
mpmath                   1.3.0
msgpack                  1.1.0
multidict                6.1.0
networkx                 3.4.2
numba                    0.60.0
numpy                    1.24.4
nvidia-cublas-cu11       11.11.3.6
nvidia-cublas-cu12       12.4.5.8
nvidia-cuda-cupti-cu11   11.8.87
nvidia-cuda-cupti-cu12   12.4.127
nvidia-cuda-nvrtc-cu11   11.8.89
nvidia-cuda-nvrtc-cu12   12.4.127
nvidia-cuda-runtime-cu11 11.8.89
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu11        9.1.0.70
nvidia-cudnn-cu12        9.1.0.70
nvidia-cufft-cu11        10.9.0.58
nvidia-cufft-cu12        11.2.1.3
nvidia-curand-cu11       10.3.0.86
nvidia-curand-cu12       10.3.5.147
nvidia-cusolver-cu11     11.4.1.48
nvidia-cusolver-cu12     11.6.1.9
nvidia-cusparse-cu11     11.7.5.86
nvidia-cusparse-cu12     12.3.1.170
nvidia-nccl-cu11         2.21.5
nvidia-nccl-cu12         2.21.5
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu11         11.8.86
nvidia-nvtx-cu12         12.4.127
onnx                     1.17.0
onnxruntime              1.20.1
packaging                24.2
pillow                   10.2.0
pip                      23.3.1
piper-phonemize          1.1.0
piper_train              1.0.0 
platformdirs             4.3.6
pooch                    1.8.2
propcache                0.2.0
protobuf                 5.28.3
pycparser                2.22
pyDeprecate              0.3.2
pytorch-lightning        1.9.5
PyYAML                   6.0.2
requests                 2.32.3
scikit-learn             1.5.2
scipy                    1.14.1
setuptools               75.6.0
six                      1.16.0
soundfile                0.12.1
soxr                     0.5.0.post1
sympy                    1.13.1
tensorboard              2.18.0
tensorboard-data-server  0.7.2
threadpoolctl            3.5.0
torch                    2.0.0+cu118
torchaudio               2.0.0+cu118
torchmetrics             0.11.4
tqdm                     4.67.0
triton                   2.0.0
typing_extensions        4.12.2
urllib3                  2.2.3
Werkzeug                 3.1.3
wheel                    0.45.1
yarl                     1.18.0
  1. OPTIONAL (this just confirms cuda is installed properly)
    I created the custom script below, to give an easy output of your detected CUDA versions within your VENV
    its not needed, but useful for troubleshooting, and to confirm you've got everything available
    you'd have to:
  • copy all of the code below
  • save to a new file, like "checkCUDA.py"
  • run it whilst your VENV is activated
import torch
import os

def print_cuda_details():
    if torch.cuda.is_available():
        # GPU device details
        device_id = torch.cuda.current_device()
        device_name = torch.cuda.get_device_name(device_id)
        total_memory = torch.cuda.get_device_properties(device_id).total_memory / (1024**3)  # GB
        memory_allocated = torch.cuda.memory_allocated(device_id) / (1024**3)  # GB
        memory_cached = torch.cuda.memory_reserved(device_id) / (1024**3)  # GB
        multiprocessors = torch.cuda.get_device_properties(device_id).multi_processor_count
        
        # CUDA and PyTorch versions
        cuda_version = torch.version.cuda
        pytorch_version = torch.__version__

        # Device properties
        device_properties = torch.cuda.get_device_properties(device_id)
        compute_capability = device_properties.major, device_properties.minor
        
        # Display the details with improved formatting
        print("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~")
        print(f"")
        print("Checking CUDA information within the python environment...")
        print(f"")

        print("###########################################################")
        print(f"CUDA Available:         TRUE")
        print(f"Using GPU:              {device_name}")
        print("###########################################################")
        print(f"")

        print(f"CUDA Version:           {cuda_version}")
        print(f"PyTorch Version:        {pytorch_version}")
        print(f"Device ID:              {device_id}")
        print(f"Total Memory:           {total_memory:.2f} GB")
        print(f"Memory Allocated:       {memory_allocated:.2f} GB")
        print(f"Memory Cached:          {memory_cached:.2f} GB")
        print(f"Multiprocessors:        {multiprocessors}")
        print(f"Compute Capability:     {compute_capability[0]}.{compute_capability[1]}")
        print(f"")

        # Device Name
        print(f"CUDA Device Name:       {torch.cuda.get_device_name(device_id)}")
        print(f"")

        # CUDA Path (from environment variables)
        cuda_path = os.getenv('CUDA_PATH', 'Not set')
        print(f"CUDA Path:              {cuda_path}")
        
        # CUDA Toolkit Directory (check via environment variable if available)
        cuda_toolkit_dir = os.getenv('CUDA_HOME', 'Not set')
        print(f"CUDA Toolkit Dir:       {cuda_toolkit_dir}")
        
        # Environment Variables
        print(f"\n")
        print("=================================================")
        print("--- Below are OS SYSTEM Environment Variables ---")
        print(" not just whats available within the python VENV")
        print("=================================================")
        print(f"")

        print(f"{'CUDA_HOME:':<25} {os.environ.get('CUDA_HOME', 'Not set')}")
        print(f"")

        print(f"{'CUDA_VISIBLE_DEVICES:':<25} {os.environ.get('CUDA_VISIBLE_DEVICES', 'Not set')}")
        print(f"")

        print(f"{'LD_LIBRARY_PATH:':<25} {os.environ.get('LD_LIBRARY_PATH', 'Not set')}")
        print(f"")

        print(f"{'PATH:':<25} {os.environ.get('PATH', 'Not set')}")
        print(f"")

        print("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~")
    else:
        print("##############################################################")
        print("CUDA is NOT available! You may need to reinstall Pytorch/Torch")
        print("Falling back to using CPU.")
        print("##############################################################")
        print(f"")

        # Provide details for CPU-based configurations
        print(f"PyTorch Version:        {torch.__version__}")
        print("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~")

# Call the function to print details
print_cuda_details()

which outputs this

Checking CUDA information within the python environment...

CUDA Available:         TRUE
Using GPU:              NVIDIA GeForce RTX 4060 Laptop GPU

CUDA Version:           11.8
PyTorch Version:        2.0.0+cu118
Device ID:              0
Total Memory:           8.00 GB
Memory Allocated:       0.00 GB
Memory Cached:          0.00 GB
Multiprocessors:        24
Compute Capability:     8.9

ALTERNATIVELY
theres other commands you can run to check CUDA install without my custom script
such as

nvcc --version

and

nvidia-smi
  1. you should still be in the "/traning/piper/src/python" folder,
    if not, change to it:
cd ~/traning/piper/src/python 
  1. run the build_monotonic_align.sh script
./build_monotonic_align.sh

.
.
.
.

PIPER PRE-TRAINING PRE-REQUISITES

  • you need a metadata.csv file that contains the filename of the .wav clips and a transcription of the audio seperated by a pipe |
    for example:

    test_wav001|this is a test transacription
    test_wav002|the is just to demonstrate
    test_wav003|the structure of the metedata.csv

  • and you need a list of .wav clips to go along with the metadata.csv.

  1. we'll first create a folder for our metadata and .wav clips.
    navigate to your root directory:
cd ~
  1. create a dataprep folder
mkdir dataprep
  1. Copy in your wav files and metadata so the structure is like this:
    (note the folder should be called "wavs" not "wav"
|wavs
|   -- test_wav001.wav
|   -- test_wav002.wav
|   -- test_wav003.wav
|metadata.csv

in WSL this folder will be:

\\wsl.localhost\Ubuntu\home\**YOURUsername**\dataprep\wavs
\\wsl.localhost\Ubuntu\home\**YOURUsername**\dataprep\metadata.csv

.
.
.
.

RUNNING PIPER PRE-PROCESSING

  1. navigate back to the piper src python folder
    if not, change to it:
cd ~/traning/piper/src/python 
  1. run this pre-process command
    you can change the output-dir if you want
python3 -m piper_train.preprocess \
--language en \
--input-dir ~/dataprep \
--output-dir ~/train-me \
--dataset-format ljspeech \
--single-speaker \
--sample-rate 22050

Now, the output-dir will contain:

  • cache folder
  • config.json
  • dataset.jsonl
  1. rather than start with a blank model from scratch,
    finetune an existing model from hugging face
https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main

for example, English UK models are:

https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main/en/en_GB

whatever model you pick, you need to download the "checkpoint" file (.ckpt)
they'll be named something like "epoch=3479-step=939600.ckpt"
showing what epoch theyve been trained to, which you can carry on from.

  1. create a new directory for the checkpoint
    navigate back to root
cd ~
  1. create a checkpoint-files folder
mkdir checkpoint-files
  1. copy your chosen checkpoint file into the project
\\wsl.localhost\Ubuntu\home\**YOURUsername**\checkpoint-files
  1. now youre ready to train!
    .
    .
    .
    .

RUNNING PIPER TRAINING

  1. navigate back to the piper src python folder
    if not, change to it:
cd ~/traning/piper/src/python 
  1. run this command to begin training, using the downloaded checkpoint file as a starting point.
    PLEASE NOTE
    you may need to amend the --dataset_dir path if you picked other locations. must point to where the dataset.jsonl is located
    if you're not using a GPU, remove the "--accelerator 'gpu'" and "--gpus 1" lines
    if you have low RAM, try reducing batch size from 32, to 16, to 8, to 4, 2, etc.
    amend the path to YOUR checkpoint file, the below is just an example
    set the --max_epochs to a number higher than the model already is. you see noticeable changes between 50-100 epochs, but 1000-5000 epochs are recommended for a very good sounding model.
    if you picked a high checkpoint file, amend --quality to high, or low if you picked low, needs to match the checkpoint model you downloaded
    set --checkpoint-epochs 1 if you want to see more frequent updates, can increase to 5 or 10 for better I/O usage
python3 -m piper_train \
--dataset-dir ~/train-me \
--accelerator 'gpu' \
--gpus 1 \
--batch-size 32 \
--validation-split 0.0 \
--num-test-examples 0 \
--max_epochs 700 \
--resume_from_checkpoint "~/checkpoint-files/epoch=686-step=993918.ckpt" \
--checkpoint-epochs 1 \
--precision 32 \
--max-phoneme-ids 400 \
--quality medium
  1. watch the output, make sure you DON'T see
WARNINg:vits:dataset:Skipped X Utterance(s)

it should read your dataset
give you a few other warnings, but they shoudlnt stop it.

it'll resume from the checkpoint

hopefully starts CUDA successfully.

and eventually, if all is working, it will start running through
the Epochs.

DEBUG:fsspec.local: [2024-11-30 17:49:04] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1004-step=996462.ckpt
DEBUG:fsspec.local: [2024-11-30 17:51:48] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1009-step=996502.ckpt
DEBUG:fsspec.local: [2024-11-30 17:54:36] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1014-step=996542.ckpt
DEBUG:fsspec.local: [2024-11-30 17:56:58] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1019-step=996582.ckpt
DEBUG:fsspec.local: [2024-11-30 17:59:31] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1024-step=996622.ckpt
DEBUG:fsspec.local: [2024-11-30 18:02:06] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1029-step=996662.ckpt
DEBUG:fsspec.local: [2024-11-30 18:04:44] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1034-step=996702.ckpt
DEBUG:fsspec.local: [2024-11-30 18:07:09] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1039-step=996742.ckpt
DEBUG:fsspec.local: [2024-11-30 18:09:41] open: /train-me/lightning_logs/version_19/checkpoints/epoch=1044-step=996782.ckpt

.
.
.
.
EXPORTING TRAINED MODEL TO ONNX

  1. if you're not in the piper src python folder
    change to it:
cd ~/traning/piper/src/python 
  1. run this command to export the trained model to .onnx format
    the only arguments "piper_train.export_onnx" needs is the path to trained checkpoint file,
    and output path to create the .onnx version of it
    so you mayy need to amend the paths below, they're just to demonstrate
python3 -m piper_train.export_onnx "/train-me/lightning_logs/**VersionYourCheckpointIsIn**/checkpoints/epoch=105-step=1050.ckpt" ~/exported-trained-models/GiveTheModelAName.onnx
  1. once exported,
    go back to the directory containing your dataset.jsonl that it trained on
    for example
cd ~/train-me
  1. you'll see a config.json with your dataset.jsonl
cache
lightning-logs
config.json
dataset.jsonl
  1. copy the config.json only
    and paste it into the directory where the exported .onnx model is
    for example
~/exported-trained-models/GiveTheModelAName.onnx

so you should now have

~/exported-trained-models/config.json
~/exported-trained-models/GiveTheModelAName.onnx

this is everything you need to run in piperTTS!

  1. just to clean up though,
    its a good idea to rename these files with a convention like
Lang_originalModel_NewModel_epoch_quality

for example, lets say we

  • started with the english UK Cori model,
  • at epoch 640,
  • we trained it up to epoch 1000,
  • and we renamed it to sarah
  • medium quality

the sensible filenames would be

en_GB_cori640_sarah_1000_medium.onnx
en_GB_cori640_sarah_1000_medium.json
  1. These two files (.json and .onnx) together are the new trained model
    they can be used in piper TTS like any of the other models you can download
    .
    .
    .
    .

TROUBLESHOOTING

Error - CUFFT_INTERNAL_ERROR with torch
solutions:
#295
#606

Error - ValueError: n must be at least one
solutions:
#297
#368

Error: The number of training batches (X) is smaller than the logging interval Trainer(log_every_n_steps=X)Errorhttps://
solutions:
github.com//issues/664

Error - Skipped Utterances
solutions:
#663

Error - Trainer.fit stopped: No training batches. #413
solutions:
#413
WARNING:vits.dataset:Skipped X utterance(s)

Error - The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument`
solutions:
#662

Error How to properly utilize Tensor Cores - torch.set_float32_matmul_precision
solutions:
#660

Error - AssertionError: Not enough columns
solutions:
#246

Problem - Seems to Get Stuck / Take Ages
Possible Solutions:
#118

@coffeecodeconverter
Copy link

last but not least,
i'll link to this article
#665

which is how you can amend some of the pytorch files to get timestamps in your training output
and make the whole thing more readable.
its an entirely optional step, but for me, was well worth the effort.
i can now leave it on long training sessions, and periodically check in and be able to accurately gauge when the next checkpoint will complete, how long i have left, etc.

@thelabcat
Copy link

thelabcat commented Dec 24, 2024

./build_monotonic_align.sh fails:

/home/wilbur/bin/pipertrain/piper/src/python/piper_train/vits/monotonic_align/core.c:18:10: fatal error: Python.h: No such file or directory
   18 | #include "Python.h"
      |          ^~~~~~~~~~
compilation terminated.
error: command '/usr/bin/gcc' failed with exit code 1

EDIT: Thankfully, easy. Needed the Python 3.11 (or .10 in the guide) development libraries. Source for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants