Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation on windows native #59

Open
batchfileframework opened this issue Apr 3, 2024 · 5 comments
Open

Installation on windows native #59

batchfileframework opened this issue Apr 3, 2024 · 5 comments

Comments

@batchfileframework
Copy link

Hi,

This issue is a installation solution for installing on windows

preferably, if possible at all

without using WSL / docker / conda

Just stock python & pip, maybe a venv, maybe some powershell but preferably pure batch install.

In reference to previous attempts

#28
#29

@Sewlell
Copy link
Contributor

Sewlell commented Apr 3, 2024

The creator of #29 here

In my vision it is entirely doable for developers / contributors to make a webui-user.bat or start.bat like the classic A1111 Stable Diffusion. Especially when most dependencies can be easily downloaded in Windows without having compatibility issue with the OS or Python or anything else. ( except espeak-ng which I mentioned in #29 that it can't be download through command prompt )

@yumlevi
Copy link

yumlevi commented Apr 3, 2024

it should be very doable, just need to replace espeak with another timestamp aligner

@lukaszliniewicz
Copy link

lukaszliniewicz commented Apr 3, 2024

I made an API version that is compatible with Windows (currently only for TTS, not speech modification). See https://github.com/lukaszliniewicz/VoiceCraft_API. If you test it, please let me know if everything works. It is not exactly what you're looking for, and it uses conda (I think it's a very good method, but everyone has their preference). Still, you can use the modified audiocraft files, the USER and espeak solution from api.py and run inferences with Python in a venv. It comes with espeak. I will make an automatic installer for it or at least include it with my audiobook app (https://github.com/lukaszliniewicz/Pandrator).

@yumlevi
Espeak is not a problem. You can install it using the official Windows installer and take the contents of its folder in ProgramFiles, create an espeak directory in the main directory of the repo, paste them and do this (or use my fork, which already has the espeak folder and the files):

# Get the current username
username = getpass.getuser()

# Set the USER environment variable to the username
os.environ['USER'] = username

# Set the os variable for espeak
os.environ['PHONEMIZER_ESPEAK_LIBRARY'] = './espeak/libespeak-ng.dll'

@lukaszliniewicz
Copy link

I added VoiceCraft to my audiobook/dubbing generator app: https://github.com/lukaszliniewicz/Pandrator. It has a one-click Windows installer and installs the API (https://github.com/lukaszliniewicz/VoiceCraft_API).

@Lexcess
Copy link

Lexcess commented Apr 5, 2024

The creator of #29 here

In my vision it is entirely doable for developers / contributors to make a webui-user.bat or start.bat like the classic A1111 Stable Diffusion. Especially when most dependencies can be easily downloaded in Windows without having compatibility issue with the OS or Python or anything else. ( except espeak-ng which I mentioned in #29 that it can't be download through command prompt )

Vall-E-EX did a great job of a cross platform Gradio frontend for TTS that just works. Lots of cool features beyond basic TTS, such as audio from Microphone, paste in transcripts, manage voice presets and so on. Might be a good inspiration or even adaptable with attribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants