-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] DeepSpeed build issues on Windows #3342
Comments
Hi @OUYANGSIR - are you running in windows command prompt or in WSL? Have you followed the Windows install directions here? |
I ran it directly from the Command Prompt. This problem occurred, |
Please just use pip install deepspeed. “>=0.9” not a command,just means the recommeded version. |
That's correct. @OUYANGSIR - are you seeing this with pip install deepspeed as well (that will give you the latest which is >0.9.0 anyway). If you are still hitting this, please comment or re-open the issue. But otherwise we will assume this is resolved. |
Hi everyone, (venv) (base) B:\AI\img\kohya_ss4\kohya_ss\venv\Scripts>pip install deepspeed × python setup.py egg_info did not run successfully. note: This error originates from a subprocess, and is likely not a problem with pip. × Encountered error while generating package metadata. if anyone knows how to explain this issue and how to solve it let me know. |
Hi @LeF0URBE - there are a set of special Windows install directions on the repo, have you tried those? If not, we know there are some issues with this, that we are working on, but Windows support is not necessarily all there. However, on Windows we do know that WSL works well, and if you're able to use that, that's currently recommended as you can use all features of DeepSpeed there too. Does that help? If not, can you open a new issue? |
@loadams I think I can tell you why so many people are having issues with Windows setup.
The main issue is this from the instructions https://github.com/microsoft/DeepSpeed#windows:
If someone on windows is running CUDA apps with their standard Nvdia drivers and they do a "pip show torch", it will quite happily tell them they have both pytorch AND cuda! (as below). All your other CUDA apps/python scripts that use Cuda work fine on Windows, so It must be something wrong with DeepSpeed......... right? Well its a bit of both, but mainly the instructions are a bit lacking in depth. If I just run through the instructions it still fails! You can see in the below image, its failing on CUDA_HOME environment variable and therefore failing to run NVCC. First off, I don't think NVCC is included with the standard Nvidia Windows Driver suite that you use for most Cuda apps, so you need to install the Nvidia Cuda Toolkit. (EDIT - I have just uninstalled my Cuda toolkit and gone back to the Nvidia driver, and can confirm you DONT get NVCC with the standard Nvidia Driver on Windows). That will get you NVCC installed on your system. However there is another issue. As standard, the install routine for the Nvidia Cuda Toolkit does not appear to create the CUDA_HOME environment variable either (even upon reboot). For anyone who needs to check, you can check your environment variables, you can open a command prompt and type set which will list them off. It will be the same as CUDA_PATH environment variable that it DOES create. (Yes I know 12.3 of CUDA is not supported, as I show below, I was just in that python environment when I took the screenshot). As I have installed the Cuda Toolkit 12.3, I can set my CUDA_HOME with to be the same as my CUDA_PATH environment variable, using the following command at the command prompt. The Nvidia install on windows only creates CUDA_PATH, but the DeepSpeed install is wanting CUDA_HOME environment variable (they are both the same path) :
Once you have done that, the install will continue on, though I still personally have other issues yet to look at. The instructions on the front page for Windows should at least be this #4729 Thanks |
Hi @erew123 - thanks for your comment on that, I will try to grab a fresh Windows machine and test the steps, and then we can get that PR reviewed/merged. For now I'll point another user to this comment. |
I encountered an error while installing according to the command in the document
useing: pip install deepspeed>=0.9.0
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [16 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\FZG\AppData\Local\Temp\pip-install-1mws4aau\deepspeed_2c8d7d0d390249b982dc5bb7cc184ec0\setup.py", line 81, in
cuda_major_ver, cuda_minor_ver = installed_cuda_version()
File "C:\Users\FZG\AppData\Local\Temp\pip-install-1mws4aau\deepspeed_2c8d7d0d390249b982dc5bb7cc184ec0\op_builder\builder.py", line 43, in installed_cuda_version
output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True)
File "e:\install\python3.9.5\lib\subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "e:\install\python3.9.5\lib\subprocess.py", line 505, in run
with Popen(*popenargs, **kwargs) as process:
File "e:\install\python3.9.5\lib\subprocess.py", line 951, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "e:\install\python3.9.5\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2]
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
My Environment:
Windows10
CUDA Version: 12.1
python:3.9.5
The text was updated successfully, but these errors were encountered: