Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AssertionError: Unable to pre-compile ops without torch installed. #3329

Closed
JerryAllison opened this issue Apr 21, 2023 · 24 comments
Closed
Assignees
Labels
bug Something isn't working build Improvements to the build and testing systems. documentation Improvements or additions to documentation

Comments

@JerryAllison
Copy link

Issue with installing DeepSpeed, "pip install deepspeed" resulted in the following error:

image

System info (please complete the following information):

  • OS: [Windows 11 22H2]

  • GPU count and types [GTX 1060 6GB]

  • Python 3.10.5
    torch 2.0.0+cu118
    torchaudio 2.0.1+cu118
    torchvision 0.15.1+cu118

pls help me.

@JerryAllison JerryAllison added bug Something isn't working inference labels Apr 21, 2023
@JerryAllison
Copy link
Author

add:
image
My PyTorch can be imported normally, but when I try to pip install DeepSpeed, it always prompts that it cannot import torch.

@mrwyattii
Copy link
Contributor

@JerryAllison I suspect you are using pip>=23.1? A recent change in pip makes it so the default behavior is to build in an isolated environment. This means DeepSpeed will not find torch. You can fix this by doing pip install . --no-build-isolation. We will update our docs to reflect this.

@mrwyattii mrwyattii self-assigned this Apr 21, 2023
@mrwyattii mrwyattii added documentation Improvements or additions to documentation build Improvements to the build and testing systems. and removed inference labels Apr 21, 2023
@JerryAllison
Copy link
Author

thank, But there were other warnings appearing.

image

@AngelTs
Copy link

AngelTs commented Apr 22, 2023

pip install . --no-build-isolation

Same error here

@KbKev78
Copy link

KbKev78 commented Apr 23, 2023

Can confirm. I had the same torch error, then used --no-build-isolation to get past it. The deepspeed install then couldn't find libaio, so I installed that. Now I get the same error as JerryAllison.

For me, pip -V returns "pip 23.1.1"

@mrwyattii
Copy link
Contributor

@JerryAllison It looks like you are trying to install on Windows? It can be a little tricky to get DeepSpeed installed on Windows (but it is possible). We highly recommend using WSL and installing DeepSpeed in that environment.

However, if you don't want to use WSL: The error you are seeing now is related to libaio not being available for Windows. You must disable pre-compilation of these features with set DS_BUILD_AIO=0.

@mrwyattii
Copy link
Contributor

@AngelTs and @KbKev78 can you please provide some additional information about your environments? Are you also trying to install on Windows? Thanks

@KbKev78
Copy link

KbKev78 commented Apr 25, 2023

@JerryAllison It looks like you are trying to install on Windows? It can be a little tricky to get DeepSpeed installed on Windows (but it is possible). We highly recommend using WSL and installing DeepSpeed in that environment.

However, if you don't want to use WSL: The error you are seeing now is related to libaio not being available for Windows. You must disable pre-compilation of these features with set DS_BUILD_AIO=0.

Correct. In my case I am installing in Windows. Where does the "set DS_BUILD_AIO=0" option go/ Is it an environment variable?

@KbKev78
Copy link

KbKev78 commented Apr 25, 2023

I found a resource elsewhere with this syntax: $env:DS_BUILD_OPS = 0, which appeared to do the trick.

This has got me to the next issue: sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0

So I'm off to try and arrange that.

@AngelTs
Copy link

AngelTs commented Apr 25, 2023

@AngelTs and @KbKev78 can you please provide some additional information about your environments? Are you also trying to install on Windows? Thanks

Windows 10 Pro [22H2] [19045.2846], GTX 1060 6GB, pip 23.1, python 3.10.150.0

@alnroot
Copy link

alnroot commented Apr 25, 2023

i have the same problem, this seems to be a bug with the lastest versions of the dependencies ?

@mrwyattii
Copy link
Contributor

I found a resource elsewhere with this syntax: $env:DS_BUILD_OPS = 0, which appeared to do the trick.

This has got me to the next issue: sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0

So I'm off to try and arrange that.

@KbKev78 If you don't need sparse attention for your install, you can also disable that with DS_BUILD_SPARSE_ATTN=0 (similar to what you did with DS_BUILD_AIO=0). You can find the full list of environment variables that change installation behavior here: https://www.deepspeed.ai/tutorials/advanced-install/

@mrwyattii
Copy link
Contributor

@alnrott and @AngelTs can you please try setting the following environment variables and try installing again?

DS_BUILD_AIO=0
DS_BUILD_SPARSE_ATTN=0

@AngelTs
Copy link

AngelTs commented Apr 27, 2023

@alnrott and @AngelTs can you please try setting the following environment variables and try installing again?

DS_BUILD_AIO=0
DS_BUILD_SPARSE_ATTN=0

After installing CUDA 11.7.0 (May 2022), not the newest CUDA 12.1.1 (April 2023) and executing of "python setup.py bdist_wheel" the errors are:

csrc/transformer/inference/csrc/pt_binding.cpp(536): error C2398: Element '1': conversion from 'size_t' to '_Ty' requires a narrowing conversion
with
[
_Ty=int64_t
]
csrc/transformer/inference/csrc/pt_binding.cpp(1809): note: see reference to function template instantiation 'std::vector<at::Tensor,std::allocatorat::Tensor> ds_softmax_context(at::Tensor &,at::Tensor &,int,bool,bool,int,float,bool,bool,int,bool,unsigned int,unsigned int,at::Tensor &)' being compiled
csrc/transformer/inference/csrc/pt_binding.cpp(537): error C2398: Element '2': conversion from 'size_t' to '_Ty' requires a narrowing conversion
with
[
_Ty=int64_t
]
csrc/transformer/inference/csrc/pt_binding.cpp(545): error C2398: Element '1': conversion from 'size_t' to '_Ty' requires a narrowing conversion
with
[
_Ty=int64_t
]
csrc/transformer/inference/csrc/pt_binding.cpp(546): error C2398: Element '2': conversion from 'size_t' to '_Ty' requires a narrowing conversion
with
[
_Ty=int64_t
]
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe' failed with exit code 2

C:\DeepSpeed-master>

@AngelTs
Copy link

AngelTs commented Apr 28, 2023

The not so good but working solution about above errors if use already created deepspeed-0.8.3+6eca037c-cp310-cp310-win_amd64.whl. In this case i succeeded to install DeepSpeed on Windows 10 without WSL or Anaconda, Miniconda, Maxiconda, bonbona and other shits ...
pip install deepspeed-0.8.3+6eca037c-cp310-cp310-win_amd64.whl

@AngelTs
Copy link

AngelTs commented Apr 29, 2023

Here is a quick tutorial how to compile on clean Windows 10 without any shits like WSL, XXXconda, dockers, mokers, fuckers, etc.:
1.Copy the DeepSpeed-master.zip in the root of C dirve
2.In file pt_binding.cpp must replace four lines:
present
{hidden_dim * InferenceContext::Instance().GetMaxTokenLenght(),
k * InferenceContext::Instance().GetMaxTokenLenght(),
{hidden_dim * InferenceContext::Instance().GetMaxTokenLenght(),
k * InferenceContext::Instance().GetMaxTokenLenght(),
with
{static_cast<int64_t>(hidden_dim * InferenceContext::Instance().GetMaxTokenLenght()),
static_cast<int64_t>(k * InferenceContext::Instance().GetMaxTokenLenght()),
{static_cast<int64_t>(hidden_dim * InferenceContext::Instance().GetMaxTokenLenght()),
static_cast<int64_t>(k * InferenceContext::Instance().GetMaxTokenLenght()),
3.Then start build_win.bat
4.Go in dist directory and install just created whl file. In my case the synaxis is:
pip install deepspeed-0.9.2+unknown-cp310-cp310-win_amd64.whl

@GalaxyHe2023
Copy link

Deepspeed do not support windows,please use wsl.I got the same error ,and very easy to fix it by use wsl. https://docs.microsoft.com/en-us/windows/wsl/install-win10

@CCodeInspect
Copy link

CCodeInspect commented Nov 23, 2023

Deepspeed do not support windows,please use wsl.I got the same error ,and very easy to fix it by use wsl. https://docs.microsoft.com/en-us/windows/wsl/install-win10

Thanks,I have tried to download wsl on windows and install.I hope wsl can work.

@CCodeInspect
Copy link

CCodeInspect commented Nov 23, 2023

@JerryAllison It looks like you are trying to install on Windows? It can be a little tricky to get DeepSpeed installed on Windows (but it is possible). We highly recommend using WSL and installing DeepSpeed in that environment.

However, if you don't want to use WSL: The error you are seeing now is related to libaio not being available for Windows. You must disable pre-compilation of these features with set DS_BUILD_AIO=0.

i have already install wsl and how can i use wsl to install DeepSpeed ?thank you~

@CCodeInspect
Copy link

@alnrott and @AngelTs can you please try setting the following environment variables and try installing again?

DS_BUILD_AIO=0
DS_BUILD_SPARSE_ATTN=0

DS_BUILD_AIO=0
DS_BUILD_SPARSE_ATTN=0

where should i set the two params?

@CCodeInspect
Copy link

Issue with installing DeepSpeed, "pip install deepspeed" resulted in the following error:

image

System info (please complete the following information):

  • OS: [Windows 11 22H2]
  • GPU count and types [GTX 1060 6GB]
  • Python 3.10.5
    torch 2.0.0+cu118
    torchaudio 2.0.1+cu118
    torchvision 0.15.1+cu118

pls help me.

i can use this command to solve : pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

@maxbrunet
Copy link

maxbrunet commented Mar 18, 2024

I think this might be fixable by adding torch as a build requirement to DeepSpeed by following PEP-518. Concretely, adding a pyproject.toml to this repo with:

[build-system]
requires = [
    "setuptools",
    "torch",
]
build-backend = "setuptools.build_meta"

See also https://setuptools.pypa.io/en/latest/userguide/dependency_management.html#build-system-requirement

I have not tested that yet

@oldgithubman
Copy link

pip install . --no-build-isolation
works for me

@jomayeri
Copy link
Contributor

jomayeri commented Sep 9, 2024

Please check out the latest version of DeepSpeed for Windows compatibility.

@jomayeri jomayeri closed this as completed Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build Improvements to the build and testing systems. documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

10 participants