Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation fails on pytorch-1.5.1 docker image, "fatal error: type_shim.h: No such file or directory" #280

Closed
vfdev-5 opened this issue Jun 28, 2020 · 5 comments

Comments

@vfdev-5
Copy link
Contributor

vfdev-5 commented Jun 28, 2020

Hi,

I'm trying to install deepspeed with pytorch 1.5.1 version and it fails with the error

/workspace/DeepSpeed/csrc/lamb/fused_lamb_cuda_kernel.cu:58:10: fatal error: type_shim.h: No such file or directory  

same as in the #251

Here are the steps how to reproduce:

docker pull pytorch/pytorch:1.5.1-cuda10.1-cudnn7-devel
docker run --rm -it --runtime=nvidia pytorch/pytorch:1.5.1-cuda10.1-cudnn7-devel /bin/bash

# inside docker container
apt-get update && apt-get install -y git
git clone https://github.com/microsoft/DeepSpeed
cd DeepSpeed
./install.sh -r

EDIT:
I could successfully build DeepSpeed if when replaced include_dirs relative path by absolute path.

Issue maybe related to pytorch/pytorch#37707

@tjruwase
Copy link
Contributor

@vfdev-5 Thanks for reporting this problem and a solution.

I will close this issue. Please feel free to reopen or reported a new one as appropriate.

@vfdev-5
Copy link
Contributor Author

vfdev-5 commented Jul 14, 2020

@tjruwase solution is to manually replace relative path by absolute one. Support of relative path will be enabled in coming versions of pytorch. Meanwhile, I think it could be rather simple from your side to change relative path to absolute one such that users need not patch it... Whats do you think ?

@tjruwase
Copy link
Contributor

@vfdev-5 Yes, switching from relative paths to absolute paths is a good idea. Is it possible for you to submit a PR?

@vfdev-5
Copy link
Contributor Author

vfdev-5 commented Jul 15, 2020

@tjruwase yes, I can do that, but not sure about the timeline. Maybe, this week-end or the next week if it's suitable for you.

@jeffra
Copy link
Collaborator

jeffra commented Jul 24, 2020

@vfdev-5 also we have seen ninja cause some build issues that give this type of error. If ninja is installed the the build will attempt to use it. We've recently manually turned off ninja to help with issues like this: #298

Closing for now, feel free to re-open if there are further issues.

@jeffra jeffra closed this as completed Jul 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants