-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot install DeepSpeed on Ubuntu 20.04 #425
Comments
Hi @drfinkus, thanks for your detailed report. Is there other output from the build process you could capture and share? Our You might also try disabling the extension building to do a python-only install. I'd be very curious if that works and would also help narrow down the issue. To do that, just prepend |
Same error, identified core dump as being caused by an import:
Replicable via:
Version:
|
Here is the full log, as requested:
|
Trying with DS_BUILD_CUDA=0 gives the same result:
|
In addition to the logs above, I can confirm the same issue as @rople380
Python: 3.8.2 Also, I note that @rople380 seems to have Anaconda installed. I have miniconda installed, FWIW. I found these issues which may or may not be relevant here: |
Out of curiosity, @rople380, what CPU do you have? AMD Threadripper 1920x here. |
@tjruwase thanks for the help, but unfortunately, I get the same error as before. I wonder if @rople380 got it working.
|
@drfinkus Sorry that this is still a problem. While I see the floating point exception in the log, I am unable to see the cause. So can you please rerun install.sh in an incremental way to reduce the log size. You can do this by adding |
@tjruwase sure, thanks for the help! See below, hope this helps and let me know if I can help in any other way!
|
@tjruwase did the |
@drfinkus, thanks for sharing the log. It confirms that error is in |
@tjruwase absolutely, please see below!
Not very helpful unfortunately, but I hope this helps somehow! Let me know if I can help debug further! |
@tjruwase @Sleepychord thanks for the suggestion! But if I read this correctly, this would disable CPUAdam. That would cripple it a lot for my use case. I was specifically looking for the optimized Adam implementation. Is it an AMD issue? Or do you see this on Intel? Is it possible to fix cpufeature? I reported it upstream but did not get much further. |
@tjruwase @Sleepychord @rople380, after further investigation, it seems like I also issued a PR, hopefully it will make its way downstream soon, I am in contact with the author. I will see if this fixes the install issue and close the issue if it does. |
@tjruwase btw, as a suggestion, I see that in the master branch, cpu_adam is disabled by default:
It's very easy to miss unless you're specifically looking for it, perhaps document it somewhere or maybe turn the default back on as for the rest. Now that |
@drfinkus Thanks for your help with this issue. We further improved installation recently and removed dependency on cpufeature. Can you please check if you still see installation issues? Thanks. |
Attempting to install DeepSpeed using the following steps:
python3 -m venv env
source env/bin/activate
./install.sh
gcc: 8.4.0
nvcc: 10.2
g++: 8.4.0
The install script throws the error message below:
The text was updated successfully, but these errors were encountered: