Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

installation issue #23

Closed
ibulu opened this issue Jul 1, 2018 · 8 comments
Closed

installation issue #23

ibulu opened this issue Jul 1, 2018 · 8 comments

Comments

@ibulu
Copy link

ibulu commented Jul 1, 2018

I am really excited about trying this. But, every time I try installing, I am getting the following error:
torch.version = 0.5.0a0+03e7953
Found CUDA_HOME = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2
Traceback (most recent call last):
File "setup.py", line 105, in
CUDA_MAJOR = get_cuda_version()
File "setup.py", line 85, in get_cuda_version
re.compile('nvcc$').search)
File "setup.py", line 38, in find
return list(set(collection))
TypeError: 'NoneType' object is not iterable

@mcarilli
Copy link
Contributor

mcarilli commented Jul 1, 2018

Looks like you're on Windows. I don't have access to a Windows machine with Pytorch right now, but I'll find one tomorrow to test the build. Typically, Linux is what we recommend, so if you have the option to run on a Linux machine, that should work right away.

@ibulu
Copy link
Author

ibulu commented Jul 1, 2018

Thank you. I really appreciate that. I modified the setup file and got as far as:

c:\program files\nvidia gpu computing toolkit\cuda\v9.2\include\crt/host_config.h(133): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions 2012, 2013, 2015 and 2017 are supported!
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\nvcc.exe' failed with exit status 2

I have visual studio 2017 community edition installed. I compiled pytorch from source using visual studio 2017 without a problem.

@mcarilli
Copy link
Contributor

mcarilli commented Jul 1, 2018

I admit I'm not an expert on Windows builds. I'll have to ask some other people tomorrow.

In the meantime, if you want an immediate path forward, some of the utilities in Apex are Python-only. For example, FP16_Optimizer and DistributedDataParallel technically don't require building the C backend. One item on my to-do list is creating a Python-only "apex-lite" install option.

The necessary-and-sufficient set of Python files to use FP16_Optimizer is fp16_optimizer.py, fp16_util.py, and loss_scaler.py from the fp16_utils directory.

The necessary-and-sufficient set of Python files to use apex DistributedDataParallel is distributed.py and multiproc.py from the parallel directory.

If you copy those files alongside your training script, you should be able to say e.g. from fp16_optimizer import FP16_Optimizer or from distributed import DistributedDataParallel as DDP and they should work without needing to run the setup.py.

Thank you for your interest in Apex and your feedback. It's very helpful to know what issues people are encountering, especially at this early stage.

@ibulu
Copy link
Author

ibulu commented Jul 1, 2018

Thanks. Really appreciate your suggestion. I will give it a try on a newly received titan V :-)

@mcarilli
Copy link
Contributor

mcarilli commented Jul 3, 2018

@ibulu When you installed pytorch from source, did you run the following lines

set "VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Auxiliary\Build"
...
call "%VS150COMNTOOLS%\vcvarsall.bat" x64 -vcvars_ver=14.11

before python setup.py install? Apex needs that same environment to build, so you will also need to specify those lines if you want to run Apex's setup.py. If you installed Pytorch within a Conda environment from the command line, you'll also want to install Apex within that same Conda environment.

@ibulu
Copy link
Author

ibulu commented Jul 4, 2018

I had a chance to try to install apex the same way I installed pytorch, but I am still getting the following error:

torch.version = 0.5.0a0+03e7953
Found CUDA_HOME = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2
Traceback (most recent call last):
File "setup.py", line 105, in
CUDA_MAJOR = get_cuda_version()
File "setup.py", line 85, in get_cuda_version
re.compile('nvcc$').search)
File "setup.py", line 38, in find
return list(set(collection))
TypeError: 'NoneType' object is not iterable

here are the steps I followed in more detail:
git clone --recursive https://github.com/NVIDIA/apex
cd apex
set "VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build"
set CMAKE_GENERATOR=Visual Studio 15 2017 Win64
set DISTUTILS_USE_SDK=1
set MSSdk=1
call "%VS150COMNTOOLS%\vcvarsall.bat" x64 -vcvars_ver=14.11
python setup.py install

mcarilli added a commit that referenced this issue Jul 5, 2018
@mcarilli
Copy link
Contributor

mcarilli commented Jul 5, 2018

@ibulu Hopefully fixed via 247349f. Try installing with current top of tree. It worked on my machine :)

Again, make sure you are running in the right Visual Studio and Anaconda environments. Only the lines

set "VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build"
call "%VS150COMNTOOLS%\vcvarsall.bat" x64 -vcvars_ver=14.11

are necessary to set up the Visual Studio environment, but if you installed Pytorch in an Anaconda enviroment, you need to install Apex in that same environment.

Moving forward, we intend to regard Windows support as experimental, so I still recommend using Linux if you can.

@ibulu
Copy link
Author

ibulu commented Jul 5, 2018

wonderful :-) I confirm that I was able to install and import the library. I also ran one of the examples. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants