installation issue #23

ibulu · 2018-07-01T16:20:00Z

I am really excited about trying this. But, every time I try installing, I am getting the following error:
torch.version = 0.5.0a0+03e7953
Found CUDA_HOME = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2
Traceback (most recent call last):
File "setup.py", line 105, in
CUDA_MAJOR = get_cuda_version()
File "setup.py", line 85, in get_cuda_version
re.compile('nvcc$').search)
File "setup.py", line 38, in find
return list(set(collection))
TypeError: 'NoneType' object is not iterable

mcarilli · 2018-07-01T16:51:10Z

Looks like you're on Windows. I don't have access to a Windows machine with Pytorch right now, but I'll find one tomorrow to test the build. Typically, Linux is what we recommend, so if you have the option to run on a Linux machine, that should work right away.

ibulu · 2018-07-01T16:52:40Z

Thank you. I really appreciate that. I modified the setup file and got as far as:

c:\program files\nvidia gpu computing toolkit\cuda\v9.2\include\crt/host_config.h(133): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions 2012, 2013, 2015 and 2017 are supported!
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\nvcc.exe' failed with exit status 2

I have visual studio 2017 community edition installed. I compiled pytorch from source using visual studio 2017 without a problem.

mcarilli · 2018-07-01T17:36:47Z

I admit I'm not an expert on Windows builds. I'll have to ask some other people tomorrow.

In the meantime, if you want an immediate path forward, some of the utilities in Apex are Python-only. For example, FP16_Optimizer and DistributedDataParallel technically don't require building the C backend. One item on my to-do list is creating a Python-only "apex-lite" install option.

The necessary-and-sufficient set of Python files to use FP16_Optimizer is fp16_optimizer.py, fp16_util.py, and loss_scaler.py from the fp16_utils directory.

The necessary-and-sufficient set of Python files to use apex DistributedDataParallel is distributed.py and multiproc.py from the parallel directory.

If you copy those files alongside your training script, you should be able to say e.g. from fp16_optimizer import FP16_Optimizer or from distributed import DistributedDataParallel as DDP and they should work without needing to run the setup.py.

Thank you for your interest in Apex and your feedback. It's very helpful to know what issues people are encountering, especially at this early stage.

ibulu · 2018-07-01T17:44:50Z

Thanks. Really appreciate your suggestion. I will give it a try on a newly received titan V :-)

mcarilli · 2018-07-03T01:12:28Z

@ibulu When you installed pytorch from source, did you run the following lines

set "VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Auxiliary\Build"
...
call "%VS150COMNTOOLS%\vcvarsall.bat" x64 -vcvars_ver=14.11

before python setup.py install? Apex needs that same environment to build, so you will also need to specify those lines if you want to run Apex's setup.py. If you installed Pytorch within a Conda environment from the command line, you'll also want to install Apex within that same Conda environment.

ibulu · 2018-07-04T00:40:02Z

I had a chance to try to install apex the same way I installed pytorch, but I am still getting the following error:

torch.version = 0.5.0a0+03e7953
Found CUDA_HOME = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2
Traceback (most recent call last):
File "setup.py", line 105, in
CUDA_MAJOR = get_cuda_version()
File "setup.py", line 85, in get_cuda_version
re.compile('nvcc$').search)
File "setup.py", line 38, in find
return list(set(collection))
TypeError: 'NoneType' object is not iterable

here are the steps I followed in more detail:
git clone --recursive https://github.com/NVIDIA/apex
cd apex
set "VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build"
set CMAKE_GENERATOR=Visual Studio 15 2017 Win64
set DISTUTILS_USE_SDK=1
set MSSdk=1
call "%VS150COMNTOOLS%\vcvarsall.bat" x64 -vcvars_ver=14.11
python setup.py install

mcarilli · 2018-07-05T16:45:17Z

@ibulu Hopefully fixed via 247349f. Try installing with current top of tree. It worked on my machine :)

Again, make sure you are running in the right Visual Studio and Anaconda environments. Only the lines

set "VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build"
call "%VS150COMNTOOLS%\vcvarsall.bat" x64 -vcvars_ver=14.11

are necessary to set up the Visual Studio environment, but if you installed Pytorch in an Anaconda enviroment, you need to install Apex in that same environment.

Moving forward, we intend to regard Windows support as experimental, so I still recommend using Linux if you can.

ibulu · 2018-07-05T17:00:10Z

wonderful :-) I confirm that I was able to install and import the library. I also ran one of the examples. Thank you very much!

Fix launch bounds for cleanup(...) call

mcarilli added a commit that referenced this issue Jul 5, 2018

Update setup.py for #23

247349f

ibulu closed this as completed Jul 5, 2018

Solacex mentioned this issue Jan 13, 2019

RuntimeError: cuda runtime error (74) : misaligned address at /pytorch/aten/src/THC/THCTensorCopy.cu:84 #124

Open

chengmengli06 mentioned this issue Nov 12, 2019

apex hangs on cudaFree #599

Open

matlabninja mentioned this issue May 20, 2020

Use O1 opt_lv leads to RuntimeError: CUDA error: no kernel image is available for execution on the device #842

Closed

lcskrishna pushed a commit to lcskrishna/apex that referenced this issue Jun 23, 2020

Merge pull request NVIDIA#23 from ashishfarmer/launch_bounds_fix

7e09937

Fix launch bounds for cleanup(...) call

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

installation issue #23

installation issue #23

ibulu commented Jul 1, 2018 •

edited

Loading

mcarilli commented Jul 1, 2018

ibulu commented Jul 1, 2018 •

edited

Loading

mcarilli commented Jul 1, 2018

ibulu commented Jul 1, 2018

mcarilli commented Jul 3, 2018 •

edited

Loading

ibulu commented Jul 4, 2018

mcarilli commented Jul 5, 2018 •

edited

Loading

ibulu commented Jul 5, 2018 •

edited

Loading

installation issue #23

installation issue #23

Comments

ibulu commented Jul 1, 2018 • edited Loading

mcarilli commented Jul 1, 2018

ibulu commented Jul 1, 2018 • edited Loading

mcarilli commented Jul 1, 2018

ibulu commented Jul 1, 2018

mcarilli commented Jul 3, 2018 • edited Loading

ibulu commented Jul 4, 2018

mcarilli commented Jul 5, 2018 • edited Loading

ibulu commented Jul 5, 2018 • edited Loading

ibulu commented Jul 1, 2018 •

edited

Loading

ibulu commented Jul 1, 2018 •

edited

Loading

mcarilli commented Jul 3, 2018 •

edited

Loading

mcarilli commented Jul 5, 2018 •

edited

Loading

ibulu commented Jul 5, 2018 •

edited

Loading