Skip to content

Theano, CUDA, CuDNN on Win10

valtron edited this page Feb 14, 2017 · 6 revisions

Also see philferrier/dlwin.

Works with a standard Python install (haven't tried Anaconda, or msys'). If you run into trouble pip installing packages with C extensions, try cgohlke's wheels.

  1. If you don't have it already, pip install Theano.

  2. Download and install CUDA 8.0 into /d/dev/cuda/v8.0.

  3. Download the VS2015 network installer. Do a custom install. In features, you only need "Programming Languages > Visual C++ > Common Tools for Visual C++ 2015" and "Windows and Web Development > Universal Windows App Development Tools > Windows 10 SDK (10.0.10240)".

  4. Download CuDNN 5.1. (cough bugmenot.com cough). Copy its bin, lib, and include folders to /d/dev/cuda/v8.0.

  5. Add to $PATH: c:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin and d:/dev/cuda/v8.0/bin.

  6. Add env var: INCLUDE -> c:/Program Files (x86)/Windows Kits/10/Include/10.0.10240.0/ucrt

  7. Add env var: LIB -> c:/Program Files (x86)/Windows Kits/10/Lib/10.0.10240.0/um/x64;c:/Program Files (x86)/Windows Kits/10/Lib/10.0.10240.0/ucrt/x64

  8. (I think the proper way of doing 6 & 7 is using vcvarsall.bat, but I'm too lazy to try to figure it out.)

  9. In $HOME/.theanorc, I had to add this to work around an error:

     [gpu]
     cxxflags = -D_hypot=hypot
    
  10. To use the new gpuarray backend, I had to add:

    [gpu]
    # Append these flags
    cxxflags = -Id:/dev/libgpuarray/include -Ld:/dev/libgpuarray/lib
    [dnn]
    include_path = d:/dev/cuda/v8.0/include
    library_path = d:/dev/cuda/v8.0/lib
    
  11. If you get a weird nvcc error about mod.cu, clear your $HOME/AppData/Local/Theano folder.

Here is the whole of my .theanorc:

[global]
device = cuda
optimizer_including = cudnn
floatX = float32

[gcc]
cxxflags = -D_hypot=hypot -Id:/dev/libgpuarray/include -Ld:/dev/libgpuarray/lib

[dnn]
include_path = d:/dev/cuda/v8.0/include
library_path = d:/dev/cuda/v8.0/lib/x64
conv.algo_bwd_filter = deterministic
conv.algo_bwd_data = deterministic

[lib]
cnmem = 0.7

[nvcc]
fastmath = True

[blas]
# Only used for device = cpu
ldflags = -Ld:/dev/openblas -lopenblas