enable building for CUDA #1

h-vetinari · 2020-05-11T16:54:48Z

Checklist

Used a fork of the feedstock to propose changes
Bumped the build number (if the version is unchanged)
Reset the build number to 0 (if the version changed)
Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
Ensured the license file is being packaged.

Follow-up from conda-forge/staged-recipes#11337.

conda-forge-linter · 2020-05-11T16:54:52Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

h-vetinari · 2020-05-11T16:54:59Z

@conda-forge-admin, please rerender

h-vetinari · 2020-05-11T17:20:00Z

@mdouze @beauby
Do you package builds for cuda 9.2? Here it fails with

/usr/local/cuda/bin/nvcc -I /usr/local/cuda/targets/x86_64-linux/include/ -Xcompiler -fPIC -Xcudafe --diag_suppress=unrecognized_attribute -gencode=arch=compute_35,code=compute_35 -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_60,code=compute_60 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_75,code=compute_75 -lineinfo -ccbin /home/conda/feedstock_root/build_artifacts/faiss-split_1589216379077/_build_env/bin/x86_64-conda_cos6-linux-gnu-c++ -std=c++11 -DFAISS_USE_FLOAT16 -I. -g -O3 -c gpu/GpuDistance.cu -o gpu/GpuDistance.o
nvcc fatal   : Unsupported gpu architecture 'compute_75'

Seems we could take out the compilation for that architecture, or is it unsupported already from your side?

h-vetinari · 2020-05-11T17:36:33Z

The cuda >=10 builds fail with:

> /usr/local/cuda/bin/nvcc -I /usr/local/cuda/targets/x86_64-linux/include/ -Xcompiler -fPIC -Xcudafe --diag_suppress=unrecognized_attribute -gencode=arch=compute_35,code=compute_35 -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_60,code=compute_60 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_75,code=compute_75 -lineinfo -ccbin /home/conda/feedstock_root/build_artifacts/faiss-split_1589216379886/_build_env/bin/x86_64-conda_cos6-linux-gnu-c++ -std=c++11 -DFAISS_USE_FLOAT16 -I. -g -O3 -c gpu/impl/PQScanMultiPassNoPrecomputed.cu -o gpu/impl/PQScanMultiPassNoPrecomputed.o

gpu/impl/PQScanMultiPassNoPrecomputed.cu(27): error: function "faiss::gpu::isSupportedNoPrecomputedSubDimSize" has already been defined
gpu/impl/PQScanMultiPassNoPrecomputed.cu(51): error: class template "faiss::gpu::LoadCodeDistances" has already been defined
gpu/impl/PQScanMultiPassNoPrecomputed.cu(123): error: function template "faiss::gpu::pqScanNoPrecomputedMultiPass" has already been defined
3 errors detected in the compilation of "/tmp/tmpxft_00000ed3_00000000-11_PQScanMultiPassNoPrecomputed.compute_75.cpp1.ii".

@mdouze @beauby @jakirkham, I have no experience with building GPU packages, so comments/pointers are welcome!

Since the makefiles are quite vanilla, I'd be tempted to say this is an error in the upstream code, but then, you've managed to build it for the pytorch channel...

h-vetinari · 2020-05-11T17:38:34Z

@dantegd: [...] are there plans to package also the C++ API in the conda package/binary at some point? That would be super useful.

Hey @dantegd, not sure if you unsubscribed from the frenzy of commits in conda-forge/staged-recipes#11337, but if yes, the good news is that the C++ API is now in fact being packaged separately. Right now, it's only the CPU-part of the C++-API, but once this PR is successfully merged, you should be able to just use libfaiss =*=*gpu in your dependencies (you can replace the first asterisk with the desired faiss-version).

mdouze · 2020-05-11T17:46:07Z

The cuda >=10 builds fail with:
gpu/impl/PQScanMultiPassNoPrecomputed.cu(27): error: function "faiss::gpu::isSupportedNoPrecomputedSubDimSize" has already been defined
gpu/impl/PQScanMultiPassNoPrecomputed.cu(51): error: class template "faiss::gpu::LoadCodeDistances" has already been defined
gpu/impl/PQScanMultiPassNoPrecomputed.cu(123): error: function template "faiss::gpu::pqScanNoPrecomputedMultiPass" has already been defined
3 errors detected in the compilation of "/tmp/tmpxft_00000ed3_00000000-11_PQScanMultiPassNoPrecomputed.compute_75.cpp1.ii".
@mdouze @beauby @jakirkham, I have no experience with building GPU packages, so comments/pointers are welcome!

Since the makefiles are quite vanilla, I'd be tempted to say this is an error in the upstream code, but then, you've managed to build it for the pytorch channel...

@wickedfoo, any idea what this could be?

h-vetinari · 2020-05-11T18:02:05Z

I mentioned in the other PR that one remaining difference is that we're not passing $CUDA_ARCH, and hence this seems to be building for several of them...

@h-vetinari: Also note that upstream uses --with-cuda as well as --with-cuda-arch, but I'm not sure what to pass as $CUDA_ARCH.

However, it doesn't read to me like the error is related to that (because several modules got correctly compiled by nvcc before), but in any case, here's the full call that failed:

> /usr/local/cuda/bin/nvcc -I /usr/local/cuda/targets/x86_64-linux/include/ -Xcompiler -fPIC -Xcudafe --diag_suppress=unrecognized_attribute -gencode=arch=compute_35,code=compute_35 -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_60,code=compute_60 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_75,code=compute_75 -lineinfo -ccbin [...snip...]/_build_env/bin/x86_64-conda_cos6-linux-gnu-c++ -std=c++11 -DFAISS_USE_FLOAT16 -I. -g -O3 -c gpu/impl/PQScanMultiPassNoPrecomputed.cu -o gpu/impl/PQScanMultiPassNoPrecomputed.o

wickedfoo · 2020-05-12T02:35:26Z

@h-vetinari @mdouze The compilation error looks like you have a stray faiss/gpu/impl/PQScanMultiPassNoPrecomputed.cu file which has been deleted from the repository as of this commit:

facebookresearch/faiss@64dd988#diff-948369eef79ac94c4b1a946328ee607f

h-vetinari · 2020-05-12T07:53:41Z

@wickedfoo
Thanks for the quick tip!

Seems we'll need to restrict $CUDA_ARCH in some way (cf. my last comment):

+ python conda/faiss-gpu/run_test.py
Faiss assertion 'err__ == cudaSuccess' failed in int faiss::gpu::getNumDevices() at gpu/utils/DeviceUtils.cu:36; details: CUDA error 35 CUDA driver version is insufficient for CUDA runtime version
/home/conda/feedstock_root/build_artifacts/faiss-split_1589265142331/test_tmp/run_test.sh: line 8:  9487 Aborted                 (core dumped) python conda/faiss-gpu/run_test.py
Tests failed for faiss-1.6.3-py37h4315b5e_1_gpu.tar.bz2 - moving package to /home/conda/feedstock_root/build_artifacts/broken

By default, it's building for:

compute_35
compute_52
compute_60
compute_61
compute_70
compute_75

Not sure how this is handled elsewhere in conda-forge @conda-forge/core (side note: how about forming a team @conda-forge/help-gpu ...?)

wickedfoo · 2020-05-12T16:05:01Z

@h-vetinari You might also want to get these commits as well that fix CUDA 8 and CUDA 10 warnings, since it appears that you are getting this warning in your CUDA 10 build:

2020-05-12T07:08:21.5654836Z gpu/utils/DeviceUtils.cu:114:19: warning: 'cudaPointerAttributes::memoryType' is deprecated [-Wdeprecated-declarations]
2020-05-12T07:08:21.5655909Z    } else if (att.memoryType == cudaMemoryTypeHost) {

CUDA 8 fixes:
facebookresearch/faiss@e05f773

CUDA 10 fixes:
facebookresearch/faiss@ba061ff

h-vetinari · 2020-05-12T17:32:59Z

@wickedfoo
Thanks a bunch!
The compiler warnings weren't high enough on my list of priorities to tackle so far, but I've included the patches you mentioned. If you look in the runs for linux-cpu as well as OSX, you'll see see several more of them (many more in the case of OSX).

…da-forge-pinning 2020.05.11.06.13.34

jakirkham · 2020-05-12T22:17:26Z

@mdouze @beauby @jakirkham, I have no experience with building GPU packages, so comments/pointers are welcome!

Thanks for the ping! Am taking some time off this week. Will take a look next week though 🙂

h-vetinari · 2020-05-12T22:21:34Z

@jakirkham: Thanks for the ping! Am taking some time off this week. Will take a look next week though 🙂

No worries, thanks for responding nevertheless!

h-vetinari · 2020-05-13T09:04:14Z

Pinging the collaborators on the linux-anvil-cuda docker image... @mbargull @isuruf @scopatz @bdice

It seems that my hunt for the right CUDA_ARCH was a red herring, as I always get the error

Faiss assertion 'err__ == cudaSuccess' failed in int faiss::gpu::getNumDevices() at gpu/utils/DeviceUtils.cu:36; details: CUDA error 35 CUDA driver version is insufficient for CUDA runtime version

This made me wonder: are there NVidia drivers in the image at all (it doesn't seem so from the dockerfiles for linux-anvil-cuda or the upstream nvidia/cuda:${CUDA_VER}-devel-centos6), and if no, how should I install them?

Looking around for "cuda driver" in conda-forge, it seems that pyarrow has a similar problem?

recipe/build-lib.sh

teju85

The proposed changes in the build script, especially the support GPU archs LGTM (please also add sm_50 in the list like we discussed above). Explicitly tagging @cjnolet @dantegd in case they can bring another perspective here.

h-vetinari · 2020-05-20T21:36:03Z

Gentle ping @jakirkham :)
I think this should be in good order, but since it's my first GPU build, I'd just really like to confirm if everything works as it should, and I can't for lack of a GPU in the CI...

jakirkham · 2020-05-20T21:36:47Z

Yep, thanks for the reminder. It's on my list :)

h-vetinari · 2020-05-26T14:59:14Z

Another gentle ping @jakirkham (I'm sure that list of yours is mighty full, but new week, new luck 😉)

jakirkham

Thanks @h-vetinari both for all of the work and the reminder! Apologies for letting this slip.

Largely things seem good here. I made a few small comments, but I think all of the reviewers comments and your work have put this in a pretty good place already. 🙂

Please let me know if you have any questions or specific things you'd like me to look at and will happily look at those as well.

jakirkham · 2020-05-28T03:00:44Z

recipe/build-lib.sh

+    # docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#major-components__table-cuda-toolkit-driver-versions
+    # docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#gpu-feature-list
+
+    # the following are all the x86-relevant gpu arches; for building aarch64-packages, add: 53, 62, 72


Should we go ahead and add a if for this case in preparation?

I propose to cross this bridge when we come to it. ;-)
My comment is the preparation 😋

jakirkham · 2020-05-28T03:01:35Z

recipe/build-lib.sh

@@ -8,7 +28,7 @@ fi
 # Build vanilla version (no avx)
 ./configure --prefix=${PREFIX} --exec-prefix=${PREFIX} \
  --with-blas=-lblas --with-lapack=-llapack \
-  ${CUDA_CONFIG_ARG}
+  ${CUDA_CONFIG_ARG} --with-cuda-arch="${CUDA_ARCH}" || exit 1


Would it make sense to add this to the CUDA_CONFIG_ARG above?

I tried this first of course, but couldn't get the quotes to get passed through correctly, and then the compilation would fail.

jakirkham · 2020-05-28T03:04:27Z

recipe/meta.yaml

    For best performance, the maintainers of the package
    [recommend](https://github.com/conda-forge/staged-recipes/pull/11337#issuecomment-623718460)
    using the MKL implementation of blas/lapack. You can ensure that this is installed
-    by adding "libblas =*=mkl" to your dependencies.
+    by adding "libblas =*=*mkl" to your dependencies.


Should we add this to a post-link script for greater visibility?

OK, will look into this.

h-vetinari

Thanks for taking a look @jakirkham!

Would you also have a GPU locally to try if the gpu-specific tests would run through?

h-vetinari · 2020-05-28T07:08:22Z

recipe/build-lib.sh

+    # docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#major-components__table-cuda-toolkit-driver-versions
+    # docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#gpu-feature-list
+
+    # the following are all the x86-relevant gpu arches; for building aarch64-packages, add: 53, 62, 72


I propose to cross this bridge when we come to it. ;-)
My comment is the preparation 😋

h-vetinari · 2020-05-28T07:09:08Z

recipe/build-lib.sh

@@ -8,7 +28,7 @@ fi
 # Build vanilla version (no avx)
 ./configure --prefix=${PREFIX} --exec-prefix=${PREFIX} \
  --with-blas=-lblas --with-lapack=-llapack \
-  ${CUDA_CONFIG_ARG}
+  ${CUDA_CONFIG_ARG} --with-cuda-arch="${CUDA_ARCH}" || exit 1


I tried this first of course, but couldn't get the quotes to get passed through correctly, and then the compilation would fail.

h-vetinari · 2020-05-28T07:09:46Z

recipe/meta.yaml

    For best performance, the maintainers of the package
    [recommend](https://github.com/conda-forge/staged-recipes/pull/11337#issuecomment-623718460)
    using the MKL implementation of blas/lapack. You can ensure that this is installed
-    by adding "libblas =*=mkl" to your dependencies.
+    by adding "libblas =*=*mkl" to your dependencies.


OK, will look into this.

cjnolet · 2020-06-03T21:13:28Z

@h-vetinari I've followed your instructions here but for some reason it's not picking up the cuda compiler version when I run the build-locally.py script. Any idea as to what I might be doing wrong?

I've also updated the conda_build_config.yml changing None to 10.2.

valid configs are {'linux_cuda_compiler_versionNone', 'osx_'}
config not selected, please choose from the following:

1. linux_cuda_compiler_versionNone
2. osx_

>

I should point out that I'm on Centos7.

h-vetinari · 2020-06-03T23:20:08Z

@cjnolet
Thanks a bunch for giving this a shot! Not sure what other tweaks are necessary to build a GPU package locally. I remember seeing some instructions at https://github.com/AnacondaRecipes/aggregate/tree/master/pytorch-feedstock/recipe, but not sure they apply generally (probably should use the condaforge/linux-anvil-cuda:10.2 image instead, for example). Any tips @jakirkham @isuruf?

jakirkham · 2020-06-04T04:27:18Z

@cjnolet, do you have shyaml installed?

h-vetinari · 2020-06-06T14:05:44Z

Merging this, let's try how it fares in real life. Post-link script can be added in another PR.

h-vetinari · 2020-06-15T21:20:25Z

With the disk space issues figured out now, GPU builds for faiss have been live for ~2 days. Has anyone had a time to give them a twirl so far?

jakirkham · 2020-06-15T21:22:29Z

With the disk space issues figured out now, GPU builds for faiss have been live for ~2 days. Has anyone had a time to give them a twirl so far?

@cjnolet @teju85 @dantegd ? 🙂

jakirkham · 2020-06-24T03:50:34Z

With the disk space issues figured out now, GPU builds for faiss have been live for ~2 days. Has anyone had a time to give them a twirl so far?

We tried. Unfortunately it looks like we are missing some packages for CUDA 10.1 and 10.2. Raised as issue ( #9 ).

h-vetinari · 2020-06-26T15:56:02Z

@jakirkham: We tried. Unfortunately it looks like we are missing some packages for CUDA 10.1 and 10.2. Raised as issue ( #9 ).

Now that #9 is closed, would you mind re-trying? 🙂

jakirkham · 2020-06-26T16:17:14Z

Yep, certainly. Thanks for the help here @h-vetinari 😄

h-vetinari force-pushed the gpu branch from cb5972b to 25c122c Compare May 12, 2020 06:29

h-vetinari force-pushed the gpu branch 2 times, most recently from d568885 to 5e8fb54 Compare May 12, 2020 18:54

h-vetinari and others added 5 commits May 12, 2020 23:32

retry building for GPU

c099b29

MNT: Re-rendered with conda-build 3.19.2, conda-smithy 3.7.0, and con…

fd9604d

…da-forge-pinning 2020.05.11.06.13.34

bump build number

d8f52ec

remove stray files with upstream commit

1e2186a

add more upstream patches

dc9b12e

h-vetinari force-pushed the gpu branch from 26071a1 to 1781ec4 Compare May 12, 2020 22:20

h-vetinari force-pushed the gpu branch 2 times, most recently from 6ea0585 to 9a2f309 Compare May 13, 2020 07:42

This was referenced May 14, 2020

Add CUDA build conda-forge/pyarrow-feedstock#101

Merged

Add missing libcuda.so on 9.2 conda-forge/conda-forge-ci-setup-feedstock#93

Merged

add ${CUDA_HOME}/compat-${CUDA_VER} to ldconfig (location of libcuda.so) conda-forge/docker-images#135

Merged

h-vetinari closed this May 15, 2020

h-vetinari reopened this May 15, 2020

teju85 suggested changes May 20, 2020

View reviewed changes

recipe/build-lib.sh Outdated Show resolved Hide resolved

recipe/build-lib.sh Show resolved Hide resolved

h-vetinari commented May 20, 2020

View reviewed changes

recipe/build-lib.sh Outdated Show resolved Hide resolved

incorporate review from teju85

0e9e614

h-vetinari force-pushed the gpu branch from c2fa28d to 0e9e614 Compare May 20, 2020 09:29

our bash is too old for negative indexes, apparently

8ff7248

h-vetinari force-pushed the gpu branch from b64da34 to 8ff7248 Compare May 20, 2020 11:09

teju85 approved these changes May 20, 2020

View reviewed changes

complete gpu build matrix

b7b56d1

jakirkham reviewed May 28, 2020

View reviewed changes

h-vetinari commented May 28, 2020

View reviewed changes

h-vetinari changed the title ~~retry building for GPU~~ enable building for CUDA Jun 6, 2020

h-vetinari merged commit b2e44fc into conda-forge:master Jun 6, 2020

h-vetinari mentioned this pull request Jun 7, 2020

Rerun for GPU builds #5

Merged

h-vetinari deleted the gpu branch June 15, 2020 21:20

h-vetinari mentioned this pull request Jun 24, 2020

MNT: rerender #8

Merged

1 task

h-vetinari mentioned this pull request Dec 11, 2020

Refactor for cmake-based infra in 1.6.4; add py3.9 & cpu-only windows #17

Merged

5 tasks

h-vetinari mentioned this pull request Jan 27, 2021

Update dependencies to achieve better performance #25

Open

5 tasks

h-vetinari mentioned this pull request Mar 12, 2021

CUDA 11 (libfaiss) conda package triggers JIT compilation on Turing GPUs #36

Closed

1 task

enable building for CUDA #1

enable building for CUDA #1

Conversation

h-vetinari commented May 11, 2020

conda-forge-linter commented May 11, 2020

h-vetinari commented May 11, 2020

h-vetinari commented May 11, 2020

h-vetinari commented May 11, 2020 • edited Loading

h-vetinari commented May 11, 2020

mdouze commented May 11, 2020

h-vetinari commented May 11, 2020 • edited Loading

wickedfoo commented May 12, 2020

h-vetinari commented May 12, 2020

wickedfoo commented May 12, 2020

h-vetinari commented May 12, 2020 • edited Loading

jakirkham commented May 12, 2020

h-vetinari commented May 12, 2020

h-vetinari commented May 13, 2020 • edited Loading

teju85 left a comment • edited Loading

Choose a reason for hiding this comment

h-vetinari commented May 20, 2020

jakirkham commented May 20, 2020

h-vetinari commented May 26, 2020

jakirkham left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

h-vetinari left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cjnolet commented Jun 3, 2020

h-vetinari commented Jun 3, 2020

jakirkham commented Jun 4, 2020

h-vetinari commented Jun 6, 2020

h-vetinari commented Jun 15, 2020

jakirkham commented Jun 15, 2020

jakirkham commented Jun 24, 2020

h-vetinari commented Jun 26, 2020

jakirkham commented Jun 26, 2020

h-vetinari commented May 11, 2020 •

edited

Loading

h-vetinari commented May 11, 2020 •

edited

Loading

h-vetinari commented May 12, 2020 •

edited

Loading

h-vetinari commented May 13, 2020 •

edited

Loading

teju85 left a comment •

edited

Loading