Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA 11 (libfaiss) conda package triggers JIT compilation on Turing GPUs #36

Closed
1 task done
dantegd opened this issue Mar 11, 2021 · 5 comments · Fixed by #37
Closed
1 task done

CUDA 11 (libfaiss) conda package triggers JIT compilation on Turing GPUs #36

dantegd opened this issue Mar 11, 2021 · 5 comments · Fixed by #37

Comments

@dantegd
Copy link
Contributor

dantegd commented Mar 11, 2021

Issue: Installing the current CUDA 11 conda package (libfaiss in particular) on computers with Turing GPUs (tested on RTX 8000 and 2070S) triggers a JIT compilation in the first call that uses GPU resources, causing a delay of minutes. It works fine on Ampere GPUs (tested on 3080), also works fine on CUDA 10.2 with Turing. The packages are:

faiss                     1.7.0           py38cuda110h60a57df_4_cuda    conda-forge
faiss-proc                1.0.0                      cuda    conda-forge
libfaiss                  1.7.0           cuda110h8045045_4_cuda    conda-forge
libfaiss-avx2             1.7.0           cuda110h1234567_4_cuda    conda-forge

Reproduced with the following code:

Python 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> 
>>> d = 64                           # dimension
>>> nb = 100000                      # database size
>>> nq = 10000                       # nb of queries
>>> np.random.seed(1234)             # make reproducible
>>> xb = np.random.random((nb, d)).astype('float32')
>>> xb[:, 0] += np.arange(nb) / 1000.
>>> xq = np.random.random((nq, d)).astype('float32')
>>> xq[:, 0] += np.arange(nq) / 1000.
>>> 
>>> import faiss
>>> res = faiss.StandardGpuResources() 
>>> index_flat = faiss.IndexFlatL2(d)
>>> gpu_index_flat = faiss.index_cpu_to_gpu(res, 0, index_flat)
# (here I am stuck waiting for minutes...) 

This was an issue that we saw first in cuML (that uses FAISS): rapidsai/cuml#3602


Environment (conda list):
$ conda list
# packages in environment at /home/galahad/miniconda3/envs/ns0311-110:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
abseil-cpp                20200923.3           h9c3ff4c_0    conda-forge
aiobotocore               1.2.1              pyhd3eb1b0_0
aiohttp                   3.7.4            py38h497a2fe_0    conda-forge
aioitertools              0.7.1              pyhd8ed1ab_0    conda-forge
alabaster                 0.7.12                     py_0    conda-forge
apipkg                    1.5                        py_0    conda-forge
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
argon2-cffi               20.1.0           py38h497a2fe_2    conda-forge
arrow-cpp                 1.0.1           py38hcc56f6c_34_cuda    conda-forge
arrow-cpp-proc            3.0.0                      cuda    conda-forge
asn1crypto                1.4.0              pyh9f0ad1d_0    conda-forge
asvdb                     0.4.1               gd6cd8f2_36    rapidsai
async-timeout             3.0.1                   py_1000    conda-forge
async_generator           1.10                       py_0    conda-forge
atk-1.0                   2.36.0               h3371d22_4    conda-forge
attrs                     20.3.0             pyhd3deb0d_0    conda-forge
autoconf                  2.69            pl5320h36c2ea0_10    conda-forge
automake                  1.16.2          pl5320ha770c72_3    conda-forge
aws-c-cal                 0.4.5                h76129ab_8    conda-forge
aws-c-common              0.5.2                h7f98852_0    conda-forge
aws-c-event-stream        0.2.7                h6bac3ce_1    conda-forge
aws-c-io                  0.9.1                ha5b09cb_1    conda-forge
aws-checksums             0.1.11               h99e32c3_3    conda-forge
aws-sam-translator        1.34.0             pyh44b312d_0    conda-forge
aws-sdk-cpp               1.8.151              hceb1b1e_1    conda-forge
aws-xray-sdk              2.6.0              pyhd8ed1ab_0    conda-forge
babel                     2.9.0              pyhd3deb0d_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.1                      py_0    conda-forge
beautifulsoup4            4.9.3              pyhb0f4dca_0    conda-forge
benchmark                 1.5.1                he1b5a44_2    conda-forge
black                     19.10b0                  py38_0    conda-forge
blas                      2.108                  openblas    conda-forge
blas-devel                3.9.0                8_openblas    conda-forge
bleach                    3.3.0              pyh44b312d_0    conda-forge
blinker                   1.4                        py_1    conda-forge
blosc                     1.21.0               h9c3ff4c_0    conda-forge
bokeh                     2.2.3            py38h578d9bd_0    conda-forge
boost                     1.72.0           py38h1e42940_1    conda-forge
boost-cpp                 1.72.0               h9d3c048_4    conda-forge
boto3                     1.17.25            pyhd8ed1ab_0    conda-forge
botocore                  1.20.25            pyhd8ed1ab_0    conda-forge
brotli                    1.0.9                h9c3ff4c_4    conda-forge
brotlipy                  0.7.0           py38h497a2fe_1001    conda-forge
brunsli                   0.1                  he1b5a44_0    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.17.1               h7f98852_1    conda-forge
ca-certificates           2021.1.19            h06a4308_1
cachetools                4.2.1              pyhd8ed1ab_0    conda-forge
cairo                     1.16.0            h7979940_1007    conda-forge
certifi                   2020.12.5        py38h578d9bd_1    conda-forge
cffi                      1.14.5           py38ha65f79e_0    conda-forge
cfitsio                   3.470                hb418390_7    conda-forge
cfn-lint                  0.47.0           py38h578d9bd_0    conda-forge
chardet                   4.0.0            py38h578d9bd_1    conda-forge
charls                    2.2.0                h9c3ff4c_0    conda-forge
clang                     8.0.1                hc9558a2_2    conda-forge
clang-tools               8.0.1                hc9558a2_2    conda-forge
clangxx                   8.0.1                         2    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
click-plugins             1.1.1                      py_0    conda-forge
cligj                     0.7.1              pyhd8ed1ab_0    conda-forge
cloudpickle               1.6.0                      py_0    conda-forge
cmake                     3.18.5               h1f3970d_0    rapidsai-nightly
cmake-format              0.6.11             pyh9f0ad1d_0    conda-forge
cmake_setuptools          0.1.3                      py_0    rapidsai
cmarkgfm                  0.5.2            py38h497a2fe_0    conda-forge
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
colorcet                  2.0.6              pyhd8ed1ab_0    conda-forge
commonmark                0.9.1                      py_0    conda-forge
conda                     4.8.3            py38h32f6830_2    conda-forge
conda-build               3.20.3           py38h32f6830_0    conda-forge
conda-package-handling    1.7.2            py38h8df0ef7_0    conda-forge
conda-verify              3.1.1           py38h578d9bd_1003    conda-forge
cookies                   2.2.1                      py_0    conda-forge
coverage                  5.5              py38h497a2fe_0    conda-forge
cryptography              3.4.6            py38ha5dfef3_0    conda-forge
cudatoolkit               11.0.221             h6bb024c_0    nvidia
cudf                      0.19.0a210311   cuda_11.0_py38_g3355e6039c_200    rapidsai-nightly
cudnn                     8.1.0.77             h90431f1_0    conda-forge
cupy                      8.5.0            py38h2393d70_1    conda-forge
curl                      7.75.0               h979ede3_0    conda-forge
cutensor                  1.2.2.5              h96e36e3_3    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cyrus-sasl                2.1.27               h3274739_1    conda-forge
cython                    0.29.22          py38h709712a_0    conda-forge
cytoolz                   0.11.0           py38h497a2fe_3    conda-forge
dask                      2021.3.0+15.g5ee329fa          pypi_0    pypi
dask-cuda                 0.19.0a210311           py38_40    rapidsai-nightly
dask-cudf                 0.19.0a210311   py38_g3355e6039c_200    rapidsai-nightly
dask-glm                  0.2.0                      py_1    conda-forge
dask-labextension         4.0.1              pyhd8ed1ab_0    conda-forge
dask-ml                   1.8.0              pyhd8ed1ab_0    conda-forge
datashader                0.11.1             pyh9f0ad1d_0    conda-forge
datashape                 0.5.4                      py_1    conda-forge
dbus                      1.13.18              hb2f20db_0
decorator                 4.4.2                      py_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
distributed               2021.3.0+6.ge734fc3d          pypi_0    pypi
dlpack                    0.3                  he1b5a44_1    conda-forge
docker-py                 4.4.4            py38h578d9bd_0    conda-forge
docker-pycreds            0.4.0                      py_0    conda-forge
docutils                  0.16             py38h578d9bd_3    conda-forge
double-conversion         3.1.5                he1b5a44_2    conda-forge
doxygen                   1.8.20               had0d8f1_0    conda-forge
ecdsa                     0.16.1             pyhd8ed1ab_0    conda-forge
entrypoints               0.3             py38h32f6830_1002    conda-forge
execnet                   1.8.0              pyh44b312d_0    conda-forge
expat                     2.2.10               h9c3ff4c_0    conda-forge
fa2                       0.3.5            py38h1e0a361_0    conda-forge
faiss                     1.7.0           py38cuda110h60a57df_4_cuda    conda-forge
faiss-proc                1.0.0                      cuda    conda-forge
fastavro                  1.3.2            py38h497a2fe_0    conda-forge
fastrlock                 0.5              py38h709712a_2    conda-forge
feather-format            0.4.1              pyh9f0ad1d_0    conda-forge
filelock                  3.0.12             pyh9f0ad1d_0    conda-forge
filterpy                  1.4.5                      py_1    conda-forge
fiona                     1.8.18           py38h58f84aa_1    conda-forge
flake8                    3.8.4                      py_0    conda-forge
flask                     1.1.2              pyh9f0ad1d_0    conda-forge
flask_cors                3.0.10             pyhd3deb0d_0    conda-forge
flatbuffers               1.10.0            hf484d3e_1002    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      2.001                hab24e00_0    conda-forge
font-ttf-source-code-pro  2.030                hab24e00_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.13.1            hba837de_1004    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
freexl                    1.0.6                h27cfd23_0
fribidi                   1.0.10               h516909a_0    conda-forge
fsspec                    0.8.7              pyhd8ed1ab_0    conda-forge
future                    0.18.2           py38h578d9bd_3    conda-forge
gcsfs                     0.7.2              pyhd8ed1ab_0    conda-forge
gdal                      3.2.1            py38hc0b2d6b_2    conda-forge
gdk-pixbuf                2.42.2               h0c95a7a_2    conda-forge
geopandas                 0.8.1                      py_0    conda-forge
geos                      3.8.1                he1b5a44_0    conda-forge
geotiff                   1.6.0                h5d11630_3    conda-forge
gettext                   0.19.8.1          h0b5b191_1005    conda-forge
gflags                    2.2.2             he1b5a44_1004    conda-forge
giflib                    5.2.1                h516909a_2    conda-forge
git                       2.30.2          pl5320h6697202_0    conda-forge
glib                      2.66.7               h9c3ff4c_1    conda-forge
glib-tools                2.66.7               h9c3ff4c_1    conda-forge
glob2                     0.7                        py_0    conda-forge
glog                      0.4.0                h49b9bf7_3    conda-forge
gmock                     1.10.0               h4bd325d_7    conda-forge
gmp                       6.2.1                h58526e2_0    conda-forge
google-auth               1.27.1             pyhd3eb1b0_0
google-auth-oauthlib      0.4.3              pyhd3eb1b0_0
graphite2                 1.3.14               h23475e2_0
graphviz                  2.46.1               h93c640b_4    conda-forge
grpc-cpp                  1.36.2               h7919d58_0    conda-forge
gtest                     1.10.0               h4bd325d_7    conda-forge
gtk2                      2.24.33              hab0c2f8_0    conda-forge
gts                       0.7.6                h64030ff_2    conda-forge
harfbuzz                  2.7.4                h5cf4720_0    conda-forge
hdf4                      4.2.13            h10796ff_1004    conda-forge
hdf5                      1.10.6          nompi_h6a2412b_1114    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
holoviews                 1.14.2             pyhd8ed1ab_0    conda-forge
httpretty                 1.0.5              pyhd8ed1ab_0    conda-forge
hypothesis                6.8.0              pyhd8ed1ab_0    conda-forge
icu                       68.1                 h58526e2_0    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
imagecodecs               2021.1.28        py38h5b4e65a_0    conda-forge
imageio                   2.9.0                      py_0    conda-forge
imagesize                 1.2.0                      py_0    conda-forge
importlib-metadata        3.7.2            py38h578d9bd_0    conda-forge
importlib_metadata        3.7.2                hd8ed1ab_0    conda-forge
inflection                0.5.1              pyh9f0ad1d_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
ipykernel                 5.5.0            py38h81c977d_1    conda-forge
ipython                   7.15.0           py38h32f6830_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.6.3              pyhd3deb0d_0    conda-forge
isort                     5.0.9            py38h32f6830_0    conda-forge
itsdangerous              1.1.0                      py_0    conda-forge
jedi                      0.17.2           py38h578d9bd_1    conda-forge
jeepney                   0.6.0              pyhd8ed1ab_0    conda-forge
jinja2                    2.11.3             pyh44b312d_0    conda-forge
jmespath                  0.10.0             pyh9f0ad1d_0    conda-forge
joblib                    1.0.1              pyhd8ed1ab_0    conda-forge
jpeg                      9d                   h516909a_0    conda-forge
json-c                    0.13.1            hbfbb72e_1002    conda-forge
json5                     0.9.5              pyh9f0ad1d_0    conda-forge
jsondiff                  1.1.2                      py_0    conda-forge
jsonpatch                 1.31               pyhd3eb1b0_0
jsonpickle                2.0.0              pyhd3eb1b0_0
jsonpointer               2.0                        py_0    conda-forge
jsonschema                3.2.0            py38h32f6830_1    conda-forge
junit-xml                 1.9                pyh9f0ad1d_0    conda-forge
jupyter-server-proxy      1.6.0              pyhd8ed1ab_0    conda-forge
jupyter_client            6.1.11             pyhd8ed1ab_1    conda-forge
jupyter_core              4.7.1            py38h578d9bd_0    conda-forge
jupyter_sphinx            0.3.1            py38h578d9bd_1    conda-forge
jupyterlab                2.1.5                      py_0    conda-forge
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_server         1.2.0                      py_0    conda-forge
jupyterlab_widgets        1.0.0              pyhd8ed1ab_1    conda-forge
jxrlib                    1.1                  h516909a_2    conda-forge
kealib                    1.4.14               hcc255d8_2    conda-forge
keyring                   22.3.0           py38h578d9bd_0    conda-forge
kiwisolver                1.3.1            py38h1fd1430_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lapack                    3.9.0                    netlib    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
lerc                      2.2.1                h9c3ff4c_0    conda-forge
libaec                    1.0.4                he1b5a44_1    conda-forge
libarchive                3.5.1                h3f442fb_1    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libcudf                   0.19.0a210311   cuda11.0_g3355e6039c_200    rapidsai-nightly
libcumlprims              0.19.0a210210   cuda11.0_g269fe04_0    rapidsai-nightly
libcurl                   7.75.0               hc4aaa36_0    conda-forge
libcypher-parser          0.6.2                         1    rapidsai
libdap4                   3.20.6               hd7c4107_1    conda-forge
libdeflate                1.7                  h7f98852_5    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libevent                  2.1.10               hcdb4288_3    conda-forge
libfaiss                  1.7.0           cuda110h8045045_4_cuda    conda-forge
libfaiss-avx2             1.7.0           cuda110h1234567_4_cuda    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
libgcrypt                 1.9.2                h7f98852_0    conda-forge
libgd                     2.3.0                h47910db_1    conda-forge
libgdal                   3.2.1                h96b6e7a_2    conda-forge
libgfortran-ng            9.3.0               hff62375_18    conda-forge
libgfortran5              9.3.0               hff62375_18    conda-forge
libglib                   2.66.7               h3e27bee_1    conda-forge
libgpg-error              1.41                 h9c3ff4c_0    conda-forge
libgsasl                  1.8.0                         2    conda-forge
libhwloc                  2.3.0                h5e5b7d1_1    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
libkml                    1.3.0             hd79254b_1012    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
liblapacke                3.9.0                8_openblas    conda-forge
liblief                   0.10.1               he1b5a44_2    conda-forge
libllvm10                 10.0.1               he513fc3_3    conda-forge
libllvm8                  8.0.1                hc9558a2_0    conda-forge
libnetcdf                 4.7.4           nompi_h56d31a8_107    conda-forge
libnghttp2                1.43.0               h812cca2_0    conda-forge
libntlm                   1.5                  h7b6447c_0
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libpng                    1.6.37               hed695b0_2    conda-forge
libpq                     12.3                 hfd2b0eb_3    conda-forge
libprotobuf               3.15.5               h780b84a_0    conda-forge
librdkafka                1.5.3                hc49e61c_1    conda-forge
librmm                    0.19.0a210311   cuda11.0_g481dac4_38    rapidsai-nightly
librsvg                   2.50.3               hfa39831_1    conda-forge
librttopo                 1.1.0                hb271727_4    conda-forge
libsodium                 1.0.18               h516909a_1    conda-forge
libspatialindex           1.9.3                he1b5a44_3    conda-forge
libspatialite             5.0.1                h6ec7341_0    conda-forge
libssh2                   1.9.0                ha56f1ee_6    conda-forge
libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
libthrift                 0.14.1               he6d91bd_1    conda-forge
libtiff                   4.2.0                hdc55705_0    conda-forge
libtool                   2.4.6             h58526e2_1007    conda-forge
libutf8proc               2.6.1                h7f98852_0    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libuv                     1.41.0               h7f98852_0    conda-forge
libwebp                   1.2.0                h3452ae3_0    conda-forge
libwebp-base              1.2.0                h7f98852_0    conda-forge
libxcb                    1.14                 h7b6447c_0
libxml2                   2.9.10               h72842e0_3    conda-forge
libxslt                   1.1.33               h15afd5d_2    conda-forge
libzopfli                 1.0.3                he1b5a44_0    conda-forge
lightgbm                  3.1.1            py38h709712a_0    conda-forge
llvm-openmp               11.0.1               h4bd325d_0    conda-forge
llvmlite                  0.35.0           py38h4630a5e_1    conda-forge
locket                    0.2.1            py38h06a4308_1
lxml                      4.6.2            py38hf1fe3a4_1    conda-forge
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
lzo                       2.10              h516909a_1000    conda-forge
m4                        1.4.18            h516909a_1001    conda-forge
make                      4.3                  hd18ef5c_1    conda-forge
markdown                  3.3.4              pyhd8ed1ab_0    conda-forge
markupsafe                1.1.1            py38h497a2fe_3    conda-forge
matplotlib-base           3.3.4            py38h0efea84_0    conda-forge
mccabe                    0.6.1                      py_1    conda-forge
mimesis                   4.0.0              pyh9f0ad1d_0    conda-forge
mistune                   0.8.4           py38h497a2fe_1003    conda-forge
mock                      4.0.3            py38h578d9bd_1    conda-forge
more-itertools            8.7.0              pyhd8ed1ab_0    conda-forge
moto                      2.0.1              pyhd3eb1b0_0
msgpack-python            1.0.2            py38h1fd1430_1    conda-forge
multidict                 5.1.0            py38h497a2fe_1    conda-forge
multipledispatch          0.6.0                      py_0    conda-forge
munch                     2.5.0                      py_0    conda-forge
mypy                      0.782                      py_0    conda-forge
mypy_extensions           0.4.3            py38h578d9bd_3    conda-forge
nbclient                  0.5.3              pyhd8ed1ab_0    conda-forge
nbconvert                 6.0.7            py38h578d9bd_3    conda-forge
nbformat                  5.1.2              pyhd8ed1ab_1    conda-forge
nbsphinx                  0.8.1              pyh44b312d_0    conda-forge
nccl                      2.8.4.1              h96e36e3_3    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
nest-asyncio              1.5.1              pyhd3eb1b0_0
networkx                  2.5                        py_0    conda-forge
ninja                     1.10.2               h4bd325d_0    conda-forge
nltk                      3.5                        py_0
nodejs                    14.15.4              h92b4a50_1    conda-forge
notebook                  6.2.0            py38h578d9bd_0    conda-forge
numba                     0.52.0           py38h51da96c_0    conda-forge
numpy                     1.20.1           py38h18fd61f_0    conda-forge
numpydoc                  1.1.0                      py_1    conda-forge
nvtx                      0.2.3            py38h497a2fe_0    conda-forge
oauthlib                  3.1.0                      py_0
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openblas                  0.3.12          pthreads_h04b7a96_1    conda-forge
openjpeg                  2.4.0                hf7af979_0    conda-forge
openssl                   1.1.1j               h7f98852_0    conda-forge
orc                       1.6.7                heec2584_1    conda-forge
packaging                 20.9               pyh44b312d_0    conda-forge
pandas                    1.2.3            py38h51da96c_0    conda-forge
pandoc                    1.19.2.1             hea2e7c5_1
pandocfilters             1.4.3            py38h06a4308_1
panel                     0.10.3             pyhd8ed1ab_0    conda-forge
pango                     1.42.4               h80147aa_5    conda-forge
param                     1.10.1             pyhd3deb0d_0    conda-forge
parquet-cpp               1.5.1                         1    conda-forge
parso                     0.7.1              pyh9f0ad1d_0    conda-forge
partd                     1.1.0                      py_0    conda-forge
patchelf                  0.12                 h2531618_1
pathspec                  0.8.1              pyhd3deb0d_0    conda-forge
patsy                     0.5.1                      py_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
perl                      5.32.0               h36c2ea0_0    conda-forge
pexpect                   4.8.0            py38h32f6830_1    conda-forge
pickleshare               0.7.5           py38h32f6830_1002    conda-forge
pillow                    8.1.2            py38ha0e1e83_0    conda-forge
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
pkg-config                0.29.2            h516909a_1008    conda-forge
pkginfo                   1.7.0              pyhd8ed1ab_0    conda-forge
pluggy                    0.13.1           py38h578d9bd_4    conda-forge
pooch                     1.3.0              pyhd8ed1ab_0    conda-forge
poppler                   0.89.0               h2de54a5_5    conda-forge
poppler-data              0.4.10                        0    conda-forge
postgresql                12.3                 h6303168_3    conda-forge
proj                      7.1.1                h966b41f_3    conda-forge
prometheus_client         0.9.0              pyhd3deb0d_0    conda-forge
prompt-toolkit            3.0.16             pyha770c72_0    conda-forge
prompt_toolkit            3.0.16               hd8ed1ab_0    conda-forge
protobuf                  3.15.5           py38h709712a_0    conda-forge
psutil                    5.8.0            py38h497a2fe_1    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
py                        1.10.0             pyhd3deb0d_0    conda-forge
py-cpuinfo                7.0.0              pyh9f0ad1d_0    conda-forge
py-lief                   0.10.1           py38h348cfbe_2    conda-forge
pyarrow                   1.0.1           py38h7b0d817_34_cuda    conda-forge
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.8                      py_0
pycodestyle               2.6.0              pyh9f0ad1d_0    conda-forge
pycosat                   0.6.3           py38h497a2fe_1006    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pycryptodome              3.10.1           py38h80e8405_0    conda-forge
pyct                      0.4.8                    py38_0
pydeck                    0.5.0              pyh9f0ad1d_0    conda-forge
pyee                      7.0.4              pyh9f0ad1d_0    conda-forge
pyflakes                  2.2.0              pyh9f0ad1d_0    conda-forge
pygal                     2.4.0                      py_0    conda-forge
pygments                  2.8.1              pyhd8ed1ab_0    conda-forge
pyjwt                     2.0.1              pyhd8ed1ab_0    conda-forge
pynndescent               0.5.2              pyh44b312d_0    conda-forge
pynvml                    8.0.4                      py_1    conda-forge
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyorc                     0.4.0            py38hec75c54_1    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyppeteer                 0.2.2                      py_1    conda-forge
pyproj                    2.6.1.post1      py38h56787f0_3    conda-forge
pyrsistent                0.17.3           py38h497a2fe_2    conda-forge
pysocks                   1.7.1            py38h578d9bd_3    conda-forge
pytest                    6.2.2            py38h578d9bd_0    conda-forge
pytest-asyncio            0.12.0           py38h32f6830_2    conda-forge
pytest-benchmark          3.2.3              pyh9f0ad1d_0    conda-forge
pytest-cov                2.11.1             pyh44b312d_0    conda-forge
pytest-forked             1.3.0              pyhd3deb0d_0    conda-forge
pytest-timeout            1.4.2              pyh9f0ad1d_0    conda-forge
pytest-xdist              2.2.1              pyhd8ed1ab_0    conda-forge
python                    3.8.8           hffdb5ce_0_cpython    conda-forge
python-confluent-kafka    1.5.0            py38h1e0a361_0    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python-jose               3.2.0                      py_0
python-libarchive-c       2.9              py38h924ce5b_2    conda-forge
python-louvain            0.15               pyhd3deb0d_0    conda-forge
python_abi                3.8                      1_cp38    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pyviz_comms               2.0.1              pyhd3deb0d_0    conda-forge
pywavelets                1.1.1            py38hab2c0dc_3    conda-forge
pyyaml                    5.4.1            py38h497a2fe_0    conda-forge
pyzmq                     22.0.3           py38h2035c66_1    conda-forge
rapidjson                 1.1.0             hf484d3e_1002    conda-forge
rapids-build-env          0.19.0a210311   cuda11.0_py38_gcd2b094_241    rapidsai-nightly
rapids-doc-env            0.19.0a210311   py38_gcd2b094_241    rapidsai-nightly
rapids-notebook-env       0.19.0a210311   cuda11.0_py38_gcd2b094_241    rapidsai-nightly
rapids-pytest-benchmark   0.0.13                     py_0    rapidsai
re2                       2020.11.01           h58526e2_0    conda-forge
readline                  8.1                  h27cfd23_0
readme_renderer           27.0               pyh9f0ad1d_0    conda-forge
recommonmark              0.7.1              pyhd8ed1ab_0    conda-forge
regex                     2020.11.13       py38h497a2fe_1    conda-forge
requests                  2.25.1             pyhd3deb0d_0    conda-forge
requests-oauthlib         1.3.0              pyh9f0ad1d_0    conda-forge
requests-toolbelt         0.9.1                      py_0    conda-forge
responses                 0.12.1             pyhd3deb0d_0    conda-forge
rfc3986                   1.4.0              pyh9f0ad1d_0    conda-forge
rhash                     1.4.1                h7f98852_0    conda-forge
ripgrep                   12.1.1               h516909a_1    conda-forge
rmm                       0.19.0a210311   cuda_11.0_py38_g481dac4_38    rapidsai-nightly
rsa                       4.7.2              pyh44b312d_0    conda-forge
rtree                     0.9.7            py38h02d302b_1    conda-forge
ruamel_yaml               0.15.87          py38h7b6447c_1
s2n                       1.0.0                h9b69904_0    conda-forge
s3fs                      0.5.2              pyhd8ed1ab_0    conda-forge
s3transfer                0.3.4              pyhd8ed1ab_0    conda-forge
scikit-image              0.18.1           py38h51da96c_0    conda-forge
scikit-learn              0.23.1           py38h3a94b23_0    conda-forge
scipy                     1.5.3            py38hb2138dd_0    conda-forge
seaborn                   0.11.1               ha770c72_0    conda-forge
seaborn-base              0.11.1             pyhd8ed1ab_1    conda-forge
secretstorage             3.3.1            py38h578d9bd_0    conda-forge
send2trash                1.5.0                      py_0    conda-forge
setuptools                52.0.0           py38h06a4308_0
shapely                   1.7.1            py38hc7361b7_1    conda-forge
shellcheck                0.7.1                         0    conda-forge
simpervisor               0.4                pyhd8ed1ab_0    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
snappy                    1.1.8                he1b5a44_3    conda-forge
snowballstemmer           2.1.0              pyhd8ed1ab_0    conda-forge
sortedcontainers          2.3.0              pyhd8ed1ab_0    conda-forge
soupsieve                 2.2                pyhd3eb1b0_0
spdlog                    1.7.0                hc9558a2_2    conda-forge
sphinx                    3.5.2              pyhd8ed1ab_0    conda-forge
sphinx-copybutton         0.3.1              pyhd8ed1ab_0    conda-forge
sphinx-markdown-tables    0.0.15             pyhd3deb0d_0    conda-forge
sphinx_rtd_theme          0.5.1              pyhd3deb0d_0    conda-forge
sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
sphinxcontrib-htmlhelp    1.0.3                      py_0    conda-forge
sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
sphinxcontrib-serializinghtml 1.1.4                      py_0    conda-forge
sphinxcontrib-websupport  1.2.4              pyh9f0ad1d_0    conda-forge
sqlite                    3.34.0               h74cdb3f_0    conda-forge
sshpubkeys                3.1.0                      py_0    conda-forge
statsmodels               0.12.2           py38h5c078b8_0    conda-forge
streamz                   0.6.2              pyh44b312d_0    conda-forge
tbb                       2020.3               hfd86e86_0
tblib                     1.7.0                      py_0
terminado                 0.9.2            py38h578d9bd_0    conda-forge
testpath                  0.4.4                      py_0    conda-forge
threadpoolctl             2.1.0              pyh5ca1d4c_0    conda-forge
tifffile                  2021.3.5           pyhd8ed1ab_0    conda-forge
tiledb                    2.2.4                hb9a9e87_2    conda-forge
tk                        8.6.10               hed695b0_1    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
toolz                     0.11.1                     py_0    conda-forge
tornado                   6.1              py38h497a2fe_1    conda-forge
tqdm                      4.59.0             pyhd8ed1ab_0    conda-forge
traitlets                 5.0.5                      py_0    conda-forge
treelite                  1.0.0            py38hd08a91b_0    conda-forge
treelite-runtime          1.0.0                    pypi_0    pypi
twine                     3.3.0            py38h578d9bd_1    conda-forge
typed-ast                 1.4.2            py38h497a2fe_0    conda-forge
typing-extensions         3.7.4.3                       0    conda-forge
typing_extensions         3.7.4.3                    py_0    conda-forge
tzcode                    2021a                h7f98852_1    conda-forge
ucx                       1.9.0+gcd9efd3       cuda11.0_0    rapidsai
ucx-proc                  1.0.0                       gpu    rapidsai
ucx-py                    0.19.0a210311   py38_gcd9efd3_17    rapidsai-nightly
umap-learn                0.5.1            py38h578d9bd_0    conda-forge
urllib3                   1.26.3             pyhd8ed1ab_0    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
websocket-client          0.58.0           py38h06a4308_4
websockets                8.1              py38h497a2fe_3    conda-forge
werkzeug                  1.0.1              pyh9f0ad1d_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
widgetsnbextension        3.5.1            py38h578d9bd_4    conda-forge
wrapt                     1.12.1           py38h497a2fe_3    conda-forge
xarray                    0.17.0             pyhd8ed1ab_0    conda-forge
xerces-c                  3.2.3                h9d8b166_2    conda-forge
xmltodict                 0.12.0                     py_0    conda-forge
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.7.0                h36c2ea0_0    conda-forge
xorg-libxext              1.3.4                h7f98852_1    conda-forge
xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
yarl                      1.6.3            py38h497a2fe_1    conda-forge
zeromq                    4.3.4                h9c3ff4c_0    conda-forge
zfp                       0.5.5                he1b5a44_4    conda-forge
zict                      2.0.0                      py_0    conda-forge
zipp                      3.4.1              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.4.9                ha95c52a_0    conda-forge

Details about conda and system ( conda info ):
$ conda info
active environment : ns0311-110
    active env location : /home/galahad/miniconda3/envs/ns0311-110
            shell level : 2
       user config file : /home/galahad/.condarc
 populated config files :
          conda version : 4.9.2
    conda-build version : not installed
         python version : 3.8.5.final.0
       virtual packages : __cuda=11.2=0
                          __glibc=2.31=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/galahad/miniconda3  (writable)
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/galahad/miniconda3/pkgs
                          /home/galahad/.conda/pkgs
       envs directories : /home/galahad/miniconda3/envs
                          /home/galahad/.conda/envs
               platform : linux-64
             user-agent : conda/4.9.2 requests/2.24.0 CPython/3.8.5 Linux/5.8.0-44-generic ubuntu/20.04.2 glibc/2.31
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False

cc @viclafargue @hcho3 @jakirkham

@jakirkham
Copy link
Member

I'm guessing you have already looked at nvidia-smi or used other profiling tools to see what is going on. Would be interesting to include that info if you have it. Maybe that sheds light on where things are getting stuck

@h-vetinari
Copy link
Member

Hey, sorry to hear this isn't working well. I'll happily admit that the GPU build options around real vs. virtual, JIT, PTX, etc. are a bit over my head - I've often received help on this from the nVidia folks (e.g. @kkraus14 & @teju85 helping out in #1). With the changes of the upstream build system to CMake, I've done the best I can based on the CMake documentation, but it's possible that I'm doing this wrong - pertinent parts are in https://github.com/conda-forge/faiss-split-feedstock/blob/master/recipe/build-lib.sh.

My goal was building for maximum compatibility, and at the time, PTX JIT compilation was recommended to me. If this should be removed and or amended somehow, I'll happily accept PRs (or guidance what to do).

@teju85
Copy link

teju85 commented Mar 12, 2021

Looking at this line of code in faiss feedstock, for cuda11.0, it does compile for sm_75 architectures.

@dantegd can you please try to disassemble the libfaiss binary and check if it does have sm_75 kernels compiled?

@dantegd
Copy link
Contributor Author

dantegd commented Mar 12, 2021

@h-vetinari thanks for the response! Indeed @teju85's advice is the best recommendation and what faiss did explicitly in version 1.6.3 with the prior build system, so faiss 1.6.3 works smoothly for all its intended archs. I think the issue was a minor mixup in the usage of the fairly recent CMAKE_CUDA_ARCHITECTURES feature of CMake to accomplish the same thing (which are not super obvious the first time when using if one is used to the direct gencode nvcc flags, I still get them wrong the first time every time). As of right now, the libfaiss conda package for 11.2 for example is generated with:

-DCMAKE_CUDA_ARCHITECTURES=52-virtual;60-virtual;61-virtual;70-virtual;75-virtual;80-virtual;86-virtual;86-real

which causes it to include device code for compute 86 (i.e. 3070/80/90) and PTX for anything under it, so that is what causes it to trigger a JIT compilation when say Turing (75) or Pascal (60s) call it, and we can also inspect (as @teju85 recommended):

(ns0311-110) ➜  lib cuobjdump libfaiss.so -lelf
ELF file    1: libfaiss.1.sm_80.cubin
ELF file    2: libfaiss.2.sm_80.cubin
ELF file    3: libfaiss.3.sm_80.cubin
ELF file    4: libfaiss.4.sm_80.cubin
...

Now most RAPIDS libraries are in the process of migrating to using CMAKE_CUDA_ARCHITECTURES (as opposed to manually injecting them in our older version based CMake scripts), but cuDF already did, and what we did there was to use:

-DCMAKE_CUDA_ARCHITECTURES=60-real;70-real;75-real;80

This causes what is (if I'm not mistaken) our intended result, having device code for supported archs (so that supported GPUs can just just cuDF without needing a long JIT compilation step), and then including the PTX for 80 so say if a future GPU with 90+ (or say 50 assuming compatibility) would be able to JIT compile and still use cuDF. And inspecting we can see

(ns0311-110) ➜  lib cuobjdump libcudf.so -lelf
ELF file    1: libcudf.1.sm_60.cubin
ELF file    2: libcudf.2.sm_70.cubin
ELF file    3: libcudf.3.sm_75.cubin
ELF file    4: libcudf.4.sm_80.cubin
ELF file    5: libcudf.5.sm_60.cubin
ELF file    6: libcudf.6.sm_70.cubin
ELF file    7: libcudf.7.sm_75.cubin
ELF file    8: libcudf.8.sm_80.cubin
...

Which is the similar to how faiss 1.6.3 was:

lib cuobjdump libfaiss.so -lelf
ELF file    1: GpuIndex.sm_35.cubin
ELF file    2: GpuIndex.sm_50.cubin
ELF file    3: GpuIndex.sm_52.cubin
ELF file    4: GpuIndex.sm_60.cubin
ELF file    5: GpuIndex.sm_61.cubin
ELF file    6: GpuIndex.sm_70.cubin
ELF file    7: GpuIndex.sm_75.cubin
ELF file    8: GpuIndex.sm_80.cubin
ELF file    9: GpuIndexBinaryFlat.sm_35.cubin
ELF file   10: GpuIndexBinaryFlat.sm_50.cubin
ELF file   11: GpuIndexBinaryFlat.sm_52.cubin
ELF file   12: GpuIndexBinaryFlat.sm_60.cubin
ELF file   13: GpuIndexBinaryFlat.sm_61.cubin
ELF file   14: GpuIndexBinaryFlat.sm_70.cubin
ELF file   15: GpuIndexBinaryFlat.sm_75.cubin
ELF file   16: GpuIndexBinaryFlat.sm_80.cubin
...

Difference in name of the .cubin comes from how CMake forms the target as opposed to faiss's older system if I'm not mistaken, but the important part is noticing we have the binaries for all supported archs so JIT compilation is not an issue for 1.6.3.

So that was a very verbose way of describing the solution proposed in #37

@h-vetinari
Copy link
Member

Thanks for the analysis!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants