Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cuml nearest neighbors returns wrong distances #4624

Closed
siegrikw opened this issue Mar 9, 2022 · 4 comments
Closed

[BUG] Cuml nearest neighbors returns wrong distances #4624

siegrikw opened this issue Mar 9, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@siegrikw
Copy link

siegrikw commented Mar 9, 2022

Describe the bug
When n_neighbors is in the following range (1 to 64) for 256 rows exactly (no less no more) the distances and indices returned are invalid:

n_rows = 256
n_neighbors = 1 < n_neighbors <= 64

Steps/Code to reproduce bug

import cupy as cp
from cuml.neighbors import NearestNeighbors
from sklearn.neighbors import NearestNeighbors as NearestNeighborsCPU

cp.random.seed(seed=42)
X = cp.random.rand(256,3)
n_neighbors = 32

cuml_model = NearestNeighbors(n_neighbors=n_neighbors,algorithm="brute")
cuml_model.fit(X)
cuml_distances, cuml_indices = cuml_model.kneighbors(X,two_pass_precision=True)

sklearn_model = NearestNeighborsCPU(n_neighbors=n_neighbors,algorithm="brute")
sklearn_model.fit(X.get())
sklearn_distances,sklearn_indices =  sklearn_model.kneighbors(X.get())

#Distance Results
print(f"CUML Distances :: \n{cuml_distances}\n")
print(f"Sklearn Distances :: \n{sklearn_distances}\n")

#Indices Results
print(f"CUML Indices :: \n{cuml_indices}\n")
print(f"Sklearn Indices :: \n{sklearn_indices}\n")

Expected behavior
The first distance for every point should be 0 (i.e. the point should be its own nearest-neighbor) as returned by sklearn.neighbors.NearestNeighbors, and all of the indices should not be identically set to 0

Environment details (please complete the following information):

  • Environment location: [Bare-metal]
  • Linux Distro/Architecture: [Ubuntu 20.04 amd64]
  • GPU Model/Driver: [3090 and driver 495.46]
  • CUDA: [11.5]
  • Method of cuDF & cuML install: [conda]
    • If method of install is [conda], run conda list and include results here
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
abseil-cpp                20210324.2           h9c3ff4c_0    conda-forge
arrow-cpp                 5.0.0           py38hd2b13db_8_cuda    conda-forge
arrow-cpp-proc            3.0.0                      cuda    conda-forge
attrs                     21.4.0             pyhd8ed1ab_0    conda-forge
aws-c-cal                 0.5.11               h95a6274_0    conda-forge
aws-c-common              0.6.2                h7f98852_0    conda-forge
aws-c-event-stream        0.2.7               h3541f99_13    conda-forge
aws-c-io                  0.10.5               hfb6a706_0    conda-forge
aws-checksums             0.1.11               ha31a3da_7    conda-forge
aws-sdk-cpp               1.8.186              hb4091e7_3    conda-forge
backcall                  0.2.0              pyhd3eb1b0_0  
bokeh                     2.4.2            py38h578d9bd_0    conda-forge
brotli-python             1.0.9            py38h709712a_6    conda-forge
brotlipy                  0.7.0           py38h497a2fe_1003    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2021.10.8            ha878542_0    conda-forge
cachetools                5.0.0              pyhd8ed1ab_0    conda-forge
certifi                   2021.10.8        py38h578d9bd_1    conda-forge
cffi                      1.15.0           py38hd667e15_1  
charset-normalizer        2.0.12             pyhd8ed1ab_0    conda-forge
click                     8.0.4            py38h578d9bd_0    conda-forge
cloudpickle               2.0.0              pyhd8ed1ab_0    conda-forge
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
cryptography              36.0.1           py38h3e25421_0    conda-forge
cuda-python               11.6.0           py38h3fd9d12_0    nvidia
cudatoolkit               11.5.1               hcf5317a_9    nvidia
cudf                      22.02.00        cuda_11_py38_g774d859fef_0    rapidsai
cugraph                   22.02.00        cuda11_py38_gf391a5c3_0    rapidsai
cuml                      22.02.00        cuda11_py38_g679e4fa21_0    rapidsai
cupy                      10.2.0           py38h405e1b6_0    conda-forge
cycler                    0.10.0                   py38_0    anaconda
cytoolz                   0.11.2           py38h497a2fe_1    conda-forge
dash                      2.2.0              pyhd8ed1ab_0    conda-forge
dash-bootstrap-components 1.0.3              pyhd8ed1ab_0    conda-forge
dash-daq                  0.5.0              pyh9f0ad1d_1    conda-forge
dask                      2022.1.0           pyhd8ed1ab_0    conda-forge
dask-core                 2022.1.0           pyhd8ed1ab_0    conda-forge
dask-cuda                 22.02.00                 py38_0    rapidsai
dask-cudf                 22.02.00        cuda_11_py38_g774d859fef_0    rapidsai
dbus                      1.13.18              hb2f20db_0    anaconda
debugpy                   1.5.1            py38h295c915_0  
decorator                 5.1.1              pyhd3eb1b0_0  
distributed               2022.1.0         py38h578d9bd_0    conda-forge
dlpack                    0.5                  h9c3ff4c_0    conda-forge
entrypoints               0.3                      py38_0  
expat                     2.2.10               he6710b0_2    anaconda
faiss-proc                1.0.0                      cuda    rapidsai
fastavro                  1.4.10           py38h0a891b7_0    conda-forge
fastrlock                 0.8              py38h709712a_1    conda-forge
flask                     2.0.3              pyhd8ed1ab_0    conda-forge
flask-compress            1.11               pyhd8ed1ab_0    conda-forge
fontconfig                2.13.0               h9420a91_0    anaconda
freetype                  2.10.4               h0708190_1    conda-forge
fsspec                    2022.2.0           pyhd8ed1ab_0    conda-forge
gflags                    2.2.2             he1b5a44_1004    conda-forge
glib                      2.56.2               hd408876_0    anaconda
glog                      0.5.0                h48cff8f_0    conda-forge
grpc-cpp                  1.40.0               h05f19cf_1    conda-forge
gst-plugins-base          1.14.0               hbbd80ab_1    anaconda
gstreamer                 1.14.0               hb453b48_1    anaconda
heapdict                  1.0.1                      py_0    conda-forge
icu                       58.2                 he6710b0_3    anaconda
idna                      3.3                pyhd8ed1ab_0    conda-forge
importlib-metadata        4.11.2           py38h578d9bd_0    conda-forge
importlib_resources       5.4.0              pyhd8ed1ab_0    conda-forge
ipykernel                 6.4.1            py38h06a4308_1  
ipython                   7.31.1           py38h06a4308_0  
ipython_genutils          0.2.0                      py_1    conda-forge
itsdangerous              2.1.0              pyhd8ed1ab_0    conda-forge
jbig                      2.1               h7f98852_2003    conda-forge
jedi                      0.18.1           py38h06a4308_1  
jinja2                    3.0.3              pyhd8ed1ab_0    conda-forge
joblib                    1.1.0              pyhd8ed1ab_0    conda-forge
jpeg                      9e                   h7f98852_0    conda-forge
jsonschema                4.4.0              pyhd8ed1ab_0    conda-forge
jupyter_client            7.1.2              pyhd3eb1b0_0  
jupyter_core              4.9.2            py38h578d9bd_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.2.0            py38hfd86e86_0    anaconda
krb5                      1.19.2               h3790be6_4    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.35.1               h7274673_9  
lerc                      2.2.1                h9c3ff4c_0    conda-forge
libblas                   3.9.0           13_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h7f98852_6    conda-forge
libbrotlidec              1.0.9                h7f98852_6    conda-forge
libbrotlienc              1.0.9                h7f98852_6    conda-forge
libcblas                  3.9.0           13_linux64_openblas    conda-forge
libcudf                   22.02.00        cuda11_g774d859fef_0    rapidsai
libcugraph                22.02.00        cuda11_gf391a5c3_0    rapidsai
libcuml                   22.02.00        cuda11_g679e4fa21_0    rapidsai
libcumlprims              22.02.00        cuda11_g2207299_0    nvidia
libcurl                   7.79.1               h2574ce0_1    conda-forge
libcusolver               11.3.3.112           hc34d849_0    nvidia
libdeflate                1.7                  h7f98852_5    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libevent                  2.1.10               h9b69904_4    conda-forge
libfaiss                  1.7.0           cuda112h5bea7ad_8_cuda    conda-forge
libffi                    3.3                  he6710b0_2  
libgcc-ng                 11.2.0              h1d223b6_13    conda-forge
libgfortran-ng            11.2.0              h69a702a_13    conda-forge
libgfortran5              11.2.0              h5c6108e_13    conda-forge
libhwloc                  2.3.0                h5e5b7d1_1    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0           13_linux64_openblas    conda-forge
libnghttp2                1.43.0               h812cca2_1    conda-forge
libopenblas               0.3.18          pthreads_h8fe5266_0    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libprotobuf               3.18.0               h780b84a_1    conda-forge
librmm                    22.02.00        cuda11_ge3e3215_0    rapidsai
libsodium                 1.0.18               h7b6447c_0  
libssh2                   1.10.0               ha56f1ee_2    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_13    conda-forge
libthrift                 0.15.0               he6d91bd_0    conda-forge
libtiff                   4.3.0                hf544144_1    conda-forge
libutf8proc               2.7.0                h7f98852_0    conda-forge
libuuid                   1.0.3                h1bed415_2    anaconda
libwebp-base              1.2.2                h7f98852_1    conda-forge
libxcb                    1.14                 h7b6447c_0    anaconda
libxml2                   2.9.10               hb55368b_3    anaconda
llvm-openmp               12.0.1               h4bd325d_1    conda-forge
llvmlite                  0.38.0           py38he1b5a44_0    numba
locket                    0.2.0                      py_2    conda-forge
lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
markupsafe                2.1.0            py38h0a891b7_1    conda-forge
matplotlib                3.3.1                         0    anaconda
matplotlib-base           3.3.1            py38h817c723_0    anaconda
matplotlib-inline         0.1.2              pyhd3eb1b0_2  
msgpack-python            1.0.3            py38h1fd1430_0    conda-forge
nbformat                  5.1.3              pyhd8ed1ab_0    conda-forge
nccl                      2.11.4.1             h5c60f85_2    conda-forge
ncurses                   6.3                  h7f8727e_2  
nest-asyncio              1.5.1              pyhd3eb1b0_0  
networkx                  2.5                        py_0    anaconda
numba                     0.55.1          np1.11py3.8hc13618b_g76720bf88_0    numba
numpy                     1.21.5           py38h87f13fb_0    conda-forge
nvtx                      0.2.3            py38h497a2fe_1    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openjpeg                  2.4.0                hb52868f_1    conda-forge
openssl                   1.1.1l               h7f98852_0    conda-forge
orc                       1.7.0                h68e2c4e_0    conda-forge
packaging                 21.3               pyhd8ed1ab_0    conda-forge
pandas                    1.3.5            py38h43a58ef_0    conda-forge
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.8.3              pyhd3eb1b0_0  
partd                     1.2.0              pyhd8ed1ab_0    conda-forge
pcre                      8.44                 he6710b0_0    anaconda
pexpect                   4.8.0              pyhd3eb1b0_3  
pickleshare               0.7.5           pyhd3eb1b0_1003  
pillow                    8.3.2            py38h8e6f84c_0    conda-forge
pip                       21.2.4           py38h06a4308_0  
plotly                    5.6.0                      py_0    plotly
prompt-toolkit            3.0.20             pyhd3eb1b0_0  
protobuf                  3.18.0           py38h709712a_0    conda-forge
psutil                    5.9.0            py38h497a2fe_0    conda-forge
ptxcompiler               0.2.0            py38h98f4b32_0    rapidsai
ptyprocess                0.7.0              pyhd3eb1b0_2  
pyarrow                   5.0.0           py38hed47224_8_cuda    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pygments                  2.11.2             pyhd3eb1b0_0  
pynndescent               0.5.6              pyh6c4a22f_0    conda-forge
pynvml                    11.4.1             pyhd8ed1ab_0    conda-forge
pyopenssl                 22.0.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.0.7              pyhd8ed1ab_0    conda-forge
pyqt                      5.9.2            py38h05f1152_4    anaconda
pyrsistent                0.18.1           py38h497a2fe_0    conda-forge
pysocks                   1.7.1            py38h578d9bd_4    conda-forge
python                    3.8.10          h49503c6_1_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.8                      2_cp38    conda-forge
pytz                      2021.3             pyhd8ed1ab_0    conda-forge
pyyaml                    6.0              py38h497a2fe_3    conda-forge
pyzmq                     22.3.0           py38h295c915_2  
qt                        5.9.7                h5867ecd_1    anaconda
re2                       2021.09.01           h9c3ff4c_0    conda-forge
readline                  8.1.2                h7f8727e_1  
requests                  2.27.1             pyhd8ed1ab_0    conda-forge
rmm                       22.02.00        cuda11_py38_ge3e3215_0_has_cma    rapidsai
s2n                       1.0.10               h9b69904_0    conda-forge
scikit-learn              0.23.2           py38h0573a6f_0    anaconda
scipy                     1.8.0            py38h56a6a73_1    conda-forge
seaborn                   0.11.0                     py_0    anaconda
setuptools                58.0.4           py38h06a4308_0  
sip                       4.19.24          py38he6710b0_0    anaconda
six                       1.16.0             pyh6c4a22f_0    conda-forge
snappy                    1.1.8                he1b5a44_3    conda-forge
sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
spdlog                    1.8.5                h4bd325d_1    conda-forge
sqlite                    3.37.2               hc218d9a_0  
tbb                       2021.5.0             h4bd325d_0    conda-forge
tblib                     1.7.0              pyhd8ed1ab_0    conda-forge
tenacity                  8.0.1            py38h06a4308_0  
threadpoolctl             2.1.0              pyh5ca1d4c_0    anaconda
tk                        8.6.11               h1ccaba5_0  
toolz                     0.11.2             pyhd8ed1ab_0    conda-forge
tornado                   6.1              py38h497a2fe_2    conda-forge
tqdm                      4.63.0             pyhd8ed1ab_0    conda-forge
traitlets                 5.1.1              pyhd8ed1ab_0    conda-forge
treelite                  2.2.1            py38hdd725b4_2    conda-forge
treelite-runtime          2.2.1                    pypi_0    pypi
typing_extensions         4.1.1              pyha770c72_0    conda-forge
ucx                       1.12.0+gd367332      cuda11.2_0    rapidsai
ucx-proc                  1.0.0                       gpu    rapidsai
ucx-py                    0.24.0          py38_gd367332_0    rapidsai
umap-learn                0.5.2            py38h578d9bd_1    conda-forge
urllib3                   1.26.8             pyhd8ed1ab_1    conda-forge
wcwidth                   0.2.5              pyhd3eb1b0_0  
werkzeug                  2.0.3              pyhd8ed1ab_1    conda-forge
wheel                     0.37.1             pyhd3eb1b0_0  
xz                        5.2.5                h7b6447c_0  
yaml                      0.2.5                h7f98852_2    conda-forge
zeromq                    4.3.4                h2531618_0  
zict                      2.1.0              pyhd8ed1ab_0    conda-forge
zipp                      3.7.0              pyhd8ed1ab_1    conda-forge
zlib                      1.2.11               h7f8727e_4  
zstd                      1.5.0                ha95c52a_0    conda-forge

@siegrikw siegrikw added ? - Needs Triage Need team to review and classify bug Something isn't working labels Mar 9, 2022
@siegrikw
Copy link
Author

The bug does not appear in Rapids 21.06. Reverting back to 21.06 and running the code above returned the intended behavior

@divyegala
Copy link
Member

@teju85 @mdoijade could I ask one of you to check this out?

@zbjornson
Copy link
Contributor

This might be due to rapidsai/raft#568. k<=64 is when the fused kernel is enabled.

@divyegala divyegala removed the ? - Needs Triage Need team to review and classify label Mar 28, 2022
rapids-bot bot pushed a commit to rapidsai/raft that referenced this issue Mar 31, 2022
…604)

This PR fixes issue - #568 and rapidsai/cuml#4624
-- fix issue in fusedL2knn which happens when rows are multiple of 256.
-- make index value to be size_t to avoid int overflow though this doesn't hamper these issues but it may for higher input sizes. 
-- also add some additional test cases in fusedL2knn test.

Authors:
  - Mahesh Doijade (https://github.com/mdoijade)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #604
@divyegala
Copy link
Member

@siegrikw we have fixed this from our side. Closing this issue for now, but feel free to open it again if the bug persists

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants