Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Confusion matrix should convert dtype as necessary #3567

Closed
beckernick opened this issue Mar 1, 2021 · 3 comments · Fixed by #3754
Closed

[BUG] Confusion matrix should convert dtype as necessary #3567

beckernick opened this issue Mar 1, 2021 · 3 comments · Fixed by #3754
Assignees
Labels
bug Something isn't working Cython / Python Cython or Python issue good first issue Good for newcomers

Comments

@beckernick
Copy link
Member

confusion_matrix should automatically convert dtypes as appropriate in order to avoid failing, like other metric functions.

from sklearn.metrics import confusion_matrix
import numpy as np
import cumly = np.array([0.0, 1.0, 0.0])
y_pred = np.array([0.0, 1.0, 1.0])
print(confusion_matrix(y, y_pred))
cuml.metrics.confusion_matrix(y, y_pred)
[[1 1]
 [0 1]]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-7e819a19dcd9> in <module>
      6 y_pred = np.array([0.0, 1.0, 1.0])
      7 print(confusion_matrix(y, y_pred))
----> 8 cuml.metrics.confusion_matrix(y, y_pred)

/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/cuml/internals/api_decorators.py in inner_with_getters(*args, **kwargs)
    464 
    465                 # Call the function
--> 466                 ret_val = func(*args, **kwargs)
    467 
    468             return cm.process_return(ret_val)

/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/cuml/metrics/confusion_matrix.py in confusion_matrix(y_true, y_pred, labels, sample_weight, normalize)
     60     """
     61     y_true, n_rows, n_cols, dtype = \
---> 62         input_to_cuml_array(y_true, check_dtype=[cp.int32, cp.int64])
     63 
     64     y_pred, _, _, _ = \

/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/cuml/internals/api_decorators.py in inner(*args, **kwargs)
    359         def inner(*args, **kwargs):
    360             with self._recreate_cm(func, args):
--> 361                 return func(*args, **kwargs)
    362 
    363         return inner

/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/cuml/common/input_utils.py in input_to_cuml_array(X, order, deepcopy, check_dtype, convert_to_dtype, check_cols, check_rows, fail_on_order, force_contiguous)
    360             del X_m
    361             raise TypeError("Expected input to be of type in " +
--> 362                             str(check_dtype) + " but got " + str(type_str))
    363 
    364     # Checks based on parameters

TypeError: Expected input to be of type in [dtype('int32'), dtype('int64')] but got float64
conda list | grep "rapids\|scikit-learn"
# packages in environment at /raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210301:
cudf                      0.19.0a210301   cuda_10.2_py37_gf79a841f92_147    rapidsai-nightly
cuml                      0.19.0a210301   cuda10.2_py37_g8fa2b9067_88    rapidsai-nightly
dask-cuda                 0.19.0a210301           py37_34    rapidsai-nightly
dask-cudf                 0.19.0a210301   py37_gf79a841f92_147    rapidsai-nightly
libcudf                   0.19.0a210301   cuda10.2_gf79a841f92_147    rapidsai-nightly
libcuml                   0.19.0a210301   cuda10.2_g8fa2b9067_88    rapidsai-nightly
libcumlprims              0.19.0a210210   cuda10.2_g269fe04_0    rapidsai-nightly
librmm                    0.19.0a210301   cuda10.2_g38a350f_32    rapidsai-nightly
rmm                       0.19.0a210301   cuda_10.2_py37_g38a350f_32    rapidsai-nightly
scikit-learn              0.24.1           py37h69acf81_0    conda-forge
ucx                       1.9.0+gcd9efd3       cuda10.2_0    rapidsai-nightly
ucx-proc                  1.0.0                       gpu    rapidsai-nightly
ucx-py                    0.19.0a210301   py37_gcd9efd3_16    rapidsai-nightly
@beckernick beckernick added bug Something isn't working Cython / Python Cython or Python issue good first issue Good for newcomers labels Mar 1, 2021
@esnvidia
Copy link

Adding to this bug:

Code:

from cuml.metrics import confusion_matrix
print(confusion_matrix(y_true=labels.type(dtype=torch.int32).to(device),
                                   y_pred=pred.argmax(1).type(dtype=torch.int32).to(device)
                                   ))
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-23-b32704b34c7b> in <module>
     20 
     21     print(confusion_matrix(y_true=labels.type(dtype=torch.int32).to(device),
---> 22                            y_pred=pred.argmax(1).type(dtype=torch.int32).to(device)))


/opt/conda/envs/rapids/lib/python3.7/site-packages/cuml/internals/api_decorators.py in inner_with_getters(*args, **kwargs)
    464 
    465                 # Call the function
--> 466                 ret_val = func(*args, **kwargs)
    467 
    468             return cm.process_return(ret_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cuml/metrics/confusion_matrix.py in confusion_matrix(y_true, y_pred, labels, sample_weight, normalize)
     67 
     68     if labels is None:
---> 69         labels = sorted_unique_labels(y_true, y_pred)
     70         n_labels = len(labels)
     71     else:

/opt/conda/envs/rapids/lib/python3.7/site-packages/cuml/metrics/utils.py in sorted_unique_labels(*ys)
     22     labels."""
     23     ys = (cp.unique(y) for y in ys)
---> 24     labels = cp.unique(cp.concatenate(ys))
     25     return labels

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/manipulation/join.py in concatenate(tup, axis, out)
     55         tup = [m.ravel() for m in tup]
     56         axis = 0
---> 57     return core.concatenate_method(tup, axis, out)
     58 
     59 

cupy/core/_routines_manipulation.pyx in cupy.core._routines_manipulation.concatenate_method()

cupy/core/_routines_manipulation.pyx in cupy.core._routines_manipulation.concatenate_method()

/opt/conda/envs/rapids/lib/python3.7/site-packages/cuml/metrics/utils.py in <genexpr>(.0)
     21     """Extract an ordered array of unique labels from one or more arrays of
     22     labels."""
---> 23     ys = (cp.unique(y) for y in ys)
     24     labels = cp.unique(cp.concatenate(ys))
     25     return labels

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/manipulation/add_remove.py in unique(ar, return_index, return_inverse, return_counts, axis)
    110         aux = ar[perm]
    111     else:
--> 112         ar.sort()
    113         aux = ar
    114     mask = cupy.empty(aux.shape, dtype=cupy.bool_)

cupy/core/core.pyx in cupy.core.core.ndarray.sort()

cupy/core/core.pyx in cupy.core.core.ndarray.sort()

cupy/core/_routines_sorting.pyx in cupy.core._routines_sorting._ndarray_sort()

cupy/cuda/thrust.pyx in cupy.cuda.thrust.sort()

RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDevice: invalid device ordinal

Code that does work by converting torch.tensor with .cpu():

from sklearn.metrics import confusion_matrix
 
 print(confusion_matrix(y_true=labels.type(dtype=torch.int32).cpu(),
                                    y_pred=pred.argmax(1).type(dtype=torch.int32).cpu()
                                    ))
                                    
[[1      2]
 [  3      4]]

@divyegala
Copy link
Member

@JohnZed @dantegd should we divide this issue into two parts, or should I look to solve this in one PR? The second bug added here seems like it's going something to do with torch so that would involve more testing

@JohnZed
Copy link
Contributor

JohnZed commented Apr 7, 2021

Yes, should split it into two separate issues, agreed

rapids-bot bot pushed a commit that referenced this issue Apr 16, 2021
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this issue Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Cython / Python Cython or Python issue good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants