Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scipy.stats.mode does not accept non-numeric values #135

Closed
xinyu-dev opened this issue Dec 12, 2023 · 1 comment
Closed

scipy.stats.mode does not accept non-numeric values #135

xinyu-dev opened this issue Dec 12, 2023 · 1 comment

Comments

@xinyu-dev
Copy link

In Tutorial_Reference_mappying.ipynb, the following lines raise an error

for k in idx_list:
    if faiss_imported:
        idx = labels[k]
    else:
        idx, sim = get_similar_vectors(test_emebd[k][np.newaxis, ...], ref_cell_embeddings, k)
    pred = mode(ref_embed_adata.obs[cell_type_key][idx], axis=0) # I made this change. scipy.stats.mode no longer accepts non-numeric values
    preds.append(pred[0][0])

TypeError Traceback (most recent call last)
/tmp/ipykernel_21913/2423575529.py in ?()
17 if faiss_imported:
18 idx = labels[k]
19 else:
20 idx, sim = get_similar_vectors(test_emebd[k][np.newaxis, ...], ref_cell_embeddings, k)
---> 21 pred = (ref_embed_adata.obs[cell_type_key][idx], axis=0) # I made this change. scipy.stats.mode no longer accepts non-numeric values
22 preds.append(pred[0][0])
23 # preds.append(stat_mode(np.array(ref_embed_adata.obs[cell_type_key][idx])))
24

/usr/local/lib/python3.10/dist-packages/scipy/stats/_axis_nan_policy.py in ?(failed resolving arguments)
519 # behavior of those would break backward compatibility.
520
521 if sentinel:
522 samples = _remove_sentinel(samples, paired, sentinel)
--> 523 res = hypotest_fun_out(*samples, **kwds)
524 res = result_to_tuple(res)
525 res = _add_reduced_axes(res, reduced_axes, keepdims)
526 return tuple_to_result(*res)

/usr/local/lib/python3.10/dist-packages/scipy/stats/_stats_py.py in ?(a, axis, nan_policy, keepdims)
507 message = ("Argument a is not recognized as numeric. "
508 "Support for input that cannot be coerced to a numeric "
509 "array was deprecated in SciPy 1.9.0 and removed in SciPy "
510 "1.11.0. Please consider np.unique.")
--> 511 raise TypeError(message)
512
513 if a.size == 0:
514 NaN = _get_nan(a)

TypeError: Argument a is not recognized as numeric. Support for input that cannot be coerced to a numeric array was deprecated in SciPy 1.9.0 and removed in SciPy 1.11.0. Please consider np.unique.mode

My current workaround:

from statistics import mode as stat_mode

for k in idx_list:
    if faiss_imported:
        idx = labels[k]
    else:
        idx, sim = get_similar_vectors(test_emebd[k][np.newaxis, ...], ref_cell_embeddings, k)
    # pred = mode(ref_embed_adata.obs[cell_type_key][idx], axis=0) # I made this change. scipy.stats.mode no longer accepts non-numeric values
    # preds.append(pred[0][0])
    preds.append(stat_mode(np.array(ref_embed_adata.obs[cell_type_key][idx])))

Just want to confirm if this is workaround is appropriate. Thanks!

@subercui
Copy link
Member

subercui commented Dec 12, 2023

Thank you @xinyu-dev ! and yes, looks like it is related to scipy version changes. The workaround looks good to me. I am also going to update a fix right now. Going to use the pandas built-in value counts, it can be faster supposedly

subercui added a commit that referenced this issue Dec 12, 2023
🔧 update value counts for predictions, fix #135
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants