-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuSOLVER error encountered when running rsc.pp.pca(adata, n_comps=50) #6182
Comments
Thanks for the issue @hyjforesight, indeed it seems to be coming from Sorry if I missed it, but any chance you could provide some additional info about the dataset you are processing? |
Hello @dantegd import numpy as np
import pandas as pd
import scanpy as sc
import scanpy.external as sce
import scipy
sc.settings.verbosity = 3
sc.logging.print_header()
sc.set_figure_params(dpi=100, dpi_save=600)
import matplotlib.pyplot as pl
from matplotlib import rcParams
import os
import cupy as cp
import rapids_singlecell as rsc
import warnings
warnings.filterwarnings("ignore")
# Enable `pool_allocator`
import rmm
from rmm.allocators.cupy import rmm_cupy_allocator
rmm.reinitialize(
managed_memory=True,
pool_allocator=False,
)
cp.cuda.set_allocator(rmm_cupy_allocator)
adata = sc.read('GC_all.h5ad')
rsc.get.anndata_to_GPU(adata)
rsc.pp.flag_gene_family(adata, gene_family_name="mt", gene_family_prefix="MT-")
rsc.pp.flag_gene_family(adata, gene_family_name="rpl", gene_family_prefix="RPL")
rsc.pp.flag_gene_family(adata, gene_family_name="rps", gene_family_prefix="RPS")
rsc.pp.calculate_qc_metrics(adata, qc_vars=['mt','rpl','rps'], log1p=False)
sc.pl.violin(adata, keys=['n_genes_by_counts', 'total_counts', 'pct_counts_mt','pct_counts_rpl','pct_counts_rps'], jitter=0.4, multi_panel=True)
sc.pl.scatter(adata, x='total_counts', y='pct_counts_mt')
sc.pl.scatter(adata, x='total_counts', y='pct_counts_rpl')
sc.pl.scatter(adata, x='total_counts', y='pct_counts_rps')
sc.pl.scatter(adata, x='total_counts', y='n_genes_by_counts')
adata = adata[adata.obs.n_genes_by_counts < 8000, :]
adata = adata[adata.obs.pct_counts_mt < 50, :]
adata = adata[adata.obs.pct_counts_rpl < 50, :]
adata = adata[adata.obs.pct_counts_rps < 50, :]
rsc.pp.filter_cells(adata, qc_var='n_genes_by_counts', min_count=100)
rsc.pp.filter_genes(adata, qc_var='n_cells_by_counts', min_count=25)
adata.layers["counts"] = adata.X.copy()
rsc.pp.normalize_total(adata, target_sum=1e4)
rsc.pp.log1p(adata)
rsc.pp.highly_variable_genes(adata, n_top_genes=5000)
sc.pl.highly_variable_genes(adata)
print(sum(adata.var.highly_variable))
adata.raw=adata
rsc.pp.regress_out(adata, keys=['total_counts', 'pct_counts_mt','pct_counts_rpl','pct_counts_rps'])
rsc.pp.scale(adata, max_value=10)
adata
AnnData object with n_obs × n_vars = 934583 × 5000
obs: 'batch', 'type', 'more_type', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_rpl', 'pct_counts_rpl', 'total_counts_rps', 'pct_counts_rps'
var: 'mt', 'rpl', 'rps', 'n_cells_by_counts', 'total_counts', 'mean_counts', 'pct_dropout_by_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
uns: 'log1p', 'hvg'
layers: 'counts'
rsc.pp.pca(adata, n_comps=50)
Then, the error comes out.
|
This seems similar to the error seen on issue #5555. However it should have been solved with cuml 24.10. If it ends up being that error, it should be solved with cuda toolkit 12.4.1.003, so a more recent CUDA version like 12.5 and higher could be a solution. |
Hello @lowener rsc.pp.pca(adata, n_comps=50)
sc.pl.pca_variance_ratio(adata, log=True, n_pcs=100)
RuntimeError Traceback (most recent call last)
Cell In[23], line 1
----> 1 rsc.pp.pca(adata, n_comps=50)
2 sc.pl.pca_variance_ratio(adata, log=True, n_pcs=100)
3 adata
File /environment/miniconda3/lib/python3.11/site-packages/rapids_singlecell/preprocessing/_pca.py:174, in pca(***failed resolving arguments***)
167 else:
168 pca_func = PCA(
169 n_components=n_comps,
170 svd_solver=svd_solver,
171 random_state=random_state,
172 output_type="numpy",
173 )
--> 174 X_pca = pca_func.fit_transform(X)
176 elif not zero_center:
177 pca_func = TruncatedSVD(
178 n_components=n_comps,
179 random_state=random_state,
180 algorithm=svd_solver,
181 output_type="numpy",
182 )
File /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/api_decorators.py:188, in _make_decorator_function.<locals>.decorator_function.<locals>.decorator_closure.<locals>.wrapper(*args, **kwargs)
185 set_api_output_dtype(output_dtype)
187 if process_return:
--> 188 ret = func(*args, **kwargs)
189 else:
190 return func(*args, **kwargs)
File /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/api_decorators.py:393, in enable_device_interop.<locals>.dispatch(self, *args, **kwargs)
391 if hasattr(self, "dispatch_func"):
392 func_name = gpu_func.__name__
--> 393 return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
394 else:
395 return gpu_func(self, *args, **kwargs)
File /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/api_decorators.py:190, in _make_decorator_function.<locals>.decorator_function.<locals>.decorator_closure.<locals>.wrapper(*args, **kwargs)
188 ret = func(*args, **kwargs)
189 else:
--> 190 return func(*args, **kwargs)
192 return cm.process_return(ret)
File base.pyx:687, in cuml.internals.base.UniversalBase.dispatch_func()
File pca.pyx:510, in cuml.decomposition.pca.PCA.fit_transform()
File /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/api_decorators.py:188, in _make_decorator_function.<locals>.decorator_function.<locals>.decorator_closure.<locals>.wrapper(*args, **kwargs)
185 set_api_output_dtype(output_dtype)
187 if process_return:
--> 188 ret = func(*args, **kwargs)
189 else:
190 return func(*args, **kwargs)
File /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/api_decorators.py:393, in enable_device_interop.<locals>.dispatch(self, *args, **kwargs)
391 if hasattr(self, "dispatch_func"):
392 func_name = gpu_func.__name__
--> 393 return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
394 else:
395 return gpu_func(self, *args, **kwargs)
File /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/api_decorators.py:190, in _make_decorator_function.<locals>.decorator_function.<locals>.decorator_closure.<locals>.wrapper(*args, **kwargs)
188 ret = func(*args, **kwargs)
189 else:
--> 190 return func(*args, **kwargs)
192 return cm.process_return(ret)
File base.pyx:687, in cuml.internals.base.UniversalBase.dispatch_func()
File pca.pyx:481, in cuml.decomposition.pca.PCA.fit()
RuntimeError: cuSOLVER error encountered at: file=/__w/cuml/cuml/python/cuml/build/cp311-cp311-linux_x86_64/_deps/raft-src/cpp/include/raft/linalg/detail/eig.cuh line=136: call='cusolverDnxsyevd(cusolverH, dn_params, CUSOLVER_EIG_MODE_VECTOR, CUBLAS_FILL_MODE_UPPER, static_cast<int64_t>(n_rows), eig_vectors, static_cast<int64_t>(n_cols), eig_vals, d_work.data(), workspaceDevice, h_work.data(), workspaceHost, d_dev_info.data(), stream_new)', Reason=7:CUSOLVER_STATUS_INTERNAL_ERROR
Obtained 63 stack frames
#1 in /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/../libcuml++.so: raft::cusolver_error::cusolver_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) +0x5a [0x7f2f513c258a]
#2 in /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/../libcuml++.so: void raft::linalg::detail::eigDC<double>(raft::resources const&, double const*, unsigned long, unsigned long, double*, double*, CUstream_st*) +0x1259 [0x7f2f51b373b9]
#3 in /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/../libcuml++.so: void ML::truncCompExpVars<double, ML::solver>(raft::handle_t const&, double*, double*, double*, double*, ML::paramsTSVDTemplate<ML::solver> const&, CUstream_st*) +0x739 [0x7f2f51f3e529]
#4 in /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/../libcuml++.so(+0x11c6a7e) [0x7f2f51f2da7e]
#5 in /environment/miniconda3/lib/python3.11/site-packages/cuml/decomposition/pca.cpython-311-x86_64-linux-gnu.so(+0x430fc) [0x7f2f468b40fc]
#6 in /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/base.cpython-311-x86_64-linux-gnu.so(+0x1009e) [0x7f2f4786b09e]
#7 in /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/base.cpython-311-x86_64-linux-gnu.so(+0x1c396) [0x7f2f47877396]
#8 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x4869 [0x515419]
#9 in /environment/miniconda3/bin/python() [0x557098]
#10 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x4869 [0x515419]
#11 in /environment/miniconda3/bin/python: _PyFunction_Vectorcall +0x173 [0x538903]
#12 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x4869 [0x515419]
#13 in /environment/miniconda3/bin/python: _PyFunction_Vectorcall +0x173 [0x538903]
#14 in /environment/miniconda3/lib/python3.11/site-packages/cuml/decomposition/pca.cpython-311-x86_64-linux-gnu.so(+0x40925) [0x7f2f468b1925]
#15 in /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/base.cpython-311-x86_64-linux-gnu.so(+0x1009e) [0x7f2f4786b09e]
#16 in /environment/miniconda3/lib/python3.11/site-packages/cuml/internals/base.cpython-311-x86_64-linux-gnu.so(+0x1c396) [0x7f2f47877396]
#17 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x4869 [0x515419]
#18 in /environment/miniconda3/bin/python() [0x557098]
#19 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x4869 [0x515419]
#20 in /environment/miniconda3/bin/python: _PyFunction_Vectorcall +0x173 [0x538903]
#21 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x4869 [0x515419]
#22 in /environment/miniconda3/bin/python() [0x5cb78a]
#23 in /environment/miniconda3/bin/python: PyEval_EvalCode +0x9f [0x5cae5f]
#24 in /environment/miniconda3/bin/python() [0x5e45e3]
#25 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x3738 [0x5142e8]
#26 in /environment/miniconda3/bin/python() [0x5e001a]
#27 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x321f [0x513dcf]
#28 in /environment/miniconda3/bin/python() [0x5e001a]
#29 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x321f [0x513dcf]
#30 in /environment/miniconda3/bin/python() [0x5e001a]
#31 in /environment/miniconda3/bin/python() [0x5e2656]
#32 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x38ba [0x51446a]
#33 in /environment/miniconda3/bin/python() [0x55799f]
#34 in /environment/miniconda3/bin/python() [0x55718e]
#35 in /environment/miniconda3/bin/python: PyObject_Call +0x12c [0x54288c]
#36 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x4869 [0x515419]
#37 in /environment/miniconda3/bin/python() [0x5e001a]
#38 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x321f [0x513dcf]
#39 in /environment/miniconda3/bin/python() [0x5e001a]
#40 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x321f [0x513dcf]
#41 in /environment/miniconda3/bin/python() [0x5e001a]
#42 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x321f [0x513dcf]
#43 in /environment/miniconda3/bin/python() [0x5e001a]
#44 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x321f [0x513dcf]
#45 in /environment/miniconda3/bin/python() [0x5e001a]
#46 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x321f [0x513dcf]
#47 in /environment/miniconda3/bin/python() [0x5e001a]
#48 in /environment/miniconda3/lib/python3.11/lib-dynload/_asyncio.cpython-311-x86_64-linux-gnu.so(+0x79fb) [0x7f32558ed9fb]
#49 in /environment/miniconda3/bin/python() [0x52657b]
#50 in /environment/miniconda3/bin/python() [0x4c6caf]
#51 in /environment/miniconda3/bin/python() [0x4cbd10]
#52 in /environment/miniconda3/bin/python() [0x51e3d7]
#53 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x928f [0x519e3f]
#54 in /environment/miniconda3/bin/python() [0x5cb78a]
#55 in /environment/miniconda3/bin/python: PyEval_EvalCode +0x9f [0x5cae5f]
#56 in /environment/miniconda3/bin/python() [0x5e45e3]
#57 in /environment/miniconda3/bin/python() [0x51e3d7]
#58 in /environment/miniconda3/bin/python: PyObject_Vectorcall +0x31 [0x51e2c1]
#59 in /environment/miniconda3/bin/python: _PyEval_EvalFrameDefault +0x6a6 [0x511256]
#60 in /environment/miniconda3/bin/python: _PyFunction_Vectorcall +0x173 [0x538903]
#61 in /environment/miniconda3/bin/python() [0x5f6c2f]
#62 in /environment/miniconda3/bin/python: Py_RunMain +0x14a [0x5f663a]
#63 in /environment/miniconda3/bin/python: Py_BytesMain +0x39 [0x5bb5c9] CUDA is 12.6
Package Version
---------------------------- --------------
absl-py 2.1.0
aiohttp 3.7.4
anaconda-anon-usage 0.4.4
anndata 0.11.1
anyio 3.7.1
archspec 0.2.3
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
array_api_compat 1.9.1
arrow 1.3.0
asttokens 2.4.1
astunparse 1.6.3
async-lru 2.0.4
async-timeout 3.0.1
attrs 23.2.0
Babel 2.14.0
beautifulsoup4 4.12.3
bleach 6.1.0
boltons 23.0.0
Brotli 1.0.9
cachetools 5.5.0
certifi 2024.2.2
cffi 1.16.0
chardet 3.0.4
charset-normalizer 2.0.4
click 8.1.7
cloudpickle 3.1.0
comm 0.2.2
conda 24.3.0
conda-content-trust 0.2.0
conda-libmamba-solver 24.1.0
conda-package-handling 2.2.0
conda_package_streaming 0.9.0
contourpy 1.2.1
cryptography 42.0.5
cuda-python 12.6.2.post1
cudf-cu12 24.10.1
cugraph-cu12 24.10.0
cuml-cu12 24.10.0
cupy-cuda12x 13.3.0
cuvs-cu12 24.10.0
cycler 0.12.1
dask 2024.9.0
dask-cuda 24.10.0
dask-cudf-cu12 24.10.1
dask-expr 1.1.14
debugpy 1.8.1
decorator 5.1.1
defusedxml 0.7.1
distributed 2024.9.0
distributed-ucxx-cu12 0.40.0
distro 1.8.0
ecdsa 0.19.0
executing 2.0.1
fastjsonschema 2.19.1
fastrlock 0.8.3
filelock 3.13.4
flatbuffers 24.3.25
fonttools 4.53.0
fqdn 1.5.1
fsspec 2024.3.1
gast 0.5.4
google-pasta 0.2.0
grpcio 1.62.2
h11 0.14.0
h5py 3.11.0
httpcore 1.0.5
httpx 0.27.0
idna 3.4
igraph 0.11.8
imageio 2.36.1
importlib_metadata 8.5.0
ipykernel 6.29.4
ipython 8.23.0
ipython-genutils 0.2.0
isoduration 20.11.0
jedi 0.19.1
Jinja2 3.1.3
joblib 1.4.2
json5 0.9.25
jsonpatch 1.33
jsonpointer 2.1
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
jupyter_client 8.6.1
jupyter_core 5.7.2
jupyter-events 0.10.0
jupyter-lsp 2.2.5
jupyter_server 2.14.0
jupyter_server_terminals 0.5.3
jupyterlab 4.2.0
jupyterlab_pygments 0.3.0
jupyterlab_server 2.27.1
keras 3.3.2
kiwisolver 1.4.5
lazy_loader 0.4
legacy-api-wrap 1.4.1
leidenalg 0.10.2
libclang 18.1.1
libcudf-cu12 24.10.1
libmambapy 1.5.8
libucx-cu12 1.17.0.post1
libucxx-cu12 0.40.0
llvmlite 0.43.0
locket 1.0.0
Markdown 3.6
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.9.0
matplotlib-inline 0.1.7
mdurl 0.1.2
menuinst 2.0.2
mistune 3.0.2
ml-dtypes 0.3.2
mpmath 1.3.0
msgpack 1.1.0
multidict 6.0.5
namex 0.0.8
natsort 8.4.0
nbclassic 0.2.8
nbclient 0.10.0
nbconvert 7.16.3
nbformat 5.10.4
nest-asyncio 1.6.0
networkx 3.3
notebook 6.4.13
notebook_shim 0.2.4
numba 0.60.0
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.19.3
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.1.105
nvtx 0.2.10
opt-einsum 3.3.0
optree 0.11.0
overrides 7.7.0
packaging 23.2
pandas 2.2.2
pandocfilters 1.5.1
parso 0.8.4
partd 1.4.2
patsy 1.0.1
pexpect 4.9.0
pillow 10.3.0
pip 23.3.1
platformdirs 3.10.0
pluggy 1.0.0
prometheus_client 0.20.0
prompt-toolkit 3.0.43
protobuf 4.25.3
psutil 5.9.8
ptyprocess 0.7.0
pure-eval 0.2.2
pyarrow 17.0.0
pycosat 0.6.6
pycparser 2.21
Pygments 2.17.2
pylibcudf-cu12 24.10.1
pylibcugraph-cu12 24.10.0
pylibraft-cu12 24.10.0
pynndescent 0.5.13
pynvjitlink-cu12 0.4.0
pynvml 11.4.1
pyparsing 3.1.2
PySocks 1.7.1
python-dateutil 2.9.0.post0
python-json-logger 2.0.7
pytz 2024.1
PyYAML 6.0.1
pyzmq 26.0.2
raft-dask-cu12 24.10.0
rapids-dask-dependency 24.10.0
rapids_singlecell 0.10.11
referencing 0.35.0
requests 2.31.0
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rich 13.7.1
rmm-cu12 24.10.0
rpds-py 0.18.0
ruamel.yaml 0.17.21
scanpy 1.10.4
scikit-image 0.25.0
scikit-learn 1.6.0
scikit-misc 0.5.1
scipy 1.14.1
seaborn 0.13.2
Send2Trash 1.8.3
session-info 1.0.0
setuptools 68.2.2
six 1.16.0
sniffio 1.3.1
sortedcontainers 2.4.0
soupsieve 2.5
sshpubkeys 3.3.1
stack-data 0.6.3
statsmodels 0.14.4
stdlib-list 0.11.0
sympy 1.12
tblib 3.0.0
tensorboard 2.16.2
tensorboard-data-server 0.7.2
tensorflow 2.16.1
tensorflow-io-gcs-filesystem 0.36.0
termcolor 2.4.0
terminado 0.18.1
texttable 1.7.0
threadpoolctl 3.5.0
tifffile 2024.12.12
tinycss2 1.3.0
toolz 1.0.0
torch 2.2.2
torchaudio 2.2.2
torchvision 0.17.2
tornado 6.4
tqdm 4.65.0
traitlets 5.14.3
treelite 4.3.0
triton 2.2.0
truststore 0.8.0
types-python-dateutil 2.9.0.20240316
typing_extensions 4.11.0
tzdata 2024.1
ucx-py-cu12 0.40.0
ucxx-cu12 0.40.0
umap-learn 0.5.7
uri-template 1.3.0
urllib3 2.1.0
wcwidth 0.2.13
webcolors 1.13
webencodings 0.5.1
websocket-client 1.8.0
Werkzeug 3.0.2
wheel 0.41.2
wrapt 1.16.0
yarl 1.9.4
zict 3.0.0
zipp 3.21.0
zstandard 0.19.0 BTW if isinstance(adata.X, cp.ndarray):
print("Checking for NaN or Inf values in adata.X...")
print(cp.any(cp.isnan(adata.X)))
print(cp.any(cp.isinf(adata.X)))
else:
print("Checking for NaN or Inf values in adata.X...")
print(np.any(np.isnan(adata.X.toarray())))
print(np.any(np.isinf(adata.X.toarray())))
Checking for NaN or Inf values in adata.X...
False
False |
Describe the bug
Hello Rapids,
Thank you for developing this amazing pipeline.
I met
cuSOLVER error encountered
when runningrapids-singlecell
. Please see below or scverse/rapids_singlecell#307.The developer of
rapids-singlecell
told me it was the issue ofcuml
.Could you please help me with this issue?
Thank you!
Best,
YJ
Steps/Code to reproduce bug
Expected behavior
Run smoothly.
Environment details (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: