-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
highly_variable_genes - issue #391
Comments
It looks like your adata object is corrupted. You should be able to type
`adata.X` to get the matrix.
How are you generating the adata object?
…On Thu, Dec 6, 2018 at 5:56 PM ltosti ***@***.***> wrote:
Hi there,
When running sc.pp.highly_variable_genes(adata.X) I get the following
error:
AttributeError: X not found
I then ran sc.pp.highly_variable_genes(adata) and got the following:
ValueError: Bin edges must be unique: array([nan, inf, inf, inf, inf, inf,
inf, inf, inf, inf, inf, inf, inf,inf, inf, inf, inf, inf, inf, inf, inf]).
You can drop duplicate edges by setting the duplicates kwarg
The older sc.pp.filter_genes_dispersion(adata.X) works fine.
Do you know how to fix this?
Thank you!
*Info*: scanpy==1.3.4 anndata==0.6.13 numpy==1.15.3 scipy==1.1.0
pandas==0.23.4 scikit-learn==0.20.0 statsmodels==0.9.0 python-igraph==0.7.1
louvain==0.6.1
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#391>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEu_1RPErIznAoUd0DwpbdlEjkOUyjTdks5u2Uw4gaJpZM4ZG6Jw>
.
|
When I run adata.X I get That looks fine? |
Hi,
As highly_variable_genes expects logarithmized data. |
Hm, very hard to say anything without looking at the dataset. Any negative values in the dataset?
Does it show true? |
When I run this on the single sample I get When I run this on the merged (batch-removed) sample I get:
|
I mean on non-normalized dataset, which is a sparse matrix. |
On non-normalized dataset I get |
The initial problem is due to the fact that the new 'highly_variable_genes' function does not take numpy arrays anymore: https://github.com/theislab/scanpy/blob/master/scanpy/preprocessing/highly_variable_genes.py It's also mentioned in the docs, but we should, of course, have thrown a clear error message. Now it does: a578ced To return the annotation, one can set |
I am experiencing a similar issue with a dataset I am using. This runs fine:
But this:
Throws the following error:
I am assuming its something wrong with the dataset (it's a publicly available one which I needed to convert from a Seurat Object), but I can't figure out what. I have checked if there are any Inf values included in adata.X or adata.raw.X but there are not. Also both adata.X and adata.raw.X are sparse matrices. Any ideas would be greatly appreciated. |
Hi! |
I am experiencing the same problem, and it also comes from a Seurat object that I converted to anndata with SeuratDisk. |
I am also getting the error Update: I just noticed that my adata.X contains a numpy array instead of a sparse matrix. Perhaps that's the issue? Will try updating to a sparse matrix and will report back |
FIXED: Updating adata.X to a scipy csr sparse matrix using I still get |
Thanks for your update @rpeys, I will try to convert to scipy csr sparse matrix :) |
I have an AnnData object whose .X matrix has been transformed by size factor division, +1 and log. Subsequent Edit: However! While I could not get |
For me this was solved by filtering out genes that were not expressed in any cell! |
@LisaSikkema, could you please open a new issue for that? It'd be helpful if you could include a reproducible example as well. |
This one works! thanks!! |
Hello, Massonix, was the problem resolved? |
hi Rebecca, I have been trying to process scRNA (converted seurat to h5ad format) in python (processing like QC, normalisation, scaling, high variables, clustering etc) and have been getting stuck at the highly variable genes. Can you please help me out with it? |
Hi there,
While running
sc.pp.highly_variable_genes(adata.X)
I got the following error:AttributeError: X not found
I then ran
sc.pp.highly_variable_genes(adata)
and got the following:ValueError: Bin edges must be unique: array([nan, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf,inf, inf, inf, inf, inf, inf, inf, inf]). You can drop duplicate edges by setting the duplicates kwarg
The older
sc.pp.filter_genes_dispersion(adata.X)
works fine.Do you know how to fix this?
Thank you!
Info: scanpy==1.3.4 anndata==0.6.13 numpy==1.15.3 scipy==1.1.0 pandas==0.23.4 scikit-learn==0.20.0 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1
The text was updated successfully, but these errors were encountered: