Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SingleCellExperiment/Seurat should contain dgCMatrix instead of dgRMatrix #202

Open
pakiessling opened this issue Nov 21, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@pakiessling
Copy link

Similiar to the discussion in theislab/zellkonverter#34

A lot of tools like https://github.com/immunogenomics/presto/ will try to call functions from the Matrix package that error when encountering a dgRMatrix.

@Artur-man
Copy link

Looking at my code, I hardly use dgRMatrix, most of my converted matrices are in dgCMatrix

@lazappi
Copy link
Collaborator

lazappi commented Nov 22, 2024

Can you share an example of what your use case is? I agree that dgRMatrix should mostly be avoided but there is a difference here in that AnnData is transposed (cells x genes) compared to SingleCellExperiment or Seurat (genes x cells) so which format is more "natural" is different.

@pakiessling
Copy link
Author

I use anndataR to integrate R methods in my mostly python based workflow.

I will routinely read data from on-disk .h5ads into Seurat / SingleCellExperiment to then apply tools e.g. presto

These tools (Is suspect all tools calling common {Matrix} functions) will then crash with an error message about the matrix having the wrong format.

I would expect a Seurat / SingleCellExperiment object generated by anndataR to be out-of-the-box compatible with functions taking Seurat / SingleCellExperiment objects as input

@lazappi
Copy link
Collaborator

lazappi commented Nov 25, 2024

Can you please give a code example? If you are using {anndataR} functions to convert to Seurat/SingleCellExperiment then I agree that should return a dgCMatrix.

@pakiessling
Copy link
Author

Yes here is what happened:

library(anndataR)
library(presto)

sce = read_h5ad("adata.h5ad",  to="SingleCellExperiment)
wilcoxauc(sce, 'cell_type')
> Error message about incorrect matrix format.

I think that this should "just work".
Less sure about how to handle things when staying inside an anndata, but isnt the main draw of using anndataR to be able to use R packages on anndatas? Won't a lot of them always fail for dgRMatrix input?

@lazappi
Copy link
Collaborator

lazappi commented Nov 26, 2024

Thanks for the extra detail. I will update the issue title to make it a bit clearer. @rcannood @LouiseDck this is pretty important for anyone who wants to read a .h5ad to a SingleCellExperiment/Seurat. As part of the conversion when matrices are transposed they also need to be converted from dgRMatrix to dgCMatrix.

@lazappi lazappi added the bug Something isn't working label Nov 26, 2024
@lazappi lazappi changed the title Consider reading Sparse .h5ad to dgCMatrix instead of dgRMatrix SingleCellExperiment/Seurat should contain dgCMatrix instead of dgRMatrix Nov 26, 2024
@LouiseDck
Copy link
Collaborator

I was following the discussion here, and read the issue on {zellkonverter}. I am convinced that the conversion should probably produce dgCMatrices.

I am not sure whether they should be dgCMatrices in the anndata representation --> if you want to interact with the matrices in the anndata object, you might want to run more specific methods where a row major format makes more sense? On the other hand, if you want to run these specific methods, you might be more aware of the differences and be able to easily convert it by yourself?

Do you have any thoughts @rcannood ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants