Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle reading empty H5AD slots #86

Closed
lazappi opened this issue May 22, 2023 · 6 comments
Closed

Handle reading empty H5AD slots #86

lazappi opened this issue May 22, 2023 · 6 comments
Labels
bug Something isn't working
Milestone

Comments

@lazappi
Copy link
Collaborator

lazappi commented May 22, 2023

Reading currently fails if a slot is not set in the .h5ad file

@lazappi lazappi added the bug Something isn't working label May 22, 2023
@rcannood
Copy link
Collaborator

rcannood commented Jul 5, 2023

Can you post a minimal reproducible example? ;)

@LouiseDck
Copy link
Collaborator

e.g. when reading in a file with only a .X:

Error in private$.validate_layers(value) : 
  'layers' must must be a named list
In addition: Warning messages:
1: In private$.validate_obsvar_dataframe(value, "obs") :
  'obs' should not have any rownames, removing them from the data frame.
2: In private$.validate_obsvar_dataframe(value, "var") :
  'var' should not have any rownames, removing them from the data frame.``

@rcannood rcannood added this to the 1.0.0 milestone Sep 20, 2023
@kollo97
Copy link

kollo97 commented Nov 5, 2024

Hi! Thanks for this great tool first of all.
I saw that this bug fix is part of your milestone for version 1.0.0, I just wanted to say that I'd also appreciate a solution on the long run!
And add minimal reproducible example:
in python:

# create test anndata without var and obs
import anndata as ad
from scipy.sparse import csr_matrix
import numpy as np

counts = csr_matrix(np.random.poisson(1, size=(100, 2000)), dtype=np.float32)
adata = ad.AnnData(counts)
print(adata.var.head())
adata.write_h5ad("no_var_no_obs.h5ad")


counts = csr_matrix(np.random.poisson(1, size=(100, 2000)), dtype=np.float32)
adata = ad.AnnData(counts)
adata.var["gene"] = adata.var.index
adata.obs["cell"] = adata.obs.index
print(adata.var.head())
adata.write_h5ad("var_obs.h5ad")

Then in R:

library(anndataR)
library(Seurat)
se1 <- read_h5ad("no_var_no_obs.h5ad", to = "Seurat")  # fails
# Error in if (ncol(obj$obs) > 0) { : argument is of length zero
# In addition: Warning messages:
# 1: In value[[3L]](cond) : Error reading element 'var' of type 'dataframe':
# HDF5-API Errors:
#     error #000: H5A.c in H5Aread(): line 1043: can't synchronously read data
#         class: HDF5
#         major: Attribute
#         minor: Read failed

#     error #001: H5A.c in H5A__read_api_common(): line 1003: buf parameter can't be NULL
#         class: HDF5
#         major: Invalid arguments to routine
#         minor: Bad value

# 2: In value[[3L]](cond) : Error reading element 'obs' of type 'dataframe':
# HDF5-API Errors:
#     error #000: H5A.c in H5Aread(): line 1043: can't synchronously read data
#         class: HDF5
#         major: Attribute
#         minor: Read failed

#     error #001: H5A.c in H5A__read_api_common(): line 1003: buf parameter can't be NULL
#         class: HDF5
#         major: Invalid arguments to routine
#         minor: Bad value

# 3: In value[[3L]](cond) : Error reading element 'var' of type 'dataframe':
# HDF5-API Errors:
#     error #000: H5A.c in H5Aread(): line 1043: can't synchronously read data
#         class: HDF5
#         major: Attribute
#         minor: Read failed

#     error #001: H5A.c in H5A__read_api_common(): line 1003: buf parameter can't be NULL
#         class: HDF5
#         major: Invalid arguments to routine
#         minor: Bad value

# 4: In value[[3L]](cond) : Error reading element 'obs' of type 'dataframe':
# HDF5-API Errors:
#     error #000: H5A.c in H5Aread(): line 1043: can't synchronously read data
#         class: HDF5
#         major: Attribute
#         minor: Read failed

#     error #001: H5A.c in H5A__read_api_common(): line 1003: buf parameter can't be NULL
#         class: HDF5
#         major: Invalid arguments to routine
#         minor: Bad value

se2 <- read_h5ad("var_obs.h5ad", to = "Seurat")  # works
se3 <- read_h5ad("no_var_no_obs.h5ad")  # works but throws warning messages 
# Warning messages:
# 1: In value[[3L]](cond) : Error reading element 'obs' of type 'dataframe':
# HDF5-API Errors:
#     error #000: H5A.c in H5Aread(): line 1043: can't synchronously read data
#         class: HDF5
#         major: Attribute
#         minor: Read failed

#     error #001: H5A.c in H5A__read_api_common(): line 1003: buf parameter can't be NULL
#         class: HDF5
#         major: Invalid arguments to routine
#         minor: Bad value

# 2: In value[[3L]](cond) : Error reading element 'var' of type 'dataframe':
# HDF5-API Errors:
#     error #000: H5A.c in H5Aread(): line 1043: can't synchronously read data
#         class: HDF5
#         major: Attribute
#         minor: Read failed

#     error #001: H5A.c in H5A__read_api_common(): line 1003: buf parameter can't be NULL
#         class: HDF5
#         major: Invalid arguments to routine
#         minor: Bad value

se3 <- se3$to_Seurat()  # works

As mentioned in the python code, a quick fix is to just add the var and obs indices as columns.

@rcannood
Copy link
Collaborator

rcannood commented Nov 5, 2024

@lazappi @LouiseDck The original issue still persists:

import anndata as ad
ad.AnnData().write_h5ad("nothing.h5ad")
library(anndataR)
read_h5ad("nothing.h5ad")

# Error in `[[.H5File`(file, name) : 
#  An object with name X does not exist in this group

I'll try to resolve this issue asap, since it will likely be relatively easy to solve.

@kollo97 Thanks for reporting this issue. However, it seems to be unrelated to the original issue. Which version of R and the dependencies you have installed are you using? I believe you are likely using an older version of hdf5r, since the code you provided works on my end:

library(anndataR)
se1 <- read_h5ad("no_var_no_obs.h5ad", to = "Seurat")  # works
# Loading required namespace: SeuratObject
se2 <- read_h5ad("var_obs.h5ad", to = "Seurat")  # works
se3 <- read_h5ad("no_var_no_obs.h5ad") # works
se3 <- se3$to_Seurat()  # works

If the problem still persists, please create a separate issue.

@kollo97
Copy link

kollo97 commented Nov 12, 2024

Hi @rcannood, thanks so much for looking into this so quickly!!
Sorry to have bothered you, the problem was on my end..
I tried the code in a fresh environment and it works indeed without throwing any HDF5-API warnings. It turns out, I only get this warning when working in a tmux session, so I must have some misconfiguration..
Thanks again 🙌

@rcannood
Copy link
Collaborator

@kollo97 Fantastic 👍

@ lazappi and LouiseDck In the meantime, the original issue has also been resolved:

import anndata as ad
ad.AnnData().write_h5ad("nothing.h5ad")
library(anndataR)
read_h5ad("nothing.h5ad")

# AnnData object with n_obs × n_vars = 0 × 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants