Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imgData slot restrictions #120

Open
lgeistlinger opened this issue Oct 26, 2022 · 5 comments
Open

imgData slot restrictions #120

lgeistlinger opened this issue Oct 26, 2022 · 5 comments

Comments

@lgeistlinger
Copy link

lgeistlinger commented Oct 26, 2022

Hi I was wondering why the imgData slot of a SpatialExperiment restricts to columns

‘sample_id’, ‘image_id’, ‘data’, ‘scaleFactor’

I can imagine many scenarios where one would might want to annotate additional metadata to the images, such as eg number of frames, number of channels, staining used, etc (would also say that scaleFactor is not necassarily one that one would always be interested in annotating).

> imgData(spe)
DataFrame with 2 rows and 4 columns
    sample_id    image_id   data scaleFactor
  <character> <character> <list>   <numeric>
1       ileum        dapi   ####          NA
2       ileum    membrane   ####          NA
> imgData(spe)$nr.frames <- 7
Error in `imgData<-`(`*tmp*`, value = new("DFrame", rownames = NULL, nrows = 2L,  : 
  'imgData' field in 'int_metadata' should have columns: ‘sample_id’, ‘image_id’, ‘data’, ‘scaleFactor’
@drighelli
Copy link
Owner

Hi Ludwig @lgeistlinger ,

when we designed this class, we only referred to the 10xVisium data, which didn't need any other additional column.

Of course, right now there is plenty of other technologies and data around to test and extend our class with...

Can you provide any dataset to play with in order to better understand which columns would be useful to add?

@lgeistlinger
Copy link
Author

lgeistlinger commented Nov 8, 2022

Hi @drighelli - here are a couple of example datasets from other technologies than 10X Visium:

spatial transcript profiling

spatial protein profiling

But why actually restricting to specific columns at all and not allowing arbitrary metadata columns to images (as eg for SummarizedExperiments colData, rowData, and metadata)? Or alternatively decide on a set of core annotation columns that need to always be there, but allow additional annotation of arbitrary metadata columns on top of that (as eg for a GRanges that needs to have a chromosome, start and end position, and strand, but then allows to annotate arbitrary metadata columns on top of that). I think that would make for a flexible + extensible design as opposed to locking the slot to an exclusive set of metadata annotations to the images as we don't know what a user might want to annotate in the future.

@drighelli
Copy link
Owner

I'm sorry @lgeistlinger, maybe I'm not getting what you mean ...

The imgData is designed to store images at the moment, so we designed it for storing images and metadata associated with them.

Of course, I get the idea to extend the DataFrame more flexibly, but what other kind of data, other than images, are you thinking of storing in the imgData?

If you're thinking about seqFISH and MERFISH processed raw data, you already have the BumpyMatrix accessible through the molecules accessor to store that kind of data, and of course, you can use the rowRanges instead of the rowData to store GRanges like information.

The only thing that comes into my mind could be another column named imgMetadata (or something else) where you can store another DataFrame/list with additional information to keep it flexible

@lgeistlinger
Copy link
Author

lgeistlinger commented Nov 14, 2022

The imgData is designed to store images at the moment, so we designed it for storing images and metadata associated with them.

Right, but given my example above the annotation of image metadata is restricted to sample_id, image_id, and scaleFactor. For my applications of interest, I could not annotate the number of frames/channels of an image, the type of the image (eg mask or raw image), the type of mask (nuclei or cell mask), the marker used for staining (eg DAPI, PolyA, or a cell membrane marker), segmentation algorithm used, etc etc. - this could be solved by allowing an arbitrary number of additional columns to the imgData DataFrame (if it's indeed represented as a DataFrame internally and not just a result of the show method).

@drighelli
Copy link
Owner

okok, thanks for the clarification! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants