Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation]: FAQ or Best Practice for representing co-registered ROI IDs #1736

Open
3 tasks done
CodyCBakerPhD opened this issue Jul 24, 2023 · 3 comments
Open
3 tasks done
Labels
priority: low alternative solution already working and/or relevant to only specific user(s) topic: docs issues related to documentation
Milestone

Comments

@CodyCBakerPhD
Copy link
Collaborator

What would you like changed or added to the documentation and why?

Opening here for now, but feel free to transfer wherever is best

For co-registration of cell IDs identified from the same subject but multiple ophys sessions, recommendation is to use a custom column to indicate the global ID across sessions and keep the typical ROI table IDs as being the ones specific to each individual session

Do you have any interest in helping write or edit the documentation?

Yes.

Code of Conduct

@oruebel
Copy link
Contributor

oruebel commented Jul 25, 2023

@CodyCBakerPhD it seems like in PyNWB maybe adding a note in the ophys tutorial may make sense https://pynwb.readthedocs.io/en/stable/tutorials/domain/ophys.html#sphx-glr-tutorials-domain-ophys-py .

Adding it to the best practices in the nwbinspector docs seems useful (even if there is no check function for this in the inspector), since this is currently the main place we point to for best practices https://nwbinspector.readthedocs.io/en/dev/best_practices/best_practices_index.html

recommendation is to use a custom column to indicate the global ID across sessions and keep the typical ROI table IDs as being the ones specific to each individual session

Could you clarify why this is the recommended strategy? I think the strategy you describe is fine; I'm just trying to understand the reasoning for it better. Is it mainly because the global IDs are determined as a separate processing step after the ROI extraction on the individual files and so we should not overwrite the original IDs but store the global IDs separately; or is there another reason?

@oruebel oruebel added topic: docs issues related to documentation priority: low alternative solution already working and/or relevant to only specific user(s) labels Jul 25, 2023
@oruebel oruebel added this to the Next Release milestone Jul 25, 2023
@CodyCBakerPhD
Copy link
Collaborator Author

Is it mainly because the global IDs are determined as a separate processing step after the ROI extraction on the individual files and so we should not overwrite the original IDs but store the global IDs separately; or is there another reason?

That's exactly the reason, you'd acquire the data and segment at least one sessions first (and hopefully even save that to NWB at that stage) then there are two cases

(a) if that session had multiple planes, perhaps you want to segment each plane separately and if each plane is contiguous enough in space you way identify 'global' unit IDs across all planes

or

(b) acquire and segment more sessions from the same animal and region, then co-register the same 'global' ID over time

See https://github.com/RichieHakim/ROICaT#readme as a tool that is being built (or works already?)

It might even be capable of reading NWB files via the ROIExtactors integration, in which case we might want to add it to the overview. @RichieHakim would know

@RichieHakim
Copy link

See https://github.com/RichieHakim/ROICaT#readme as a tool that is being built (or works already?)

It works great.

It might even be capable of reading NWB files via the ROIExtactors integration, in which case we might want to add it to the overview. @RichieHakim would know

Yes, it is capable of reading NWB files as input.

Could you clarify why this is the recommended strategy?

This information is useful because it allows for analyzing the same neurons over sessions or planes, which is becoming increasingly common.

Could you clarify why this is the recommended strategy?

The approach described by @CodyCBakerPhD makes sense to me and is what I use. The output of my ROI tracking pipeline is generally a list of integer arrays where each array is a session, and each element of the array is an ROI (with the same index position as other representations of the ROI like fluorescence traces and masks), and the value of each element is the 'unique ROI ID number'. In this way, ROIs with the same ID number from different imaging sessions / planes can be defined as deriving from the same source neuron / ROI. Generally ROIs that are 'unassigned' to a unique source / cluster are given an ID number of -1. For this reason, the datatype is generally an int64.

This particular data representation strikes an appropriate balance between sparsity/efficiency and clarity/ease-of-use compared to other representations. There are many other representations that provide more clarity or more sparsity (look-up tables, sparse arrays, see here for a related function).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: low alternative solution already working and/or relevant to only specific user(s) topic: docs issues related to documentation
Projects
None yet
Development

No branches or pull requests

5 participants