-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring spot finding #1450
Comments
I think you're on to something here. :-) I have some clarifying questions.
is "an image" here an arbitrary series of 2d or 3d planes? For the ISS example, should
be A hiccup in
Could you flesh out the contract more explicitly? What indices would be passed in? (x, y, z)? (x, y, z, r)? How would these be matched to the images that would be measured. Also, It's also worth noting that the
could you explain why this is needed, and perhaps how decoding would work in the remainder of the spot finding examples? |
My opinions (and therefore @shanaxel42 should chime in with her opinion).
It should be a 5D tensor, with axes that we group by.
If we group a bunch of spots with pixel coordinates that are not identical, would we need the coordinates of each spot in each r·c? Would the average location not suffice? MeasureSpots could have an implementation that tolerates some fuzz around where it's doing the measurement.
I think this needs an "axes to measure across". Sequential assays should be doing this across C or Z (not mutually exclusive).
are you saying these do more than one thing? |
Spot finders really only work on 2d and 3d images, though -- grouping over
I think we would, because...
This would decrease precision of the outcome by averaging over regions with signal and no signal, or in cases with crowded signal, produce the wrong result.
👍
I was trying to articulate that I'm not confident that simply reporting (z, y, x, r, c, radius-or-equivalent) is always adequate to determine the spot intensity. I suspect we can do this, but as I understand the proposal, we'll need to dig into blob_log and trackpy.locate to understand what they're doing to measure intensity and factor that out. |
Yes SpotLocate will always accept a 5d tensor and will work the same as the
trackpy currently creates SpotAttribute with
Depends on the SpotLocate method used, as above. But always a SpotAttributes dataframe with at least x,y positions.
Since the output of LocateSpots will always be a SpotAttibutes dataframe, a simple step is left in the Pixel scenario to turn that into a DecodedIntensityTable, Decode.Lookup() would just do that last bit of mapping the key values of each spot to it's corresponding gene value. For the rest of the spot finding examples the last output is a regular IntensityTable, so the last step is just normal decoding to go from IntensityTable -> DecodedIntensityTable. The same as things work now.
yuuup my bad. |
I strongly suspect that a table with a feature_id column and a spot_id column might work well as a data structure that accommodates all our decoding methodologies. It may complicate how users configure the spot detection methods, but might make it possible to have one data structure for everything. |
Closing in favor of new direction detailed here: #1500 |
Design Proposal
The approach that seqFish takes to spot finding breaks some of our current processing modals. The existing spot finding ecosystem was built up pretty independently for each assay and makes reusability difficult. This proposal refactors our current spot finders into a few small packages that are hopefully modular enough to support seqFIsh and any other future spot finding approach.
starfish.spots would include the following 3 packages:
To support each step the following three data models are required
-Spot Attributes (a dataframe representing spots and their attributes x,y,z,radius, and optional info: passes threshold, target, intensity)
-IntensityTable (aka spot traces, a representation of spots, their attributes and their corresponding traces, for mulitplexed methods this trace is a vector. For non multiplexed methods the trace is a singe value. )
-DecodedIntensityTable(aka decdoded spot traces. a representation of spots, their attributes and their corresponding traces, for mulitplexed methods this trace is a vector. For non multiplexed methods the trace is a singe value AND their decoded target value)
The inputs and outputs of these packages would be:
imagestack & -> LocateSpots -> SpotAttributes
spot finding algorithm
spotAttributes & imagestack -> MeasureSpots -> IntensityTable
IntensityTable & -> DecodeSpots -> DecodedIntensityTable
Codebook
This introduces the concept of a DecodedIntensityTable which would just be a subclass of IntensityTable that includes a 'targets' dimension. This is a minor change that I think will reduce a lot of confusion around the outputs of each step.
In order to incorporate methods that use local spot finding like SeqFish. We also need to add the concept of a "spot id" to the SpotAttributes Dataframe. This will allow us to use the same spot in multiple possible groupings and track it.
So what does this mean for current spotFinders:
ex. osmFISH
osmFish currently uses a LocalMaxPeakFinder with no reference image. It's new workflow would be:
ex. ISS
ISS currently uses a blob detector on a blobs images. It's new workflow would be:
ex. BaristaSeq
baristaSeq currently uses PixelSpotDecoding. Pixel spot decoding will remain it's own module that does both locating and decoding.
ex. StarMap
Starmap currently uses LocalSearchBlobDetector. It's new workflow would be:
smFish
smFish currently used TrackpyLocalMaxPeakFinder with no reference image, it's new workflow would be:
The text was updated successfully, but these errors were encountered: