Intensity Table Concat Processing #1118

shanaxel42 · 2019-03-29T23:48:11Z

This PR introduces the idea of processing a list of IntensityTables with an overlap strategy before concatenating them. It also explicitly adds the TAKE_MAX strategy described by @berl in which we compare overlapping intensity tables and remove spots from the one with less spots in the overlap.

The PR also includes unit tests for overlapping_util methods.

NOTE:
In an effort to make the code easier to understand I went with O(n^2) approach to finding overlaps within a list of IntensityTables (just comparing each on to each other one). But there is a O(nlogn) approach that involves sorting the list by x/y coordinates first. If we think we'll need to optimize this process for large lists I can refactor.

codecov-io · 2019-03-29T23:48:25Z

Codecov Report

Merging #1118 into master will increase coverage by 0.13%.
The diff coverage is 97.1%.

@@            Coverage Diff             @@
##           master    #1118      +/-   ##
==========================================
+ Coverage   88.95%   89.08%   +0.13%     
==========================================
  Files         127      128       +1     
  Lines        4824     4891      +67     
==========================================
+ Hits         4291     4357      +66     
- Misses        533      534       +1

Impacted Files	Coverage Δ
starfish/types/__init__.py	`100% <ø> (ø)`	⬆️
starfish/types/_constants.py	`100% <100%> (ø)`	⬆️
starfish/intensity_table/intensity_table.py	`94.44% <100%> (+1.31%)`	⬆️
starfish/util/overlap_utils.py	`96.15% <96.15%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cb37ba5...4da1a19. Read the comment docs.

ttung

review checkpoint

starfish/intensity_table/intensity_table.py

starfish/util/overlap_utils.py

starfish/test/test_overlap_utils.py

ttung · 2019-04-02T07:44:58Z

starfish/test/test_overlap_utils.py

+    concatenated = IntensityTable.concatanate_intensity_tables(
+        [it1, it2], overlap_strategy=OverlapStrategy.TAKE_MAX)
+
+    # The overlap section hits half of the spots from each intensity table, 5 from it1


wait what? if it hits 5 of the spots from it1, then shouldn't we get a total of 25 spots?

both the sel and remove_area_of_xarray methods are inclusive...so we get one spot in the comparison count and the concatenation...maybe this is wrong?

I would dump the table to make sure it is consistent with your understanding, though I suspect you are correct. :)

ttung

but please see comments.

ttung · 2019-04-10T08:40:26Z

starfish/util/overlap_utils.py

+
+    """
+    all_overlaps: List[Tuple[int, int]] = list()
+    for idx1, idx2 in itertools.combinations(range(len(xarrays)), 2):


so this is, as you pointed out, a n^2 operation. but each operation requires scanning the table to find the min/max for the coordinates. can you get some numbers from @ambrosejcarr for realistic FOV and spot counts, and build a set of tables that reflect that, and see what the perf is like? we can likely significantly shrink the cost by precomputing a min-max for each xarray and reusing that. it would also help inform whether we need to do the nlogn approach.

Finally, it might be worth trying to actually merge a large set of intensity tables, even if synthetic, to see if there are any performance implications to any of the xarray ops used during the merge.

starfish/util/overlap_utils.py

starfish/intensity_table/intensity_table.py

ttung · 2019-04-10T08:48:39Z

starfish/intensity_table/intensity_table.py

+            overlap_method = OVERLAP_STRATEGY_MAP[overlap_strategy]
+            idx1, idx2 = indices
+            # modify IntensityTables based on overlap strategy
+            it1, it2 = overlap_method(its[idx1], its[idx2])


I think this strategy might break down where you have three intensity tables that overlap in one area. It might be easier to illustrate this if I'm not making sense.

starfish/intensity_table/intensity_table.py

ttung · 2019-04-10T08:52:02Z

starfish/test/test_overlap_utils.py

+    concatenated = IntensityTable.concatanate_intensity_tables(
+        [it1, it2], overlap_strategy=OverlapStrategy.TAKE_MAX)
+
+    # The overlap section hits half of the spots from each intensity table, 5 from it1


I would dump the table to make sure it is consistent with your understanding, though I suspect you are correct. :)

starfish/test/test_overlap_utils.py

starfish/util/overlap_utils.py

Shannon Axelrod added 20 commits March 1, 2019 16:58

adding todo's and comment about looping

da247e6

reverting experiement.py"

d8122f5

example looping on allen test

2fb6fa0

adding travis wait

cb17b5f

upping travis wait time

7de5bd6

trying to free up mem in allen test

4db58d9

reverting mem thing

5a90da7

moving ecample to 3dsmFish

a9fb8a9

reverting tavis wait

9816592

regenerated notebooks

84b00f6

:changes

fb1764d

merge from master

4075652

changes.

29ff4b8

good framework to process lists of tables

b20c92c

cleanned up

c52310c

adding better docs, need better tests

c348597

added tests need to make work for negative coords

be54805

fixing lint errors

97fdc35

notebooks

6499c7a

small fixes

847ef42

shanaxel42 requested a review from ttung March 29, 2019 23:48

shanaxel42 requested a review from berl March 29, 2019 23:48

ttung requested changes Apr 2, 2019

View reviewed changes

review changes

be5b9e1

shanaxel42 requested review from ttung and ambrosejcarr April 8, 2019 17:47

ttung approved these changes Apr 10, 2019

View reviewed changes

Shannon Axelrod added 2 commits April 10, 2019 10:33

review comments

a752373

merge from master

4da1a19

shanaxel42 merged commit 1bd8abf into master Apr 10, 2019

shanaxel42 deleted the saxelrod-special-concat branch April 10, 2019 20:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intensity Table Concat Processing #1118

Intensity Table Concat Processing #1118

shanaxel42 commented Mar 29, 2019 •

edited

Loading

codecov-io commented Mar 29, 2019 •

edited

Loading

ttung left a comment

ttung Apr 2, 2019

shanaxel42 Apr 8, 2019

ttung Apr 10, 2019

ttung left a comment

ttung Apr 10, 2019

ttung Apr 10, 2019

ttung Apr 10, 2019

Intensity Table Concat Processing #1118

Intensity Table Concat Processing #1118

Conversation

shanaxel42 commented Mar 29, 2019 • edited Loading

codecov-io commented Mar 29, 2019 • edited Loading

Codecov Report

ttung left a comment

Choose a reason for hiding this comment

ttung Apr 2, 2019

Choose a reason for hiding this comment

shanaxel42 Apr 8, 2019

Choose a reason for hiding this comment

ttung Apr 10, 2019

Choose a reason for hiding this comment

ttung left a comment

Choose a reason for hiding this comment

ttung Apr 10, 2019

Choose a reason for hiding this comment

ttung Apr 10, 2019

Choose a reason for hiding this comment

ttung Apr 10, 2019

Choose a reason for hiding this comment

shanaxel42 commented Mar 29, 2019 •

edited

Loading

codecov-io commented Mar 29, 2019 •

edited

Loading