Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework metrics #125

Merged
merged 77 commits into from
Apr 28, 2021
Merged
Show file tree
Hide file tree
Changes from 72 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
ef212c6
start metrics template
ismael-mendoza Apr 7, 2021
9129b70
Reorganized metrics
Apr 13, 2021
9e1f1f2
typo
ismael-mendoza Apr 13, 2021
74bfe8d
update docstring
ismael-mendoza Apr 13, 2021
11d044d
docstring module
ismael-mendoza Apr 13, 2021
6c0e191
shouldbe py37
ismael-mendoza Apr 13, 2021
121b70b
Modified matches
Apr 14, 2021
79508a6
Merge branch 'rework-metrics' of github.com:LSSTDESC/BlendingToolKit …
Apr 14, 2021
41fe669
Added new metrics and adjusted matches (still WIP)
Apr 14, 2021
c977786
Fixed matches and added F1 score
Apr 14, 2021
f09c037
Small fixes and added IoU for segmentation
Apr 14, 2021
4305dea
Added generator instead of wrap
Apr 15, 2021
ad4f3b4
new way of doing detections
ismael-mendoza Apr 15, 2021
197d40d
fix docstring
ismael-mendoza Apr 15, 2021
68f740c
more explicit
ismael-mendoza Apr 15, 2021
19fa6aa
Fixed issues, added statistics per galaxy and basic plotting function
Apr 15, 2021
8e7f275
Merge branch 'rework-metrics' of github.com:LSSTDESC/BlendingToolKit …
Apr 15, 2021
2e46e97
Removed the old matching function
Apr 15, 2021
15ad82d
Added detected boolean column and distance, and new function for plot…
Apr 15, 2021
52d54d5
start metrics template
ismael-mendoza Apr 7, 2021
d88af72
Reorganized metrics
Apr 13, 2021
34fded1
Modified matches
Apr 14, 2021
ea6b1ad
typo
ismael-mendoza Apr 13, 2021
3a3f24d
update docstring
ismael-mendoza Apr 13, 2021
57db289
docstring module
ismael-mendoza Apr 13, 2021
06e2650
Added new metrics and adjusted matches (still WIP)
Apr 14, 2021
a09c758
Fixed matches and added F1 score
Apr 14, 2021
8e9002e
Small fixes and added IoU for segmentation
Apr 14, 2021
ec551fe
Added generator instead of wrap
Apr 15, 2021
72817c3
Fixed issues, added statistics per galaxy and basic plotting function
Apr 15, 2021
4f9aef5
new way of doing detections
ismael-mendoza Apr 15, 2021
14afd16
fix docstring
ismael-mendoza Apr 15, 2021
555f7bf
more explicit
ismael-mendoza Apr 15, 2021
626de51
Removed the old matching function
Apr 15, 2021
036cd5d
Added detected boolean column and distance, and new function for plot…
Apr 15, 2021
459940d
Added new plot features and distance to closest galaxy column
Apr 16, 2021
f15f1fc
added some docstrings
ismael-mendoza Apr 16, 2021
603004a
no need to name j variable if unused
ismael-mendoza Apr 16, 2021
1934ecd
Handling case with single measure_function
Apr 16, 2021
340515d
Merge branch 'rework-metrics' of github.com:LSSTDESC/BlendingToolKit …
Apr 16, 2021
a060b95
Tentative structure for measuring ellipticities and the like
Apr 19, 2021
b33dfd4
Added blendedness
Apr 19, 2021
aa86b63
Added target measurement metric
Apr 20, 2021
ef87c3e
refactoring and readability
ismael-mendoza Apr 20, 2021
5d1a3ee
improve docstring
ismael-mendoza Apr 20, 2021
c4df1d0
added more docstrings and remove unused arguments
ismael-mendoza Apr 20, 2021
8486b1c
added more docstrings, starting new function
ismael-mendoza Apr 20, 2021
02b2f52
add blend_counter to keep track of blends between batches; first draf…
ismael-mendoza Apr 20, 2021
d202dec
Merge branch 'rework-metrics' of github.com:LSSTDESC/BlendingToolKit …
Apr 20, 2021
f94045f
Made the detection metrics global instead of per blend
Apr 21, 2021
06a78fc
Cleaned the target_meas a bit
Apr 21, 2021
49a372b
Fixed dim_order everywhere
Apr 21, 2021
fc1ab80
Cleaned target_meas some more and fixed blendedness
Apr 21, 2021
99b0c2d
Added docstring for comput_metrics
Apr 21, 2021
2f7ad19
docstring additions
ismael-mendoza Apr 21, 2021
c0894d6
periods
ismael-mendoza Apr 21, 2021
9fdf803
change N->M
ismael-mendoza Apr 22, 2021
f304603
Update btk/plot_utils.py
ismael-mendoza Apr 22, 2021
b4f5c94
Updated tutorial (WIP), and made a few corrections
Apr 22, 2021
804e1d0
Merge branch 'rework-metrics' of github.com:LSSTDESC/BlendingToolKit …
Apr 22, 2021
e6d4a2d
Modified plot functions to take an ax as argument
Apr 22, 2021
aae9c5d
Separated functions
thuiop Apr 23, 2021
0a26ddc
Added docstrings
thuiop Apr 23, 2021
8cb63d7
Added more docstrings and removed unnecessary arguments
thuiop Apr 23, 2021
06749bc
Updated tests and added back efficiency matrix
thuiop Apr 26, 2021
3ef608f
Added execution test for metrics generator
thuiop Apr 26, 2021
4baa52d
Added plots to the metrics tests
thuiop Apr 26, 2021
353fffd
Merge branch 'main' into rework-metrics
thuiop Apr 27, 2021
89eb357
Convert to channels_last after rebase
thuiop Apr 27, 2021
7f4f1ee
Improved doc
thuiop Apr 27, 2021
c1338cf
Added documentation on all the metrics
thuiop Apr 27, 2021
105fa74
Added plot_utils to doc and updated flowchart
thuiop Apr 27, 2021
0553b99
Updated tutorial and added check for columns
thuiop Apr 28, 2021
44391da
Corrected IoU threshold
thuiop Apr 28, 2021
c40d03c
Changed default value for ellipticity and added metrics to precommit …
thuiop Apr 28, 2021
08a3b28
avoid special cases in returning measure_results and metrics_results
ismael-mendoza Apr 28, 2021
3608fec
avoid long multi-line commands
ismael-mendoza Apr 28, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions btk/draw_blends.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ def __init__(
self.add_noise = add_noise
self.verbose = verbose

self.dim_order = (1, 2, 0) if channels_last else (0, 1, 2)
self.channels_last = channels_last

def __iter__(self):
"""Returns iterable which is the object itself."""
Expand Down Expand Up @@ -243,9 +243,17 @@ def __next__(self):
batch_blend_cat[s.name] = []
batch_obs_cond[s.name] = []
image_shape = (
len(s.filters),
pix_stamp_size,
pix_stamp_size,
(
len(s.filters),
pix_stamp_size,
pix_stamp_size,
)
if not self.channels_last
else (
pix_stamp_size,
pix_stamp_size,
len(s.filters),
)
ismael-mendoza marked this conversation as resolved.
Show resolved Hide resolved
)
blend_images[s.name] = np.zeros((self.batch_size, *image_shape))
isolated_images[s.name] = np.zeros((self.batch_size, self.max_number, *image_shape))
Expand Down Expand Up @@ -353,7 +361,7 @@ def render_mini_batch(self, blend_list, psf, wcs, survey, extra_data=None):
iso_image_multi[:, b, :, :] = single_band_output[1]

# transpose if requested.
dim_order = np.array(self.dim_order)
dim_order = np.array((0, 1, 2) if not self.channels_last else (1, 2, 0))
blend_image_multi = blend_image_multi.transpose(dim_order)
iso_image_multi = iso_image_multi.transpose(0, *(dim_order + 1))

Expand Down
104 changes: 64 additions & 40 deletions btk/measure.py
Original file line number Diff line number Diff line change
@@ -1,35 +1,40 @@
"""File containing measurement infrastructure for the BlendingToolKit.

Contains examples of functions that can be used to apply a measurement algorithm to the blends
simulated by BTK. Every measurement function should take as an input a `batch` returned from a
DrawBlendsGenerator object (see its `__next__` method) and an index corresponding to which image
in the batch to measure.
simulated by BTK. Every measurement function should take as an input a `batch` returned from a
DrawBlendsGenerator object (see its `__next__` method) and an index corresponding to which image
in the batch to measure.

It should return a dictionary containing a subset of the following keys/values (note the key
`catalog` is mandatory):
- catalog (astropy.table.Table): An astropy table containing measurement information. The
`len` of the table should be `n_objects`. If your
DrawBlendsGenerator uses a single survey, the following
column names are required:
- x_peak: horizontal centroid position in pixels.
- y_peak: vertical centroid position in pixels.
For multiple surveys (multi-resolution), we instead require:
- ra: object centroid right ascension in arcseconds,
following the convention from the `wcs` object included in
the input batch.
- dec: vertical centroid position in arcseconds,
following the convention from the `wcs` object included in
the input batch.
- deblended_image (np.ndarray): Array of deblended isolated images with shape:
`(n_objects, n_bands, stamp_size, stamp_size)` or
`(n_objects, stamp_size, stamp_size, n_bands)` depending on
convention. The order of this array should correspond to the
order in the returned `catalog`. Where `n_objects` is the
number of detected objects
- segmentation (np.ndarray): Array of booleans with shape `(n_objects,stamp_size,stamp_size)`
The pixels set to True in the i-th channel correspond to the i-th
object. The order should correspond to the order in the returned
`catalog`.

* catalog (astropy.table.Table): An astropy table containing measurement information. The
`len` of the table should be `n_objects`. If your
DrawBlendsGenerator uses a single survey, the following
Comment on lines +11 to +13
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for organizing the docstring so now it's clearer

column names are required:

* x_peak: horizontal centroid position in pixels.
* y_peak: vertical centroid position in pixels.

For multiple surveys (multi-resolution), we instead require:

* ra: object centroid right ascension in arcseconds,
following the convention from the `wcs` object included in
the input batch.
* dec: vertical centroid position in arcseconds,
following the convention from the `wcs` object included in
the input batch.

* deblended_image (np.ndarray): Array of deblended isolated images with shape:
`(n_objects, n_bands, stamp_size, stamp_size)` or
`(n_objects, stamp_size, stamp_size, n_bands)` depending on
convention. The order of this array should correspond to the
order in the returned `catalog`. Where `n_objects` is the
number of detected objects by the algorithm.
* segmentation (np.ndarray): Array of booleans with shape `(n_objects,stamp_size,stamp_size)`
The pixels set to True in the i-th channel correspond to the i-th
object. The order should correspond to the order in the returned
`catalog`.

Omitted keys in the returned dictionary are automatically assigned a `None` value (except for
`catalog` which is a mandatory entry).
Expand Down Expand Up @@ -88,8 +93,8 @@ def sep_measure(batch, idx):
raise NotImplementedError("This function does not support the multi-resolution feature.")

image = batch["blend_images"][idx]
stamp_size = image.shape[-2] # true for channels last or channels first.
coadd = np.mean(image, axis=0)
stamp_size = image.shape[-2] # true for both 'NCHW' or 'NHWC' formats.
coadd = np.mean(image, axis=np.argmin(image.shape)) # Smallest dimension is the channels
thuiop marked this conversation as resolved.
Show resolved Hide resolved
bkg = sep.Background(coadd)
# Here the 1.5 value corresponds to a 1.5 sigma threshold for detection against noise.
catalog, segmentation = sep.extract(coadd, 1.5, err=bkg.globalrms, segmentation_map=True)
Expand All @@ -99,7 +104,11 @@ def sep_measure(batch, idx):
for i in range(n_objects):
seg_i = segmentation == i + 1
segmentation_exp[i] = seg_i
deblended_images[i] = image * seg_i[np.newaxis, ...]
seg_i_reshaped = np.zeros((np.min(image.shape), stamp_size, stamp_size))
ismael-mendoza marked this conversation as resolved.
Show resolved Hide resolved
for j in range(np.min(image.shape)):
seg_i_reshaped[j] = seg_i
seg_i_reshaped = np.moveaxis(seg_i_reshaped, 0, np.argmin(image.shape))
thuiop marked this conversation as resolved.
Show resolved Hide resolved
deblended_images[i] = image * seg_i_reshaped

t = astropy.table.Table()
t["x_peak"] = catalog["x"]
Expand Down Expand Up @@ -156,7 +165,7 @@ def __init__(
self.cpus = cpus

self.batch_size = self.draw_blend_generator.batch_size
self.dim_order = self.draw_blend_generator.dim_order
self.channels_last = self.draw_blend_generator.channels_last

self.verbose = verbose

Expand Down Expand Up @@ -203,11 +212,13 @@ def run_batch(self, batch, index):
f"The output '{key}' of at least one of your measurement"
f"functions is not a numpy array."
)
if not out[key].shape[-2:] == batch["blend_images"].shape[-2:]:
raise ValueError(
f"The shapes of the blended images in your {key} don't"
f"match for at least one your measurement functions."
)
if key == "deblended_images":
if not out[key].shape[-3:] == batch["blend_images"].shape[-3:]:
raise ValueError(
f"The shapes of the blended images in your {key} don't "
f"match for at least one your measurement functions."
f"{out[key].shape[-3:]} vs {batch['blend_images'].shape[-3:]}"
)

out = {k: out.get(k, None) for k in self.measure_params}
output.append(out)
Expand All @@ -217,19 +228,32 @@ def __next__(self):
"""Return measurement results on a single batch from the draw_blend_generator.

Returns:
draw_blend_generator output from `__next__` method.
measurement output: List of length `batch_size`, where each element is a list of
`len(measure_functions)` corresponding to the measurements made by
each function on each element of the batch.
draw_blend_generator output from its `__next__` method.
measurement_results (dict): Dictionary with keys being the name of each
`measure_function` passed in. Each value is a dictionary containing keys
`catalog`, `deblended_images`, and `segmentation` storing the values returned by
the corresponding measure_function` for one batch.
"""
blend_output = next(self.draw_blend_generator)
input_args = ((blend_output, i) for i in range(self.batch_size))
measure_results = multiprocess(
measure_output = multiprocess(
self.run_batch,
input_args,
cpus=self.cpus,
verbose=self.verbose,
)
if self.verbose:
print("Measurement performed on batch")
measure_results = {}
for i, f in enumerate(self.measure_functions):
measure_dic = {}
for key in ["catalog", "deblended_images", "segmentation"]:
if measure_output[0][i][key] is not None:
measure_dic[key] = [
measure_output[j][i][key] for j in range(len(measure_output))
]
measure_results[f.__name__] = measure_dic
if len(self.measure_functions) == 1:
measure_results = measure_results[self.measure_functions[0].__name__]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you have two separate cases depending on how many measurement functions you are using (1 vs >1). This makes is so that we have to split the logic in the metrics code below (and possibly later in the unit tests). I would suggest getting rid of this additional 'if' statement for now.

Later we can discuss how to make accessing the information less cumbersome for the user/us, but I think splitting the logic now just makes it confusing at least for me. (had to spend sometime figuring out if statement in __next__ from metrics_generator)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, it doesn't feel too cumbersome to me ; but why not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will commit this change soon


return blend_output, measure_results
Loading