Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply Mask Changes: Multichannel, Allow Depth, Simplify Fill Value #1230

Merged
merged 13 commits into from
Apr 8, 2024

Conversation

ctuguinay
Copy link
Collaborator

This PR is meant to address #1204 and #1224.

@ctuguinay ctuguinay changed the title [Draft] Apply Mask Multichannel, Allow Depth, Simplify Fill Value [Draft] Apply Mask Changes: Multichannel, Allow Depth, Simplify Fill Value Nov 22, 2023
@codecov-commenter
Copy link

codecov-commenter commented Nov 22, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.97%. Comparing base (7679b96) to head (0b67ea6).
Report is 55 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1230      +/-   ##
==========================================
+ Coverage   83.29%   92.97%   +9.67%     
==========================================
  Files          64        3      -61     
  Lines        5675      185    -5490     
==========================================
- Hits         4727      172    -4555     
+ Misses        948       13     -935     
Flag Coverage Δ
unittests 92.97% <100.00%> (+9.67%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ctuguinay ctuguinay self-assigned this Nov 22, 2023
@ctuguinay ctuguinay added enhancement This makes echopype better processing functions labels Nov 22, 2023
… test that currently needs to be refactored
@ctuguinay ctuguinay changed the base branch from dev to main March 1, 2024 17:35
@leewujung leewujung added this to the v0.8.4 milestone Mar 14, 2024
@ctuguinay
Copy link
Collaborator Author

ctuguinay commented Mar 15, 2024

Original Checklist

Remove np.ndarray from allowed type for fill_value

  • Check for the channel dimension for all elements in mask to ensure that they have the same shape in ping_time and range_sample (and depth, see Add depth to allowed dimension in apply_mask #1204) -- this check should already be in place, but the code could use some checking and potentially simplifcation
  • If fill_value is of type xr.DataArray, it must have the same dimension as source_ds
  • Make sure to broadcast all input masks together before combining them (element-wise multiplication)
  • Add an input argument keep_unmasked_channel: bool = True
    • when True: channels that are not in mask will be left as is, i.e. no masking will be applied to the data in chC
    • when False: channels that are not in mask will be masked, i.e. all data in chC will be masked with values in fill_value.

Further Changes

  • Modified logic to no longer default to the final mask being 1 channel for all cases. Instead, it checks to see if the channel dimension exists in both final mask and the source da. If it exists in both, then final mask is applied appropriately per channel (behavior of this apply depends on value of keep_unmasked_channel)
  • Modified integration tests to include more permutations of mask channel inputs

@ctuguinay ctuguinay marked this pull request as ready for review March 15, 2024 23:31
@ctuguinay ctuguinay changed the title [Draft] Apply Mask Changes: Multichannel, Allow Depth, Simplify Fill Value Apply Mask Changes: Multichannel, Allow Depth, Simplify Fill Value Mar 15, 2024
@ctuguinay ctuguinay requested a review from leewujung March 15, 2024 23:32
Can be a single input or list that corresponds to a DataArray or a path.
Each entry in the list must have dimensions ``('ping_time', 'range_sample')``.
Multi-channel masks are not currently supported.
Can be a individual input or list that corresponds to a DataArray or a path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small fix

Suggested change
Can be a individual input or list that corresponds to a DataArray or a path.
Can be a individual input or a list that corresponds to a DataArray or a path.

Copy link
Member

@leewujung leewujung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @ctuguinay! The core changes for expanding this function to allow data variables with the depth coordination (in addition to range_sample) looks good and to simplify fill value specification look good.

I have a more overarching comment about the what we expect in the input arguments. Right now the code is set up to be super flexible and catch as much cases as possible if there are mismatches between source_ds and mask_list. I think that we can reduce the code complexity significantly if we impose some expectations on the input arguments to be responsibility of the users. This is also because I think the current code does not actually catch all the things that could go wrong in the mismatches, so it may be better to not try to accommodate some scenarios but not others.

My suggestions are the following:

  • If the mask(s) in mask_list have the channel dimension, we require that source_ds and all elements in mask_list have the same channel coordinates and the shapes are the same (across channel, ping_time, and range_sample or depth).
  • If the mask(s) in mask_list do not have the channel dimension, we require that source_ds and all elements in mask_list have the same shape (across ping_time and range_sample or depth).
  • The above requires users to slice their masks and/or Sv dataset at the input.
  • This will remove the need for the keep_unmasked_channel argument. I think allowing this variable would potentially make the data processing pipeline very messy, since the shape of the dataset could change somewhat unpredictably depending on this argument keep_unmasked_channel. If people only want to apply masks to a subset of the channels, they could put 1s for all data in the channels that are not masked.

Let me know what you think.

@ctuguinay
Copy link
Collaborator Author

Thanks for the review @leewujung!

Yeah, I agree with you on there being too many cases that this code tries to accommodate. I also agree with you on all suggestions you posed.

Just to get the cases in order, once we get to the post-broadcast post-logical-reduce stage (once we get to here https://github.com/ctuguinay/echopype/blob/e52e05f33a82d1700d24f5c58d072fe15d319136/echopype/mask/api.py#L335), I should pass only the following:

  • No channel in both source and mask, but they have matching ping time and depth
  • Source and mask both have matching channel, ping time, and depth
  • Source has channel and mask doesn't, but they have matching ping time and depth

@ctuguinay ctuguinay requested a review from leewujung April 1, 2024 22:45
@ctuguinay
Copy link
Collaborator Author

@leewujung This PR should be ready to review again

echopype/mask/api.py Outdated Show resolved Hide resolved
echopype/mask/api.py Outdated Show resolved Hide resolved
echopype/mask/api.py Outdated Show resolved Hide resolved
leewujung
leewujung previously approved these changes Apr 8, 2024
Copy link
Member

@leewujung leewujung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good. I suggested some small edits for the docstring and error messages. It is important to be as clear as possible about these things. I think this is ready to be merged -- thanks for the changes!

@ctuguinay
Copy link
Collaborator Author

@leewujung I just committed your suggestions, and this should be ready to merge. Thank you for making the docstring more precise! I think I wrote that right after I got tests working so I was at the point of complete familiarity, and I didn't stop to think whether or not someone new could understand it. I will work on that 👍

@leewujung
Copy link
Member

Thanks @ctuguinay , I'll merge this now! 🎉

@leewujung leewujung merged commit 646759e into OSOceanAcoustics:main Apr 8, 2024
5 checks passed
@leewujung
Copy link
Member

Oops, I just realized that you target main and not dev, and I used squash and merge. I may have to do hard reset and ask you to another PR...

@leewujung
Copy link
Member

Looks like there's no compounding changes from other PRs that have been merged to dev, so this should be ok. We want to simplify our dev workflow anyway #1275 so this forces us to do it, ha.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This makes echopype better processing functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants