Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boxcar extraction with non-finite pixels #245

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

hpparvi
Copy link
Contributor

@hpparvi hpparvi commented Jan 29, 2025

This PR adds a new exclude mask treatment option to BoxcarExtract, allowing the extraction to exclude any non-finite pixels. With the exclude option, the extraction is now carried out as a weighted sum, calculated as the average of the finite in-window pixels multiplied by the number of in-window pixels. The behaviour stays identical to the previous with the other mask treatment options or if no non-finite values are present.

@hpparvi hpparvi self-assigned this Jan 29, 2025
@hpparvi hpparvi added this to the v1.5 milestone Jan 29, 2025
Copy link

codecov bot commented Jan 29, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.46%. Comparing base (e0ac82a) to head (bf647cd).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #245      +/-   ##
==========================================
+ Coverage   85.38%   85.46%   +0.07%     
==========================================
  Files          13       13              
  Lines        1177     1183       +6     
==========================================
+ Hits         1005     1011       +6     
  Misses        172      172              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hpparvi hpparvi changed the title Weighted boxcar extraction Boxcar extraction with non-finite pixels Jan 30, 2025
…lly handled in extraction methods. Detailed the default behavior for the `mask_treatment` option in BoxcarExtract. Additionally, fixed some minor typos.
@hpparvi hpparvi requested a review from camipacifici February 10, 2025 20:18
@hpparvi
Copy link
Contributor Author

hpparvi commented Feb 10, 2025

I improved the documentation slightly to describe how the non-finite values can be handled in boxcar extraction.

mask_treatment: Literal['filter', 'omit', 'zero-fill'] = 'filter'
_valid_mask_treatment_methods = ('filter', 'omit', 'zero-fill')
# TODO: should the 'filter' option be changed to 'propagate'
mask_treatment: Literal['filter', 'omit', 'exclude'] = 'exclude'
Copy link
Contributor

@tepickering tepickering Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this choice of words doesn't make it immediately clear what each option does. also, should we keep zero-fill as an option and implement it?

Copy link
Contributor

@tepickering tepickering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i find that the names of the mask treatment options as implemented are not very clear and in fact somewhat misleading. it's worth having longer names that are more clearly descriptive of what they actually do. and if only two options are functionally supported, there should only be two options available. in my opinion, the options that make sense are:

  • ignore: don't apply mask or mask non-finite values; extract from image as-is.
  • apply: apply mask and mask non-finite values as you've implemented (should be default).
  • apply_mask_only: apply mask, but ignore non-finite values
  • mask_nan_only: ignore mask, but mask non-finite values
  • mask_extraction_bin: grow mask so that if any masked or non-finite values are in the extraction bin, the whole bin gets masked in the output.

apply_mask_only and mask_nan_only are kind of niche cases so maybe we don't worry about those. zero_fill is another option that might make sense in some contexts, but i think masking properly is better.

there's also the larger, more contentious issue that BoxcarExtract is a class that really should be a function. all of its functionality is contained within __call__() and making it a class adds a redundant layer of type declarations. it also requires users to either use two lines of code to perform an extraction or an unintuitive second pair of ()'s.

width = width or self.width
disp_axis = disp_axis or self.disp_axis
cdisp_axis = crossdisp_axis or self.crossdisp_axis
mask_mapping = {'filter': 'filter', 'exclude': 'filter', 'omit': 'omit'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then the 3 unclear choices become actually 2...

@hpparvi
Copy link
Contributor Author

hpparvi commented Feb 13, 2025

Thanks for your comments @tepickering. I've aimed to retain the mask treatment option names chosen by @cshanahan1 in her mask treatment PR while making spectrum-extraction-specific addition of the exclude option and the removal of the zero-fill option. That said, I agree that we should have more discussion about these before we release v1.5.

As I understand it, the main goal of @cshanahan1's mask treatment work is to introduce a consistent set of mask and non-finite value treatment options that are easy to document, understand, and remember. Also, since not all options are relevant to every use case, individual classes and functions don't need to support all options but only those that make sense within their context.

The current options are filter, omit, and zero-fill, where filter ignores the mask, omit masks any columns with masked or non-finite pixels (that is, it treats masked values as NaNs and propagates them), and zero-fill is self-explanatory.

I like the apply and ignore options by @tepickering , and would keep zero-fill as it is. Personally, I find propagate to be a good alternative to omit, but I’m not sure if that's just my preference. mask_extraction_bin would also be clear, especially since this option is really only needed in the extraction classes and functions.

I can refactor the code once we're all happy with the option names, but I would do this in a separate PR. Before starting with this, I'd appreciate any input, so let me know if you have any opinions @cshanahan1, @tepickering, @camipacifici, @kecnry, @kbwestfall, and @eteq.

i find that the names of the mask treatment options as implemented are not very clear and in fact somewhat misleading. it's worth having longer names that are more clearly descriptive of what they actually do. and if only two options are functionally supported, there should only be two options available. in my opinion, the options that make sense are:

* `ignore`: don't apply mask or mask non-finite values; extract from image as-is.

* `apply`: apply mask and mask non-finite values as you've implemented (should be default).

* `apply_mask_only`: apply mask, but ignore non-finite values

* `mask_nan_only`: ignore mask, but mask non-finite values

* `mask_extraction_bin`: grow mask so that if any masked or non-finite values are in the extraction bin, the whole bin gets masked in the output.

apply_mask_only and mask_nan_only are kind of niche cases so maybe we don't worry about those. zero_fill is another option that might make sense in some contexts, but i think masking properly is better.

there's also the larger, more contentious issue that BoxcarExtract is a class that really should be a function. all of its functionality is contained within __call__() and making it a class adds a redundant layer of type declarations. it also requires users to either use two lines of code to perform an extraction or an unintuitive second pair of ()'s.

I agree with this. At the moment, boxcar and Horne extraction could be implemented as functions rather than classes. If there are good reasons to keep them as classes (anyone?), we could offer simple wrapper functions (e.g., boxcar_extract and horne_extract) for those who prefer not to use an object-oriented approach. Otherwise, we could convert them into functions, though I'd prefer not to do this before v2 to avoid breaking the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants