Skip to content
This repository has been archived by the owner on Jan 14, 2025. It is now read-only.

Calc nobs divisions #288

Merged
merged 4 commits into from
Nov 15, 2023
Merged

Calc nobs divisions #288

merged 4 commits into from
Nov 15, 2023

Conversation

dougbrn
Copy link
Collaborator

@dougbrn dougbrn commented Nov 14, 2023

Addresses #287. Makes calc_nobs use map partitions to propagate metadata instead of groupby when divisions are known. This introduces an annoying cost of having to calculate the unique bands for the resulting meta in the by_band=True case, it's slow, but works for now.

@dougbrn
Copy link
Collaborator Author

dougbrn commented Nov 14, 2023

pinging both @wilsonbb for general technical review and @nevencaplar to make sure this works for his use case.

@dougbrn dougbrn marked this pull request as ready for review November 14, 2023 22:24
Copy link

codecov bot commented Nov 14, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (1082730) 94.01% compared to head (55ee6b7) 94.02%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #288      +/-   ##
==========================================
+ Coverage   94.01%   94.02%   +0.01%     
==========================================
  Files          23       23              
  Lines        1203     1206       +3     
==========================================
+ Hits         1131     1134       +3     
  Misses         72       72              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@wilsonbb wilsonbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, it seems we should likely file an issue for adding a multi-partition data fixture in our testing suite. This hopefully would be more robust for catching issues like these.

Ultimately, looks good to me!

band_col = self._band_col

# Get the band metadata
unq_bands = np.unique(self._source[band_col])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'm not fully certain about the performance differences, but I feel self._source[band_col].unique() might be preferable and slightly more readable

@nevencaplar
Copy link
Member

I have verified that this solves issue 287

@dougbrn dougbrn merged commit 1d97c55 into main Nov 15, 2023
9 checks passed
@dougbrn dougbrn deleted the calc_nobs_divisions branch December 11, 2023 19:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants