-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable factory references to create new dimensions on load. #6168
Enable factory references to create new dimensions on load. #6168
Conversation
Now updated with prototype loading-control object
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just a couple comments.
lib/iris/__init__.py
Outdated
def __init__( | ||
self, | ||
support_multiple_references: bool = False, | ||
multiref_triggers_concatenate: bool = False, | ||
use_concatenate: bool = False, | ||
use_merge: bool = True, | ||
cat_before_merge: bool = False, | ||
repeat_until_done: bool = False, | ||
): | ||
"""Container for loading controls.""" | ||
self.support_multiple_references = support_multiple_references | ||
self.multiref_triggers_concatenate = multiref_triggers_concatenate | ||
self.use_concatenate = use_concatenate | ||
self.use_merge = use_merge | ||
self.cat_before_merge = cat_before_merge | ||
self.repeat_until_done = repeat_until_done | ||
|
||
def __repr__(self): | ||
msg = ( | ||
"LoadPolicy(" | ||
f"support_multiple_references={self.support_multiple_references}, " | ||
f"multiref_triggers_concatenate={self.multiref_triggers_concatenate}, " | ||
f"use_concatenate={self.use_concatenate}, " | ||
f"use_merge={self.use_merge}, " | ||
f"cat_before_merge={self.cat_before_merge}, " | ||
f"repeat_until_done={self.repeat_until_done}" | ||
")" | ||
) | ||
return msg | ||
|
||
def copy(self): | ||
return LoadPolicy(**{key: getattr(self, key) for key in self._allkeys}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think data classes give you some/all of this, and you can still add your own methods in addition.
lib/iris/__init__.py
Outdated
"support_multiple_references", | ||
"multiref_triggers_concatenate", | ||
"use_concatenate", | ||
"use_merge", | ||
"cat_before_merge", | ||
"repeat_until_done", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of giving users more controls to play with, but having 6 booleans is very confusing, so I'm keen to look for ways it could be simpler to understand.
- It doesn't look like
support_multiple_references
is necessary use_concatenate
,use_merge
andcat_before_merge
could become some sort of controllable sequence e.g.{"c", "m"}
/{"m"}
/{"m", "c"}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, as discussed offline, I think we can simplify this with benefits.
See new strategy proposals, below.
lib/iris/__init__.py
Outdated
LOAD_POLICY_LEGACY = LoadPolicy() | ||
LOAD_POLICY_RECOMMENDED = LoadPolicy( | ||
support_multiple_references=True, multiref_triggers_concatenate=True | ||
) | ||
LOAD_POLICY_COMPREHENSIVE = LoadPolicy( | ||
support_multiple_references=True, use_concatenate=True, repeat_until_done=True | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be simpler to follow if these 'presets' were all part of another object. Could they be class attributes e.g. LoadPolicy.recommended
? It would be good if the only global constant were LOAD_POLICY
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I don't like this much.
Under review, watch this space!
lib/iris/__init__.py
Outdated
"""Return context manager for temporary options. | ||
|
||
Modifies the given parameters within a context, for the active thread. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do with more detail and examples - I'm struggling to picture use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This is not intended to be complete yet -- still just draft ideas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had my rough look around.
I think the user experience is almost there. Seems like a good compromise that still holds the user's hand as much as possible (they don't write the code themselves), while still offering more customisation than currently.
The developer experience feels very complicated. I think this is worth some attention, since the developer experience around loading is already too complicated and I have noticed can be a big barrier for less experienced people. Afraid I don't have any immediate suggestions, only that it's worth some time.
Thanks!
I'm not providing additional docstrings yet, but it's clearly needed (if not a userguide section?) |
You're totally right, and I'm currently planning to remove all the pieces here to a single sub-module "iris.io.loading", which I think will help a bit. However, I want to add some proper testcases before I attempt a lift-and-shift, as otherwise some breakages might go unnoticed. |
Current best ideas for a "new control strategy"Following offline discussion wuith @trexfeathers Just 3 control keywords:
Plus...
|
Progress Update 2024-10-14 :For now I'm deferring the user-API improvements detailed above.
|
3942968
to
e135573
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## FEATURE_load_policy #6168 +/- ##
=======================================================
+ Coverage 89.82% 89.86% +0.04%
=======================================================
Files 88 88
Lines 23185 23277 +92
Branches 4314 4333 +19
=======================================================
+ Hits 20825 20917 +92
- Misses 1628 1629 +1
+ Partials 732 731 -1 ☔ View full report in Codecov by Sentry. |
… in load;load_cube/load_cubes.
f7ddb65
to
0f1bae3
Compare
0f1bae3
to
801f9e2
Compare
lib/iris/tests/integration/varying_references/test_roundtrip_time_varying_references.py
Show resolved
Hide resolved
Status 2024-10-23 @1615 :Essentials from previous comments still to resolve :
We think that unresolved comments by @trexfeathers are all either now done, or not important to fix |
52c2fc4
to
03f0d2a
Compare
Hi @stephenworsley thanks for all your patience ! |
⏱️ Performance Benchmark Report: 58decfdPerformance shifts
Full benchmark results
Generated by GHA run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! I believe this is in a releasable state now. There's a small bump in some of our loading benchmarks, but I think that is to be expected given our expanded functionality. Looking at the scaling within the benchmarks, it looks to be more overhead than an actual concern. The remaining work looks seperable, I don't believe anything here is functionally incomplete.
27359ee
into
SciTools:FEATURE_load_policy
…6194) * Enable factory references to create new dimensions on load. * Skip hanging tests. * Skip more hanging tests. * Adjust misleading comment. * Add policy control and auto-detect. NOTE: for now only load, not load_cubes/load_cube * Add temporary testcode. NB no actual test, just printout. * Replaced _CubeFilterCollection.merged() with combined(); replace uses in load;load_cube/load_cubes. * Fix licence header * Fix to handle empty reference correctly. * Fix tests. * Simplify policy options and tidy api. * More documentation of loading options. * Fix doctest. * Fix repeated combination. * Minor docs improvements. * Initial load functions testing (WIP). More * Integration tests for time-varying reference fields. * Fix test result. * Make grib test optional. * Review changes * Reinstate skipped tests. * Make combine_cubes work with plain lists; Make 'combine_cubes' private API. * Add tests for combine_cubes. * Add tests for LoadPolicy API. * Add special split-netcdf roundtrip testing. * Removed unwanted 'policy' keyword from iris.load . * Make LoadPolicy examples more consistent. * Review changes : documentation improvements. * Doctest fix * fix controlling_merge docs * LOAD_POLICY uses 'default' settings by default. * Various quick fixes to legacy tests. * Added whatsnew. --------- Co-authored-by: Patrick Peglar <[email protected]>
* upstream/main: (194 commits) Restore latest Whats New files. [pre-commit.ci] pre-commit autoupdate (SciTools#6205) correct major minor in whatsnew (SciTools#6202) What's new updates for v3.11.0rc0 . (SciTools#6201) Update CF standard names table. (SciTools#6200) Specify meta in dask.array.map_blocks (SciTools#5989) added in a vertical rule for surface fields (SciTools#5734) Updated environment lockfiles (SciTools#6197) Reduce duplication in cf.py spanning checks. (SciTools#6196) Fix attribute array comparison (SciTools#6181) Enable factory references to create new dimensions on load. (SciTools#6168) (SciTools#6194) Improve handling of masks in concatenate (SciTools#6187) Demo Numpy v2 (SciTools#6035) Bump scitools/workflows from 2024.10.1 to 2024.10.2 (SciTools#6186) Document use of new_axis to control merge (SciTools#6180) Updated environment lockfiles (SciTools#6184) [pre-commit.ci] pre-commit autoupdate (SciTools#6175) Updated environment lockfiles (SciTools#6183) Update to `cell_method` parsing (SciTools#6083) Bump scitools/workflows from 2024.10.0 to 2024.10.1 (SciTools#6179) ...
Relates to #5369 , #6165