-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(convert): Full refactor for the usage of disk during memory expansion [all tests ci] #1185
Conversation
for more information, see https://pre-commit.ci
This is an initial full refactoring for swap feature. Changes include the removal of Parsed2Zarr objects and its downstream implementation in favor of direct zarr file writing during the data parsing within the Parser object.
Codecov Report
@@ Coverage Diff @@
## dev #1185 +/- ##
==========================================
+ Coverage 77.80% 83.13% +5.32%
==========================================
Files 66 63 -3
Lines 6002 5710 -292
==========================================
+ Hits 4670 4747 +77
+ Misses 1332 963 -369
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes 📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today! |
Added 'expanded_data_shapes' property to parse base to compute the expansion shapes. Additionally, moved the determination of 'use_swap' to during rectangularization. Re-activated 'auto' keyword for destination_path.
By default when destination path is None, now the program will use the built in tempfile module. This will ensure that swap files can be cleaned up by the Operating System. Introduced a new global variable in 'io' called 'ECHOPYPE_TEMP_DIR' to specify the path to the temporary 'echopype' directory within the OS temp directory. This is initialized on echopype import. Additionally, moved I/O related functions with swap files now to the higher level 'io' module in 'utils'.
Too complex at this moment... tabling this. |
Add a check for ping_time dimension shape and ensure that the written zarr array has the same shape so that it doesn't fail at set_beam.
Moved common generator utility functions to testing module so that it can be accessed anywhere to create fixtures. Additionally, set up rectangularize_data tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @lsetiawan : THANK YOU! This PR looks great! I see that you've resolved most of my comments yesterday. The remaining ones are pretty small. Also awesome to see the use of mocker in the tests -- it sets a couple great examples to add more tests to cover parser and set_groupers in the future.
My only significant comment is the one related to the reshaping of complex variables. It is a small change, but since this PR is already very large, and touching the parser usually takes extra care, I think it's better to take it in another PR. We have a couple tests that test for the actual parsed values that would be good for catching any errors.
Co-authored-by: Wu-Jung Lee <[email protected]>
Co-authored-by: Wu-Jung Lee <[email protected]>
@leewujung Thanks for this extensive review! I'm glad that the review went well and that you're satisfied with the changes 😄 Regarding your last comment:
Would you be able to create a new issue for this so that it can be tracked? Thanks! |
Nvm, just quickly did this. See #1213 |
Overview
This PR does a full refactor for the "swap" functionality by largely removing the whole usage of
Parsed2Zarr
objects in favor of a simpler direct usage ofzarr
library features to create arrays.Issues Resolved
parsed2zarr
mechanism #1179'auto'
option to kwargoffload_to_zarr
inopen_raw
#782