-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Function to rechunk single variables or batch variables from existing…
… MDIO (#368) * Improve MDIO API with accessor modes and rechunk functionality The MDIO API has been enhanced with support for additional file operation modes ('w' for rechunking) and a new rechunking feature. The 'copy_mdio' function now accepts strongly typed arguments, and 'rechunk' functions have been added to efficiently resize chunks for large datasets, with progress tracking and error handling improvements. * Add usage examples to rechunk functions Expanded the docstrings in `convenience.py` to include examples illustrating how to use `rechunk` and `rechunk_batch` functions for clarity and ease of use for developers. * Add convenience functions section to docs The new section "Convenience Functions" has been added to the reference documentation. It specifically includes documentation for the `mdio.api.convenience` module, excluding `create_rechunk_plan` and `write_rechunked_values`. * Add optional compressor parameter to rechunk functions The rechunk operations in the MDIO API now accept an optional compressor parameter, allowing users to specify a custom data compression codec. The default compressor, Blosc('zstd'), is set if none is provided, ensuring backward compatibility. * Add rechunk function TODO comment Inserted a TODO comment in `rechunk` function for writing tests, referencing the relevant issue. * Refactor buffer size and improve documentation Removed an extraneous newline and introduced a constant MAX_BUFFER to handle buffer size for chunking. Updated the create_rechunk_plan function's docstring to include the buffer size details, making it clearer how the buffer size can be adjusted by altering the MAX_BUFFER variable. This change enhances code readability and maintainability. * Add rechunking optimization demo Added a new demo `rechunking.ipynb` to demonstrate how to optimize access patterns using rechunking and lossy compression. The notebook includes detailed steps and code snippets to create optimized, compressed copies for different access patterns, enhancing read performance. * Refactor notebook: reset execution counts and tidy metadata Streamlined the Jupyter notebook by resetting execution counts and cleaning up metadata fields. This provides a fresh state for the execution environment and a more structured document for other developers to follow. * Update rechunking notebook with minor tweaks Corrected the reference from 'notebook' to 'page', added a parenthetical clarification to a section heading, and updated the performance benchmark outputs. These changes improve document clarity and provide the latest performance metrics.
- Loading branch information
Showing
5 changed files
with
742 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters