Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Non-kerchunk backend for HDF5/netcdf4 files. (#87)
* Generate chunk manifest backed variable from HDF5 dataset. * Transfer dataset attrs to variable. * Get virtual variables dict from HDF5 file. * Update virtual_vars_from_hdf to use fsspec and drop_variables arg. * mypy fix to use ChunkKey and empty dimensions list. * Extract attributes from hdf5 root group. * Use hdf reader for netcdf4 files. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix ruff complaints. * First steps for handling HDF5 filters. * Initial step for hdf5plugin supported codecs. * Small commit to check compression support in CI environment. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix mypy complaints for hdf_filters. * Local pre-commit fix for hdf_filters. * Use fsspec reader_options introduced in #37. * Fix incorrect zarr_v3 if block position from merge commit ef0d7a8. * Fix early return from hdf _extract_attrs. * Test that _extract_attrs correctly handles multiple attributes. * Initial attempt at scale and offset via numcodecs. * Tests for cfcodec_from_dataset. * Temporarily relax integration tests to assert_allclose. * Add blosc_lz4 fixture parameterization to confirm libnetcdf environment. * Check for compatability with netcdf4 engine. * Use separate fixtures for h5netcdf and netcdf4 compression styles. * Print libhdf5 and libnetcdf4 versions to confirm compiled environment. * Skip netcdf4 style compression tests when libhdf5 < 1.14. * Include imagecodecs.numcodecs to support HDF5 lzf filters. * Remove test that verifies call to read_kerchunk_references_from_file. * Add additional codec support structures for imagecodecs and numcodecs. * Add codec config test for Zstd. * Include initial cf decoding tests. * Revert typo for scale_factor retrieval. * Update reader to use new numpy manifest representation. * Temporarily skip test until blosc netcdf4 issue is solved. * Fix Pydantic 2 migration warnings. * Include hdf5plugin and imagecodecs-numcodecs in mamba test environment. * Mamba attempt with imagecodecs rather than imagecodecs-numcodecs. * Mamba attempt with latest imagecodecs release. * Use correct iter_chunks callback function signtature. * Include pip based imagecodecs-numcodecs until conda-forge availability. * Handle non-coordinate dims which are serialized to hdf as empty dataset. * Use reader_options for filetype check and update failing kerchunk call. * Fix chunkmanifest shaping for chunked datasets. * Handle scale_factor attribute serialization for compressed files. * Include chunked roundtrip fixture. * Standardize xarray integration tests for hdf filters. * Update reader selection logic for new filetype determination. * Use decode_times for integration test. * Standardize fixture names for hdf5 vs netcdf4 file types. * Handle array add_offset property for compressed data. * Include h5py shuffle filter. * Make ScaleAndOffset codec last in filters list. * Apply ScaleAndOffset codec to _FillValue since it's value is now downstream. * Coerce scale and add_offset values to native float for JSON serialization. * Temporarily xfail integration tests for main * Remove pydantic dependency as per pull/210. * Update test for new kerchunk reader module location. * Fix branch typing errors. * Re-include automatic file type determination. * Handle various hdf flavors of _FillValue storage. * Include loadable variables in drop variables list. * Mock readers.hdf.virtual_vars_from_hdf to verify option passing. * Convert numpy _FillValue to native Python for serialization support. * Support groups with HDF5 reader. * Handle empty variables with a shape. * Import top-level version of xarray classes. * Add option to explicitly specify use of an experimental hdf backend. * Include imagecodecs and hdf5plugin in all CI environments. * Add test_hdf_integration tests to be skipped for non-kerchunk env. * Include imagecodecs in dependencies. * Diagnose imagecodecs-numcodecs installation failures in CI. * Ignore mypy complaints for VirtualBackend. * Remove checksum assert which varies across different zstd versions. * Temporarily xfail integration tests with coordinate inconsistency. * Remove backend arg for non-hdf network file tests. * Fix mypy comment moved by ruff formatting. * Make HDR reader dependencies optional. * Handle optional imagecodecs and hdf5plugin dependency imports for tests. * Prevent conflicts with explicit filetype and backend args. * Correctly convert root coordinate attributes to a list. * Clarify that method extracts attrs from any specified group. * Restructure hdf reader and codec filters into a module namespace. * Improve docstrings for hdf and filter modules. * Explicitly specify HDF5VirtualBackend for test parameter. * Include isssue references for xfailed tests. * Use soft import strategy for optional dependencies see xarray/issues/9554. * Handle mypy for soft imports. * Attempt at nested optional depedency usage. * Handle use of soft import sub modules for typing. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information