-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError: '.zarray' with HDF5 data #159
Comments
This file has nested groups, which we don't support yet. See #84. The error is because the current code for parsing kerchunk references essentially assumes that each group is an array (i.e. that there is only one group - the root). It errors when it doesn't find the array metadata. |
It's surprisingly hard to find HDF5 files out in the wild that don't have groups! If you know of any for testing let me know @TomNicholas , otherwise it seems like the group kwarg |
Are you looking for HDF5 files that are not netCDF4 files? Because we could
just point to the same file that we open as xarray's tutorial dataset?
We do need the group kwarg anyway. I think Ayush's DMR++ PR adds one for
that particular backend.
…On Tue, Jun 25, 2024, 6:16 PM Scott Henderson ***@***.***> wrote:
It's surprisingly hard to find HDF5 files out in the wild that don't have
groups! If you know of any for testing let me know @TomNicholas
<https://github.com/TomNicholas> , otherwise it seems like the group
kwarg open_virtual_dataset(..., group='xyz') would be a good solution.
Often the groups are unrelated enough that you really only want to work
with one anyways.
—
Reply to this email directly, view it on GitHub
<#159 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AISNPI2ZDLZ67M4ZCMZ2YSLZJHT5TAVCNFSM6AAAAABJ4PVBQ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJQGA3DINRRGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Correct, I was. From the netCDF Docs: "Some HDF5 features not supported in netCDF-4 format include non-hierarchical group structures, HDF5 reference types, multiple links to a data object, user-defined atomic data types, stored property lists, more permissive rules for data object names, the HDF5 date/time type, and attributes associated with user-defined types" Although, taking a step back I don't think it's worth seeking out examples of HDF5 that is not netCDF4 :) I'd guess many of the .h5 files out there actually conform to the netCDF4 subset and could've just as easily been named .nc ! Feel free to close this one as a duplicate of #84 |
Just wanted to +1 this issue. I ran into it today while also trying to use VirtualiZarr on a netCDF file with multiple groups. |
@forrestfwilliams I'm happy to review a PR! Adding a |
I'm getting a KeyError Traceback for several different HDF5 files.
For example, trying to use this test file from kerchunk:
https://github.com/fsspec/kerchunk/blob/ae692fead51a216691e4db9a67c99194c5ba8e14/kerchunk/tests/test_hdf.py#L307
The text was updated successfully, but these errors were encountered: