-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Virtual datasets from Zarr stores #63
Comments
I think this is a good idea, that I had a mental issue for already anyway 😅 One neat thing this woudl allow is testing writing to zarr stores as manifests by round-tripping.
We could use this and it would certainly be the quickest way, but perhaps we might be better off just writing the function ourselves? Otherwise we would be starting with a zarr store, opening it using a dependency (kerchunk and fsspec), get the opened results back as a kerchunk reference dict, then immediately converting that reference dict to I suspect also that if we do it that way (the "zarr-native" way), we might later find that either we can import and use code from |
Actually wait I think I misunderstood what you were suggesting @maxrjones . There are two types of Zarr stores we could read byte ranges from:
|
When you say "read the zarr json manually" do you mean using zarr-python? I'm not sure how just loading the .zarray (V2) or zarr.json (V3) would work because it doesn't tell you which chunks are initialized. |
Oops, didn't notice your comment before posting my last response.
I was referring to this case, which seems like simpler to implement right now. Although (2) is also important. |
Yep! If you want to make a PR for case (1) then go for it!
Yeah, but maybe I'll just add that in as part of #45. |
@maxrjones - can you expand on this? I see these as distinct features. Consolidated metadata rolls the group/array docs up to a single json, whereas the manifests concept covers the key mappings for individual arrays. |
In my mind the datasets containing multiple manifests served the same purpose as consolidated metadata with the added bonus of including key mappings for individual arrays, and so |
Thanks for this package! Just want to add i'm interested in this. Started some work on kerchunk to make |
@norlandrhagen and I were discussing creating virtual datasets from zarr stores earlier today (placeholder already in _automatically_determine_filetype). @TomNicholas what are your thoughts on trying out Kerchunk's single_zarr for this purpose? I think this could be a helpful step towards virtual concatenation of Zarr stores and allowing manifests to replace consolidated metadata for V3.
The text was updated successfully, but these errors were encountered: