Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v3] zarr 3 fails to create child groups #2228

Closed
jhamman opened this issue Sep 24, 2024 · 4 comments · Fixed by #2262
Closed

[v3] zarr 3 fails to create child groups #2228

jhamman opened this issue Sep 24, 2024 · 4 comments · Fixed by #2262
Labels
bug Potential issues with the zarr-python library

Comments

@jhamman
Copy link
Member

jhamman commented Sep 24, 2024

Zarr version

3.0.0.alpha5

Numcodecs version

NA

Python Version

3.11

Operating System

Mac

Installation

pip

Description

The v3 branch is missing some important behavior around the implicit creation of sub-nodes. In 2.x, this behavior existed:

In [1]: import zarr

In [2]: store = {}

In [3]: zarr.create(shape=(10, 10), store=store, path='foo/bar/spam')
Out[3]: <zarr.core.Array '/foo/bar/spam' (10, 10) float64>

In [4]: list(store)
Out[4]: ['.zgroup', 'foo/.zgroup', 'foo/bar/.zgroup', 'foo/bar/spam/.zarray']

In [5]: g = zarr.open_group(store=store)

In [6]: g
Out[6]: <zarr.hierarchy.Group '/'>

Note the creation of sub groups foo, foo.bar, and foo.bar.spam.

Steps to reproduce

In [1]: import zarr

In [2]: store = await zarr.store.MemoryStore.open(mode='w')

In [3]: zarr.create(shape=(10, 10), store=store, path='foo/bar/spam')
Out[3]: <Array memory://4841759424/foo/bar/spam shape=(10, 10) dtype=float64>

In [4]: list(store._store_dict)
Out[4]: ['foo/bar/spam/zarr.json']

Note that zarr failed to create zarr.json objects for any of the parent groups.

Additional output

No response

@jhamman jhamman added the bug Potential issues with the zarr-python library label Sep 24, 2024
@TomAugspurger
Copy link
Contributor

I'm starting to look at this, since it's causing some failures in xarray.

We'll need to think through how best to do this. I'd like to avoid having to list the keys already in the store to discover whether or not we need to set them.

I think the best option, if we can implement it, is some kind of setdefault-like operation on stores that sets a value only if it doesn't exist (and doesn't error if it does exist).

@jhamman
Copy link
Member Author

jhamman commented Sep 26, 2024

@TomAugspurger - @d-v-b also started on this today. You two should coordinate and perhaps compare notes / PR reviews.

@d-v-b
Copy link
Contributor

d-v-b commented Sep 26, 2024

Tom is basically doing a better version of what I did, so i might just rebase my branch off this :)

edit: I made the comment above when I mistakenly thought this was the "store.with_mode" discussion. As for intermediate groups, from what I can tell tom and I are approaching this from different angles, so it should be useful to compare and sync our approaches. I will link to my branch shortly once I push my latest commits

@TomAugspurger
Copy link
Contributor

I've started on this at https://github.com/zarr-developers/zarr-python/compare/v3...TomAugspurger:zarr-python:fix/intermediates?expand=1, and am slightly worried about adding a setdefault style method to the Store class.

I was able to do atomic writes for MemoryStore and LocalStore without issue I think. Just use .setdefault for MemoryStore and write the bytes with mode=wx for LocalStore (and catch the error if it's raised).

I didn't immediately find a good way to that with fsspec. At least on Azure there's an overwrite option to the write_blob that could be used. I'm not sure about other filesystems, and I didn't see it documented in fsspec as a behavior that's supposed to be consistent across implementations.

Checking file.exits() and then doing the set is maybe an option, but that has a race condition between multiple writers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants