-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving a group errors when serializing NaN
in IcechunkStore.set
#133
Comments
We should never consolidate metadata on icechunk. Is this error being raised on the consolidate step or the |
Here's the traceback from the real example. It's during metadata consolidation (which takes data that shows up in each array's ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[19], line 1
----> 1 ds.to_zarr(store)
File [~/gh/pydata/xarray/xarray/core/dataset.py:2572](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/pydata/xarray/xarray/core/dataset.py#line=2571), in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs)
2417 """Write dataset contents to a zarr group.
2418
2419 Zarr chunks are determined in the following way:
(...)
2568 The I[/O](http://localhost:8890/O) user guide, with more details and examples.
2569 """
2570 from xarray.backends.api import to_zarr
-> 2572 return to_zarr( # type: ignore[call-overload,misc]
2573 self,
2574 store=store,
2575 chunk_store=chunk_store,
2576 storage_options=storage_options,
2577 mode=mode,
2578 synchronizer=synchronizer,
2579 group=group,
2580 encoding=encoding,
2581 compute=compute,
2582 consolidated=consolidated,
2583 append_dim=append_dim,
2584 region=region,
2585 safe_chunks=safe_chunks,
2586 zarr_version=zarr_version,
2587 write_empty_chunks=write_empty_chunks,
2588 chunkmanager_store_kwargs=chunkmanager_store_kwargs,
2589 )
File [~/gh/pydata/xarray/xarray/backends/api.py:1777](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/pydata/xarray/xarray/backends/api.py#line=1776), in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs)
1775 writer = ArrayWriter()
1776 # TODO: figure out how to properly handle unlimited_dims
-> 1777 dump_to_store(dataset, zstore, writer, encoding=encoding)
1778 writes = writer.sync(
1779 compute=compute, chunkmanager_store_kwargs=chunkmanager_store_kwargs
1780 )
1782 if compute:
File [~/gh/pydata/xarray/xarray/backends/api.py:1467](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/pydata/xarray/xarray/backends/api.py#line=1466), in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims)
1464 if encoder:
1465 variables, attrs = encoder(variables, attrs)
-> 1467 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
File [~/gh/pydata/xarray/xarray/backends/zarr.py:826](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/pydata/xarray/xarray/backends/zarr.py#line=825), in ZarrStore.store(self, variables, attributes, check_encoding_set, writer, unlimited_dims)
823 if _zarr_v3():
824 # https://github.com/zarr-developers/zarr-python/pull/2113#issuecomment-2386718323
825 kwargs["path"] = self.zarr_group.name.lstrip("/")
--> 826 zarr.consolidate_metadata(self.zarr_group.store, **kwargs)
File [~/gh/zarr-developers/zarr-python/src/zarr/api/synchronous.py:46](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/api/synchronous.py#line=45), in consolidate_metadata(*args, **kwargs)
45 def consolidate_metadata(*args: Any, **kwargs: Any) -> Group:
---> 46 return Group(sync(async_api.consolidate_metadata(*args, **kwargs)))
File [~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:91](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py#line=90), in sync(coro, loop, timeout)
88 return_result = next(iter(finished)).result()
90 if isinstance(return_result, BaseException):
---> 91 raise return_result
92 else:
93 return return_result
File [~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:50](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py#line=49), in _runner(coro)
45 """
46 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
47 exception, the exception will be returned.
48 """
49 try:
---> 50 return await coro
51 except Exception as ex:
52 return ex
File [~/gh/zarr-developers/zarr-python/src/zarr/api/asynchronous.py:196](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/api/asynchronous.py#line=195), in consolidate_metadata(store, path, zarr_format)
191 metadata = dataclasses.replace(group.metadata, consolidated_metadata=consolidated_metadata)
192 group = dataclasses.replace(
193 group,
194 metadata=metadata,
195 )
--> 196 await group._save_metadata()
197 return group
File [~/gh/zarr-developers/zarr-python/src/zarr/core/group.py:744](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/group.py#line=743), in AsyncGroup._save_metadata(self, ensure_parents)
734 for parent in parents:
735 awaitables.extend(
736 [
737 (parent.store_path [/](http://localhost:8890/) key).set_if_not_exists(value)
(...)
741 ]
742 )
--> 744 await asyncio.gather(*awaitables)
File [~/gh/zarr-developers/zarr-python/src/zarr/abc/store.py:348](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/abc/store.py#line=347), in set_or_delete(byte_setter, value)
346 await byte_setter.delete()
347 else:
--> 348 await byte_setter.set(value)
File [~/gh/zarr-developers/zarr-python/src/zarr/storage/common.py:49](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/storage/common.py#line=48), in StorePath.set(self, value, byte_range)
47 if byte_range is not None:
48 raise NotImplementedError("Store.set does not have partial writes yet")
---> 49 await self.store.set(self.path, value)
File [~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py:382](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py#line=381), in IcechunkStore.set(self, key, value)
374 async def set(self, key: str, value: Buffer) -> None:
375 """Store a (key, value) pair.
376
377 Parameters
(...)
380 value : Buffer
381 """
--> 382 return await self._store.set(key, value.to_bytes())
ValueError: store error: bad metadata: `expected value at line 20 column 23` This is easy for me to workaround by setting
Yeah... I assume that it has no benefit in icechunk's case. As currently written, my consolidated metadata PR doesn't support opting out by the store. From the store's point of view it just looks like a bunch of |
🤯 How is this working with Xarray? Is this on a private branch? |
Two :) But if you run this I think it'll get everything you need:
That |
This is great progress. For a bit more background: the reason metadata consolidation doesn't work is that Icechunk is not actually a generic key value store. It only allows you to store Zarr V3 metadata, and in fact, it reverse-engineers the structure of the dataset from the keys that it sees. (In this sense, Icechunk is breaking the abstraction between Zarr store and the rest of Zarr.) This is necessary for Icechunk to work. From my perspective, it's not a major problem right now that consolidated metadata doesn't work, as consolidated metadata is an optional thing. From an ecosystem point of view (e.g. Xarray), we want to cleanly and automatically avoid using consolidated metadata if it's not necessary. Here's one way we could do this.
What do you think? |
Oh... I see that I led this discussion astray with the focus on consolidated metadata. That was the first thing I hit, but I was able to reproduce it with a plain Group: import icechunk
import numpy as np
import zarr
from icechunk import IcechunkStore, StorageConfig
store = await IcechunkStore.create(
storage=StorageConfig.memory("icechunk-demo"),
mode="w",
)
group = zarr.group(store=store, attributes={"key": float("nan")}) which raises with ---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/asynchronous.py:586](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/asynchronous.py#line=585), in group(store, overwrite, chunk_store, cache_attrs, synchronizer, path, zarr_version, zarr_format, meta_array, attributes, storage_options)
585 try:
--> 586 return await AsyncGroup.open(store=store_path, zarr_format=zarr_format)
587 except (KeyError, FileNotFoundError):
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py:513](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py#line=512), in AsyncGroup.open(cls, store, zarr_format, use_consolidated)
512 if zarr_json_bytes is None and zgroup_bytes is None:
--> 513 raise FileNotFoundError(
514 f"could not find zarr.json or .zgroup objects in {store_path}"
515 )
516 # set zarr_format based on which keys were found
FileNotFoundError: could not find zarr.json or .zgroup objects in <icechunk.IcechunkStore object at 0x110f43980>
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
Cell In[9], line 14
8 store = await IcechunkStore.create(
9 storage=StorageConfig.memory("icechunk-demo"),
10 mode="w",
11 )
12 store
---> 14 group = zarr.group(store=store, attributes={"key": float("nan")})
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/_compat.py:43](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/_compat.py#line=42), in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
41 extra_args = len(args) - len(all_args)
42 if extra_args <= 0:
---> 43 return f(*args, **kwargs)
45 # extra_args > 0
46 args_msg = [
47 f"{name}={arg}"
48 for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
49 ]
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/synchronous.py:182](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/synchronous.py#line=181), in group(store, overwrite, chunk_store, cache_attrs, synchronizer, path, zarr_version, zarr_format, meta_array, attributes)
167 @_deprecate_positional_args
168 def group(
169 store: StoreLike | None = None,
(...)
179 attributes: dict[str, JSON] | None = None,
180 ) -> Group:
181 return Group(
--> 182 sync(
183 async_api.group(
184 store=store,
185 overwrite=overwrite,
186 chunk_store=chunk_store,
187 cache_attrs=cache_attrs,
188 synchronizer=synchronizer,
189 path=path,
190 zarr_version=zarr_version,
191 zarr_format=zarr_format,
192 meta_array=meta_array,
193 attributes=attributes,
194 )
195 )
196 )
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/sync.py:91](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/sync.py#line=90), in sync(coro, loop, timeout)
88 return_result = next(iter(finished)).result()
90 if isinstance(return_result, BaseException):
---> 91 raise return_result
92 else:
93 return return_result
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/sync.py:50](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/sync.py#line=49), in _runner(coro)
45 """
46 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
47 exception, the exception will be returned.
48 """
49 try:
---> 50 return await coro
51 except Exception as ex:
52 return ex
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/asynchronous.py:588](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/asynchronous.py#line=587), in group(store, overwrite, chunk_store, cache_attrs, synchronizer, path, zarr_version, zarr_format, meta_array, attributes, storage_options)
586 return await AsyncGroup.open(store=store_path, zarr_format=zarr_format)
587 except (KeyError, FileNotFoundError):
--> 588 return await AsyncGroup.from_store(
589 store=store_path,
590 zarr_format=zarr_format or _default_zarr_version(),
591 exists_ok=overwrite,
592 attributes=attributes,
593 )
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py:434](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py#line=433), in AsyncGroup.from_store(cls, store, attributes, exists_ok, zarr_format)
429 attributes = attributes or {}
430 group = cls(
431 metadata=GroupMetadata(attributes=attributes, zarr_format=zarr_format),
432 store_path=store_path,
433 )
--> 434 await group._save_metadata(ensure_parents=True)
435 return group
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py:744](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py#line=743), in AsyncGroup._save_metadata(self, ensure_parents)
734 for parent in parents:
735 awaitables.extend(
736 [
737 (parent.store_path [/](http://localhost:8891/) key).set_if_not_exists(value)
(...)
741 ]
742 )
--> 744 await asyncio.gather(*awaitables)
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/abc/store.py:348](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/abc/store.py#line=347), in set_or_delete(byte_setter, value)
346 await byte_setter.delete()
347 else:
--> 348 await byte_setter.set(value)
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/storage/common.py:49](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/storage/common.py#line=48), in StorePath.set(self, value, byte_range)
47 if byte_range is not None:
48 raise NotImplementedError("Store.set does not have partial writes yet")
---> 49 await self.store.set(self.path, value)
File [~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py:382](http://localhost:8891/~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py#line=381), in IcechunkStore.set(self, key, value)
374 async def set(self, key: str, value: Buffer) -> None:
375 """Store a (key, value) pair.
376
377 Parameters
(...)
380 value : Buffer
381 """
--> 382 return await self._store.set(key, value.to_bytes())
ValueError: store error: bad metadata: `expected value at line 3 column 12` I need to dig into icechunk a bit more to have an informed opinion, but one maybe important point is that zarr V3 metadata is stored in the So IMO this is maybe worth fixing on its own, assuming Icechunk wants to support these kinds of extensions to JSON. I saw an issue on serde-json about parsing |
Ok, yes this seems like an actual bug, separate from consolidated metadata. |
@TomAugspurger yes, this is an actual Icechunk problem. That serde-json issue you mention is relevant. We are parsing strict json today, and the way we do it doesn't make it easy to go to JSON5 for the extensions. We have hacked together a fix, exclusively for the |
How critical is this Tom? How frequently does this situation arise in the typical Xarray / Zarr stack? |
It seems like xarray choose import xarray as xr
import numpy as np
import json
ds = xr.Dataset({"a": np.array([])})
store = {}
ds.to_zarr(store)
json.loads(store["a/zarr.json"].to_bytes()) # 'NaN' zarr-python uses import zarr
store = zarr.storage.MemoryStore(mode="w")
zarr.create(shape=(0,), dtype="float64", store=store)
json.loads(store._store_dict["zarr.json"].to_bytes())['fill_value'] # 0.0 So I think this would affect anyone using xarray (not plain Zarr) with floating point data, but only thanks to the consolidated metadata stuff. I suspect that user-defined attributes with I think we can pretty easily add an attribute to the group = zarr.open_group(store) # a store with consolidated metadata at the root
group["a"]["b"]["c"] # getitem doesn't need to do any I/O with consolidated metadata I think the main question to answer is: how expensive is it to "list" all the children of a Group when opening that group? On an object store that can be very expensive since you need to actually list the objects and open each one, which scales with the number of nodes. But maybe in icechunk it's just one or two additional reads? |
in Icechunk that operation is fast and requires no IO |
In general, listing metadata in Icechunk could be extremely fast and require no IO, but the Zarr But that said, it's still super fast because it doesn't need to go to object store for every key, or even do list operations in object store. It just fetches a manifest once, and pulls keys from there. So you can expect something like |
I'm testing out some xarray datasets with icechunk. zarr-developers/zarr-python#2113 adds consolidated metadata, which stores the metadata of each array in the root
Group
. Some of the arrays I'm serializing have afill_value
ofNaN
(not quoted).Here's a reproducer that should work on zarr-python
v3
.which fails with
The bytes being written there are
Strangely(?), this does not affect
Array
. Those can have afill_value
ofNaN
or an attribute withNaN
and it works fine. It's just Group that's affected.The text was updated successfully, but these errors were encountered: