Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving a group errors when serializing NaN in IcechunkStore.set #133

Open
TomAugspurger opened this issue Oct 3, 2024 · 13 comments
Open

Comments

@TomAugspurger
Copy link

I'm testing out some xarray datasets with icechunk. zarr-developers/zarr-python#2113 adds consolidated metadata, which stores the metadata of each array in the root Group. Some of the arrays I'm serializing have a fill_value of NaN (not quoted).

Here's a reproducer that should work on zarr-python v3.

import zarr

mem = zarr.storage.MemoryStore(mode="w")
arr = zarr.open_array(store=mem, path="nan", shape=(1,), fill_value=float('nan'))
zarr.consolidate_metadata(mem)

which fails with

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 10
      4 store = await icechunk.IcechunkStore.create(
      5     storage=icechunk.StorageConfig.memory("icechunk-demo"),
      6     mode="w",
      7 )
      9 g = zarr.open_group(store=store, path="g")
---> 10 g.attrs["key"] = float("nan")

File [~/gh/zarr-developers/zarr-python/src/zarr/core/attributes.py:26](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/attributes.py#line=25), in Attributes.__setitem__(self, key, value)
     24 new_attrs = dict(self._obj.metadata.attributes)
     25 new_attrs[key] = value
---> 26 self._obj = self._obj.update_attributes(new_attrs)

File [~/gh/zarr-developers/zarr-python/src/zarr/core/group.py:1388](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/group.py#line=1387), in Group.update_attributes(self, new_attributes)
   1387 def update_attributes(self, new_attributes: dict[str, Any]) -> Group:
-> 1388     self._sync(self._async_group.update_attributes(new_attributes))
   1389     return self

File [~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:135](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py#line=134), in SyncMixin._sync(self, coroutine)
    132 def _sync(self, coroutine: Coroutine[Any, Any, T]) -> T:
    133     # TODO: refactor this to to take *args and **kwargs and pass those to the method
    134     # this should allow us to better type the sync wrapper
--> 135     return sync(
    136         coroutine,
    137         timeout=config.get("async.timeout"),
    138     )

File [~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:91](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py#line=90), in sync(coro, loop, timeout)
     88 return_result = next(iter(finished)).result()
     90 if isinstance(return_result, BaseException):
---> 91     raise return_result
     92 else:
     93     return return_result

File [~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:50](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py#line=49), in _runner(coro)
     45 """
     46 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     47 exception, the exception will be returned.
     48 """
     49 try:
---> 50     return await coro
     51 except Exception as ex:
     52     return ex

File [~/gh/zarr-developers/zarr-python/src/zarr/core/group.py:1047](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/group.py#line=1046), in AsyncGroup.update_attributes(self, new_attributes)
   1044 self.metadata.attributes.update(new_attributes)
   1046 # Write new metadata
-> 1047 await self._save_metadata()
   1049 return self

File [~/gh/zarr-developers/zarr-python/src/zarr/core/group.py:744](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/group.py#line=743), in AsyncGroup._save_metadata(self, ensure_parents)
    734     for parent in parents:
    735         awaitables.extend(
    736             [
    737                 (parent.store_path [/](http://localhost:8890/) key).set_if_not_exists(value)
   (...)
    741             ]
    742         )
--> 744 await asyncio.gather(*awaitables)

File [~/gh/zarr-developers/zarr-python/src/zarr/abc/store.py:348](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/abc/store.py#line=347), in set_or_delete(byte_setter, value)
    346     await byte_setter.delete()
    347 else:
--> 348     await byte_setter.set(value)

File [~/gh/zarr-developers/zarr-python/src/zarr/storage/common.py:49](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/storage/common.py#line=48), in StorePath.set(self, value, byte_range)
     47 if byte_range is not None:
     48     raise NotImplementedError("Store.set does not have partial writes yet")
---> 49 await self.store.set(self.path, value)

File [~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py:382](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py#line=381), in IcechunkStore.set(self, key, value)
    374 async def set(self, key: str, value: Buffer) -> None:
    375     """Store a (key, value) pair.
    376 
    377     Parameters
   (...)
    380     value : Buffer
    381     """
--> 382     return await self._store.set(key, value.to_bytes())

ValueError: store error: bad metadata: `expected value at line 3 column 12`

The bytes being written there are

{
  "attributes": {
    "key": NaN
  },
  "zarr_format": 3,
  "consolidated_metadata": null,
  "node_type": "group"
}

Strangely(?), this does not affect Array. Those can have a fill_value of NaN or an attribute with NaN and it works fine. It's just Group that's affected.

@rabernat
Copy link
Contributor

rabernat commented Oct 3, 2024

We should never consolidate metadata on icechunk. Is this error being raised on the consolidate step or the open_array step?

@TomAugspurger
Copy link
Author

Here's the traceback from the real example. It's during metadata consolidation (which takes data that shows up in each array's zarr.json and puts it in the root Group's zarr.json)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[19], line 1
----> 1 ds.to_zarr(store)

File [~/gh/pydata/xarray/xarray/core/dataset.py:2572](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/pydata/xarray/xarray/core/dataset.py#line=2571), in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs)
   2417 """Write dataset contents to a zarr group.
   2418 
   2419 Zarr chunks are determined in the following way:
   (...)
   2568     The I[/O](http://localhost:8890/O) user guide, with more details and examples.
   2569 """
   2570 from xarray.backends.api import to_zarr
-> 2572 return to_zarr(  # type: ignore[call-overload,misc]
   2573     self,
   2574     store=store,
   2575     chunk_store=chunk_store,
   2576     storage_options=storage_options,
   2577     mode=mode,
   2578     synchronizer=synchronizer,
   2579     group=group,
   2580     encoding=encoding,
   2581     compute=compute,
   2582     consolidated=consolidated,
   2583     append_dim=append_dim,
   2584     region=region,
   2585     safe_chunks=safe_chunks,
   2586     zarr_version=zarr_version,
   2587     write_empty_chunks=write_empty_chunks,
   2588     chunkmanager_store_kwargs=chunkmanager_store_kwargs,
   2589 )

File [~/gh/pydata/xarray/xarray/backends/api.py:1777](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/pydata/xarray/xarray/backends/api.py#line=1776), in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs)
   1775 writer = ArrayWriter()
   1776 # TODO: figure out how to properly handle unlimited_dims
-> 1777 dump_to_store(dataset, zstore, writer, encoding=encoding)
   1778 writes = writer.sync(
   1779     compute=compute, chunkmanager_store_kwargs=chunkmanager_store_kwargs
   1780 )
   1782 if compute:

File [~/gh/pydata/xarray/xarray/backends/api.py:1467](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/pydata/xarray/xarray/backends/api.py#line=1466), in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims)
   1464 if encoder:
   1465     variables, attrs = encoder(variables, attrs)
-> 1467 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)

File [~/gh/pydata/xarray/xarray/backends/zarr.py:826](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/pydata/xarray/xarray/backends/zarr.py#line=825), in ZarrStore.store(self, variables, attributes, check_encoding_set, writer, unlimited_dims)
    823 if _zarr_v3():
    824     # https://github.com/zarr-developers/zarr-python/pull/2113#issuecomment-2386718323
    825     kwargs["path"] = self.zarr_group.name.lstrip("/")
--> 826 zarr.consolidate_metadata(self.zarr_group.store, **kwargs)

File [~/gh/zarr-developers/zarr-python/src/zarr/api/synchronous.py:46](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/api/synchronous.py#line=45), in consolidate_metadata(*args, **kwargs)
     45 def consolidate_metadata(*args: Any, **kwargs: Any) -> Group:
---> 46     return Group(sync(async_api.consolidate_metadata(*args, **kwargs)))

File [~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:91](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py#line=90), in sync(coro, loop, timeout)
     88 return_result = next(iter(finished)).result()
     90 if isinstance(return_result, BaseException):
---> 91     raise return_result
     92 else:
     93     return return_result

File [~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py:50](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/sync.py#line=49), in _runner(coro)
     45 """
     46 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     47 exception, the exception will be returned.
     48 """
     49 try:
---> 50     return await coro
     51 except Exception as ex:
     52     return ex

File [~/gh/zarr-developers/zarr-python/src/zarr/api/asynchronous.py:196](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/api/asynchronous.py#line=195), in consolidate_metadata(store, path, zarr_format)
    191 metadata = dataclasses.replace(group.metadata, consolidated_metadata=consolidated_metadata)
    192 group = dataclasses.replace(
    193     group,
    194     metadata=metadata,
    195 )
--> 196 await group._save_metadata()
    197 return group

File [~/gh/zarr-developers/zarr-python/src/zarr/core/group.py:744](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/core/group.py#line=743), in AsyncGroup._save_metadata(self, ensure_parents)
    734     for parent in parents:
    735         awaitables.extend(
    736             [
    737                 (parent.store_path [/](http://localhost:8890/) key).set_if_not_exists(value)
   (...)
    741             ]
    742         )
--> 744 await asyncio.gather(*awaitables)

File [~/gh/zarr-developers/zarr-python/src/zarr/abc/store.py:348](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/abc/store.py#line=347), in set_or_delete(byte_setter, value)
    346     await byte_setter.delete()
    347 else:
--> 348     await byte_setter.set(value)

File [~/gh/zarr-developers/zarr-python/src/zarr/storage/common.py:49](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/zarr-developers/zarr-python/src/zarr/storage/common.py#line=48), in StorePath.set(self, value, byte_range)
     47 if byte_range is not None:
     48     raise NotImplementedError("Store.set does not have partial writes yet")
---> 49 await self.store.set(self.path, value)

File [~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py:382](http://localhost:8890/lab/tree/icechunk-python/notebooks/~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py#line=381), in IcechunkStore.set(self, key, value)
    374 async def set(self, key: str, value: Buffer) -> None:
    375     """Store a (key, value) pair.
    376 
    377     Parameters
   (...)
    380     value : Buffer
    381     """
--> 382     return await self._store.set(key, value.to_bytes())

ValueError: store error: bad metadata: `expected value at line 20 column 23`

This is easy for me to workaround by setting consolidated=False in my ds.to_zarr() call.

We should never consolidate metadata on icechunk.

Yeah... I assume that it has no benefit in icechunk's case. As currently written, my consolidated metadata PR doesn't support opting out by the store. From the store's point of view it just looks like a bunch of getitem calls to bundle up the metadata followed by a final setitem.

@TomAugspurger
Copy link
Author

On the bright side, this works :)

import os
import icechunk
import xarray as xr
import fsspec

account_id = os.environ["R2_ACCOUNT_ID"]
s3_storage = icechunk.StorageConfig.s3_from_env(
    endpoint_url=f'https://{account_id}.r2.cloudflarestorage.com',
    bucket="taugspurger",
    prefix="daymet",
)

store = await icechunk.IcechunkStore.create(storage=s3_storage)

ds = xr.open_dataset(
    fsspec.open("https://thredds.daac.ornl.gov/thredds/fileServer/ornldaac/2130/daymet_v4_prcp_annttl_hi_1980.nc").open(),
    engine="h5netcdf"
)

ds.to_zarr(store, consolidated=False)

await store.commit("Added 1980")

ds2 = xr.open_dataset(store, engine="zarr", consolidated=False)
image

@rabernat
Copy link
Contributor

rabernat commented Oct 3, 2024

🤯 How is this working with Xarray? Is this on a private branch?

@TomAugspurger
Copy link
Author

Two :) But if you run this I think it'll get everything you need:

pip install git+https://github.com/TomAugspurger/zarr-python@xarray-compat git+https://github.com/TomAugspurger/xarray@fix/zarr-v3

That xarray-compat branch combines several PRs to zarr-python (including your vlen string one).

@rabernat
Copy link
Contributor

rabernat commented Oct 4, 2024

This is great progress.

For a bit more background: the reason metadata consolidation doesn't work is that Icechunk is not actually a generic key value store. It only allows you to store Zarr V3 metadata, and in fact, it reverse-engineers the structure of the dataset from the keys that it sees. (In this sense, Icechunk is breaking the abstraction between Zarr store and the rest of Zarr.) This is necessary for Icechunk to work.

From my perspective, it's not a major problem right now that consolidated metadata doesn't work, as consolidated metadata is an optional thing. From an ecosystem point of view (e.g. Xarray), we want to cleanly and automatically avoid using consolidated metadata if it's not necessary.

Here's one way we could do this.

  • Stores add an optional .supports_consolidated_metadata class attribute, which defaults to True but will be False for Icechunk store.
  • Calls to consolidate_metadata or open_consolidate fail with a typed error if attempted on these stores.
  • Downstream libraries (e.g. Xarray) can catch this exception and act appropriately

What do you think?

@TomAugspurger
Copy link
Author

Oh... I see that I led this discussion astray with the focus on consolidated metadata. That was the first thing I hit, but I was able to reproduce it with a plain Group:

import icechunk

import numpy as np
import zarr

from icechunk import IcechunkStore, StorageConfig

store = await IcechunkStore.create(
    storage=StorageConfig.memory("icechunk-demo"),
    mode="w",
)

group = zarr.group(store=store, attributes={"key": float("nan")})

which raises with

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/asynchronous.py:586](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/asynchronous.py#line=585), in group(store, overwrite, chunk_store, cache_attrs, synchronizer, path, zarr_version, zarr_format, meta_array, attributes, storage_options)
    585 try:
--> 586     return await AsyncGroup.open(store=store_path, zarr_format=zarr_format)
    587 except (KeyError, FileNotFoundError):

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py:513](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py#line=512), in AsyncGroup.open(cls, store, zarr_format, use_consolidated)
    512 if zarr_json_bytes is None and zgroup_bytes is None:
--> 513     raise FileNotFoundError(
    514         f"could not find zarr.json or .zgroup objects in {store_path}"
    515     )
    516 # set zarr_format based on which keys were found

FileNotFoundError: could not find zarr.json or .zgroup objects in <icechunk.IcechunkStore object at 0x110f43980>

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[9], line 14
      8 store = await IcechunkStore.create(
      9     storage=StorageConfig.memory("icechunk-demo"),
     10     mode="w",
     11 )
     12 store
---> 14 group = zarr.group(store=store, attributes={"key": float("nan")})

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/_compat.py:43](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/_compat.py#line=42), in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     41 extra_args = len(args) - len(all_args)
     42 if extra_args <= 0:
---> 43     return f(*args, **kwargs)
     45 # extra_args > 0
     46 args_msg = [
     47     f"{name}={arg}"
     48     for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
     49 ]

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/synchronous.py:182](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/synchronous.py#line=181), in group(store, overwrite, chunk_store, cache_attrs, synchronizer, path, zarr_version, zarr_format, meta_array, attributes)
    167 @_deprecate_positional_args
    168 def group(
    169     store: StoreLike | None = None,
   (...)
    179     attributes: dict[str, JSON] | None = None,
    180 ) -> Group:
    181     return Group(
--> 182         sync(
    183             async_api.group(
    184                 store=store,
    185                 overwrite=overwrite,
    186                 chunk_store=chunk_store,
    187                 cache_attrs=cache_attrs,
    188                 synchronizer=synchronizer,
    189                 path=path,
    190                 zarr_version=zarr_version,
    191                 zarr_format=zarr_format,
    192                 meta_array=meta_array,
    193                 attributes=attributes,
    194             )
    195         )
    196     )

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/sync.py:91](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/sync.py#line=90), in sync(coro, loop, timeout)
     88 return_result = next(iter(finished)).result()
     90 if isinstance(return_result, BaseException):
---> 91     raise return_result
     92 else:
     93     return return_result

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/sync.py:50](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/sync.py#line=49), in _runner(coro)
     45 """
     46 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     47 exception, the exception will be returned.
     48 """
     49 try:
---> 50     return await coro
     51 except Exception as ex:
     52     return ex

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/asynchronous.py:588](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/api/asynchronous.py#line=587), in group(store, overwrite, chunk_store, cache_attrs, synchronizer, path, zarr_version, zarr_format, meta_array, attributes, storage_options)
    586     return await AsyncGroup.open(store=store_path, zarr_format=zarr_format)
    587 except (KeyError, FileNotFoundError):
--> 588     return await AsyncGroup.from_store(
    589         store=store_path,
    590         zarr_format=zarr_format or _default_zarr_version(),
    591         exists_ok=overwrite,
    592         attributes=attributes,
    593     )

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py:434](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py#line=433), in AsyncGroup.from_store(cls, store, attributes, exists_ok, zarr_format)
    429 attributes = attributes or {}
    430 group = cls(
    431     metadata=GroupMetadata(attributes=attributes, zarr_format=zarr_format),
    432     store_path=store_path,
    433 )
--> 434 await group._save_metadata(ensure_parents=True)
    435 return group

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py:744](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/core/group.py#line=743), in AsyncGroup._save_metadata(self, ensure_parents)
    734     for parent in parents:
    735         awaitables.extend(
    736             [
    737                 (parent.store_path [/](http://localhost:8891/) key).set_if_not_exists(value)
   (...)
    741             ]
    742         )
--> 744 await asyncio.gather(*awaitables)

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/abc/store.py:348](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/abc/store.py#line=347), in set_or_delete(byte_setter, value)
    346     await byte_setter.delete()
    347 else:
--> 348     await byte_setter.set(value)

File [~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/storage/common.py:49](http://localhost:8891/~/gh/earth-mover/icechunk/.direnv/python-3.12/lib/python3.12/site-packages/zarr/storage/common.py#line=48), in StorePath.set(self, value, byte_range)
     47 if byte_range is not None:
     48     raise NotImplementedError("Store.set does not have partial writes yet")
---> 49 await self.store.set(self.path, value)

File [~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py:382](http://localhost:8891/~/gh/earth-mover/icechunk/icechunk-python/python/icechunk/__init__.py#line=381), in IcechunkStore.set(self, key, value)
    374 async def set(self, key: str, value: Buffer) -> None:
    375     """Store a (key, value) pair.
    376 
    377     Parameters
   (...)
    380     value : Buffer
    381     """
--> 382     return await self._store.set(key, value.to_bytes())

ValueError: store error: bad metadata: `expected value at line 3 column 12`

I need to dig into icechunk a bit more to have an informed opinion, but one maybe important point is that zarr V3 metadata is stored in the Group metadata object (following the lead of putting .zattributes in the Group, rather than a separate document; that's why we hit this in both Group and consolidated_metadata). So any Store automatically supports Zarr v3 consolidated metadata.

So IMO this is maybe worth fixing on its own, assuming Icechunk wants to support these kinds of extensions to JSON.

I saw an issue on serde-json about parsing NaN and its maintainer not wanting to support that (since it isn't strictly valid JSON), but I can't find it now.

@rabernat
Copy link
Contributor

rabernat commented Oct 4, 2024

Ok, yes this seems like an actual bug, separate from consolidated metadata.

@paraseba
Copy link
Contributor

paraseba commented Oct 4, 2024

@TomAugspurger yes, this is an actual Icechunk problem. That serde-json issue you mention is relevant. We are parsing strict json today, and the way we do it doesn't make it easy to go to JSON5 for the extensions. We have hacked together a fix, exclusively for the fill_value field, where you can pass "NaN", but it's currently not working on user attributes or other fields like codecs, etc. It will takes us some time to get to this issue, but it definitely needs fixing.

@rabernat
Copy link
Contributor

rabernat commented Oct 4, 2024

How critical is this Tom? How frequently does this situation arise in the typical Xarray / Zarr stack?

@TomAugspurger
Copy link
Author

TomAugspurger commented Oct 4, 2024

It seems like xarray choose NaN as the default fill value for float data.

import xarray as xr
import numpy as np
import json

ds = xr.Dataset({"a": np.array([])})
store = {}
ds.to_zarr(store)

json.loads(store["a/zarr.json"].to_bytes())  # 'NaN'

zarr-python uses 0.0 as its default:

import zarr

store = zarr.storage.MemoryStore(mode="w")
zarr.create(shape=(0,), dtype="float64", store=store)
json.loads(store._store_dict["zarr.json"].to_bytes())['fill_value']  # 0.0

So I think this would affect anyone using xarray (not plain Zarr) with floating point data, but only thanks to the consolidated metadata stuff. I suspect that user-defined attributes with NaN, Infinity, etc. happen sometimes, at which point this affects plain Zarr users, but maybe aren't too common.

I think we can pretty easily add an attribute to the Store class as a way for it to indicate whether it "prefers" consolidated metadata, which xarray could use when deciding how to interpret consolidated=None. That said, I want to understand icechunk a bit more to ensure the design is correct. I think that at the user-level, consolidated metadata can still be useful if only to reduce the number of separate I/O requests made to the store. On my branch, this code only needs a single HTTP request with consolidated metadata:

group = zarr.open_group(store)  # a store with consolidated metadata at the root
group["a"]["b"]["c"]  # getitem doesn't need to do any I/O with consolidated metadata

I think the main question to answer is: how expensive is it to "list" all the children of a Group when opening that group? On an object store that can be very expensive since you need to actually list the objects and open each one, which scales with the number of nodes. But maybe in icechunk it's just one or two additional reads?

@paraseba
Copy link
Contributor

paraseba commented Oct 4, 2024

group["a"]["b"]["c"] # getitem doesn't need to do any I/O with consolidated metadata

in Icechunk that operation is fast and requires no IO

@paraseba
Copy link
Contributor

paraseba commented Oct 4, 2024

In general, listing metadata in Icechunk could be extremely fast and require no IO, but the Zarr Store abstraction currently doesn't let us be as fast as we would like, because it doesn't distinguish between metadata and chunk keys. So listing means listing everything.

But that said, it's still super fast because it doesn't need to go to object store for every key, or even do list operations in object store. It just fetches a manifest once, and pulls keys from there. So you can expect something like store.list_prefix("/") to be much faster than on other remote stores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants