Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset.to_zarr() with mode='a' does not work with groups #3170

Closed
VincentDehaye opened this issue Jul 30, 2019 · 0 comments · Fixed by #3610
Closed

Dataset.to_zarr() with mode='a' does not work with groups #3170

VincentDehaye opened this issue Jul 30, 2019 · 0 comments · Fixed by #3610
Labels
topic-zarr Related to zarr storage library

Comments

@VincentDehaye
Copy link

VincentDehaye commented Jul 30, 2019

MCVE Code Sample

import xarray as xr
import numpy as np
from s3fs import S3FileSystem, S3Map

s3 = S3FileSystem()
bucket_name = 'your-bucket-name'
s3_path = bucket_name + 'some_path.zarr'
store = S3Map(s3_path, s3=s3)
for i in range(6):
    if i%2 == 0:
        group = 'Group1'
    else:
        group = 'Group2'
    lead_time = i//2
    var1 = np.random.rand(1)
    ds = xr.Dataset({'var1': (['lead_time'], var1)},
                    coords={'lead_time': [lead_time]})
    ds.to_zarr(store=store, mode='a', append_dim='lead_time', group=group)

Output

This code returns the following error:

Traceback (most recent call last):
  File "/home/vincent/anaconda3/envs/hanover_backend/lib/python3.7/site-packages/xarray/core/dataset.py", line 1019, in _construct_dataarray
    variable = self._variables[name]
KeyError: 'var1'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/vincent/Documents/Greenlytics/SiteForecast/debugging-script.py", line 201, in <module>
    ds.to_zarr(store=store, mode='a', append_dim='lead_time', group=group)
  File "/home/vincent/anaconda3/envs/hanover_backend/lib/python3.7/site-packages/xarray/core/dataset.py", line 1433, in to_zarr
    consolidated=consolidated, append_dim=append_dim)
  File "/home/vincent/anaconda3/envs/hanover_backend/lib/python3.7/site-packages/xarray/backends/api.py", line 1101, in to_zarr
    dump_to_store(dataset, zstore, writer, encoding=encoding)
  File "/home/vincent/anaconda3/envs/hanover_backend/lib/python3.7/site-packages/xarray/backends/api.py", line 929, in dump_to_store
    unlimited_dims=unlimited_dims)
  File "/home/vincent/anaconda3/envs/hanover_backend/lib/python3.7/site-packages/xarray/backends/zarr.py", line 358, in store
    variables_with_encoding[vn].encoding = ds[vn].encoding
  File "/home/vincent/anaconda3/envs/hanover_backend/lib/python3.7/site-packages/xarray/core/dataset.py", line 1103, in __getitem__
    return self._construct_dataarray(key)
  File "/home/vincent/anaconda3/envs/hanover_backend/lib/python3.7/site-packages/xarray/core/dataset.py", line 1022, in _construct_dataarray
    self._variables, name, self._level_coords, self.dims)
  File "/home/vincent/anaconda3/envs/hanover_backend/lib/python3.7/site-packages/xarray/core/dataset.py", line 91, in _get_virtual_variable
    ref_var = variables[ref_name]
KeyError: 'var1'
The KeyError can happen on a variable name as well as on a dimension name, it depends on the runs.

Problem Description

I am trying to use the append mode introduced in the PR #2706 on zarr groups. This raises a KeyError as you can see in the trace above. Is it a bug or a feature that is not supported (yet)?

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.1 (default, Dec 14 2018, 19:28:38)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.18.0-20-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.2
libnetcdf: 4.6.3

xarray: 0.12.3
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.5.1.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.3.1
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.6.1.post1
iris: None
bottleneck: None
dask: 1.1.5
distributed: None
matplotlib: 3.0.3
cartopy: None
seaborn: None
numbagg: None
setuptools: 40.8.0
pip: 19.0.3
conda: None
pytest: None
IPython: 7.5.0
sphinx: None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-zarr Related to zarr storage library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants