-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset.chunk() and DataArray.chunk() now set encoding attribute #8069
Conversation
…ncoding attribute
Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient. |
fc288f1
to
2ef1207
Compare
Through the tests in test_backend.py I encountered an error in the zarr library when setting attempting to write a Dataset with a dimension of length 0. While the size of the Dask chunk can be 0, the value in the encoding attribute can not. In the zarr code, it can be found that they default to a chunksize of 1 when the dimension has length 0. It can be debated whether |
There is still one test that fails, and which actually challenges the current state of the proposed fix: From my current understanding, I see three ways forward:
In all cases, I would suggest to add a line to the docstring of the |
This comment was marked as outdated.
This comment was marked as outdated.
@pydata/xarray I don't know encoding well, but what should we do next on this? |
Does someone have a clear vision of how we're evolving the |
Yeah unfortunately we are all agreed on deleting Thanks for trying @Metamess I'm sorry that your first PR can't make it through. We hope to see you back soon! |
whats-new.rst
This PR aims to fix the bug identified in Issue #8062 , where calling the
.chunk()
function on aDataArray
orDataset
would change the chunking of the underlying data, but not change the"chunks"
entry in theencoding
attribute. This would lead to Datasets that cannot be written to file (zarr) without raising an error. The setting of theencoding
attribute via the.chunk()
function is part of the expected behavior based on the docstring ofDataset.to_zarr
: