Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: ValueError: Cannot generate bounds for a coordinate of length <= 1. Using decode_times optional arg #304

Closed
durack1 opened this issue Aug 10, 2022 · 11 comments · Fixed by #313
Assignees
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@durack1
Copy link
Collaborator

durack1 commented Aug 10, 2022

What happened?

Attempting to open a dataset with time:units = "months since 1955-01-01 00:00:00" ; has problems, adding decode_times=False gets around the pandas issue (#282), but this leads to a xcdat/bounds.py issue instead: ValueError: Cannot generate bounds for a coordinate of length <= 1.

What did you expect to happen?

xcdat could handle the poorly formed file and open the dataset.

Minimal Complete Verifiable Example

In [2]: from xcdat import open_dataset

In [3]: f = "~/Shared/obs_data/WOD18/190312/woa18_decav_t00_04.nc"

In [4]: wH = open_dataset(f)

Or 

In [5]: wH = open_dataset(f, decode_times=False)

Relevant log output

Traceback (most recent call last):

  File "/tmp/ipykernel_177169/1773024523.py", line 1, in <cell line: 1>
    wH = open_dataset(f)

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/dataset.py", line 89, in open_dataset
    ds = decode_non_cf_time(ds)

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/dataset.py", line 320, in decode_non_cf_time
    data = [ref_date + pd.DateOffset(**{units: offset}) for offset in time.data]

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/dataset.py", line 320, in <listcomp>
    data = [ref_date + pd.DateOffset(**{units: offset}) for offset in time.data]

  File "pandas/_libs/tslibs/offsets.pyx", line 444, in pandas._libs.tslibs.offsets.BaseOffset.__add__

  File "pandas/_libs/tslibs/offsets.pyx", line 450, in pandas._libs.tslibs.offsets.BaseOffset.__add__

  File "pandas/_libs/tslibs/offsets.pyx", line 180, in pandas._libs.tslibs.offsets.apply_wraps.wrapper

  File "pandas/_libs/tslibs/offsets.pyx", line 1092, in pandas._libs.tslibs.offsets.RelativeDeltaOffset._apply

  File "pandas/_libs/tslibs/timestamps.pyx", line 1399, in pandas._libs.tslibs.timestamps.Timestamp.__new__

  File "pandas/_libs/tslibs/conversion.pyx", line 436, in pandas._libs.tslibs.conversion.convert_to_tsobject

  File "pandas/_libs/tslibs/conversion.pyx", line 517, in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject

  File "pandas/_libs/tslibs/np_datetime.pyx", line 120, in pandas._libs.tslibs.np_datetime.check_dts_bounds

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2315-07-01 00:00:00


In [5]: wH = open_dataset(f, decode_times=False)
Traceback (most recent call last):

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 178, in get_bounds
    bounds_key = coord_var.attrs["bounds"]

KeyError: 'bounds'


During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 145, in add_missing_bounds
    self.get_bounds(axis)

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 180, in get_bounds
    raise KeyError(

KeyError: "The coordinate variable 'time' has no 'bounds' attr. Set the 'bounds' attr to the name of the bounds data variable."


During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 178, in get_bounds
    bounds_key = coord_var.attrs["bounds"]

KeyError: 'bounds'


During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 220, in add_bounds
    self.get_bounds(axis)

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 180, in get_bounds
    raise KeyError(

KeyError: "The coordinate variable 'time' has no 'bounds' attr. Set the 'bounds' attr to the name of the bounds data variable."


During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "/tmp/ipykernel_177169/170555424.py", line 1, in <cell line: 1>
    wH = open_dataset(f, decode_times=False)

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/dataset.py", line 95, in open_dataset
    ds = _postprocess_dataset(ds, data_var, center_times, add_bounds, lon_orient)

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/dataset.py", line 482, in _postprocess_dataset
    dataset = dataset.bounds.add_missing_bounds()

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 147, in add_missing_bounds
    self._dataset = self.add_bounds(axis, width)

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 225, in add_bounds
    dataset = self._add_bounds(axis, width)

  File "~/mambaforge/envs/xcdat030spy532/lib/python3.10/site-packages/xcdat/bounds.py", line 271, in _add_bounds
    raise ValueError("Cannot generate bounds for a coordinate of length <= 1.")

ValueError: Cannot generate bounds for a coordinate of length <= 1.

Anything else we need to know?

Adding the additional argument

wH = open_dataset(f, decode_times=False, add_bounds=False)

opens the file fine

Environment

0.3.0 - conda-forge build

In [12]: xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.62.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1

xarray: 2022.6.0
pandas: 1.4.3
numpy: 1.22.4
scipy: 1.9.0
netCDF4: 1.6.0
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.8.0
distributed: 2022.8.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.7.1
cupy: None
pint: None
sparse: 0.13.0
flox: None
numpy_groupies: None
setuptools: 63.4.2
pip: 22.2.2
conda: None
pytest: None
IPython: 7.33.0
sphinx: 5.1.1
~/mambaforge/envs/xcdat031spy532/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")

@durack1 durack1 added the type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Aug 10, 2022
@pochedls
Copy link
Collaborator

@durack1 - can you provide some more details about the file metadata or a http/thredds link/path to the file? I think xcdat 0.3.1 may fix the first issue, but I'm not sure why the second issue is occurring.

@durack1
Copy link
Collaborator Author

durack1 commented Aug 10, 2022

@pochedls
Copy link
Collaborator

pochedls commented Aug 10, 2022

The file can be downloaded from here.

I think the time units issue is fixed with this PR. The bounds issue is because this file has one time step and xcdat can't figure out how to generate bounds in that case.

I think a reasonable fix would be to modify this line to log a warning and return the dataset as is.

@durack1 - would you be willing to try a PR on this (assuming @tomvothecoder agrees with this solution)? The contributing guide is pretty complete (and would give you the latest xcdat environment since you would like to use some of the bug fixes on 0.3.1).

@durack1
Copy link
Collaborator Author

durack1 commented Aug 10, 2022

@pochedls sure thing - happy to be a test bunny.

I did pull down and create an xcdat_dev env last night following the instructions from https://xcdat.readthedocs.io/en/stable/contributing.html#local-development and attempted to build the env using bump/0.3.1 but hit issues with xesmf.py - didn't log these as it's not a released/main branch.

As an aside, I was keen to pull down the changes in bump/0.3.1 and give this a whirl, but was stumped. I know this is not merged into main, but in a situation where it was, a nightly conda build would be VERY helpful

As a second aside, garbled files such as this example (the time units are all wrong, should be ~1986, not 2359) would be useful a fringe test suite files

@tomvothecoder
Copy link
Collaborator

I did pull down and create an xcdat_dev env last night following the instructions from xcdat.readthedocs.io/en/stable/contributing.html#local-development and attempted to build the env using bump/0.3.1 but hit issues with xesmf.py - didn't log these as it's not a released/main branch.

This is fixed by #307.

As an aside, I was keen to pull down the changes in bump/0.3.1 and give this a whirl, but was stumped. I know this is not merged into main, but in a situation where it was, a nightly conda build would be VERY helpful

As mentioned, this is not merged into main yet and is blocked by other PRs. It should be correctly labeled as "Draft PR".

We decided to go for rapid release cycles over nightly conda builds. Will provide a follow up the reasoning for this decision in #269 soon.

@tomvothecoder tomvothecoder moved this to Todo in v0.3.1 Aug 10, 2022
@tomvothecoder
Copy link
Collaborator

@durack1 - would you be willing to try a PR on this (assuming @tomvothecoder agrees with this solution)? The contributing guide is pretty complete (and would give you the latest xcdat environment since you would like to use some of the bug fixes on 0.3.1).

If you are interested @durack1, I can give you write access to this repository so you don't need to work on a separate fork.

@durack1
Copy link
Collaborator Author

durack1 commented Aug 10, 2022

@tomvothecoder sure, hook me up!

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Aug 10, 2022

@durack1 Just added you! Thanks for the help!

@durack1
Copy link
Collaborator Author

durack1 commented Aug 10, 2022

@pochedls I am running on the latest main 4c51879 in spyder and ready and waiting to test out a PR when you have one available - from the looks there's no active PR or branch that I can currently find.

Interestingly, the merge of #283 has led to the previous xarray error being replaced by an all xcdat error logs straight away, which suppresses the hint that gave me the decode_times=False tip

Repository owner moved this from Todo to Done in v0.3.1 Aug 16, 2022
@pochedls
Copy link
Collaborator

@durack1 - This should be fixed on the main branch.

@durack1
Copy link
Collaborator Author

durack1 commented Aug 16, 2022

@pochedls @tomvothecoder excellent, thanks gents. I'll pull down the latest main later this week!

pochedls added a commit that referenced this issue Aug 18, 2022
tomvothecoder pushed a commit that referenced this issue Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants