Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support operations with pandas Offset objects #4535

Closed
max-sixty opened this issue Oct 24, 2020 · 2 comments · Fixed by #4537
Closed

Support operations with pandas Offset objects #4535

max-sixty opened this issue Oct 24, 2020 · 2 comments · Fixed by #4537

Comments

@max-sixty
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Currently xarray objects containting datetimes don't operate with pandas' offset objects:

times = pd.date_range("2000-01-01", freq="6H", periods=10)
ds = xr.Dataset(
    {
        "foo": (["time", "x", "y"], np.random.randn(10, 5, 3)),
        "bar": ("time", np.random.randn(10), {"meta": "data"}),
        "time": times,
    }
)
ds.attrs["dsmeta"] = "dsdata"
ds.resample(time="24H").mean("time").time + to_offset("8H")

raises:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-29-f9de46fe6c54> in <module>
----> 1 ds.resample(time="24H").mean("time").time + to_offset("8H")

/usr/local/lib/python3.8/site-packages/xarray/core/dataarray.py in func(self, other)
   2763 
   2764             variable = (
-> 2765                 f(self.variable, other_variable)
   2766                 if not reflexive
   2767                 else f(other_variable, self.variable)

/usr/local/lib/python3.8/site-packages/xarray/core/variable.py in func(self, other)
   2128             with np.errstate(all="ignore"):
   2129                 new_data = (
-> 2130                     f(self_data, other_data)
   2131                     if not reflexive
   2132                     else f(other_data, self_data)

TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'pandas._libs.tslibs.offsets.Hour'

This is an issue because pandas resampling has deprecated loffset — from our test suite:

xarray/tests/test_dataset.py::TestDataset::test_resample_loffset
  /Users/maximilian/workspace/xarray/xarray/tests/test_dataset.py:3844: FutureWarning: 'loffset' in .resample() and in Grouper() is deprecated.

  >>> df.resample(freq="3s", loffset="8H")

  becomes:

  >>> from pandas.tseries.frequencies import to_offset
  >>> df = df.resample(freq="3s").mean()
  >>> df.index = df.index.to_timestamp() + to_offset("8H")

    ds.bar.to_series().resample("24H", loffset="-12H").mean()

...and so we'll need to support something like this in order to maintain existing behavior.

Describe the solution you'd like
I'm not completely sure; I think probably supporting the operations between xarray objects containing datetime objects and pandas' offset objects.

@dcherian
Copy link
Contributor

dcherian commented Oct 24, 2020

For resample, we could add it to .indexes["time"] which is pd.DatetimeIndex so the addition should work.

I agree with generally supporting adding pandas offset objects.

EDIT: to clarify, ds["time"] = ds.indexes["time"] + to_offset(loffset) should do what we want

@max-sixty
Copy link
Collaborator Author

Yes, good point, that would be easy and probably sufficient for most use-cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants