Skip to content
This repository has been archived by the owner on Oct 7, 2024. It is now read-only.

Commit

Permalink
Dataset.map, GroupBy.map, Resample.map (pydata#3459)
Browse files Browse the repository at this point in the history
* rename dataset.apply to dataset.map, deprecating apply

* use apply in deprecation test

* adjust docs

* add groupby rename, remove depreciation warnings (to pending)

* change internal usages

* formatting

* whatsnew

* docs

* docs

* internal usages

* formatting

* docstring, see also
  • Loading branch information
max-sixty authored Nov 9, 2019
1 parent ffc3275 commit db0f13d
Show file tree
Hide file tree
Showing 13 changed files with 186 additions and 76 deletions.
4 changes: 2 additions & 2 deletions doc/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -462,13 +462,13 @@ Datasets support most of the same methods found on data arrays:
abs(ds)
Datasets also support NumPy ufuncs (requires NumPy v1.13 or newer), or
alternatively you can use :py:meth:`~xarray.Dataset.apply` to apply a function
alternatively you can use :py:meth:`~xarray.Dataset.map` to map a function
to each variable in a dataset:

.. ipython:: python
np.sin(ds)
ds.apply(np.sin)
ds.map(np.sin)
Datasets also use looping over variables for *broadcasting* in binary
arithmetic. You can do arithmetic between any ``DataArray`` and a dataset:
Expand Down
15 changes: 8 additions & 7 deletions doc/groupby.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,11 @@ Let's create a simple example dataset:
.. ipython:: python
ds = xr.Dataset({'foo': (('x', 'y'), np.random.rand(4, 3))},
coords={'x': [10, 20, 30, 40],
'letters': ('x', list('abba'))})
arr = ds['foo']
ds = xr.Dataset(
{"foo": (("x", "y"), np.random.rand(4, 3))},
coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
)
arr = ds["foo"]
ds
If we groupby the name of a variable or coordinate in a dataset (we can also
Expand Down Expand Up @@ -93,15 +94,15 @@ Apply
~~~~~

To apply a function to each group, you can use the flexible
:py:meth:`~xarray.DatasetGroupBy.apply` method. The resulting objects are automatically
:py:meth:`~xarray.DatasetGroupBy.map` method. The resulting objects are automatically
concatenated back together along the group axis:

.. ipython:: python
def standardize(x):
return (x - x.mean()) / x.std()
arr.groupby('letters').apply(standardize)
arr.groupby('letters').map(standardize)
GroupBy objects also have a :py:meth:`~xarray.DatasetGroupBy.reduce` method and
methods like :py:meth:`~xarray.DatasetGroupBy.mean` as shortcuts for applying an
Expand Down Expand Up @@ -202,7 +203,7 @@ __ http://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#_two_dimen
dims=['ny','nx'])
da
da.groupby('lon').sum(...)
da.groupby('lon').apply(lambda x: x - x.mean(), shortcut=False)
da.groupby('lon').map(lambda x: x - x.mean(), shortcut=False)
Because multidimensional groups have the ability to generate a very large
number of bins, coarse-binning via :py:meth:`~xarray.Dataset.groupby_bins`
Expand Down
2 changes: 1 addition & 1 deletion doc/howdoi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ How do I ...
* - convert a possibly irregularly sampled timeseries to a regularly sampled timeseries
- :py:meth:`DataArray.resample`, :py:meth:`Dataset.resample` (see :ref:`resampling` for more)
* - apply a function on all data variables in a Dataset
- :py:meth:`Dataset.apply`
- :py:meth:`Dataset.map`
* - write xarray objects with complex values to a netCDF file
- :py:func:`Dataset.to_netcdf`, :py:func:`DataArray.to_netcdf` specifying ``engine="h5netcdf", invalid_netcdf=True``
* - make xarray objects look like other xarray objects
Expand Down
2 changes: 1 addition & 1 deletion doc/quick-overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ xarray supports grouped operations using a very similar API to pandas (see :ref:
labels = xr.DataArray(['E', 'F', 'E'], [data.coords['y']], name='labels')
labels
data.groupby(labels).mean('y')
data.groupby(labels).apply(lambda x: x - x.min())
data.groupby(labels).map(lambda x: x - x.min())
Plotting
--------
Expand Down
7 changes: 7 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,13 @@ New Features
option for dropping either labels or variables, but using the more specific methods is encouraged.
(:pull:`3475`)
By `Maximilian Roos <https://github.com/max-sixty>`_
- :py:meth:`Dataset.map` & :py:meth:`GroupBy.map` & :py:meth:`Resample.map` have been added for
mapping / applying a function over each item in the collection, reflecting the widely used
and least surprising name for this operation.
The existing ``apply`` methods remain for backward compatibility, though using the ``map``
methods is encouraged.
(:pull:`3459`)
By `Maximilian Roos <https://github.com/max-sixty>`_
- :py:meth:`Dataset.transpose` and :py:meth:`DataArray.transpose` now support an ellipsis (`...`)
to represent all 'other' dimensions. For example, to move one dimension to the front,
use `.transpose('x', ...)`. (:pull:`3421`)
Expand Down
11 changes: 8 additions & 3 deletions xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -920,7 +920,7 @@ def copy(self, deep: bool = True, data: Any = None) -> "DataArray":
Coordinates:
* x (x) <U1 'a' 'b' 'c'
See also
See Also
--------
pandas.DataFrame.copy
"""
Expand Down Expand Up @@ -1717,7 +1717,7 @@ def stack(
codes=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]],
names=['x', 'y'])
See also
See Also
--------
DataArray.unstack
"""
Expand Down Expand Up @@ -1765,7 +1765,7 @@ def unstack(
>>> arr.identical(roundtripped)
True
See also
See Also
--------
DataArray.stack
"""
Expand Down Expand Up @@ -1923,6 +1923,11 @@ def drop(
"""Backward compatible method based on `drop_vars` and `drop_sel`
Using either `drop_vars` or `drop_sel` is encouraged
See Also
--------
DataArray.drop_vars
DataArray.drop_sel
"""
ds = self._to_temp_dataset().drop(labels, dim, errors=errors)
return self._from_temp_dataset(ds)
Expand Down
34 changes: 30 additions & 4 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -3557,6 +3557,11 @@ def drop(self, labels=None, dim=None, *, errors="raise", **labels_kwargs):
"""Backward compatible method based on `drop_vars` and `drop_sel`
Using either `drop_vars` or `drop_sel` is encouraged
See Also
--------
Dataset.drop_vars
Dataset.drop_sel
"""
if errors not in ["raise", "ignore"]:
raise ValueError('errors must be either "raise" or "ignore"')
Expand Down Expand Up @@ -4108,14 +4113,14 @@ def reduce(
variables, coord_names=coord_names, attrs=attrs, indexes=indexes
)

def apply(
def map(
self,
func: Callable,
keep_attrs: bool = None,
args: Iterable[Any] = (),
**kwargs: Any,
) -> "Dataset":
"""Apply a function over the data variables in this dataset.
"""Apply a function to each variable in this dataset
Parameters
----------
Expand All @@ -4135,7 +4140,7 @@ def apply(
Returns
-------
applied : Dataset
Resulting dataset from applying ``func`` over each data variable.
Resulting dataset from applying ``func`` to each data variable.
Examples
--------
Expand All @@ -4148,7 +4153,7 @@ def apply(
Data variables:
foo (dim_0, dim_1) float64 -0.3751 -1.951 -1.945 0.2948 0.711 -0.3948
bar (x) int64 -1 2
>>> ds.apply(np.fabs)
>>> ds.map(np.fabs)
<xarray.Dataset>
Dimensions: (dim_0: 2, dim_1: 3, x: 2)
Dimensions without coordinates: dim_0, dim_1, x
Expand All @@ -4165,6 +4170,27 @@ def apply(
attrs = self.attrs if keep_attrs else None
return type(self)(variables, attrs=attrs)

def apply(
self,
func: Callable,
keep_attrs: bool = None,
args: Iterable[Any] = (),
**kwargs: Any,
) -> "Dataset":
"""
Backward compatible implementation of ``map``
See Also
--------
Dataset.map
"""
warnings.warn(
"Dataset.apply may be deprecated in the future. Using Dataset.map is encouraged",
PendingDeprecationWarning,
stacklevel=2,
)
return self.map(func, keep_attrs, args, **kwargs)

def assign(
self, variables: Mapping[Hashable, Any] = None, **variables_kwargs: Hashable
) -> "Dataset":
Expand Down
49 changes: 40 additions & 9 deletions xarray/core/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -608,7 +608,7 @@ def assign_coords(self, coords=None, **coords_kwargs):
Dataset.swap_dims
"""
coords_kwargs = either_dict_or_kwargs(coords, coords_kwargs, "assign_coords")
return self.apply(lambda ds: ds.assign_coords(**coords_kwargs))
return self.map(lambda ds: ds.assign_coords(**coords_kwargs))


def _maybe_reorder(xarray_obj, dim, positions):
Expand Down Expand Up @@ -655,8 +655,8 @@ def lookup_order(dimension):
new_order = sorted(stacked.dims, key=lookup_order)
return stacked.transpose(*new_order, transpose_coords=self._restore_coord_dims)

def apply(self, func, shortcut=False, args=(), **kwargs):
"""Apply a function over each array in the group and concatenate them
def map(self, func, shortcut=False, args=(), **kwargs):
"""Apply a function to each array in the group and concatenate them
together into a new array.
`func` is called like `func(ar, *args, **kwargs)` for each array `ar`
Expand Down Expand Up @@ -702,6 +702,21 @@ def apply(self, func, shortcut=False, args=(), **kwargs):
applied = (maybe_wrap_array(arr, func(arr, *args, **kwargs)) for arr in grouped)
return self._combine(applied, shortcut=shortcut)

def apply(self, func, shortcut=False, args=(), **kwargs):
"""
Backward compatible implementation of ``map``
See Also
--------
DataArrayGroupBy.map
"""
warnings.warn(
"GroupBy.apply may be deprecated in the future. Using GroupBy.map is encouraged",
PendingDeprecationWarning,
stacklevel=2,
)
return self.map(func, shortcut=shortcut, args=args, **kwargs)

def _combine(self, applied, restore_coord_dims=False, shortcut=False):
"""Recombine the applied objects like the original."""
applied_example, applied = peek_at(applied)
Expand Down Expand Up @@ -765,7 +780,7 @@ def quantile(self, q, dim=None, interpolation="linear", keep_attrs=None):
if dim is None:
dim = self._group_dim

out = self.apply(
out = self.map(
self._obj.__class__.quantile,
shortcut=False,
q=q,
Expand Down Expand Up @@ -820,16 +835,16 @@ def reduce_array(ar):

check_reduce_dims(dim, self.dims)

return self.apply(reduce_array, shortcut=shortcut)
return self.map(reduce_array, shortcut=shortcut)


ops.inject_reduce_methods(DataArrayGroupBy)
ops.inject_binary_ops(DataArrayGroupBy)


class DatasetGroupBy(GroupBy, ImplementsDatasetReduce):
def apply(self, func, args=(), shortcut=None, **kwargs):
"""Apply a function over each Dataset in the group and concatenate them
def map(self, func, args=(), shortcut=None, **kwargs):
"""Apply a function to each Dataset in the group and concatenate them
together into a new Dataset.
`func` is called like `func(ds, *args, **kwargs)` for each dataset `ds`
Expand Down Expand Up @@ -862,6 +877,22 @@ def apply(self, func, args=(), shortcut=None, **kwargs):
applied = (func(ds, *args, **kwargs) for ds in self._iter_grouped())
return self._combine(applied)

def apply(self, func, args=(), shortcut=None, **kwargs):
"""
Backward compatible implementation of ``map``
See Also
--------
DatasetGroupBy.map
"""

warnings.warn(
"GroupBy.apply may be deprecated in the future. Using GroupBy.map is encouraged",
PendingDeprecationWarning,
stacklevel=2,
)
return self.map(func, shortcut=shortcut, args=args, **kwargs)

def _combine(self, applied):
"""Recombine the applied objects like the original."""
applied_example, applied = peek_at(applied)
Expand Down Expand Up @@ -914,7 +945,7 @@ def reduce_dataset(ds):

check_reduce_dims(dim, self.dims)

return self.apply(reduce_dataset)
return self.map(reduce_dataset)

def assign(self, **kwargs):
"""Assign data variables by group.
Expand All @@ -923,7 +954,7 @@ def assign(self, **kwargs):
--------
Dataset.assign
"""
return self.apply(lambda ds: ds.assign(**kwargs))
return self.map(lambda ds: ds.assign(**kwargs))


ops.inject_reduce_methods(DatasetGroupBy)
Expand Down
43 changes: 39 additions & 4 deletions xarray/core/resample.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import warnings

from . import ops
from .groupby import DataArrayGroupBy, DatasetGroupBy

Expand Down Expand Up @@ -173,8 +175,8 @@ def __init__(self, *args, dim=None, resample_dim=None, **kwargs):

super().__init__(*args, **kwargs)

def apply(self, func, shortcut=False, args=(), **kwargs):
"""Apply a function over each array in the group and concatenate them
def map(self, func, shortcut=False, args=(), **kwargs):
"""Apply a function to each array in the group and concatenate them
together into a new array.
`func` is called like `func(ar, *args, **kwargs)` for each array `ar`
Expand Down Expand Up @@ -212,7 +214,9 @@ def apply(self, func, shortcut=False, args=(), **kwargs):
applied : DataArray or DataArray
The result of splitting, applying and combining this array.
"""
combined = super().apply(func, shortcut=shortcut, args=args, **kwargs)
# TODO: the argument order for Resample doesn't match that for its parent,
# GroupBy
combined = super().map(func, shortcut=shortcut, args=args, **kwargs)

# If the aggregation function didn't drop the original resampling
# dimension, then we need to do so before we can rename the proxy
Expand All @@ -225,6 +229,21 @@ def apply(self, func, shortcut=False, args=(), **kwargs):

return combined

def apply(self, func, args=(), shortcut=None, **kwargs):
"""
Backward compatible implementation of ``map``
See Also
--------
DataArrayResample.map
"""
warnings.warn(
"Resample.apply may be deprecated in the future. Using Resample.map is encouraged",
PendingDeprecationWarning,
stacklevel=2,
)
return self.map(func=func, shortcut=shortcut, args=args, **kwargs)


ops.inject_reduce_methods(DataArrayResample)
ops.inject_binary_ops(DataArrayResample)
Expand All @@ -247,7 +266,7 @@ def __init__(self, *args, dim=None, resample_dim=None, **kwargs):

super().__init__(*args, **kwargs)

def apply(self, func, args=(), shortcut=None, **kwargs):
def map(self, func, args=(), shortcut=None, **kwargs):
"""Apply a function over each Dataset in the groups generated for
resampling and concatenate them together into a new Dataset.
Expand Down Expand Up @@ -282,6 +301,22 @@ def apply(self, func, args=(), shortcut=None, **kwargs):

return combined.rename({self._resample_dim: self._dim})

def apply(self, func, args=(), shortcut=None, **kwargs):
"""
Backward compatible implementation of ``map``
See Also
--------
DataSetResample.map
"""

warnings.warn(
"Resample.apply may be deprecated in the future. Using Resample.map is encouraged",
PendingDeprecationWarning,
stacklevel=2,
)
return self.map(func=func, shortcut=shortcut, args=args, **kwargs)

def reduce(self, func, dim=None, keep_attrs=None, **kwargs):
"""Reduce the items in this group by applying `func` along the
pre-defined resampling dimension.
Expand Down
Loading

0 comments on commit db0f13d

Please sign in to comment.