Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multidimensional initial guess and bounds for curvefit #7821

Merged
merged 22 commits into from
May 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ v2023.05.1 (unreleased)
New Features
~~~~~~~~~~~~

- Added support for multidimensional initial guess and bounds in :py:meth:`DataArray.curvefit` (:issue:`7768`, :pull:`7821`).
By `András Gunyhó <https://github.com/mgunyho>`_.

Breaking changes
~~~~~~~~~~~~~~~~
Expand Down
99 changes: 91 additions & 8 deletions xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -5475,6 +5475,7 @@ def polyfit(
numpy.polyfit
numpy.polyval
xarray.polyval
DataArray.curvefit
"""
return self._to_temp_dataset().polyfit(
dim, deg, skipna=skipna, rcond=rcond, w=w, full=full, cov=cov
Expand Down Expand Up @@ -6129,8 +6130,8 @@ def curvefit(
func: Callable[..., Any],
reduce_dims: Dims = None,
skipna: bool = True,
p0: dict[str, Any] | None = None,
bounds: dict[str, Any] | None = None,
p0: dict[str, float | DataArray] | None = None,
bounds: dict[str, tuple[float | DataArray, float | DataArray]] | None = None,
param_names: Sequence[str] | None = None,
kwargs: dict[str, Any] | None = None,
) -> Dataset:
Expand Down Expand Up @@ -6161,12 +6162,16 @@ def curvefit(
Whether to skip missing values when fitting. Default is True.
p0 : dict-like or None, optional
Optional dictionary of parameter names to initial guesses passed to the
`curve_fit` `p0` arg. If none or only some parameters are passed, the rest will
be assigned initial values following the default scipy behavior.
bounds : dict-like or None, optional
Optional dictionary of parameter names to bounding values passed to the
`curve_fit` `bounds` arg. If none or only some parameters are passed, the rest
will be unbounded following the default scipy behavior.
`curve_fit` `p0` arg. If the values are DataArrays, they will be appropriately
broadcast to the coordinates of the array. If none or only some parameters are
passed, the rest will be assigned initial values following the default scipy
behavior.
bounds : dict-like, optional
Optional dictionary of parameter names to tuples of bounding values passed to the
`curve_fit` `bounds` arg. If any of the bounds are DataArrays, they will be
appropriately broadcast to the coordinates of the array. If none or only some
parameters are passed, the rest will be unbounded following the default scipy
behavior.
param_names : sequence of Hashable or None, optional
Sequence of names for the fittable parameters of `func`. If not supplied,
this will be automatically determined by arguments of `func`. `param_names`
Expand All @@ -6185,6 +6190,84 @@ def curvefit(
[var]_curvefit_covariance
The covariance matrix of the coefficient estimates.

Examples
--------
Generate some exponentially decaying data, where the decay constant and amplitude are
different for different values of the coordinate ``x``:

>>> rng = np.random.default_rng(seed=0)
>>> def exp_decay(t, time_constant, amplitude):
... return np.exp(-t / time_constant) * amplitude
...
>>> t = np.linspace(0, 10, 11)
>>> da = xr.DataArray(
... np.stack(
... [
... exp_decay(t, 1, 0.1),
... exp_decay(t, 2, 0.2),
... exp_decay(t, 3, 0.3),
... ]
... )
... + rng.normal(size=(3, t.size)) * 0.01,
... coords={"x": [0, 1, 2], "time": t},
... )
>>> da
<xarray.DataArray (x: 3, time: 11)>
array([[ 0.1012573 , 0.0354669 , 0.01993775, 0.00602771, -0.00352513,
0.00428975, 0.01328788, 0.009562 , -0.00700381, -0.01264187,
-0.0062282 ],
[ 0.20041326, 0.09805582, 0.07138797, 0.03216692, 0.01974438,
0.01097441, 0.00679441, 0.01015578, 0.01408826, 0.00093645,
0.01501222],
[ 0.29334805, 0.21847449, 0.16305984, 0.11130396, 0.07164415,
0.04744543, 0.03602333, 0.03129354, 0.01074885, 0.01284436,
0.00910995]])
Coordinates:
* x (x) int64 0 1 2
* time (time) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

Fit the exponential decay function to the data along the ``time`` dimension:

>>> fit_result = da.curvefit("time", exp_decay)
>>> fit_result["curvefit_coefficients"].sel(param="time_constant")
<xarray.DataArray 'curvefit_coefficients' (x: 3)>
array([1.05692036, 1.73549639, 2.94215771])
Coordinates:
* x (x) int64 0 1 2
param <U13 'time_constant'
>>> fit_result["curvefit_coefficients"].sel(param="amplitude")
<xarray.DataArray 'curvefit_coefficients' (x: 3)>
array([0.1005489 , 0.19631423, 0.30003579])
Coordinates:
* x (x) int64 0 1 2
param <U13 'amplitude'

An initial guess can also be given with the ``p0`` arg (although it does not make much
of a difference in this simple example). To have a different guess for different
coordinate points, the guess can be a DataArray. Here we use the same initial guess
for the amplitude but different guesses for the time constant:

>>> fit_result = da.curvefit(
... "time",
... exp_decay,
... p0={
... "amplitude": 0.2,
... "time_constant": xr.DataArray([1, 2, 3], coords=[da.x]),
... },
... )
>>> fit_result["curvefit_coefficients"].sel(param="time_constant")
<xarray.DataArray 'curvefit_coefficients' (x: 3)>
array([1.0569213 , 1.73550052, 2.94215733])
Coordinates:
* x (x) int64 0 1 2
param <U13 'time_constant'
>>> fit_result["curvefit_coefficients"].sel(param="amplitude")
<xarray.DataArray 'curvefit_coefficients' (x: 3)>
array([0.10054889, 0.1963141 , 0.3000358 ])
Coordinates:
* x (x) int64 0 1 2
param <U13 'amplitude'

See Also
--------
DataArray.polyfit
Expand Down
103 changes: 75 additions & 28 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -361,17 +361,24 @@ def _initialize_curvefit_params(params, p0, bounds, func_args):
"""Set initial guess and bounds for curvefit.
Priority: 1) passed args 2) func signature 3) scipy defaults
"""
from xarray.core.computation import where

def _initialize_feasible(lb, ub):
# Mimics functionality of scipy.optimize.minpack._initialize_feasible
lb_finite = np.isfinite(lb)
ub_finite = np.isfinite(ub)
p0 = np.nansum(
[
0.5 * (lb + ub) * int(lb_finite & ub_finite),
(lb + 1) * int(lb_finite & ~ub_finite),
(ub - 1) * int(~lb_finite & ub_finite),
]
p0 = where(
Copy link
Contributor Author

@mgunyho mgunyho May 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative I considered here (see cd685ba) is

 p0 = np.nansum(
	 [
		 0.5 * (lb + ub) * (lb_finite & ub_finite),
		 (lb + 1) * (lb_finite & ~ub_finite),
		 (ub - 1) * (~lb_finite & ub_finite),
	 ],
	 axis=0,
 )

which is closer to the original, but it gave warnings about the multiplication (the original version also issues warnings if you remove the int() cast). Not sure which is clearer.

lb_finite,
where(
ub_finite,
0.5 * (lb + ub), # both bounds finite
lb + 1, # lower bound finite, upper infinite
),
where(
ub_finite,
ub - 1, # lower bound infinite, upper finite
0, # both bounds infinite
),
)
return p0

Expand All @@ -381,9 +388,13 @@ def _initialize_feasible(lb, ub):
if p in func_args and func_args[p].default is not func_args[p].empty:
param_defaults[p] = func_args[p].default
if p in bounds:
bounds_defaults[p] = tuple(bounds[p])
if param_defaults[p] < bounds[p][0] or param_defaults[p] > bounds[p][1]:
param_defaults[p] = _initialize_feasible(bounds[p][0], bounds[p][1])
lb, ub = bounds[p]
bounds_defaults[p] = (lb, ub)
param_defaults[p] = where(
(param_defaults[p] < lb) | (param_defaults[p] > ub),
_initialize_feasible(lb, ub),
param_defaults[p],
)
if p in p0:
param_defaults[p] = p0[p]
return param_defaults, bounds_defaults
Expand Down Expand Up @@ -8611,8 +8622,8 @@ def curvefit(
func: Callable[..., Any],
reduce_dims: Dims = None,
skipna: bool = True,
p0: dict[str, Any] | None = None,
bounds: dict[str, Any] | None = None,
p0: dict[str, float | DataArray] | None = None,
bounds: dict[str, tuple[float | DataArray, float | DataArray]] | None = None,
param_names: Sequence[str] | None = None,
kwargs: dict[str, Any] | None = None,
) -> T_Dataset:
Expand Down Expand Up @@ -8643,12 +8654,16 @@ def curvefit(
Whether to skip missing values when fitting. Default is True.
p0 : dict-like, optional
Optional dictionary of parameter names to initial guesses passed to the
`curve_fit` `p0` arg. If none or only some parameters are passed, the rest will
be assigned initial values following the default scipy behavior.
`curve_fit` `p0` arg. If the values are DataArrays, they will be appropriately
broadcast to the coordinates of the array. If none or only some parameters are
passed, the rest will be assigned initial values following the default scipy
behavior.
bounds : dict-like, optional
Optional dictionary of parameter names to bounding values passed to the
`curve_fit` `bounds` arg. If none or only some parameters are passed, the rest
will be unbounded following the default scipy behavior.
Optional dictionary of parameter names to tuples of bounding values passed to the
`curve_fit` `bounds` arg. If any of the bounds are DataArrays, they will be
appropriately broadcast to the coordinates of the array. If none or only some
parameters are passed, the rest will be unbounded following the default scipy
behavior.
param_names : sequence of hashable, optional
Sequence of names for the fittable parameters of `func`. If not supplied,
this will be automatically determined by arguments of `func`. `param_names`
Expand Down Expand Up @@ -8715,29 +8730,53 @@ def curvefit(
"in fitting on scalar data."
)

# Check that initial guess and bounds only contain coordinates that are in preserved_dims
for param, guess in p0.items():
if isinstance(guess, DataArray):
unexpected = set(guess.dims) - set(preserved_dims)
if unexpected:
raise ValueError(
f"Initial guess for '{param}' has unexpected dimensions "
f"{tuple(unexpected)}. It should only have dimensions that are in data "
f"dimensions {preserved_dims}."
)
for param, (lb, ub) in bounds.items():
for label, bound in zip(("Lower", "Upper"), (lb, ub)):
if isinstance(bound, DataArray):
unexpected = set(bound.dims) - set(preserved_dims)
if unexpected:
raise ValueError(
f"{label} bound for '{param}' has unexpected dimensions "
f"{tuple(unexpected)}. It should only have dimensions that are in data "
f"dimensions {preserved_dims}."
)

# Broadcast all coords with each other
coords_ = broadcast(*coords_)
coords_ = [
coord.broadcast_like(self, exclude=preserved_dims) for coord in coords_
]
n_coords = len(coords_)

params, func_args = _get_func_args(func, param_names)
param_defaults, bounds_defaults = _initialize_curvefit_params(
params, p0, bounds, func_args
)
n_params = len(params)
kwargs.setdefault("p0", [param_defaults[p] for p in params])
kwargs.setdefault(
"bounds",
[
[bounds_defaults[p][0] for p in params],
[bounds_defaults[p][1] for p in params],
],
)

def _wrapper(Y, *coords_, **kwargs):
def _wrapper(Y, *args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _wrapper(Y, *args, **kwargs):
def _wrapper(Y, coords: list[?], p0: list[?], lb: list[?], ub: list[?], **kwargs):

The argument handling below is hard to read and looks prone to error, is a more explicit version possible?
Might have to remove the * in apply_ufunc as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I'm not sure it's possible. The signature is apply_ufunc(func, *args, ...), which only broadcasts each element in *args, so it doesn't work if the element itself is a list, like in the case of the initial guess. If a, b and c are float | DataArray, then we can only broadcast them like f(a, b, c), and not like f([a, b, c]).

Let me know if you think the code could be made easier to follow somehow.

Copy link
Contributor Author

@mgunyho mgunyho May 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I found a way to do it: if we convert the outer list to an additional dimension for coords_ and for each of the params & bounds, and add that additional dimension to input_core_dims, then we can pass array-valued arguments to _wrapper with apply_ufunc. I'm not sure if it's more readable though.

You can see the implementation here (see here for the diff compared to this version, and here for compared to main (note: you have to click to the Files tab, I wasn't able to link straight to the line in the diff)).

It still needs some work:

  • I now use Dataset(...).to_array(dim_name) to do the broadcasting + adding a new dimension, but is this a good way to do it? Another way I could think of was something like concat([coord.expand_dims(new_dim=[i]) for i, coord in enumerate(coords_)], "new_dim"). Both options are not very easy to understand, and probably have quite a bit of overhead.
  • The dimension names "param" and "coord" are now hard-coded, so it will not work if the input array has these dimensions. Do you have a better suggestion? The output already has the hard-coded dimension "param", so maybe it's not so bad.

If you think it's worth pursuing this approach, I can add that commit to this branch and we can discuss further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer the current version in this PR.

I apologize for the noise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. I think the alternative version could be OK, if there was a more explicit / clearer way of saying "for each element in this list, broadcast them to be the same shape, and then concatenate them along a new temporary axis", other than Dataset(...).to_array(). But yeah neither option is clearly better.

# Wrap curve_fit with raveled coordinates and pointwise NaN handling
x = np.vstack([c.ravel() for c in coords_])
# *args contains:
# - the coordinates
# - initial guess
# - lower bounds
# - upper bounds
coords__ = args[:n_coords]
p0_ = args[n_coords + 0 * n_params : n_coords + 1 * n_params]
lb = args[n_coords + 1 * n_params : n_coords + 2 * n_params]
ub = args[n_coords + 2 * n_params :]

x = np.vstack([c.ravel() for c in coords__])
y = Y.ravel()
if skipna:
mask = np.all([np.any(~np.isnan(x), axis=0), ~np.isnan(y)], axis=0)
Expand All @@ -8748,7 +8787,7 @@ def _wrapper(Y, *coords_, **kwargs):
pcov = np.full([n_params, n_params], np.nan)
return popt, pcov
x = np.squeeze(x)
popt, pcov = curve_fit(func, x, y, **kwargs)
popt, pcov = curve_fit(func, x, y, p0=p0_, bounds=(lb, ub), **kwargs)
return popt, pcov

result = type(self)()
Expand All @@ -8758,13 +8797,21 @@ def _wrapper(Y, *coords_, **kwargs):
else:
name = f"{str(name)}_"

input_core_dims = [reduce_dims_ for _ in range(n_coords + 1)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I factored this out of the function call because Black was suggesting something like

apply_ufunc(
    ...,
    input_core_dims=[reduce_dims_ for _ in range(n_coords + 1)]
    + [
        [] for _ in range(3 * n_params)
    ],  # core_dims for p0 and bounds
    ...
)

which I found quite ugly.

input_core_dims.extend(
[[] for _ in range(3 * n_params)]
) # core_dims for p0 and bounds

popt, pcov = apply_ufunc(
_wrapper,
da,
*coords_,
*param_defaults.values(),
*[b[0] for b in bounds_defaults.values()],
*[b[1] for b in bounds_defaults.values()],
vectorize=True,
dask="parallelized",
input_core_dims=[reduce_dims_ for d in range(len(coords_) + 1)],
input_core_dims=input_core_dims,
output_core_dims=[["param"], ["cov_i", "cov_j"]],
dask_gufunc_kwargs={
"output_sizes": {
Expand Down
Loading