Supplying multidimensional initial guess to `curvefit` #7768

mgunyho · 2023-04-19T12:37:53Z

Is your feature request related to a problem?

Hi, I'm trying to use DataArray.curvefit to fit a bunch of data. Let's say the data dimensions are (x, experiment_index), and I'm trying to fit m * x + b, where m will be different for each experiment_index. I would like to supply an initial guess p0 to curvefit that depends on experiment_index, but it seems like this is not supported. Here's a minimal example:

import numpy as np
import xarray as xr

x = xr.DataArray(coords=[("x", np.linspace(0, 10, 101))]).x
i = xr.DataArray(coords=[("experiment_index", [1, 2, 3])]).experiment_index

data = 2.0 * i * x + 5

m_guess = 2 * i

data.curvefit(
    "x",
    lambda x, m, b: m * x + b,
    p0={"m": m_guess}  # I would like to provide a guess for 'm' as a function of `experiment_index`
)

Describe the solution you'd like

I would like to be able to provide arrays as the values of p0, so that I can have different initial guesses for different slices of the data.

I suppose this could also be implemented for bounds.

Describe alternatives you've considered

I could wrap curvefit in a for-loop, for example

result = []
for y in data.transpose("experiment_index", ...):
    result.append(y.curvefit(
        "x",
        lambda x, m, b: m * x + b,
        p0={"m": m_guess.sel(experiment_index=y.experiment_index).item()},
    ))
result = xr.concat(result, dim="experiment_index")

But this is quite cumbersome, especially for multidimensional data.

Additional context

The above example gives the error

*** ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

because curve_fit tries to do np.atleast_1d([m_guess, 1]), but it should be np.atleast_1d([m_guess[0], 1]).

~~The above example gives the error~~

ValueError: operands could not be broadcast together with shapes (3,) (101,)

~~which comes from scipy.curve_fit tying to compute m * x, where m is the DataArray m_guess, but x is a plain Numpy array, basically x.data.~~ - this applies for scipy 1.7.

This toy example of course works with just a scalar guess like p0={"m": 2}, but in my case the function is more complicated and fit might fail if the initial guess is too far off.

The initial guess is inserted into kwargs passed to curve_fit here:

xarray/xarray/core/dataset.py

Line 8659 in c75ac8b

kwargs.setdefault("p0", [param_defaults[p] for p in params])

The text was updated successfully, but these errors were encountered:

welcome · 2023-04-19T12:37:55Z

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

slevang · 2023-04-20T17:16:39Z

This should be doable. I think we would have to rework the apply_ufunc wrapper to pass p0 and bounds as DataArrays through *args instead of simple dictionaries throughkwargs, so that apply_ufunc can broadcast them and handle dask chunks.

mgunyho · 2023-05-06T13:18:47Z

I implemented this for p0 in #7821. I used your idea of passing p0 as part of *args. It's maybe a tiny bit hacky to put two things in *args and then reconstruct them based on the lengths, but not too bad.

I can also do this for the bounds, just didn't have time to do it yet. How do you think the multidimensional bounds should be passed? As a tuple of arrays, or as an array of tuples, or something else? To me, it would make most sense to pass them as tuples of "things that can be broadcast", so that e.g. the lower bound of can be a scalar 0, but the upper bound could vary.

BeaBoh · 2024-03-25T17:14:29Z

Hello,

Was this also implemented for bounds at the end?

Thanks

mgunyho · 2024-03-25T20:02:13Z

Hi, yes, it was implemented for bounds as well.

mgunyho added the enhancement label Apr 19, 2023

mgunyho mentioned this issue May 6, 2023

Implement multidimensional initial guess and bounds for curvefit #7821

Merged

3 tasks

Illviljan closed this as completed in #7821 May 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supplying multidimensional initial guess to `curvefit` #7768

Supplying multidimensional initial guess to `curvefit` #7768

mgunyho commented Apr 19, 2023 •

edited

Loading

welcome bot commented Apr 19, 2023

slevang commented Apr 20, 2023

mgunyho commented May 6, 2023

BeaBoh commented Mar 25, 2024

mgunyho commented Mar 25, 2024

Supplying multidimensional initial guess to curvefit #7768

Supplying multidimensional initial guess to curvefit #7768

Comments

mgunyho commented Apr 19, 2023 • edited Loading

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

welcome bot commented Apr 19, 2023

slevang commented Apr 20, 2023

mgunyho commented May 6, 2023

BeaBoh commented Mar 25, 2024

mgunyho commented Mar 25, 2024

Supplying multidimensional initial guess to `curvefit` #7768

Supplying multidimensional initial guess to `curvefit` #7768

mgunyho commented Apr 19, 2023 •

edited

Loading