Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variogram fitting does not work for 'large' coordinates? #143

Closed
krihabu opened this issue Mar 18, 2021 · 6 comments · Fixed by #145
Closed

Variogram fitting does not work for 'large' coordinates? #143

krihabu opened this issue Mar 18, 2021 · 6 comments · Fixed by #145
Assignees
Labels
bug Something isn't working Documentation help wanted Extra attention is needed question Further information is requested
Milestone

Comments

@krihabu
Copy link

krihabu commented Mar 18, 2021

Hello,

I have encountered an issue in fitting a variogram model to an estimated variogram and it seems to be connected to the used coordinates.

Along the tutorial on variogram fitting I tried the following code:

bin_max = 300000

bins = np.linspace(0, bin_max, 10)
bin_center, gamma = gstools.vario_estimate_unstructured((y_coord, x_coord), vals, bins)

fit_model = gstools.Exponential(dim=2)
fit_param, _ = fit_model.fit_variogram(bin_center, gamma)
plt.plot(bin_center, gamma)
gs.covmodel.plot.plot_variogram(fit_model, ax=plt.gca(), x_max=bins[-1])

The arrays x_coord and y_coord contain the coordinates for data points in Germany in the Gauss-Krüger projection (EPSG 31467). Hence, they are in the range 3280000 to 3935000 for the x direction and 5235000 to 6105000 for the y direction.

This is the resulting variogram plot, which does not look right:
gstools_issue1

I tried to locate my error and found that when doing the following conversions beforehand, the result looks very different:

x_coord = x_coord / 10 ** 6
y_coord = y_coord / 10 ** 6
bin_max = bin_max / 10 ** 6

This is the resulting variogram plot:
gstools_issue2

Is this expected behavior?

--
I use Python 3.7.9 and gstools 1.2.1

@MuellerSeb
Copy link
Member

Hey @krihabu ,

this is a problem with scipy's curve_fit. The routine gives up to fast and doesn't search the higher regions for length-scales.

There were some major changes since 1.2.1 in develop and v1.3 will be released soon. There you can provide initial guesses for the variogram fitting:

import numpy as np
import gstools as gs

### SYNTHETIC DATA
x = np.random.RandomState(19970221).rand(1000) * 10000.
y = np.random.RandomState(20011012).rand(1000) * 10000.
model = gs.Exponential(dim=2, var=2, len_scale=2000)
srf = gs.SRF(model, mean=0, seed=19970221)
field = srf((x, y))
### SYNTHETIC DATA

# estimate the variogram of the field (auto binning)
bin_center, gamma = gs.vario_estimate((x, y), field)

# fit the variogram with a stable model. (no nugget fitted)
fit_model = gs.Stable(dim=2)
fit_model.len_scale = 1000  # init guess for len_scale
fit_model.fit_variogram(bin_center, gamma, nugget=False, init_guess="current")

# output
ax = fit_model.plot(x_max=max(bin_center))
ax.scatter(bin_center, gamma)
print(fit_model)

vario_fit

Stable(dim=2, var=1.7, len_scale=1.68e+03, nugget=0.0, anis=[1.0], angles=[0.0], alpha=0.932)

So providing an initial guess for len-scale could be a good tutorial (like taking 1/10 of the domain size as often reported as a good guess in literature).

But you are right, it should work out of the box.

Another idea I was thinking about for some time was to fit lenght scale via its order of magnitude, i.e. log-value. This could help finding the correct size.

Hope this helps for now! Will keep this open to keep track of it.
Cheers, Sebastian

@MuellerSeb MuellerSeb self-assigned this Mar 19, 2021
@MuellerSeb MuellerSeb added bug Something isn't working Documentation help wanted Extra attention is needed question Further information is requested labels Mar 19, 2021
@LSchueler
Copy link
Member

Just in case you didn't see the instructions and you need a solution urgently, it's pretty easy to install the latest and not yet released version of GSTools via pip.

@krihabu
Copy link
Author

krihabu commented Mar 19, 2021

Thank you for the quick response!

I will go with my dirty fix for the moment and will try the init_guess option when the new version is released.

@MuellerSeb
Copy link
Member

Another option could be to change the scale of the model with the new rescale parameter, so that the len_scale is given in kilometer and not meter:

# estimate the variogram of the field
bin_center, gamma = gs.vario_estimate((x, y), field)
# fit the variogram with a rescale factor of 1/1000
fit_model = gs.Exponential(dim=2, rescale=1 / 1000)
fit_model.fit_variogram(bin_center, gamma)
# output
ax = fit_model.plot(x_max=max(bin_center))
ax.scatter(bin_center, gamma)
print(fit_model)

vario_fit_rescale

Exponential(dim=2, var=1.61, len_scale=1.68, nugget=0.0509, anis=[1.0], angles=[0.0])

As you see, len_scale=1.68 could be estimated with the correct order of magnitude.
Note, that the default init guess is always 1.

The rescaled length can be accessed by:

fit_model.len_rescaled
1678.510771993176

@MuellerSeb
Copy link
Member

@krihabu Thanks to your feedback, I came up with a new set of initial guesses for variogram fitting: #145.
len_scale will be set to the mean of given bin centers to obtain the correct order of magnitude. In addition, var and nugget will be set to the mean of given variogram values for the same reason.

Will be part of GSTools v1.3 😉

Love this! Thanks.
Sebastian

@MuellerSeb MuellerSeb added this to the 1.3 milestone Mar 21, 2021
@MuellerSeb MuellerSeb linked a pull request Mar 21, 2021 that will close this issue
@krihabu
Copy link
Author

krihabu commented Mar 22, 2021

Those sound like reasonable presumptions, thanks for integrating this so quickly.
Looking forward to the new release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Documentation help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants