Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement the nugget on the diagonal to prevent exact interpolation #30

Closed
kvanlombeek opened this issue Dec 20, 2016 · 12 comments · Fixed by #153
Closed

implement the nugget on the diagonal to prevent exact interpolation #30

kvanlombeek opened this issue Dec 20, 2016 · 12 comments · Fixed by #153

Comments

@kvanlombeek
Copy link
Contributor

As suggested already in the code, implement the option that the variance is taken into account when you krige.

@rth
Copy link
Contributor

rth commented Dec 20, 2016

@kvanlombeek Could you please explain a bit more what you mean by "taken into account", and in which section it is suggested in the code? I couldn't find any reference to it in the code (apart the fact that variance is calculated)... Thanks!

Edit: missed the description in the title again ) Could you explain a bit more "implement the sill on the diagonal to prevent exact interpolation"?

@kvanlombeek kvanlombeek changed the title implement the sill on the diagonal to prevent exact interpolation implement the nugget on the diagonal to prevent exact interpolation Dec 21, 2016
@kvanlombeek
Copy link
Contributor Author

This is what I found in the code, and is exactly what I would like to implement:

    In the future, the code may include an extra 'exact_values' boolean flag that can be
    adjusted to specify whether to treat the measurements as 'exact'. Setting the flag
    to false would indicate that the variogram should not be forced to be zero at zero distance
    (i.e., when evaluated at data points). Instead, the uncertainty in the point will be
    equal to the nugget. This would mean that the diagonal of the kriging matrix would be set to
    the nugget instead of to zero.

Realized I made some mistakes in the description, sorry about that.

@basaks
Copy link
Collaborator

basaks commented Dec 21, 2016

This is actually useful and is analogous to adding measurement noise to the observations.

@bsmurphy
Copy link
Contributor

Yeah, this is one of my long-term goals... shouldn't be that hard to implement. I think it would basically just require (1) adding a new internal variable into the init function(s) and (2) adding one statement to set the diagonal of the kriging matrix to the nugget. Might be another sneaky addition that would need to be made, but I don't think it would require much more than that. @kvanlombeek, would you be willing to take out a pull request for this? Otherwise I can work on it in the coming weeks...

@kvanlombeek
Copy link
Contributor Author

Yes I can give this a shot. I actually implemented this already here, but would like to do it all over again as it was a bit messy.

I personally have never understood why it is not standard, the data I krige (mostly houseprices) is always full of noise.

@basaks
Copy link
Collaborator

basaks commented Dec 22, 2016

I think this should be implemented.

I personally have never understood why it is not standard, the data I krige (mostly houseprices) is always full of noise.

In the Gaussial Processes literature adding (white)noise to the Kernel functions is actually very common as demonstrated in the examples. Otherwise with enough basis functions you can fit your training data exactly (this will compare to zero nugget), and that can introduce overfitting.

@kvanlombeek
Copy link
Contributor Author

I am having difficulties implementing this.

Do @rth @basaks @basaks you know why you fill in the negative of the variogram function, around line 381 in ok.py:

a[:n, :n] = - self.variogram_function(self.variogram_model_parameters, d)

If I replace the diagonal with the absolute values of the diagonal it works, but I can't get why.

@rth
Copy link
Contributor

rth commented Dec 23, 2016

@bsmurphy would know more about this I think...

@bsmurphy
Copy link
Contributor

That is a good question... I can't remember exactly right off the top of my head (and I don't have my reference texts handy at the moment), but I think it has to do with the sign convention that I used in this formulation of the kriging system...
What are you trying to put on the diagonal of the matrix currently? The negative of the nugget, or the actual (positive) nugget? In my comment above I think I mentioned just the diagonal of the kriging matrix; I think you also have to modify the formulation of the RHS of the matrix system (vector b) as well to add the nugget.

@kvanlombeek
Copy link
Contributor Author

Hi all,

I have investigated a bit more, and came to the following:

  • The sign convention is a bit odd and hard to debug. I believe it is cleaner if matrix a would be filled with positive values. That implies changing the signs here and there in the code.

  • I have been looking at _exec_loop, as this is the easiest to read. I believer there is a mistake in the code, the calculation of sigmasq around line 470 is wrong, shouldn't it be sigmasq[j] = np.sum(x[:n, 0] * b[:n, 0]) + b[-1,0] instead of sigmasq[j] = np.sum(x[:, 0] * b[:, 0])

For the rest it looks pretty straight forward. Do you guys know a text book example that we can use to be sure our calculations are correct? Something with maybe only 5 values?

@bsmurphy
Copy link
Contributor

Changing the sign convention could unleash all sorts of unanticipated problems, but with @rth's refactoring idea this would be the time to mess around with it. I'll need to look through my references again to make sure I'm remembering the sign business correctly... Regarding the calculation of the squared residuals, you're probably right -- I noticed that the kriging statistics calculations often produce incorrect (or even NaN) results, so something is definitely wrong with that whole business at some point. Again, I'll need to check the references... The Kitanidis text (the reference I use most often for all the kriging stuff) has an example I think, I'll try to pull it out.

@lewisjared
Copy link

Just wondering if this (being allowed to have uncertainties on the measurements for the kriging procedure) has been implemented yet? Is there a branch which contains the WIP?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants