Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to use GridSearchCV with the class OrdinaryKriging? #128

Closed
Irene-GM opened this issue Jul 16, 2019 · 11 comments · Fixed by #158
Closed

Is it possible to use GridSearchCV with the class OrdinaryKriging? #128

Irene-GM opened this issue Jul 16, 2019 · 11 comments · Fixed by #158

Comments

@Irene-GM
Copy link

Irene-GM commented Jul 16, 2019

Hi,

I am interested in using pykrige.ok.OrdinaryKriging and pykrige.uk.UniversalKriging in many different set ups with respect to the accepted parameters. Thus, it makes sense to use GridSearchCV to define a big dictionary with all the parameters and let the library do the rest. However, it seems that GridSearchCV is coupled to the RegressionKriging package and not accepting parameters from OK/UK. Thus, if I use the code below specifying some of the OK parameters (e.g. variogram_parameters, anisotropy_scaling), the following error is raised:

ValueError: Invalid parameter anisotropy_angle for estimator Krige(method='ordinary', n_closest_points=10, nlags=6, variogram_model='linear',
      verbose=False, weight=False). Check the list of available parameters with `estimator.get_params().keys()`.

Find below the code used. I wonder whether there is a way of using GridSearchCV also with the OK/UK parameters that I might have overlooked, or else the solution is to generate in a loop the models with the parameters required for this experiment.

Thanks for the great work! :-)

import numpy as np
from pykrige.rk import Krige
from pykrige.compat import GridSearchCV

methods = ["ordinary"]
vario_models = ["linear", "exponential", "gaussian", "spherical", "power"]
weights = [True, False]
ani_scalings = [1]
ani_angles = [0]
verb = [True]
enable_plot = [False]
enable_stat = [False]
coordi_type = ["euclidean"]

param_dict = {  "method":methods,
                "variogram_model":vario_models,
                "weight":weights,
                "anisotropy_scaling":ani_scalings,
                "anisotropy_angle":ani_angles,
                "verbose":verb,
                "enable_plotting":enable_plot,
                "enable_statistics":enable_stat,
                "coordinates_type":coordi_type
                }

estimator = GridSearchCV(Krige(), param_dict, verbose=True)

print("List of params: ")
print(estimator.get_params().keys())

# dummy data
X3 = np.random.randint(0, 400, size=(100, 2)).astype(float)
y = 5 * np.random.rand(100)

# run the gridsearch
estimator.fit(X=X3, y=y)

print(estimator.cv_results_)
@mariosgeo
Copy link

Following cause I have the same issue.

@jtilson
Copy link

jtilson commented Mar 18, 2020

I am stuck by this as well.

@jtilson
Copy link

jtilson commented Mar 19, 2020

For my own information, do issues like this get resolved by some volunteer taking on the task? This turns out to be highly important, for those doing science and using this code.

@MuellerSeb
Copy link
Member

Hey there. I'll have a look at it. Since we are going to restructure PyKrige with version 2, we can add it to the list.

@MuellerSeb MuellerSeb added this to the v2.0 milestone Apr 4, 2020
@MuellerSeb MuellerSeb self-assigned this Apr 4, 2020
@MuellerSeb
Copy link
Member

MuellerSeb commented Apr 5, 2020

For the moment you could rewrite the Krige Estimator used by GridSearchCV, to support additional keywords.
I did this with the following hack in your script:

import numpy as np
from pykrige.compat import GridSearchCV
from pykrige.ok import OrdinaryKriging
from pykrige.uk import UniversalKriging
from pykrige.ok3d import OrdinaryKriging3D
from pykrige.uk3d import UniversalKriging3D
from sklearn.base import RegressorMixin, BaseEstimator


krige_methods = {
    "ordinary": OrdinaryKriging,
    "universal": UniversalKriging,
    "ordinary3d": OrdinaryKriging3D,
    "universal3d": UniversalKriging3D,
}
threed_krige = ("ordinary3d", "universal3d")
# valid additional keywords for each method
krige_methods_kws = {
    "ordinary": [
        "anisotropy_scaling",
        "anisotropy_angle",
        "enable_statistics",
        "coordinates_type",
    ],
    "universal": [
        "anisotropy_scaling",
        "anisotropy_angle",
        "drift_terms",
        "point_drift",
        "external_drift",
        "external_drift_x",
        "external_drift_y",
        "functional_drift",
    ],
    "ordinary3d": [
        "anisotropy_scaling_y",
        "anisotropy_scaling_z",
        "anisotropy_angle_x",
        "anisotropy_angle_y",
        "anisotropy_angle_z",
    ],
    "universal3d": [
        "anisotropy_scaling_y",
        "anisotropy_scaling_z",
        "anisotropy_angle_x",
        "anisotropy_angle_y",
        "anisotropy_angle_z",
        "drift_terms",
        "functional_drift",
    ],
}


def validate_method(method):
    if method not in krige_methods.keys():
        raise ValueError(
            "Kriging method must be one of {}".format(krige_methods.keys())
        )


class Krige(RegressorMixin, BaseEstimator):
    def __init__(
        self,
        method="ordinary",
        variogram_model="linear",
        variogram_parameters=None,
        variogram_function=None,
        nlags=6,
        weight=False,
        n_closest_points=10,
        verbose=False,
        anisotropy_scaling=1.0,
        anisotropy_angle=0.0,
        enable_statistics=False,
        coordinates_type="euclidean",
        anisotropy_scaling_y=1.0,
        anisotropy_scaling_z=1.0,
        anisotropy_angle_x=0.0,
        anisotropy_angle_y=0.0,
        anisotropy_angle_z=0.0,
        drift_terms=None,
        point_drift=None,
        external_drift=None,
        external_drift_x=None,
        external_drift_y=None,
        functional_drift=None,
    ):
        validate_method(method)
        self.variogram_model = variogram_model
        self.variogram_parameters = variogram_parameters
        self.variogram_function = variogram_function
        self.nlags = nlags
        self.weight = weight
        self.verbose = verbose
        self.anisotropy_scaling = anisotropy_scaling
        self.anisotropy_angle = anisotropy_angle
        self.enable_statistics = enable_statistics
        self.coordinates_type = coordinates_type
        self.anisotropy_scaling_y = anisotropy_scaling_y
        self.anisotropy_scaling_z = anisotropy_scaling_z
        self.anisotropy_angle_x = anisotropy_angle_x
        self.anisotropy_angle_y = anisotropy_angle_y
        self.anisotropy_angle_z = anisotropy_angle_z
        self.drift_terms = drift_terms
        self.point_drift = point_drift
        self.external_drift = external_drift
        self.external_drift_x = external_drift_x
        self.external_drift_y = external_drift_y
        self.functional_drift = functional_drift
        self.model = None  # not trained
        self.n_closest_points = n_closest_points
        self.method = method
        self.val_kw = "val" if self.method in threed_krige else "z"

    def fit(self, x, y, *args, **kwargs):
        setup = dict(
            variogram_model=self.variogram_model,
            variogram_parameters=self.variogram_parameters,
            variogram_function=self.variogram_function,
            nlags=self.nlags,
            weight=self.weight,
            verbose=self.verbose,
        )
        add_setup = dict(
            anisotropy_scaling=self.anisotropy_scaling,
            anisotropy_angle=self.anisotropy_angle,
            enable_statistics=self.enable_statistics,
            coordinates_type=self.coordinates_type,
            anisotropy_scaling_y=self.anisotropy_scaling_y,
            anisotropy_scaling_z=self.anisotropy_scaling_z,
            anisotropy_angle_x=self.anisotropy_angle_x,
            anisotropy_angle_y=self.anisotropy_angle_y,
            anisotropy_angle_z=self.anisotropy_angle_z,
            drift_terms=self.drift_terms,
            point_drift=self.point_drift,
            external_drift=self.external_drift,
            external_drift_x=self.external_drift_x,
            external_drift_y=self.external_drift_y,
            functional_drift=self.functional_drift,
        )
        for kw in krige_methods_kws[self.method]:
            setup[kw] = add_setup[kw]
        input_kw = self._dimensionality_check(x)
        input_kw.update(setup)
        input_kw[self.val_kw] = y
        self.model = krige_methods[self.method](**input_kw)

    def _dimensionality_check(self, x, ext=""):
        if self.method in ("ordinary", "universal"):
            if x.shape[1] != 2:
                raise ValueError("2d krige can use only 2d points")
            else:
                return {"x" + ext: x[:, 0], "y" + ext: x[:, 1]}
        if self.method in ("ordinary3d", "universal3d"):
            if x.shape[1] != 3:
                raise ValueError("3d krige can use only 3d points")
            else:
                return {
                    "x" + ext: x[:, 0],
                    "y" + ext: x[:, 1],
                    "z" + ext: x[:, 2],
                }

    def predict(self, x, *args, **kwargs):
        if not self.model:
            raise Exception("Not trained. Train first")
        points = self._dimensionality_check(x, ext="points")
        return self.execute(points, *args, **kwargs)[0]

    def execute(self, points, *args, **kwargs):
        points.update(dict(style="points", backend="loop"))
        if isinstance(self.model, (OrdinaryKriging, OrdinaryKriging3D)):
            points.update(dict(n_closest_points=self.n_closest_points))
        else:
            print("n_closest_points will be ignored for UniversalKriging")
        prediction, variance = self.model.execute(**points)
        return prediction, variance


param_dict = {
    "method": ["ordinary"],
    "anisotropy_scaling": [1],
    "anisotropy_angle": [0],
    "enable_statistics": [False],
    "coordinates_type": ["euclidean"],
    "variogram_model": ["linear", "exponential", "gaussian", "spherical"],
    "weight": [True, False],
    "verbose": [True],
}

estimator = GridSearchCV(Krige(), param_dict, verbose=True)
# dummy data
X3 = np.random.randint(0, 400, size=(100, 2)).astype(float)
y = 5 * np.random.rand(100)
# run the gridsearch
estimator.fit(X=X3, y=y)
print(estimator.cv_results_)

Does this help? @Irene-GM @mariosgeo @jtilson
@rth: is this hack worth implementing in a 1.5.1 version? I think it's really just a hack.

@mariosgeo
Copy link

@MuellerSeb yes it helps. Thanks for updating the script, I think it's worthwhile adding it to version 1.5.1 as well.

@MuellerSeb MuellerSeb modified the milestones: v2.0, v1.5.1 Apr 6, 2020
@jtilson
Copy link

jtilson commented Apr 6, 2020

Thank you for looking into this.

@jtilson
Copy link

jtilson commented Jun 2, 2020 via email

@MuellerSeb
Copy link
Member

HI Sabastian, Thank you (all) for your hard work. Question, the below script doesn't account for an important aspect of mine. That is optimizing the vparams (specifically nugget, sill and range). Would you expect to be able to simply add those keys to the script or will they still require special treatment? Regards, jeff

The variogram_parameters can be accessed in the Class given above, if I am not getting it wrong. They are None by default.
Only problem is, you have to stick to one model, or at least a set of models, that have the same list of model parameters (linear has no length-scale for example).

@MuellerSeb
Copy link
Member

@jtilson : I was a bit wrong. You can also pass a dict containing the variogram parameters by name and this can include a "range" keyword which is ignored in the case of a "linear" model. So you don't have to care about the models and you can provide the parameters as a dictionary (and these could come from a list to use the GridSearch).

@jtilson
Copy link

jtilson commented Aug 3, 2020 via email

@rth rth closed this as completed in #158 Aug 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants