Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrete transform #70

Merged
merged 10 commits into from
Feb 25, 2020
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ All notable changes to **GSTools** will be documented in this file.
- Universal
- External Drift Kriging
- Detrended Kriging
- a new transformation function for discrete fields has been added #70

### Changes
- Python versions 2.7 and 3.4 are no longer supported #40 #43
Expand Down
20 changes: 20 additions & 0 deletions examples/07_transformations/02_discrete.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
"""
discrete fields
-------------

Here we transform a field to a discrete field with five values.
If we do not give thresholds, the pairwise means of the given
values are taken as thresholds.
"""
import numpy as np
import gstools as gs

# structured field with a size of 100x100 and a grid-size of 1x1
x = y = range(100)
model = gs.Gaussian(dim=2, var=1, len_scale=10)
srf = gs.SRF(model, seed=20170519)
srf.structured([x, y])
# create 5 equidistanly spaced values
discrete_values = np.linspace(np.min(srf.field), np.max(srf.field), 5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values are equidistantly, yes. But they don't need to be equally distributed in the end. I think, this could be misleading.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean it's misleading for this specific example? Would " create 5 eq. spaced values for this example" be better?! Or do you prefer an example with non-eq. spaced values?

gs.transform.discrete(srf, discrete_values)
srf.plot()
1 change: 1 addition & 0 deletions examples/07_transformations/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ common transformations:

.. autosummary::
binary
discrete
boxcox
zinnharvey
normal_force_moments
Expand Down
3 changes: 3 additions & 0 deletions gstools/transform/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

.. autosummary::
binary
discrete
boxcox
zinnharvey
normal_force_moments
Expand All @@ -22,6 +23,7 @@

from gstools.transform.field import (
binary,
discrete,
boxcox,
zinnharvey,
normal_force_moments,
Expand All @@ -33,6 +35,7 @@

__all__ = [
"binary",
"discrete",
"boxcox",
"zinnharvey",
"normal_force_moments",
Expand Down
52 changes: 52 additions & 0 deletions gstools/transform/field.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

.. autosummary::
binary
discrete
boxcox
zinnharvey
normal_force_moments
Expand All @@ -26,6 +27,7 @@

__all__ = [
"binary",
"discrete",
"boxcox",
"zinnharvey",
"normal_force_moments",
Expand Down Expand Up @@ -67,6 +69,56 @@ def binary(fld, divide=None, upper=None, lower=None):
fld.field[fld.field <= divide] = lower


def discrete(fld, values, thresholds=None):
"""
Discrete transformation.

After this transformation, the field has only `len(values)` discrete
values.

Parameters
----------
fld : :any:`Field`
Spatial Random Field class containing a generated field.
Field will be transformed inplace.
values : :any:`np.ndarray`
The discrete values the field will take
thresholds : :class:`numpy.ndarray`, optional
the thresholds, where the value classes are separated
Default: mean of the neighbouring values
"""
if fld.field is None:
print("discrete: no field stored in SRF class.")
else:
if thresholds is None:
# just in case, sort the values
values = np.sort(values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I want the values to BE unsorted? Why do the given values need to be in relation to the values of the given field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, at least for the way I implemented the transformation, the values have to be monotonically increasing. I'm also not sure of how to interpret non monotonically increasing values.

thresholds = (values[1:] + values[:-1]) / 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the natural way would be to us thresholds, so that the resulting ratios between the given values are even. Maybe I don't get the aim of this transformation...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was also not exactly sure how to handle the kind of input arguments the best.
But at least in my use case I'm just interested in the number of "value classes". The non-arithmetic mean thresholds are just a little generalisation, which was easy to implement.

If the function would just take an array of thresholds, what values would you assign to all the field values lying in between two thresholds?

else:
if len(values) != len(thresholds) + 1:
raise ValueError(
"discrete transformation: len(values) != len(thresholds) + 1"
)
values = np.array(values)
thresholds = np.array(thresholds)
for i in range(len(thresholds)):
if not (values[i] <= thresholds[i] < values[i + 1]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this restriction?

Copy link
Member Author

@LSchueler LSchueler Feb 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How else would you define an array of thresholds, which subdivides an array of values? - It's a bit like the nodes and the edges of a graph.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The binary transformation does the following:

  1. get a divide value to select "lower" and "upper" values
  2. replace the lower and upper values with given values

The divide values is unrelated to the given lower and upper values, which will be set in the field. It is only related to the input-field

raise ValueError(
"discrete transformation: thresholds must lie between values"
)

# handle edge cases
fld.field[fld.field <= thresholds[0]] = values[0]
fld.field[fld.field > thresholds[-1]] = values[-1]

for i in range(len(values[:-2])):
fld.field[
np.logical_and(
thresholds[i] <= fld.field, fld.field < thresholds[i + 1]
)
] = values[i + 1]


def boxcox(fld, lmbda=1, shift=0):
"""
Box-Cox transformation.
Expand Down
7 changes: 7 additions & 0 deletions tests/test_srf.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,13 @@ def test_transform(self):
tf.binary(srf)
srf((self.x_grid, self.y_grid), seed=self.seed, mesh_type="structured")
tf.boxcox(srf)
srf((self.x_grid, self.y_grid), seed=self.seed, mesh_type="structured")
values = np.linspace(np.min(srf.field), np.max(srf.field), 3)
tf.discrete(srf, values)
srf((self.x_grid, self.y_grid), seed=self.seed, mesh_type="structured")
values = [-1, 0, 1]
thresholds = [-0.9, 0.1]
tf.discrete(srf, values, thresholds)

def test_incomprrandmeth(self):
self.cov_model = Gaussian(dim=2, var=0.5, len_scale=1.0)
Expand Down