-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discrete transform #70
Changes from 4 commits
d0c270d
c6f6d4a
c4e0319
f2cbee0
9a88da7
bbb8ec6
00df6a4
4a8c9b8
a3580c7
ddb6e5f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
""" | ||
discrete fields | ||
------------- | ||
|
||
Here we transform a field to a discrete field with five values. | ||
If we do not give thresholds, the pairwise means of the given | ||
values are taken as thresholds. | ||
""" | ||
import numpy as np | ||
import gstools as gs | ||
|
||
# structured field with a size of 100x100 and a grid-size of 1x1 | ||
x = y = range(100) | ||
model = gs.Gaussian(dim=2, var=1, len_scale=10) | ||
srf = gs.SRF(model, seed=20170519) | ||
srf.structured([x, y]) | ||
# create 5 equidistanly spaced values | ||
discrete_values = np.linspace(np.min(srf.field), np.max(srf.field), 5) | ||
gs.transform.discrete(srf, discrete_values) | ||
srf.plot() |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,7 @@ common transformations: | |
|
||
.. autosummary:: | ||
binary | ||
discrete | ||
boxcox | ||
zinnharvey | ||
normal_force_moments | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,7 @@ | |
|
||
.. autosummary:: | ||
binary | ||
discrete | ||
boxcox | ||
zinnharvey | ||
normal_force_moments | ||
|
@@ -26,6 +27,7 @@ | |
|
||
__all__ = [ | ||
"binary", | ||
"discrete", | ||
"boxcox", | ||
"zinnharvey", | ||
"normal_force_moments", | ||
|
@@ -67,6 +69,56 @@ def binary(fld, divide=None, upper=None, lower=None): | |
fld.field[fld.field <= divide] = lower | ||
|
||
|
||
def discrete(fld, values, thresholds=None): | ||
""" | ||
Discrete transformation. | ||
|
||
After this transformation, the field has only `len(values)` discrete | ||
values. | ||
|
||
Parameters | ||
---------- | ||
fld : :any:`Field` | ||
Spatial Random Field class containing a generated field. | ||
Field will be transformed inplace. | ||
values : :any:`np.ndarray` | ||
The discrete values the field will take | ||
thresholds : :class:`numpy.ndarray`, optional | ||
the thresholds, where the value classes are separated | ||
Default: mean of the neighbouring values | ||
""" | ||
if fld.field is None: | ||
print("discrete: no field stored in SRF class.") | ||
else: | ||
if thresholds is None: | ||
# just in case, sort the values | ||
values = np.sort(values) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if I want the values to BE unsorted? Why do the given values need to be in relation to the values of the given field? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, at least for the way I implemented the transformation, the values have to be monotonically increasing. I'm also not sure of how to interpret non monotonically increasing values. |
||
thresholds = (values[1:] + values[:-1]) / 2 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the natural way would be to us thresholds, so that the resulting ratios between the given values are even. Maybe I don't get the aim of this transformation... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I was also not exactly sure how to handle the kind of input arguments the best. If the function would just take an array of thresholds, what values would you assign to all the field values lying in between two thresholds? |
||
else: | ||
if len(values) != len(thresholds) + 1: | ||
raise ValueError( | ||
"discrete transformation: len(values) != len(thresholds) + 1" | ||
) | ||
values = np.array(values) | ||
thresholds = np.array(thresholds) | ||
for i in range(len(thresholds)): | ||
if not (values[i] <= thresholds[i] < values[i + 1]): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why this restriction? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How else would you define an array of thresholds, which subdivides an array of values? - It's a bit like the nodes and the edges of a graph. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The
The divide values is unrelated to the given lower and upper values, which will be set in the field. It is only related to the input-field |
||
raise ValueError( | ||
"discrete transformation: thresholds must lie between values" | ||
) | ||
|
||
# handle edge cases | ||
fld.field[fld.field <= thresholds[0]] = values[0] | ||
fld.field[fld.field > thresholds[-1]] = values[-1] | ||
|
||
for i in range(len(values[:-2])): | ||
fld.field[ | ||
np.logical_and( | ||
thresholds[i] <= fld.field, fld.field < thresholds[i + 1] | ||
) | ||
] = values[i + 1] | ||
|
||
|
||
def boxcox(fld, lmbda=1, shift=0): | ||
""" | ||
Box-Cox transformation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values are equidistantly, yes. But they don't need to be equally distributed in the end. I think, this could be misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean it's misleading for this specific example? Would " create 5 eq. spaced values for this example" be better?! Or do you prefer an example with non-eq. spaced values?