-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Katz_FD #36
base: master
Are you sure you want to change the base?
Update Katz_FD #36
Conversation
Thank you @PiethonProgram — I'll aim to review the PR later this week. In the meantime, please make sure that the CI tests and lint tests are all passing. |
Applied formatting changes and should pass lint cases. Changed self.assertEqual(np.round(katz_fd(x_k), 3), VALUE) from VALUE = 5.783 to 1.503 to reflect formula change |
antropy/fractal.py
Outdated
# euclidian distance calculation | ||
euclidean_distance = np.sqrt(1 + np.square(np.diff(x, axis=axis))) | ||
|
||
# total and average path lengths | ||
total_path_length = euclidean_distance.sum(axis=axis) | ||
average_path_length = euclidean_distance.mean(axis=axis) | ||
|
||
# max distance from first to all | ||
horizontal_diffs = np.arange(1, x.shape[axis]) | ||
vertical_diffs = np.take(x, indices=np.arange(1, x.shape[axis]), axis=axis) - np.take( | ||
x, indices=[0], axis=axis | ||
) | ||
|
||
if axis == 1: # reshape if needed | ||
horizontal_diffs = horizontal_diffs.reshape(1, -1) | ||
elif axis == 0: | ||
horizontal_diffs = horizontal_diffs.reshape(-1, 1) | ||
|
||
# Euclidean distance and max distance | ||
distances = np.sqrt(np.square(horizontal_diffs) + np.square(vertical_diffs)) | ||
max_distance = np.max(distances, axis=axis) | ||
|
||
# Katz Fractal Dimension Calculation | ||
full_distance = np.log10(total_path_length / average_path_length) | ||
kfd = np.squeeze(full_distance / (full_distance + np.log10(max_distance / total_path_length))) | ||
|
||
# ensure scalar output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @PiethonProgram,
I think that this implementation can be simplified, for example by following the proposed new implementation in: #34, or by leveraging the Neurokit2 implementation (which as of present gives the same output as Antropy): https://github.com/neuropsychology/NeuroKit/blob/45c9ad90d863ebf4e9d043b975a10d9f8fdeb06b/neurokit2/complexity/fractal_katz.py#L6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I will make the adjustments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, I have taken a look at the implementations you mentioned. Are you sure it can be simplified? Previous implementations that you mentioned are shorter since they are all single-channel.
If you want to only offer single-channel feature extraction then I can make the changes, but otherwise, unless you want to try and decrease time using Numba, I don't think there is much I can "simplify."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about the support for ND arrays. I have not yet found the time to do a deep dive into this, but can we just take the existing implementation of Antropy (see below) and replace the distance calculation by the Euclidean distance?
dists = np.abs(np.diff(x, axis=axis))
ll = dists.sum(axis=axis)
ln = np.log10(ll / dists.mean(axis=axis))
aux_d = x - np.take(x, indices=[0], axis=axis)
d = np.max(np.abs(aux_d), axis=axis)
kfd = np.squeeze(ln / (ln + np.log10(d / ll)))
or is there more to your implementation that I'm missing?
Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the essence of the code is the same, but some additional "bits" are needed when using Euclidean distance in n-dimensions in order to check for distances from one to the other.
If we were only speaking in 1-dimension, then yes, we can simply just replace the distance calculation line.
Should I be concerned with the unsuccessful checks : It seems the issue is related to GitHub versions, and not the code itself. |
Yeah don't worry about the CI failures, I need to make some upgrade to the GitHub Actions workflow. Thanks! |
Reformatted and "simplified" code. Note : Black formatting caused line 208 in fractal.py to expand into 7 lines (likely due to nested parenthesis restrictions) <= No impact on functions, just aesthetics. |
Hey, Thanks again for the implementation and the PR.
import stochastic.processes.noise as sn
rng = np.random.default_rng(seed=42)
X = np.vstack([
np.arange(1000),
np.sin(2 * np.pi * 1 * np.arange(1000) / 100),
sn.FractionalGaussianNoise(hurst=0.1, rng=rng).sample(1000),
sn.FractionalGaussianNoise(hurst=0.9, rng=rng).sample(1000),
rng.random(1000),
rng.random(1000)]
)
katz_new(X)
def katz(x):
# Define total length of curve
dists = np.abs(np.diff(x, axis=axis))
length = np.sum(dists, axis=axis)
# Average distance between successive points
a = np.mean(dists, axis=axis)
# Compute farthest distance between starting point and any other point
d = np.max(np.abs(x.T - x[..., 0]).T, axis=axis)
return np.log10(length / a) / (np.log10(d / a)) |
Hi @raphaelvallat , thanks for the catch.
|
Hi @raphaelvallat ,
Please let me know your thoughts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for looking into this! Two very minor comments but otherwise I think we are good to merge.
To clarify for others: this PR is just a simplification of the existing implementation. It does not update the distance calculation to use Euclidean distance, as originally discussed in #34
antropy/fractal.py
Outdated
@@ -181,17 +182,22 @@ def katz_fd(x, axis=-1): | |||
>>> x = np.arange(1000) | |||
>>> print(f"{ant.katz_fd(x):.4f}") | |||
1.0000 | |||
euclidean_distance = np.sqrt(1 + np.square(np.diff(x, axis=axis))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra line
antropy/fractal.py
Outdated
a = np.mean(dists, axis=axis) | ||
|
||
# Compute the farthest distance between starting point and any other point | ||
# d = np.max(np.abs(x.T - x[..., 0]).T, axis=axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feel free to remove this commented line
Hi, the changes have been made. Please let me know if there is anything else, and thank you for your cooperation as well. |
Fixed Katz_FD implementation by utilizing Euclidean distances. Also modified test cases to include tests for new method.