Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactors and fixes sample entropy (last M values must be ignored) #15

Merged
merged 2 commits into from
Jan 10, 2020

Conversation

CSchoel
Copy link
Contributor

@CSchoel CSchoel commented Dec 28, 2019

@DominiqueMakowski called my attention to a discrepancy between this implementation of the sample entropy and my implementation in nolds in this issue: neuropsychology/NeuroKit#53 .

I think there is a small issue in pyEntropy that causes this inconsistence: The sample entropy is the conditional probability that two pieces of the input sequence that are similar for M time steps will remain similar for M+1 time steps. If we count the number of similar template vector pairs of length M we therefore must ignore the last template vector, since it cannot be followed for another time step. If we would include it in the calculation, this would introduce a bias that underestimates the number of template vectors that remain similar for a length of M+1.

Reference: Richman and Moorman (2000), page H2042

At Dominique's hint I found similar issues with entro-py (ixjlyons/entro-py#2) and pyeeg (forrestbao/pyeeg#29). With the suggested fix in this pull request, pyEntropy produces the same output as nolds and the R-package pracma (which I used as reference for the implementation of nolds), as well as the fixed versions of pyeeg and entro-py,

import nolds
import entropy as e
import pyentrp.entropy as pent
import pyeeg.entropy as peeg
import numpy as np

num = 100
dim = 2
tol = 0.2
signal = np.cos(np.linspace(start=0, stop=30, num=num))
print("entro-py", e.sampen(signal, dim, tol, False))
print(" pyentrp", pent.sample_entropy(signal, dim + 1, tol)[-1])
print("   pyeeg", peeg.samp_entropy(signal, dim, tol))
print("   nolds", nolds.sampen(signal, emb_dim=dim, tolerance=tol))
entro-py 0.27672305866206137
 pyentrp 0.2767230586620615
   pyeeg 0.27672305866206137
   nolds 0.2767230586620615

Since I found the code in pyEntropy hard to understand, I took the liberty to refactor it. After I was done with that I could also identify the actual culprit in the original code. So if you would like to incorporate the fix but not my refactored version, you can alternatively pull the branch fix_sampen in my fork.

reason for fix: last M values in time series cannot be followed for M steps
reference: Richman and Moorman (2000), page H2042

reason for refactoring: code was unnecessarily complicated with a for loop
disguised as a while loop and an unnecessary separation of the first
iteration
also used assert_allclose instead of assert_array_equal for correct float testing
@codecov
Copy link

codecov bot commented Dec 28, 2019

Codecov Report

Merging #15 into master will decrease coverage by 0.46%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #15      +/-   ##
==========================================
- Coverage   92.25%   91.79%   -0.47%     
==========================================
  Files           2        2              
  Lines         142      134       -8     
==========================================
- Hits          131      123       -8     
  Misses         11       11
Impacted Files Coverage Δ
pyentrp/entropy.py 88.77% <100%> (-0.85%) ⬇️
tests/test_entropy.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ae2bf71...1f8792f. Read the comment docs.

@CSchoel
Copy link
Contributor Author

CSchoel commented Dec 28, 2019

Update: I also adjusted the tests accordingly (and used assert_allclose instead of assert_array_equals for correct float handling.

@nikdon
Copy link
Owner

nikdon commented Jan 10, 2020

Hey @CSchoel. Thanks for the PR. I think your refactoring is good to be merged, and I don't see any objections as well :) I'll merge it shortly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants