Adapt signal_fixpeaks to deal with larger gaps in data #650

danibene · 2022-06-03T09:34:17Z

Describe the solution you'd like
I noticed that the "neurokit" method of signal_fixpeaks does not work well with larger gaps in the data, sometimes leading to negative indices or getting stuck in a loop: https://github.com/danibene/fixpeaks_with_large_gaps/blob/main/example_signal_with_large_gaps.ipynb

How could we do it?
I think there are a few possible ways to go about dealing with larger gaps (not mutually exclusive):

Adapting the number of NaNs depending on the interval size rather than always inserting 2 NaNs for large intervals
Interpolating the peaks directly rather than the intervals
Providing a warning if there are negative indices & breaking the loop after a certain number of iterations

If you're interested in any of the above, I'm happy to try my hand & open a PR!

DominiqueMakowski · 2022-06-03T09:56:08Z

Fixing peaks is a tricky issue, but there's definitely room for improvement!
n°3 should be a starting point, then I don't know about what's best between 1 and 2... but please do open a PR so we can check.

Would be also nice to add an example / study explaining and benchmarking these methods

danibene · 2022-06-06T07:06:20Z

Fixing peaks is a tricky issue

Indeed!

I've opened the PR for now with the 3 changes I mentioned above, it's a work in progress but I thought I should get your input before overcomplicating things: #647

The new version is able to fix ECG peaks for data where the old version got stuck in a loop.

I still see a couple of issues though:

there is no extrapolation in case there is a NaN at the final index (so when interpolating the peaks directly, the index stays constant rather than increasing)
→ modify the signal_interpolate function to extrapolate?
→ could remove NaNs at the end?
sometimes the indices are not negative, but still out of order
→ provide warning for peak indices that are not increasing?
if there are many missing peaks, it might be better to exclude them instead of linearly interpolating for certain analyses, e.g. short term HRV
→ have option to keep NaNs rather than interpolate above a certain threshold?

danibene · 2022-07-19T22:38:55Z

Hi @DominiqueMakowski & @JanCBrammer , I've noticed a couple of problematic parts of the code from the previous PR

If the intervals are standardized, the calculation of the number of NaNs to insert based on the mean interval does not make sense:

NeuroKit/neurokit2/signal/signal_fixpeaks.py

Lines 645 to 652 in 4965f06

    
           interval = signal_period(peaks, sampling_rate=sampling_rate, desired_length=None) 
        
           interval = standardize(interval, robust=robust) 
        
           peaks = _interpolate_missing( 
        
               peaks=peaks, 
        
               interval=interval, 
        
               interval_max=relative_interval_max, 
        
               sampling_rate=sampling_rate, 
        
           )

NeuroKit/neurokit2/signal/signal_fixpeaks.py

Lines 673 to 674 in 4965f06

    
           interval_without_outliers = interval[np.invert(outliers)] 
        
           mean_interval = np.nanmean(interval_without_outliers)

E.g.:

import neurokit2 as nk
peaks = [250,  1250,  1350,  2250,  3250]
interval = nk.signal_period(peaks, sampling_rate=1000, desired_length=None)
interval
array([0.75, 1.  , 0.1 , 0.9 , 1.  ])

nk.standardize(interval)
array([ 0.        ,  0.66226618, -1.72189206,  0.39735971,  0.66226618])

The calculation of the number of NaNs to insert is always with the floor rather than rounding:

NeuroKit/neurokit2/signal/signal_fixpeaks.py

Lines 680 to 681 in 4965f06

# compute number of NaNs to insert based on the mean interval

n_nan = int(interval[loc] / mean_interval)

This is probably not ideal if interval[loc] / mean_interval is closer to the next integer:

int(1.99)
1

round(1.99)
2

Should I open a new PR? Sorry I didn't notice these earlier 🙈

DominiqueMakowski · 2022-07-20T00:18:50Z

Should I open a new PR? Sorry I didn't notice these earlier 🙈

No worries, better late than never :) Yes please do open a PR

danibene mentioned this issue Jun 6, 2022

[Feature] Adapt fixpeaks for large gaps #651

Merged

4 tasks

danibene closed this as completed Jun 30, 2022

danibene reopened this Jul 19, 2022

danibene mentioned this issue Jul 20, 2022

[Fix] Calculation of number of NaNs in signal_fixpeaks() #679

Merged

4 tasks

danibene closed this as completed Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt signal_fixpeaks to deal with larger gaps in data #650

Adapt signal_fixpeaks to deal with larger gaps in data #650

danibene commented Jun 3, 2022

DominiqueMakowski commented Jun 3, 2022

danibene commented Jun 6, 2022 •

edited

Loading

danibene commented Jul 19, 2022

DominiqueMakowski commented Jul 20, 2022

Adapt signal_fixpeaks to deal with larger gaps in data #650

Adapt signal_fixpeaks to deal with larger gaps in data #650

Comments

danibene commented Jun 3, 2022

DominiqueMakowski commented Jun 3, 2022

danibene commented Jun 6, 2022 • edited Loading

danibene commented Jul 19, 2022

DominiqueMakowski commented Jul 20, 2022

danibene commented Jun 6, 2022 •

edited

Loading