Optimize smooth_frames=False #318

bemoody · 2021-08-10T15:12:59Z

If we are reading a record in "non-smooth-frames" mode, we need to extract an array of samples for each signal, which may require copying a non-contiguous sequence of samples into a new array.

Previously, this was done through a complicated Python loop that created an array of all the desired sample indices (i.e., if the signal contains 1,000,000 samples, this would generate an array of 1,000,000 integers) and using that array to index into the raw sample array. This is rather slow.

A better way is to treat the raw sample array as a 2D array, extract a slice, and then reshape that slice into a 1D array. This means the work is done within numpy and is very fast.

This shouldn't have any user-visible effects, apart from better performance, and is independent of pull #313.

Example:

$ git checkout master

$ time python3 -c 'import wfdb;[wfdb.rdrecord("sample-data/wave_4",smooth_frames=True) for _ in range(10)]'

real    0m2.009s
user    0m1.892s
sys     0m0.112s

$ time python3 -c 'import wfdb;[wfdb.rdrecord("sample-data/wave_4",smooth_frames=False) for _ in range(10)]'

real    0m7.895s
user    0m7.591s
sys     0m0.301s

$ git checkout optimize-non-smooth

$ time python3 -c 'import wfdb;[wfdb.rdrecord("sample-data/wave_4",smooth_frames=True) for _ in range(10)]'

real    0m1.952s
user    0m1.877s
sys     0m0.122s

$ time python3 -c 'import wfdb;[wfdb.rdrecord("sample-data/wave_4",smooth_frames=False) for _ in range(10)]'

real    0m0.883s
user    0m0.778s
sys     0m0.169s

When reading signal data with smooth_frames = False, we want to return a one-dimensional array of samples for each signal, which means extracting a range of samples from each frame and concatenating those ranges into a single numpy array. Previously this was done by constructing an array of all the desired sample indices, and using that array as a subscript into the input sig_data array. This is inefficient in both time and memory. Instead, extract the desired samples as a slice of a two-dimensional array, and reshape that slice to obtain a 1D array. (Using reshape(-1) instead of flatten() means that if samps_per_frame is 1, the array does not need to be copied. If the array does need to be copied, the reshape operation is still faster than any Python loop and doesn't require additional memory.)

briangow · 2021-08-10T17:45:15Z

This looks good to me!

briangow merged commit 408ad93 into master Aug 10, 2021

briangow deleted the optimize-non-smooth branch August 10, 2021 17:45

tompollard mentioned this pull request Sep 10, 2021

bump version to 3.4.1 #326

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize smooth_frames=False #318

Optimize smooth_frames=False #318

bemoody commented Aug 10, 2021

briangow commented Aug 10, 2021

Optimize smooth_frames=False #318

Optimize smooth_frames=False #318

Conversation

bemoody commented Aug 10, 2021

briangow commented Aug 10, 2021