Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize smooth_frames=False #318

Merged
merged 1 commit into from
Aug 10, 2021
Merged

Optimize smooth_frames=False #318

merged 1 commit into from
Aug 10, 2021

Conversation

bemoody
Copy link
Collaborator

@bemoody bemoody commented Aug 10, 2021

If we are reading a record in "non-smooth-frames" mode, we need to extract an array of samples for each signal, which may require copying a non-contiguous sequence of samples into a new array.

Previously, this was done through a complicated Python loop that created an array of all the desired sample indices (i.e., if the signal contains 1,000,000 samples, this would generate an array of 1,000,000 integers) and using that array to index into the raw sample array. This is rather slow.

A better way is to treat the raw sample array as a 2D array, extract a slice, and then reshape that slice into a 1D array. This means the work is done within numpy and is very fast.

This shouldn't have any user-visible effects, apart from better performance, and is independent of pull #313.

Example:

$ git checkout master

$ time python3 -c 'import wfdb;[wfdb.rdrecord("sample-data/wave_4",smooth_frames=True) for _ in range(10)]'

real    0m2.009s
user    0m1.892s
sys     0m0.112s

$ time python3 -c 'import wfdb;[wfdb.rdrecord("sample-data/wave_4",smooth_frames=False) for _ in range(10)]'

real    0m7.895s
user    0m7.591s
sys     0m0.301s

$ git checkout optimize-non-smooth

$ time python3 -c 'import wfdb;[wfdb.rdrecord("sample-data/wave_4",smooth_frames=True) for _ in range(10)]'

real    0m1.952s
user    0m1.877s
sys     0m0.122s

$ time python3 -c 'import wfdb;[wfdb.rdrecord("sample-data/wave_4",smooth_frames=False) for _ in range(10)]'

real    0m0.883s
user    0m0.778s
sys     0m0.169s

When reading signal data with smooth_frames = False, we want to return
a one-dimensional array of samples for each signal, which means
extracting a range of samples from each frame and concatenating those
ranges into a single numpy array.

Previously this was done by constructing an array of all the desired
sample indices, and using that array as a subscript into the input
sig_data array.  This is inefficient in both time and memory.

Instead, extract the desired samples as a slice of a two-dimensional
array, and reshape that slice to obtain a 1D array.

(Using reshape(-1) instead of flatten() means that if samps_per_frame
is 1, the array does not need to be copied.  If the array does need to
be copied, the reshape operation is still faster than any Python loop
and doesn't require additional memory.)
@briangow
Copy link
Contributor

This looks good to me!

@briangow briangow merged commit 408ad93 into master Aug 10, 2021
@briangow briangow deleted the optimize-non-smooth branch August 10, 2021 17:45
@tompollard tompollard mentioned this pull request Sep 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants