Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance: Improve resampler performance #1271

Merged
merged 3 commits into from
Sep 27, 2023

Conversation

sultanqasim
Copy link
Contributor

Set resampler_cc output_multiple to 4096. This results in slightly increased audio delay (to ~23ms) and huge CPU load reduction.

All credit to @vladisslav2011, I only cherry-picked and tested this change. I found this one small simple change provided something like a 2.5-3x reduction in DSP CPU usage, and the added delay is barely perceptible if at all. No regressions observed. With this change, I can smoothly listen to radio channels while using the full 61.44 Msps sample rate of my USRP B210. Before this change, even 30 Msps had my CPU struggling and skipping audio chunks.

@argilo
Copy link
Member

argilo commented Sep 26, 2023

Thanks for the pull request.

I was able to reproduce the large performance impact, but I'd like to explore this a bit before merging the change. I'm curious what the root cause of the performance problem is (many downstream blocks in nbrx and wfmrx processing very small numbers of samples at a time?), whether setting this particular block's output multiple is really the best way to address it, and if so, whether 4096 is an appropriate value to use.

@argilo
Copy link
Member

argilo commented Sep 27, 2023

I'm curious what the root cause of the performance problem is (many downstream blocks in nbrx and wfmrx processing very small numbers of samples at a time?)

That does appear to be the problem. When the input rate is 12 Msps, for example, the arbitrary resampler most often produces either 40 samples or 1 sample (!!), so the downstream blocks' work functions are called very frequently with small numbers of samples.

whether 4096 is an appropriate value to use

This does appear to be an appropriate value. Decreasing to 2048 results in somewhat higher CPU utilization, and a larger value would increase audio latency appreciably.

The extra 23 ms of audio latency seems acceptable in exchange for reduced CPU utilization.

vladisslav2011 and others added 3 commits September 26, 2023 20:51
Set resampler_cc output_multiple to 4096. This results in slightly
increased audio delay (to ~23ms) and huge CPU load reduction.
@willcode
Copy link
Contributor

Interesting. Is the latency ok at lower sample rates? I wonder if there is any general solution where a large decimation is followed by expensive processing.

@argilo
Copy link
Member

argilo commented Sep 27, 2023

Is the latency ok at lower sample rates?

I expect so, because the output sample rate of the arbitrary resampler is constant (240000 for FM modes, or 96000 for narrow-band modes).

@argilo
Copy link
Member

argilo commented Sep 27, 2023

I tested with an RTL-SDR at 240,000 sps and audio latency is fine (provided the RTL-SDR hardware input buffer length is decreased by adding ,buflen=4096 to the device string).

@argilo
Copy link
Member

argilo commented Sep 27, 2023

I rebased, replaced the magic number 4096 with a constant, and updated the news. I'll merge once CI is happy and I've had a chance to do a bit more testing.

@argilo
Copy link
Member

argilo commented Sep 27, 2023

This results in slightly increased audio delay (to ~23ms)

I expect the delay will be higher for narrow-band modes (i.e. everything except WFM), where the output rate of the arbitrary resampler is 96000. (4096 / 96000 = 42.7 ms).

@argilo argilo merged commit c5ddff3 into gqrx-sdr:master Sep 27, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants