Configurable AudioWorklet process block size (higher than 128 samples)? #1503

josh83abc · 2018-02-23T02:07:39Z

Hello everybody!

As far as I see, the AudioWorkletProcessor process block size is 128 samples, like any AudioNodes.

I haven't really tested the robustness of the audio stream but this value seems to be pretty low to me if you compare it with what you see in desktop music softwares (process block size is usually more than 512 samples and can easily be 1024 or 2048 samples).

I don't know if it is related but I can hear frequent tiny audio glitches with the AudioWorklet sinus demo when I switch from tab to tab. https://googlechromelabs.github.io/web-audio-samples/audio-worklet/basic/hello-audio-worklet.html

Also in my app, the audio latency is not the most important aspect, so I prefer to have a higher latency allowing a better robust audio stream.

Would it be possible to change the AudioWorkletProcessor process block size in the future? Is there a workaround?

PS : this post also talks about it : #1466

hoch · 2018-02-23T03:24:41Z

The internal buffering (e.g. FIFO) in the processor might be the way to resolve the buffer size difference. Even if we change the spec to accommodate the variable buffer size, the implementation will do the buffering internally anyway because the other parts of WebAudio use 128 frames.

Having the same render quantum size is the key to the lower latency, and it was the one of design goals of AudioWorklet. I doubt that WG will change something fundamental like this now, but it can be revisited later for the V2.

hoch · 2018-02-23T03:26:20Z

Also the glitch is an implementation issue. If you have a repro case, please file a bug and cc me (hongchan@).

sletz · 2018-02-23T12:54:01Z

@josh83abc: you can get more info on glitch issues on this bug report: https://bugs.chromium.org/p/chromium/issues/detail?id=796330&can=2&start=0&num=100&q=component%3ABlink%3EWebAudio&colspec=ID%20Pri%20M%20Stars%20ReleaseBlock%20Component%20Status%20Owner%20Summary%20OS%20Modified&groupby=&sort=

josh83abc · 2018-02-24T00:18:09Z

Thanks @sletz for this link!! lot of very interesting stuff in it. Avoid audio glitches is the main priority of my music player, so I'm very interested in understanding all the details.

@hoch, thanks a lot for all these technical details. I'm aware that all the webaudio graph is processed with a render quantum of 128 samples and so does the AudioWorkletProcessor. For now I didn't really tested the audio robustness of the webaudio graph rendering, it is pretty good for sure but I will try to reach its limits. Also I will test on iOS and Android. Tell me if I'm wrong but I don't think a FIFO on top of the AudioWorkletProcessor will improve the robustness that much.

For the V.next, it would be interesting if the webaudio render quantum size could be ajusted by the programmer who doesn't need a very low latency. For instance, a render quantum of 1024 samples ([email protected]) would be way enough for my application. Actually Flash is using this value to process audio, the latency is ok and there is no audio glitch at all, even if Chrome crashes.

padenot · 2018-02-26T10:13:15Z

The internal block size of the Web Audio rendering graph provides an lower bound for the OS buffer size, not an upper bound (and even then, it's not strictly true).

The Web Audio API already has a mechanism to request a higher latency on an AudioContext, using AudioContextLatencyCategory.

Please open issues on the UA's bug tracker for issues about implementations.

josh83abc · 2018-02-26T22:04:02Z

Thanks a lot @padenot for letting me aware of the latencyHint option of the AudioContext!! I totally missed it when I read the spec, but this exactly what I was talking about : having an option to ajust the audio latency according the need of the app.
Amazing it already exists!! webaudio is dope :)

rtoy · 2018-02-26T23:19:40Z

Based on #1503 (comment), I think we can close out this issue. I don't think there's anything that needs to be done for the spec.

fr0m · 2018-03-23T10:20:46Z

Back then using ScriptProcessorOrNode, buffer size controls how frequently the audioprocess event is dispatched, well latencyHint doesn't affect that.

Since the AudioWorkletProcessor process buffer size is 128, the frequency of process is triggered is overwhelmingly higher than using ScriptProcessorOrNode with 1024 buffer size or higher.

This situation may cause high CPU occupation since expensive operation may be executed in the process, which is the situation in my project. So do we have another way to control the frequency of process triggered?

padenot · 2018-03-23T12:37:39Z

This situation may cause high CPU occupation since expensive operation may be executed in the process, which is the situation in my project. So do we have another way to control the frequency of process triggered?

No. If it's too expensive when computing 128 frames, why is it OK when computing 1024 frames? You exactly have the same amount of time per frame to compute the audio.

You probably simply need to optimize your code.

sletz · 2018-03-23T12:42:30Z

Because running the complete audio chain adds a fixed processing cost per buffer. When smaller buffers are used, the fixed cost is added more times, so finally takes more of the available CPU.

fr0m · 2018-03-26T09:10:03Z

Thanks for the reply.

We use audioWorklet to do audio live stream, encoded audioBuffer is sent via WebSocket in AudioWorkletProcessor. It's expensive because of the frequency, not frame.

I can cache the buffer, and send the buffer in a particular buffer size. But it will be much better if the unnecessary process will not be triggered at all. Will you take this situation into account?

positonic · 2019-03-02T16:32:48Z

@sletz that benchmark link for glitches is dead, has it moved somewhere?

sletz · 2019-03-02T16:43:07Z

Which link ?

thedracle · 2022-12-14T21:07:22Z

Was this closed with a resolution of some kind?

I have a situation where I have a trained model that works on buffers that are at a minimum 512 frames, or multiples of it.

The processing time isn't the issue, but the network was trained/optimized for this frame size, and doesn't seem to be an easy to way to match it with the demand that all processing be done on 128 sample chunks.

Is it acceptable to set the latencyHint to "playback" and to block the callback until I've collected 512 frames, processed them, and then feed my collected frames out one at a time?

padenot · 2022-12-15T09:45:26Z

Was this closed with a resolution of some kind?

This was in fact about latency, and was closed because of this: #1503 (comment). We didn't update the issue's title, maybe we should have.

I have a situation where I have a trained model that works on buffers that are at a minimum 512 frames, or multiples of it.

The processing time isn't the issue, but the network was trained/optimized for this frame size, and doesn't seem to be an easy to way to match it with the demand that all processing be done on 128 sample chunks.

Is it acceptable to set the latencyHint to "playback" and to block the callback until I've collected 512 frames, processed them, and then feed my collected frames out one at a time?

There are three things you can do. Two you can do now, and one you'll be able to do later next year. It's not a huge effort to do the first two, but the best solution depends on the situation:

First, you can internally buffer 384 frames: you accumulate 384 frames of input, in the first 3 callbacks, and then on the 4th callback, when you've had 512 frames of input, you can run your model. This induces 3 blocks of latency, maybe this is acceptable for your use-case. This works well if your model runs in less than 128 / sampleRate seconds. You can measure this easily, https://blog.paul.cx/post/profiling-firefox-real-time-media-workloads/ has instructions.
Otherwise, if your model takes more than 128 / sampleRate seconds to execute, you can offload the computation to a Web Worker, by sending from the worklet to a Web Worker using a ring buffer. This is explained in https://blog.paul.cx/post/a-wait-free-spsc-ringbuffer-for-the-web/, the repo has two examples: efficiently sending audio to a Web Worker, and back to an AudioWorkletProcessor. If you don't need to play the audio out, then this is the solution I'd recommend (you can skip sending the audio back to the AudioWorkletProcessor in this case.
Finally, sometime next year, we'll merge and implement a feature to change the block size of an AudioContext, in which case you'll be able to specify 512, and you'll get buffers with 512 frames in your AudioWorkletProcessor. The specification part is done, but it's a big change in the implementations, so we've delayed merging the specification text for clarity.

In any case, you can never "block the callback", this will cause problems, such as demoting the real-time audio thread from real-time priority to regular priority, causing all sorts of glitches and problems. But if you're not playing the audio out anyways, you can set the latencyHint to "playback", and it might save some power (depending on the OS and implementation).

thedracle · 2022-12-15T15:41:26Z

@padenot Awesome, thanks for the very detailed response! It's extremely helpful.

I'm pleased to see the final resolution, and the latency introduction is unfortunately unavoidable given the constraints of the ML model we are using, and we are willing to accept it.

I will try the first suggestion for now, and I look forward to the new feature to be able to change the block size.

padenot closed this as completed Feb 27, 2018

ghost mentioned this issue Apr 7, 2018

[Question] Do AudioWorkletNodes share an AudioWorkletProcessor instance? #1558

Closed

oshoham mentioned this issue Jun 27, 2019

Replace ScriptProcessorNode in p5.SoundRecorder with AudioWorkletNode processing/p5.js-sound#369

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable AudioWorklet process block size (higher than 128 samples)? #1503

Configurable AudioWorklet process block size (higher than 128 samples)? #1503

josh83abc commented Feb 23, 2018 •

edited

Loading

hoch commented Feb 23, 2018

hoch commented Feb 23, 2018

sletz commented Feb 23, 2018

josh83abc commented Feb 24, 2018

padenot commented Feb 26, 2018

josh83abc commented Feb 26, 2018 •

edited

Loading

rtoy commented Feb 26, 2018

fr0m commented Mar 23, 2018

padenot commented Mar 23, 2018

sletz commented Mar 23, 2018

fr0m commented Mar 26, 2018

positonic commented Mar 2, 2019

sletz commented Mar 2, 2019

thedracle commented Dec 14, 2022 •

edited

Loading

padenot commented Dec 15, 2022

thedracle commented Dec 15, 2022

Configurable AudioWorklet process block size (higher than 128 samples)? #1503

Configurable AudioWorklet process block size (higher than 128 samples)? #1503

Comments

josh83abc commented Feb 23, 2018 • edited Loading

hoch commented Feb 23, 2018

hoch commented Feb 23, 2018

sletz commented Feb 23, 2018

josh83abc commented Feb 24, 2018

padenot commented Feb 26, 2018

josh83abc commented Feb 26, 2018 • edited Loading

rtoy commented Feb 26, 2018

fr0m commented Mar 23, 2018

padenot commented Mar 23, 2018

sletz commented Mar 23, 2018

fr0m commented Mar 26, 2018

positonic commented Mar 2, 2019

sletz commented Mar 2, 2019

thedracle commented Dec 14, 2022 • edited Loading

padenot commented Dec 15, 2022

thedracle commented Dec 15, 2022

josh83abc commented Feb 23, 2018 •

edited

Loading

josh83abc commented Feb 26, 2018 •

edited

Loading

thedracle commented Dec 14, 2022 •

edited

Loading