You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am looking for a way to do chunk-based inference instead of streaming inference using audio files.
The issue now is that each file audio will have new inference and thus new state (new speaker embeddings) which is unwanted behaviour for my program.
How should I do achieve wanted behaviour of doing inference on larger chunks of audio (such as 20 second) and keeping the pipeline state across ?
The text was updated successfully, but these errors were encountered:
Hello,
I am looking for a way to do chunk-based inference instead of streaming inference using audio files.
The issue now is that each file audio will have new inference and thus new state (new speaker embeddings) which is unwanted behaviour for my program.
How should I do achieve wanted behaviour of doing inference on larger chunks of audio (such as 20 second) and keeping the pipeline state across ?
The text was updated successfully, but these errors were encountered: