Partial results on the gpu batch recognizer #1539

starfurylab · 2024-03-14T17:19:20Z

Hi! Great project, especially excited about the gpu support.
But i have a question, is it possible to use something like PartialResult() when working on gpu (rtx2080ti, cuda12.3), as it is done in websocket/asr_server.py?
For example, in a real-time audio stream analysis scenario, which is perfectly handled by asr server running on the cpu, but would like more performance than a cpu can provide.
Best Regards.

nshmyrev · 2024-03-15T09:53:06Z

Hello. It is possible but not implemented

starfurylab · 2024-03-15T12:11:43Z

Thanks! it's good to know that it's possible in principle. Can you give me a hint? Will such implementation affect only vosk-api or kaldi too? And maybe give me a direction to look in? I want to try to implement this feature.

nshmyrev · 2024-03-16T21:55:23Z

Sure, see here:

https://github.com/kaldi-asr/kaldi/blob/master/src/cudadecoder/batched-threaded-nnet3-cuda-online-pipeline.h#L172

https://github.com/alphacep/vosk-api/blob/master/src/batch_recognizer.cc#L120

starfurylab · 2024-03-18T15:26:00Z

Thanks, I'll let you know when I get something

starfurylab · 2024-04-08T15:43:39Z

Hi. Sorry for the delay. I have created a pull-request: #1554

I added partial results, but I don't know how to link it to other languages, so only in c, and added an example

On tests I got a limit of about 510-530 realtime streams from several test files on the rtx2080ti at about 15-20% of the i7-8700

Problems I noticed: it crashes when removing the model when removing the cuda pipeline instance, but I didn't look deeply into kaldi

vosk-api/src/batch_model.cc

Line 128 in 40937b6

delete cuda_pipeline_;

ASSERTION_FAILED ([5.5.1094~1-2b69ae]:~BatchedThreadedNnet3CudaOnlinePipeline():batched-threaded-nnet3-cuda-online-pipeline.cc:60) Assertion failed: (available_channels_.empty() || available_channels_.size() == num_channels_)

[ Stack-Trace: ]
../src/libvosk.so(kaldi::MessageLogger::LogMessage() const+0x7f6) [0x7bf5e0db5076]
../src/libvosk.so(kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)+0x75) [0x7bf5e0db5ae5]
../src/libvosk.so(kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipeline::~BatchedThreadedNnet3CudaOnlinePipeline()+0xb1e) [0x7bf5e0951f3e]
../src/libvosk.so(BatchModel::~BatchModel()+0x1d3) [0x7bf5e094af63]
../src/libvosk.so(vosk_batch_model_free+0x12) [0x7bf5e0917842]
./test_vosk_gpu_batch(+0x150b) [0x65475b2dd50b]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7bf5e0229d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7bf5e0229e40]
./test_vosk_gpu_batch(+0x12a5) [0x65475b2dd2a5]

Aborted (core dumped)

nshmyrev transferred this issue from alphacep/vosk-server Mar 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial results on the gpu batch recognizer #1539

Partial results on the gpu batch recognizer #1539

starfurylab commented Mar 14, 2024

nshmyrev commented Mar 15, 2024

starfurylab commented Mar 15, 2024

nshmyrev commented Mar 16, 2024

starfurylab commented Mar 18, 2024

starfurylab commented Apr 8, 2024 •

edited

Loading

Partial results on the gpu batch recognizer #1539

Partial results on the gpu batch recognizer #1539

Comments

starfurylab commented Mar 14, 2024

nshmyrev commented Mar 15, 2024

starfurylab commented Mar 15, 2024

nshmyrev commented Mar 16, 2024

starfurylab commented Mar 18, 2024

starfurylab commented Apr 8, 2024 • edited Loading

starfurylab commented Apr 8, 2024 •

edited

Loading