Recognizing technical terms #38

cfasana · 2024-10-29T13:14:44Z

First of all, great work!
Considering that technical jargon may be used in real scenarios (especially when working in industrial settings), is there a way to improve recognition of these terms without fine-tuning?

As an example, OpenAI Whisper enables the possibility of using a prompt (openai/whisper#963 (comment)) that is passed to the decoder before the actual audio. In this way, the inference time increases a bit, but the resulting performance improves noticeably.

Thanks

keveman · 2024-10-29T18:30:18Z

We have intentionally kept the model simple, such as no timestamps and no other special tokens, no special prompts etc. However, industrial setting use case indeed is one of our targets, and recognizing technical terms, domain specific abbreviations etc., are indeed important in those settings. Prompting the decoder indeed is a generic solution that we plan on pursuing. Will post updates here when we have some.

cfasana · 2024-10-30T07:25:51Z

That's great to hear!
Indeed, the industrial setting is a very interesting field both for what concerns technical terms and noise.

I will wait for any updates also concerning your plans if it is possible so that we can exchange ideas

curiositry · 2024-11-07T00:33:18Z

@keveman My main usecase involves recognizing a small, pre-defined set of words and phrases, with a low false positive rate, so I'm glad to hear this is something you're already thinking about.

ocavue · 2024-12-25T12:32:22Z

When transcribing long audio files, we need to split them into chunks under 30 seconds each. The prompt feature helps smooth out these transitions by using the previous chunk's transcription as context. This makes the output more natural, especially at the boundaries between chunks.

Here us an example when transcribing a long audio without previous text - you'll notice the text doesn't flow smoothly between chunks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recognizing technical terms #38

Recognizing technical terms #38

cfasana commented Oct 29, 2024

keveman commented Oct 29, 2024

cfasana commented Oct 30, 2024

curiositry commented Nov 7, 2024

ocavue commented Dec 25, 2024

Recognizing technical terms #38

Recognizing technical terms #38

Comments

cfasana commented Oct 29, 2024

keveman commented Oct 29, 2024

cfasana commented Oct 30, 2024

curiositry commented Nov 7, 2024

ocavue commented Dec 25, 2024