Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[whisper] Add OpenAI API compatibility #17921

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dalgwen
Copy link
Contributor

@dalgwen dalgwen commented Dec 17, 2024

Use an external OpenAI compatible API as another option besides the locally embedded whisper.

Whisper is excellent as an embedded service, but it can also be remotely exposed, with services such as OpenAI obviously, but also by some other open source projects wrapping it (backed by whisper.cpp, or faster-whisper), such as faster-whisper-server.

So, this MR adds the possibility to use a remote Whisper by calling an OpenAI compatible API exposing it. I briefly thought about doing a new add-on, but deemed it overkill. It is another way of using Whisper, so I hope it's appropriate to do it here ?

Some refactors I hope @GiviMAD can take a look at :

  • whisper related code (initialization) delayed to the last moment in order to keep API mode / local mode separated as much as possible. So, is it ok to initialize WhisperState inside the new recognizeLocal method, effectively doing it in each iteration of the main while loop ? Or should I keep it as a global parameter, initialized once, outside the while loop ?
  • float32 array conversion delayed to the last moment (short16 is used for remote whisper so I kept it as the main format, as I didn't want to do short16 -> float32 -> short16)
  • inside the newly extracted method recognizeLocal, I use exception and null return as a signal to break the external loop, instead of the break keyword. Branching condition and output should be the same nonetheless.
  • I added language as a configuration parameter, to not depend on the system locale if the user wants so.

Thanks !

@dalgwen dalgwen requested a review from GiviMAD as a code owner December 17, 2024 21:54
@dalgwen dalgwen added enhancement An enhancement or new feature for an existing add-on about to close Issue potentially resolved and no feedback received and removed about to close Issue potentially resolved and no feedback received labels Dec 17, 2024
@GiviMAD
Copy link
Member

GiviMAD commented Dec 17, 2024

Hello @dalgwen, this is great, I was about to look at it because I bought an Nvidia Orin Nano development kit and I want to host whisper there using an API that mimics OpenAI so this also solves my problem.

I'm going to make a quick test tomorrow but I just reviewed the code and everything looks good to me.

Also I'm about to upgrade the whisper-jni library to use the latests whisper.cpp version, but I think I will wait to create a PR until this gets merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement or new feature for an existing add-on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants