-
Notifications
You must be signed in to change notification settings - Fork 870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can whisper-tiny speech-to-text translate to English as well as transcribe foreign language? #87
Comments
Hi 👋 I haven't added that yet, but it should be quite simple. Seems like all I need to do is set a prefix token. Something like:
It doesn't look like their pipeline function supports this (unless I'm wrong). Perhaps there's a way to pass tokenizer keyword arguments to the pipeline. I'll check it out. 👍 |
It already works, but there is one issue left for me right now: #107 Then you can use this code: env.localModelPath = "http://127.0.0.1/transformer/models/";
const pipe = await pipeline("automatic-speech-recognition", "whisper-tiny");
const audioCTX = new AudioContext({
sampleRate: 16000
});
const arrayBuffer = await (await fetch(SPEECH2TEXT_AUDIO.currentSrc)).arrayBuffer();
const decoded = await audioCTX.decodeAudioData(arrayBuffer);
const audio = decoded.getChannelData(0);
const result = await pipe(audio, {
return_timestamps: true,
//chunk_length_s: 30,
chunk_callback: (obj) => {
const decodedTokens = pipe.tokenizer.decode(obj.tokens);
console.log("progress tokens:", decodedTokens);
}
});
console.log("result", result); The output will look somewhat like: The interesting part is basically just the [1] index: ( EDIT: I realized this is about actually translating lol |
Opened PR for this 👍 Expect it to be merged soon. |
…95) (#133) * Align `.generate()` return type with python library * Add multilingual transcription + translation for whisper models (#87, #95) * Include `return_timestamps` in calculation of `forced_decoder_ids` * Only return non-null `forced_decoder_ids` * Allow user to specify task in any case * Only set `forced_decoder_ids` when non-empty * Implement `SuppressTokensAtBeginLogitsProcessor`
This was added in v2.2.0 🎉 Check the release notes (https://github.com/xenova/transformers.js/releases/tag/2.2.0) for example code. |
I know there is a separate translation engine (t5-small), but I'm wondering if speech-to-text with whisper-tiny (not whisper-tiny.en) can return English translation alongside the foreign-language transcription? -- I read Whisper.ai can do this. It seems like it would just be a parameter, but I don't know where to look.
The text was updated successfully, but these errors were encountered: