-
Notifications
You must be signed in to change notification settings - Fork 788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Percent X and transcripts get stuck with console error #66
Comments
Can you try with a smaller model? Base is the largest among "tiny", "small" and "base". So, you may be running out of memory. |
Tried tiny as well, for english last week it worked well for 7 min video, for 14 min hindi video it is getting stuck It’s MacBook Air but also tried on win 10 32GB RAM i7 256SSD is this a low for this purpose? |
Admittedly, I haven't tested on very long non-english videos/audios. If possible, could you share the audio file? If not, could you find a YouTube video I can perhaps test with? Thanks! |
This is the video link this is the app I am trying |
Thanks! I'll run some tests 👍 It's also worth mentioning that the creator of ScreenRun also raised an issue the other day: #54 So, it might be fixed now in the latest version (i.e., if you can update, that might fix it). If it is running in the browser, you can also try refreshing the cache (since, the models were also updated recently) |
This should be fixed in the latest release (https://www.npmjs.com/package/@xenova/transformers). I will close for now, and if you have the issue again, feel free to reopen or open a new issue. |
I am running into the same issue: whisper_bug.html <body></body>
<script>
function importFile(content) {
return "data:text/javascript;base64," + btoa(content);
}
const imports = {
"transformers": "./src/transformers.js",
"fs": importFile("export default {};"),
"url": importFile("export default {};"),
"path": importFile("export default {};"),
"stream/web": importFile("export default {};"),
"sharp": importFile("export default {};"),
"onnxruntime-node": importFile("export default {};"),
"onnxruntime-web": importFile(`
await import("https://cdnjs.cloudflare.com/ajax/libs/onnxruntime-web/1.14.0/ort.es6.min.js");
let ONNX = globalThis.ort;
export default ONNX;
export {
ONNX
};
`),
};
const importmap = document.createElement("script");
importmap.type = "importmap";
importmap.textContent = JSON.stringify({imports});
document.body.appendChild(importmap);
</script>
<script type="module">
import * as transformers from "./src/transformers.js";
Object.assign(window, {...transformers});
</script>
<!--
<audio id="SPEECH2TEXT_AUDIO" src="./examples/demo-site/assets/audio/jfk.wav" controls="true"></audio>
<audio id="SPEECH2TEXT_AUDIO" src="./examples/demo-site/assets/audio/minner_fra_krigen_short.wav" controls="true"></audio>
-->
<audio id="SPEECH2TEXT_AUDIO" src="./examples/demo-site/assets/audio/minner_fra_krigen.wav" controls="true"></audio> Run this in f12/devtools:
Error: ort.es6.min.js:6 D:/a/_work/1/s/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:41 onnxruntime::ReshapeHelper::ReshapeHelper(const TensorShape &, TensorShapeVector &, bool) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{6,0,64}, requested shape:{1,6,64,64}
An error occurred during model execution: "Error: failed to call OrtRun(). error code = 6.".
sessionRun @ models.js:123
await in sessionRun (async)
seq2seq_forward @ models.js:341
forward @ models.js:1837
seq2seqRunBeam @ models.js:411
runBeam @ models.js:1817
generate @ models.js:835
await in generate (async)
generate @ models.js:1769
_call @ pipelines.js:818
await in _call (async)
closure @ core.js:62
(anonym) @ VM74:8 I didn't try ONNX Runtime v1.14.1 yet, so I don't know if this is a bug here or there, but I will experiment more. First I thought the chunk size was wrong, so I made it powers of 16000 and 30 * 16000 etc., but it continued to fail for some reason anyway. |
Could you post the audio file you're testing with? That error message is usually associated with OOM, but I don't see why it would happen in this case. Can you try using whisper-tiny.en? |
Yep, I will test a bit more now 🧪 🔬 File (zipped up because GitHub doesn't like |
That failed aswell, so currently I'm trying to build ONNX with emscripten for having debug symbols and maybe I figure out a Minimal reproducible example |
I converted yesterdays's HEAD of ONNX to ES6 and tried to debug this... but running into a same'ish problem: Chrome: But because I'm also testing around with WebGPU support and Chrome-on-Linux doesn't support WebGPU on my system yet, I had to install Firefox Nightly and Firefox-Nightly-on-Linux-via-WebGPU also doesn't work: The *.wasm binaries I used come from here: microsoft/onnxruntime#15796 (comment) It keeps figuring out the audio chunks for a while just to crash around 2 minutes later. So multiple backends crash on the same task. 🤔 |
This is already closed, even though the bug still exists in HEAD. However, this PR fixes it for me: (very happy about that 😅) Thank you very much for all the work 🥇 👍 |
🔥 Nice! I've done a lot more testing, and it seems to happen most (or only) on M1/M2 macs in safari. What was your testing environment? Once I understand the problem a bit more, I'll open a bug report on the onnxruntime repo. |
All testing on Linux and I tried latest Chrome and Firefox Nightly. With PR, both work fine, without PR, both crash with the same error. If chipset matters: AMD Ryzen 5 3600 6-Core Processor |
Describe the bug
A clear and concise description of what the bug is.
How to reproduce
Steps or a minimal working example to reproduce the behavior
Audio with Hindi having transcribing stuck on X %
Expected behavior
A clear and concise description of what you expected to happen.
It should generate subtitles
Logs/screenshots
If applicable, add logs/screenshots to help explain your problem.
Screenshot attached, app name is screenrun.app
Environment
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: