-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utilise MLX framework on Apple Silicon #1598
Comments
FWIW, I played around with Whisper implementation in the MLX-examples repo (using python bindings), and while the mlx version definitely uses my M1 GPU, it was about half as fast on the same models than the current CoreML-enabled whisper.cpp build. But there could be various overhead from how they implemented it in python. |
I did some simple benchmarking with a 10 minute audio file, the large model, M1Pro and MLX. Some else contributed M2 Ultra and M3 Max Numbers. See https://owehrens.com/whisper-nvidia-rtx-4090-vs-m1pro-with-mlx/ for the full post. Your mileage may vary. |
This is interesting. Was the 4090 whisper.cpp built with CUDA (feels like it isn't)? Were the whisper.cpp Mac builds built with CoreML? So far the best Apple Silicon performance I've seen has been the CoreML builds of Whisper.cpp; would be curious if others find that the current MLX example code isn't quite as good as whisper.cpp + CoreML. My overall best performance has been on Ubuntu with the CUDA enabled build of Whisper.cpp. (with a 3070), better than Python + pytorch + CUDA (though I should test that again.) Here are times on
So, MLX so far doesn't seem as good as the CoreML enabled whisper.cpp builds on my lowly M1, but that may not be true of other Apple Silicon chips. EDITED to add regular |
Your result might not be accurate. I ran a test on my |
I retested with |
Apple released this framework:
https://t.co/uA2ZbYC13I
It has a C++ core with python bindings, and a python python example of a port for whisper. Unfortunately no C++ example.
Would be great if it could be integrated here.
The text was updated successfully, but these errors were encountered: