You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.
I tried running inference(transformer like usage, because apparently llama.cpp like usage is not available for whisper) , installed intel_extension_for_transformers, but now it fails on
import neural_speed.whisper_cpp as cpp_model
ModuleNotFoundError: No module named 'neural_speed.whisper_cpp
I installed neural-speed in the way it is mentioned in docs, i.e.,
pip install -r requirements.txt
pip install .
and was successful in running phi-1.5 inference in llama.cpp way.
Please guide how to run whisper inference and like other models also add 3-bit inference support to whisper
The text was updated successfully, but these errors were encountered:
Thanks, that example worked, so closing the issue.
However, it starts using only 1 CPU core after first few seconds of inference (both with and without OMP_NUM_THREADS environment variable). Just informing you, however it is not a big issue for me, so closing the issue
I tried running inference(transformer like usage, because apparently llama.cpp like usage is not available for whisper) , installed intel_extension_for_transformers, but now it fails on
I installed neural-speed in the way it is mentioned in docs, i.e.,
and was successful in running phi-1.5 inference in llama.cpp way.
Please guide how to run whisper inference and like other models also add 3-bit inference support to whisper
The text was updated successfully, but these errors were encountered: