Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

Documentation for whisper inference #104

Closed
bil-ash opened this issue Jan 31, 2024 · 2 comments
Closed

Documentation for whisper inference #104

bil-ash opened this issue Jan 31, 2024 · 2 comments
Assignees

Comments

@bil-ash
Copy link

bil-ash commented Jan 31, 2024

I tried running inference(transformer like usage, because apparently llama.cpp like usage is not available for whisper) , installed intel_extension_for_transformers, but now it fails on

import neural_speed.whisper_cpp as cpp_model
ModuleNotFoundError: No module named 'neural_speed.whisper_cpp

I installed neural-speed in the way it is mentioned in docs, i.e.,

pip install -r requirements.txt
pip install .

and was successful in running phi-1.5 inference in llama.cpp way.

Please guide how to run whisper inference and like other models also add 3-bit inference support to whisper

@intellinjun
Copy link
Contributor

You can use this pr and install neural_speed again. We don't currently support 3-bit inference and still in development

@bil-ash
Copy link
Author

bil-ash commented Feb 4, 2024

Thanks, that example worked, so closing the issue.
However, it starts using only 1 CPU core after first few seconds of inference (both with and without OMP_NUM_THREADS environment variable). Just informing you, however it is not a big issue for me, so closing the issue

@bil-ash bil-ash closed this as completed Feb 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants