Documentation for whisper inference #104

bil-ash · 2024-01-31T15:24:41Z

I tried running inference(transformer like usage, because apparently llama.cpp like usage is not available for whisper) , installed intel_extension_for_transformers, but now it fails on

import neural_speed.whisper_cpp as cpp_model
ModuleNotFoundError: No module named 'neural_speed.whisper_cpp

I installed neural-speed in the way it is mentioned in docs, i.e.,

pip install -r requirements.txt
pip install .

and was successful in running phi-1.5 inference in llama.cpp way.

Please guide how to run whisper inference and like other models also add 3-bit inference support to whisper

The text was updated successfully, but these errors were encountered:

intellinjun · 2024-02-04T06:48:46Z

You can use this pr and install neural_speed again. We don't currently support 3-bit inference and still in development

bil-ash · 2024-02-04T12:23:27Z

Thanks, that example worked, so closing the issue.
However, it starts using only 1 CPU core after first few seconds of inference (both with and without OMP_NUM_THREADS environment variable). Just informing you, however it is not a big issue for me, so closing the issue

kevinintel assigned intellinjun Feb 2, 2024

intellinjun mentioned this issue Feb 4, 2024

add whisper example #114

Merged

bil-ash closed this as completed Feb 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation for whisper inference #104

Documentation for whisper inference #104

bil-ash commented Jan 31, 2024 •

edited

Loading

intellinjun commented Feb 4, 2024

bil-ash commented Feb 4, 2024

Documentation for whisper inference #104

Documentation for whisper inference #104

Comments

bil-ash commented Jan 31, 2024 • edited Loading

intellinjun commented Feb 4, 2024

bil-ash commented Feb 4, 2024

bil-ash commented Jan 31, 2024 •

edited

Loading