[NPU] Slow Token Generation with Latest NPU Driver 32.0.100.3053 on LNL 226V series #12266

climh · 2024-10-25T00:51:32Z

Description

Observed less than < 1 token per second generation for model >7B parameters using 32.0.100.3053 driver with latest ipex-llm[npu] on LNL 226V series laptop.

Model tested:

Qwen/Qwen2.5-7B-Instruct
meta-llama/Llama-2-7b-hf
meta-llama/Meta-Llama-3-8B-Instruct

ipex-llm[npu] = 2.2.0b20241022
npu driver = 32.0.10.3053

hkvision added the user issue label Oct 25, 2024

plusbang assigned plusbang and unassigned plusbang Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NPU] Slow Token Generation with Latest NPU Driver 32.0.100.3053 on LNL 226V series #12266

[NPU] Slow Token Generation with Latest NPU Driver 32.0.100.3053 on LNL 226V series #12266

climh commented Oct 25, 2024

[NPU] Slow Token Generation with Latest NPU Driver 32.0.100.3053 on LNL 226V series #12266

[NPU] Slow Token Generation with Latest NPU Driver 32.0.100.3053 on LNL 226V series #12266

Comments

climh commented Oct 25, 2024

Description