Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU][ARM] KleidiAI integration and KleidiAI MM executor #28830

Merged
merged 14 commits into from
Feb 17, 2025

Conversation

alvoron
Copy link
Contributor

@alvoron alvoron commented Feb 5, 2025

Details:

  • kleidiai is added as git submodule
  • kleidiai is built statically and linked into cpu plugin library
  • MatMul kleidiai executor is added
  • weights transpose is supported in MatMul kleidiai executor
  • Initial implementation is inherited from Add kleidiai as thirdparty #27331

Tickets:

  • ticket-id

@alvoron alvoron added the platform: arm OpenVINO on ARM / ARM64 label Feb 5, 2025
@alvoron alvoron requested review from a team as code owners February 5, 2025 08:30
@github-actions github-actions bot added category: CPU OpenVINO CPU plugin category: build OpenVINO cmake script / infra category: dependency_changes Pull requests that update a dependency file labels Feb 5, 2025
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch 2 times, most recently from fe0902d to ed92197 Compare February 5, 2025 10:12
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from ed92197 to 071c016 Compare February 5, 2025 10:28
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch 3 times, most recently from 919dbbc to e77b37f Compare February 6, 2025 13:00
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from e77b37f to 2388725 Compare February 6, 2025 13:13
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from 3b86719 to 60bad2c Compare February 7, 2025 12:31
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from 60bad2c to e3350e9 Compare February 7, 2025 12:42
@dmitry-gorokhov
Copy link
Contributor

@NishantPrabhuFujitsu Alex fixed major perf/memory issues. Will you be able to try it again and meausre the perf?

@alvoron alvoron requested a review from a team as a code owner February 7, 2025 13:31
@github-actions github-actions bot added the category: licensing Changes in OpenVINO licenses label Feb 7, 2025
@NishantPrabhuFujitsu
Copy link
Contributor

NishantPrabhuFujitsu commented Feb 9, 2025

@NishantPrabhuFujitsu Alex fixed major perf/memory issues. Will you be able to try it again and meausre the perf?

Here are the benchmark results for TinyLlama-1.1B-Chat-v1.0 after incorporating the fixes.

Backend Prompt evaluation throughput (tokens/sec) Decoding throughput (tokens/sec)
KleidiAI 397.98 34.45
ACL 402.79 34.42

The performance is at par with ACL now. Also the memory leak is resolved. Thanks!

@dmitry-gorokhov
Copy link
Contributor

dmitry-gorokhov commented Feb 10, 2025

@NishantPrabhuFujitsu Alex fixed major perf/memory issues. Will you be able to try it again and meausre the perf?

Here are the benchmark results for TinyLlama-1.1B-Chat-v1.0 after incorporating the fixes.

Backend Prompt evaluation throughput (tokens/sec) Decoding throughput (tokens/sec)
KleidiAI 397.98 34.45
ACL 402.79 34.42
The performance is at par with ACL now. Also the memory leak is resolved. Thanks!

@NishantPrabhuFujitsu Thanks a lot!
I think the perf looks good. Given the fact this is fp32 precision I wouldn't expect any perf benefits over ACL.
Int8/Int4 will make the difference :)

We will finalize the PR and merge it once get the approval.

@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from 22b8186 to b7e3f21 Compare February 14, 2025 08:01
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from 6df0caf to db8959d Compare February 14, 2025 14:46
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from db8959d to a9fb2b9 Compare February 14, 2025 14:48
@alvoron alvoron requested a review from a team as a code owner February 15, 2025 17:23
@alvoron alvoron requested review from itikhono and removed request for a team February 15, 2025 17:23
@github-actions github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label Feb 15, 2025
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from 4ab895b to 16411be Compare February 15, 2025 18:55
@alvoron alvoron force-pushed the alvoron_kleidi_matmul branch from 16411be to 0c6fc5e Compare February 15, 2025 19:02
@dmitry-gorokhov dmitry-gorokhov added this to the 2025.1 milestone Feb 17, 2025
@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Feb 17, 2025
Merged via the queue into openvinotoolkit:master with commit ff06fa3 Feb 17, 2025
186 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: build OpenVINO cmake script / infra category: CPU OpenVINO CPU plugin category: dependency_changes Pull requests that update a dependency file category: licensing Changes in OpenVINO licenses category: transformations OpenVINO Runtime library - Transformations platform: arm OpenVINO on ARM / ARM64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants