Skip to content

Commit

Permalink
qgemm: optimize avxvnni QGEMM inner kernel for M=1
Browse files Browse the repository at this point in the history
QGEMM Benchmarks when M = 1 on an 13th Gen Intel(R) Core(TM) i9-13900K
shows a 1.4x improvement on a single thread.

|--------------------------------------------------------------------+--------+---------+----------+----------+---------+---------|
| Benchmark                                                          | Time   | CPU     | Time Old | Time New | CPU Old | CPU New |
|--------------------------------------------------------------------+--------+---------+----------+----------+---------+---------|
| QGEMM/UnsignedAPackB/M:1/N:512/K:512/Batch:1/Threads:1/real_time   | -0.275 | -0.2756 | 4330     | 3137     | 4330    | 3136    |
| QGEMM/UnsignedAPackB/M:1/N:512/K:1024/Batch:1/Threads:1/real_time  | -0.292 | -0.2927 | 9027     | 6385     | 9027    | 6385    |
| QGEMM/UnsignedAPackB/M:1/N:1024/K:1024/Batch:1/Threads:1/real_time | -0.300 | -0.3005 | 17867    | 12499    | 17866   | 12498   |
| OVERALL_GEOMEAN                                                    | -0.289 | -0.2897 |          |          |         |         |
|--------------------------------------------------------------------+--------+---------+----------+----------+---------+---------|
  • Loading branch information
r-devulap committed Nov 26, 2024
1 parent 1b35bb0 commit 42a8d2d
Show file tree
Hide file tree
Showing 4 changed files with 555 additions and 99 deletions.
Loading

0 comments on commit 42a8d2d

Please sign in to comment.