Pass to add llvm annotations to avoid inlining #1142

newling · 2025-02-26T18:36:00Z

It doesn't make much sense to outline 64x64x64 matmuls to reduce program memory (PM), only to let LLVM then inline them and go out of PM! Let's see if this doesn't cause bad regressions.

Included here: put function names in alphabetical order in Passes.*

UPDATE

This causes serious slowdown in O3 outlined matmuls. Nice memory saving though. I/we should understand how inlining is helping, it seems like the compiler backend is having blind luck.

With PR:

matmul_4096_512_512_bf16_f32_O3_npu1_4col_outline_benchmark
--------------------------------------------------------------------------------------------------
Benchmark                                        Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------
BM_matmul/process_time/real_time_mean         2909 us          123 us            5 items_per_second=343.923/s
BM_matmul/process_time/real_time_median       2893 us          120 us            5 items_per_second=345.623/s
BM_matmul/process_time/real_time_stddev       64.2 us         12.6 us            5 items_per_second=7.49528/s
--------------------------------------------------------------------------------------------------
The largest program memory size (read from byte 72 of elf files) is 3680 bytes

Before:

matmul_4096_512_512_bf16_f32_O3_npu1_4col_outline_benchmark
--------------------------------------------------------------------------------------------------
Benchmark                                        Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------
BM_matmul/process_time/real_time_mean         1897 us          108 us            5 items_per_second=527.121/s
BM_matmul/process_time/real_time_median       1896 us          107 us            5 items_per_second=527.447/s
BM_matmul/process_time/real_time_stddev       11.0 us         5.43 us            5 items_per_second=3.0455/s
--------------------------------------------------------------------------------------------------
The largest program memory size (read from byte 72 of elf files) is 8464 bytes

first commit

33b7c1c

newling requested review from makslevental, nirvedhmeshram, MaheshRavishankar, yzhang93, Abhishek-Varma and jtuyls as code owners February 26, 2025 18:36

newling marked this pull request as draft February 26, 2025 19:36

newling mentioned this pull request Feb 27, 2025

Pass to add llvm annotations to avoid inlining (#2) #1146

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass to add llvm annotations to avoid inlining #1142

Pass to add llvm annotations to avoid inlining #1142

newling commented Feb 26, 2025 •

edited

Loading

Pass to add llvm annotations to avoid inlining #1142

Are you sure you want to change the base?

Pass to add llvm annotations to avoid inlining #1142

Conversation

newling commented Feb 26, 2025 • edited Loading

newling commented Feb 26, 2025 •

edited

Loading