Add LTO/IPO compile support #764

wanghenshui · 2022-08-02T08:04:55Z

No description provided.

CMakeLists.txt

PragmaTwice · 2022-08-02T12:40:14Z

It seems GCC builds in linux work well but Clang builds fail in linking phase (https://github.com/apache/incubator-kvrocks/runs/7631095274?check_suite_focus=true).

I have not tried it manually but I guess it maybe due to ld, which cannot recognize *.o generated by Clang with -flto enabled (it is actually LLVM bytecode in LTO mode, not native object file).

Hence you can try to replace ld with lld while clang is used. I can help you to do that if you do not familiar with those stuff.

wanghenshui · 2022-08-02T12:43:36Z

It seems GCC builds in linux work well but Clang builds fail in linking phase (https://github.com/apache/incubator-kvrocks/runs/7631095274?check_suite_focus=true).

I have not tried it manually but I guess it maybe due to ld, which cannot recognize *.o generated by clang with -flto enabled (it is actually llvm bytecode in LTO mode).

Hence you can try to replace ld with lld while clang is used. I can help you to do that if you do not familiar with those stuff.

I'll fix it. wait for

CMakeLists.txt

PragmaTwice · 2022-08-03T08:13:07Z

clang not working. i cannot fix it.

It's fine. I will help you.

I'm going to try your branch on my side and commit to it. So please do not force-push from now 🤣

git-hulk

LGMT, it looks didn't increase much compile time while comparing to before, but I didn't have a look at the memory usage.

ShooterIT · 2022-08-03T11:09:56Z

I think we should add some descriptions for these options, why do we use, their advantage or disadvantages

PragmaTwice · 2022-08-03T12:30:18Z

I think we should add some descriptions for these options, why do we use, their advantage or disadvantages

Link-time optimization (LTO) is an important optimization technology in modern compilers.

As we already known, C/C++ compilers generate native object files for every translation unit (TU), which indicates compiler can only get information within a certain TU, i.e. it know nothing about another TU (except declarations). But obviously most compiler optimization approach requires the information in function definitions, like constant propagation, reachable analysis, interprocedual dataflow analysis, inlining, loop invariant analysis, pointer analysis, etc.

So if compiler cannot retrieve such information, the optimization will be just discarded. Obviously, this is a huge loss. LTO postpone the optimization procedures to link-time so (almost) every definition in the program is available to the optimizer. I do not think LTO has any disadvantages other than possibly slowing down compilation.

I do not know why CMake call it interprocedure optimization, since interprocedure optimization can be done without LTO (but limited since it loss information in definitions).

PragmaTwice · 2022-08-04T10:51:41Z

Hi everyone, is there any other thought on this PR? I will merge it if no further discussion 🚀

ShooterIT · 2022-08-05T01:27:02Z

oh, thanks, is there a performance comparison?

PragmaTwice · 2022-08-05T04:44:29Z

oh, thanks, is there a performance comparison?

You can check this review and this paper.

ShooterIT · 2022-08-05T05:58:03Z

I mean how it affects kvrocks?
When we add new features or improvements, we must know their earnings, for example, if this build option make kvrocks performance better, we should know that and how much it improves. Otherwise, other contributors may add new optimization options, we also don't know which option bring earnings or why, we will loss the control of this project, we also can't answer the users's question how to optimize the performance with build options.

I don't say we will merge it only if this option must bring much obvious performance improvement, but we must know it clearly. In addition, not only for this PR changes, even if they are valid on another system, we also don't need to merge it if they don't bring earnings on kvrocks till now, since, for a long time, it will be more and more complex, and make it hard to maintain.

BTW, if you merge this commit, also please describe it clearly in commit log.

wanghenshui · 2022-08-05T06:46:21Z

I mean how it affects kvrocks? When we add new features or improvements, we must know their earnings, for example, if this build option make kvrocks performance better, we should know that and how much it improves. Otherwise, other contributors may add new optimization options, we also don't know which option bring earnings or why, we will loss the control of this project, we also can't answer the users's question how to optimize the performance with build options.

I don't say we will merge it only if this option must bring much obvious performance improvement, but we must know it clearly. In addition, not only for this PR changes, even if they are valid on another system, we also don't need to merge it if they don't bring earnings on kvrocks till now, since, for a long time, it will be more and more complex, and make it hard to maintain.

BTW, if you merge this commit, also please describe it clearly in commit log.

Do we have benchmark test? Or some scripts to test performence easy? We could use those stuff test LTO/before LTO difference.

LTO is free lunch, just like change compiler from c++98 to c++11, c++17, more higher, more gains.

PragmaTwice · 2022-08-05T07:08:12Z

Hi @ShooterIT, I agree with your argument that benchmark matters.

I think we can wait for a benchmark before merge it or set ENABLE_IPO to OFF on default since LTO is practical in lots of database projects like mysql, clickhouse and mongodb.

ShooterIT · 2022-08-05T07:30:01Z

from studying of these options, i think they would bring performance earnings, but till now, we still don't know how much on kvrocks

ShooterIT · 2022-08-05T07:32:25Z

for test, currently we can use redis-benchmark
https://github.com/apache/incubator-kvrocks#2--qps-on-different-payloads

tisonkun

@wanghenshui @PragmaTwice perhaps you can submit a benchmark report before and after this patch via redis-benchmark -p 6666 -q.

tisonkun · 2022-11-08T08:49:17Z

Closed as no consensus. @wanghenshui I suggest you create an issue first and share the performance benchmark so that our maintainers can be sure whether this change brings good.

wanghenshui · 2022-11-19T14:07:50Z

before IPO

redis-benchmark -q -p 6666
ERROR: failed to fetch CONFIG from 127.0.0.1:6666
WARNING: Could not fetch server CONFIG
PING_INLINE: 185528.77 requests per second, p50=0.143 msec                    
PING_MBULK: 191204.59 requests per second, p50=0.135 msec                    
SET: 132802.12 requests per second, p50=0.343 msec                    
GET: 205338.81 requests per second, p50=0.127 msec                    
INCR: 124688.28 requests per second, p50=0.359 msec                    
LPUSH: 95693.78 requests per second, p50=0.455 msec                   
RPUSH: 101936.80 requests per second, p50=0.399 msec                   
LPOP: 90579.71 requests per second, p50=0.463 msec                   
RPOP: 84033.61 requests per second, p50=0.495 msec                   
SADD: 132100.39 requests per second, p50=0.303 msec                    
HSET: 133511.34 requests per second, p50=0.311 msec                    
SPOP: 224719.11 requests per second, p50=0.119 msec                    
ZADD: 180180.17 requests per second, p50=0.231 msec                    
ZPOPMIN: 289855.06 requests per second, p50=0.095 msec                    
LPUSH (needed to benchmark LRANGE): 100200.40 requests per second, p50=0.439 msec                    
LRANGE_100 (first 100 elements): 118063.76 requests per second, p50=0.231 msec                    
LRANGE_300 (first 300 elements): 65445.03 requests per second, p50=0.415 msec                   
LRANGE_500 (first 500 elements): 41876.05 requests per second, p50=0.647 msec                   
LRANGE_600 (first 600 elements): 37735.85 requests per second, p50=0.719 msec                   
MSET (10 keys): 21537.80 requests per second, p50=2.223 msec

after IPO

 redis-benchmark -q -p 6666
ERROR: failed to fetch CONFIG from 127.0.0.1:6666
WARNING: Could not fetch server CONFIG
PING_INLINE: 233100.23 requests per second, p50=0.111 msec                    
PING_MBULK: 241545.89 requests per second, p50=0.111 msec                    
SET: 137741.05 requests per second, p50=0.319 msec                    
GET: 204498.98 requests per second, p50=0.127 msec                    
INCR: 119760.48 requests per second, p50=0.375 msec                    
LPUSH: 99108.03 requests per second, p50=0.439 msec                   
RPUSH: 101626.02 requests per second, p50=0.415 msec                    
LPOP: 86655.11 requests per second, p50=0.463 msec                   
RPOP: 97656.24 requests per second, p50=0.415 msec                    
SADD: 149476.83 requests per second, p50=0.311 msec                    
HSET: 163398.70 requests per second, p50=0.255 msec                    
SPOP: 158478.61 requests per second, p50=0.167 msec                    
ZADD: 136425.66 requests per second, p50=0.327 msec                    
ZPOPMIN: 242718.45 requests per second, p50=0.111 msec                    
LPUSH (needed to benchmark LRANGE): 93720.71 requests per second, p50=0.447 msec                   
LRANGE_100 (first 100 elements): 141442.72 requests per second, p50=0.183 msec                    
LRANGE_300 (first 300 elements): 69832.40 requests per second, p50=0.391 msec                   
LRANGE_500 (first 500 elements): 44247.79 requests per second, p50=0.607 msec                   
LRANGE_600 (first 600 elements): 37721.61 requests per second, p50=0.719 msec                   
MSET (10 keys): 23557.13 requests per second, p50=1.911 msec

after IPO optimize, redis quick bench result was worse, so don't merge it.

git-hulk · 2022-11-19T14:11:53Z

Thanks for @wanghenshui input, I'd like to have a try as well.

Update: I tested the ping/ping_bulk which won't be affected by the data volume, and this PR improve the performance from 54929.96 => 57378.93. It's worth reopening since we have the data to prove this.

PragmaTwice · 2022-11-19T15:37:01Z

The merge conflict is resolved now.

PragmaTwice · 2022-11-20T10:17:31Z

I think we can merge it now since the benchmark looks good.

git-hulk · 2022-11-20T11:48:03Z

I think we can merge it now since the benchmark looks good.

Yes

git-hulk · 2022-11-20T12:38:23Z

Thanks all, merging...

git-hulk · 2022-11-20T12:39:22Z

Thanks to @wanghenshui contribution again.

git-hulk requested a review from PragmaTwice August 2, 2022 08:06

PragmaTwice reviewed Aug 2, 2022

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

wanghenshui closed this Aug 2, 2022

wanghenshui reopened this Aug 2, 2022

PragmaTwice reviewed Aug 2, 2022

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

PragmaTwice reviewed Aug 3, 2022

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

Add LTO/IPO compile support

325f0b7

PragmaTwice added 4 commits August 3, 2022 16:15

fix lto

6da7d28

Merge branch 'whs_unstable' into HEAD

bbcc1b5

fix

db725f5

Merge branch 'whs_unstable' into HEAD

1c52cc3

PragmaTwice requested review from git-hulk and tisonkun August 3, 2022 10:00

PragmaTwice previously approved these changes Aug 3, 2022

View reviewed changes

git-hulk previously approved these changes Aug 3, 2022

View reviewed changes

Merge branch 'unstable' into unstable

a330257

tisonkun dismissed git-hulk’s stale review via a330257 October 23, 2022 03:39

tisonkun dismissed PragmaTwice’s stale review via a330257 October 23, 2022 03:39

tisonkun reviewed Oct 23, 2022

View reviewed changes

Merge branch 'apache:unstable' into unstable

48fcbcf

tisonkun closed this Nov 8, 2022

git-hulk reopened this Nov 19, 2022

Merge branch 'unstable' into unstable

c75f6b5

git-hulk approved these changes Nov 20, 2022

View reviewed changes

PragmaTwice approved these changes Nov 20, 2022

View reviewed changes

git-hulk merged commit bf13335 into apache:unstable Nov 20, 2022

AlexStocks mentioned this pull request Apr 15, 2023

build: support LTO OpenAtomFoundation/pika#1299

Closed

git-hulk mentioned this pull request Jul 6, 2023

Evaluate Profile-Guided Optimization on kvrocks #1551

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LTO/IPO compile support #764

Add LTO/IPO compile support #764

wanghenshui commented Aug 2, 2022

PragmaTwice commented Aug 2, 2022 •

edited

Loading

wanghenshui commented Aug 2, 2022

PragmaTwice commented Aug 3, 2022

git-hulk left a comment

ShooterIT commented Aug 3, 2022

PragmaTwice commented Aug 3, 2022 •

edited

Loading

PragmaTwice commented Aug 4, 2022

ShooterIT commented Aug 5, 2022

PragmaTwice commented Aug 5, 2022 •

edited

Loading

ShooterIT commented Aug 5, 2022

wanghenshui commented Aug 5, 2022

PragmaTwice commented Aug 5, 2022

ShooterIT commented Aug 5, 2022 •

edited

Loading

ShooterIT commented Aug 5, 2022

tisonkun left a comment

tisonkun commented Nov 8, 2022

wanghenshui commented Nov 19, 2022

git-hulk commented Nov 19, 2022 •

edited

Loading

PragmaTwice commented Nov 19, 2022

PragmaTwice commented Nov 20, 2022

git-hulk commented Nov 20, 2022

git-hulk commented Nov 20, 2022

git-hulk commented Nov 20, 2022

Add LTO/IPO compile support #764

Add LTO/IPO compile support #764

Conversation

wanghenshui commented Aug 2, 2022

PragmaTwice commented Aug 2, 2022 • edited Loading

wanghenshui commented Aug 2, 2022

PragmaTwice commented Aug 3, 2022

git-hulk left a comment

Choose a reason for hiding this comment

ShooterIT commented Aug 3, 2022

PragmaTwice commented Aug 3, 2022 • edited Loading

PragmaTwice commented Aug 4, 2022

ShooterIT commented Aug 5, 2022

PragmaTwice commented Aug 5, 2022 • edited Loading

ShooterIT commented Aug 5, 2022

wanghenshui commented Aug 5, 2022

PragmaTwice commented Aug 5, 2022

ShooterIT commented Aug 5, 2022 • edited Loading

ShooterIT commented Aug 5, 2022

tisonkun left a comment

Choose a reason for hiding this comment

tisonkun commented Nov 8, 2022

wanghenshui commented Nov 19, 2022

git-hulk commented Nov 19, 2022 • edited Loading

PragmaTwice commented Nov 19, 2022

PragmaTwice commented Nov 20, 2022

git-hulk commented Nov 20, 2022

git-hulk commented Nov 20, 2022

git-hulk commented Nov 20, 2022

PragmaTwice commented Aug 2, 2022 •

edited

Loading

PragmaTwice commented Aug 3, 2022 •

edited

Loading

PragmaTwice commented Aug 5, 2022 •

edited

Loading

ShooterIT commented Aug 5, 2022 •

edited

Loading

git-hulk commented Nov 19, 2022 •

edited

Loading