Improve compile times of the most expensive source modules #1806

lgritz · 2017-11-19T05:08:50Z

Some source modules took minutes to compile, mostly because they involve
lots of template instatiation for many type combinations.

So, first, some of the functions that had a very wide cross-product of
type combinations could be changed to the variety that only handles the
common types and uses float as an intermediate format for the outlier
cases. That reduces total compile work substantially.

But also, particularly long-compiling modules can screw up parallel
builds, because that all the other modules for a library may get done
compiling while that one with an extra long compile is still going, but
nothing else can proceed until that's done, so only one core is doing
work while the others wait. So to combat this, I took the two files with
the very longest times (imagebufalgo_pixelmath and imagebufalgo_copy)
and split them up into multiple pieces. That doesn't reduce the total
amount of compilation work, but it does make it easier to parallelize it
over multiple cores.

The net result can be seen by these stats for fresh full builds (after
clearing ccache) on my 4-core laptop:

     before:       315.85 real       850.53 user        42.60 sys
     after:        200.44 real       725.09 user        36.21 sys

The "real" time is the important one -- reduce wall clock time by ~35%
when doing a full parallel build on a 4-core machine.

Note: the big blocks of changes are just moving code from one file to another.
Those sections didn't have any logic changes.

Some source modules took minutes to compile, mostly because they involve lots of template instatiation for many type combinations. So, first, some of the functions that had a very wide cross-product of type combinations could be changed to the variety that only handles the common types and uses float as an intermediate format for the outlier cases. That reduces total compile work substantially. But also, particularly long-compiling modules can screw up parallel builds, because that all the other modules for a library may get done compiling while that one with an extra long compile is still going, but nothing else can proceed until that's done, so only one core is doing work while the others wait. So to combat this, I took the two files with the very longest times (imagebufalgo_pixelmath and imagebufalgo_copy) and split them up into multiple pieces. That doesn't reduce the total amount of compilation work, but it does make it easier to parallelize it over multiple cores. The net result can be seen by these stats for fresh full builds (after clearing ccache) on my 4-core laptop: before: 315.85 real 850.53 user 42.60 sys after: 200.44 real 725.09 user 36.21 sys The "real" time is the important one -- reduce wall clock time by ~35% when doing a full parallel build on a 4-core machine.

lgritz · 2017-11-23T21:05:16Z

For those curious, I reran the timings on a beefier Linux machine where I typically build with ninja -j 24, and the results were similar:

before:    3:01 real   23:51 user   0:41 sys
after:     2:02 real   22:48 user   0:41 sys

So it speeds up total work slightly (5%), but by load balancing among the threads better, it reduces the amount of human time waiting for the build by 33%!

Watching the graphical load meter as it goes, I saw that there is still a good 30-45 seconds where only 2 or 3 threads are working, so I think I can get even better utilization if I jiggle things a bit more.

Signed-off-by: Brad Smith <[email protected]> Co-authored-by: Rémi Achard <[email protected]> Co-authored-by: Doug Walker <[email protected]>

lgritz merged commit 81ea41c into AcademySoftwareFoundation:master Nov 21, 2017

lgritz deleted the lg-breakup branch November 27, 2017 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve compile times of the most expensive source modules #1806

Improve compile times of the most expensive source modules #1806

lgritz commented Nov 19, 2017

lgritz commented Nov 23, 2017

Improve compile times of the most expensive source modules #1806

Improve compile times of the most expensive source modules #1806

Conversation

lgritz commented Nov 19, 2017

lgritz commented Nov 23, 2017