chore: Enable ThinLTO #2548

osiewicz · 2023-06-01T10:49:14Z

Per conversation with Antonio, I've suggested enabling full LTO; right now we use a crate-local ThinLTO, which does not inline function calls across crates.

Configuration	Current main (`788f97e`)	Thin LTO	Full LTO
Size in bytes	158806721	155868753	111115553
% of `main` size	100%	98.14%	69.96%
Size in bytes (no debug info)	129186657	127942929	108281345

SomeoneToIgnore · 2023-06-01T10:50:53Z

I would really love to see compilation time differences (or cargo --timings files, ideally)

SomeoneToIgnore · 2023-06-01T10:54:03Z

Another idea for the size reduction, that might not increase the build time a lot: cut off dependencies' debug symbols

[profile.release.package."*"]
debug = false

since we don't use big executor runtimes, we won't loose important the backtrace info this way?

osiewicz · 2023-06-01T10:56:25Z

Yup, I can provide those (though that'll take a while). Full LTO took about 20 minutes for me (vs 3 minutes for ThinLTO), which is far from ideal; however, we could add a custom dist profile that'd run full LTO and leave release profile as is.

osiewicz · 2023-06-01T10:58:59Z

Tbh I'm more interested in potential perf improvements, though size reduction is visible at glance

SomeoneToIgnore · 2023-06-01T11:03:28Z

AFAIK, we neither have any benchmark runs nor any historical perf data to notice any perf benefits LTO brings, hence I'm discussing the sizes only.
I would not expect LTO to help anyhow notably with the perf, but would be glad to be proven wrong 🙂

maxbrunsfeld · 2023-06-01T18:53:12Z

You could just do an ad-hoc before-and-after comparison of some known expensive operation, like editing a multi-buffer with 1000 cursors, or something like that.

maxbrunsfeld · 2023-06-01T18:54:48Z

A 20-minute LTO time, even if we only did it on CI for our releases, would definitely slow down our release process a bit. Do you have a sense of whether LTO time can be reduced by running on a machine with lots of CPU cores?

We could upgrade the machine we use to build the releases. MacStadium (the service we use for cloud-hosted macs) offers a Mac Studio (which has 20 CPU cores instead of 8).

osiewicz · 2023-06-01T19:41:58Z

I will upload timings that @SomeoneToIgnore requested tomorrow. In private convo we also came up with comparing sizes of various LTO flavours + debug=false (to see how much of the size reduction comes from removing references to functions -> removing debug info).
I don't believe fat LTO benefits that much from more cores. :/ We'd have to first see what kind of gain we could expect.
The intent of this PR is more so to spark a discussion than to get it merged ASAP (setting our CI machines on fire in the meantime).

osiewicz · 2023-06-01T19:56:35Z

Here are the timings.zip
Tl;dr:
Full LTO takes over 20 minutes,
Both NoLTO and ThinLTO take 4m20s.

Commands:
cargo clean followed by cargo build --timings --release

At the very least it may be worth a shot to try ThinLTO at some point in the future, as it aids cross-crate inlining and bunch of other optimisations. At the end of the day, features are more important right now so I guess we don't want to spend too much time on this?

maxbrunsfeld · 2023-06-01T20:20:15Z

~~Thin LTO is the default, right?~~

Nevermind, I guess "thin local LTO" is the default, which is different from "thin LTO". I'm open to just switching to "thin LTO" if the compile-time cost is low.

Change Fat LTO to ThinLTO

osiewicz · 2023-06-06T15:31:46Z

I believe that most of Full LTO's size reduction comes from stripping unused code -> stripping debug info (as the difference is size is not nearly as big when Zed is built with debug=false); another factor that might come into play is that as far as I know, Full LTO greatly aids devirtualization of virtual calls.

Anyways, we can come back to this later. As it stands, I'll merge ThinLTO variant.

Edit: we can also use multitargets to speed up bundle script.

chore: Enable full LTO

799f200

osiewicz added the run-build-dmg label Jun 1, 2023

Update Cargo.toml

797322f

Change Fat LTO to ThinLTO

maxbrunsfeld approved these changes Jun 6, 2023

View reviewed changes

osiewicz merged commit 572c59e into main Jun 6, 2023

osiewicz deleted the enable_lto branch June 6, 2023 19:50

osiewicz changed the title ~~chore: Enable full LTO~~ chore: Enable ThinLTO Jun 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: Enable ThinLTO #2548

chore: Enable ThinLTO #2548

osiewicz commented Jun 1, 2023 •

edited

Loading

SomeoneToIgnore commented Jun 1, 2023

SomeoneToIgnore commented Jun 1, 2023

osiewicz commented Jun 1, 2023

osiewicz commented Jun 1, 2023

SomeoneToIgnore commented Jun 1, 2023 •

edited

Loading

maxbrunsfeld commented Jun 1, 2023

maxbrunsfeld commented Jun 1, 2023 •

edited

Loading

osiewicz commented Jun 1, 2023 •

edited

Loading

osiewicz commented Jun 1, 2023 •

edited

Loading

maxbrunsfeld commented Jun 1, 2023 •

edited

Loading

osiewicz commented Jun 6, 2023 •

edited

Loading

chore: Enable ThinLTO #2548

chore: Enable ThinLTO #2548

Conversation

osiewicz commented Jun 1, 2023 • edited Loading

SomeoneToIgnore commented Jun 1, 2023

SomeoneToIgnore commented Jun 1, 2023

osiewicz commented Jun 1, 2023

osiewicz commented Jun 1, 2023

SomeoneToIgnore commented Jun 1, 2023 • edited Loading

maxbrunsfeld commented Jun 1, 2023

maxbrunsfeld commented Jun 1, 2023 • edited Loading

osiewicz commented Jun 1, 2023 • edited Loading

osiewicz commented Jun 1, 2023 • edited Loading

maxbrunsfeld commented Jun 1, 2023 • edited Loading

osiewicz commented Jun 6, 2023 • edited Loading

osiewicz commented Jun 1, 2023 •

edited

Loading

SomeoneToIgnore commented Jun 1, 2023 •

edited

Loading

maxbrunsfeld commented Jun 1, 2023 •

edited

Loading

osiewicz commented Jun 1, 2023 •

edited

Loading

osiewicz commented Jun 1, 2023 •

edited

Loading

maxbrunsfeld commented Jun 1, 2023 •

edited

Loading

osiewicz commented Jun 6, 2023 •

edited

Loading