-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Greatly decrease the size of rustc_driver.so
when debuginfo is enabled
#110221
Conversation
r? @ozkanonur (rustbot has picked a reviewer for you, use r? to override) |
|
Done: https://rust-lang.zulipchat.com/#narrow/stream/317568-t-compiler.2Fwg-debugging/topic/investigating.20debuginfo.20size/near/348854698 (Separately, I want to enable frame pointers unconditionally so that |
I think frame pointers aren't enough to get inlined functions which can be pretty important. But I could be wrong about that. |
@jyn514 I wasn't suggesting it for all functions, just for a couple trivial |
6537a5f
to
92375ea
Compare
I see that this is using -gz to compress debuginfo -- that implies gzip, right? Maybe we can use zstd or similar for even better wins, though perhaps tooling support is less mature there. I'm also having trouble running this locally. I seem to get this error with debuginfo enabled:
Can you confirm whether I'm just testing the wrong way or something else is wrong? Here is the command that's getting executed:
|
Yes, that implies gzip.
So,
I can't find a detailed description like this in the clang documentation, but I think it behaves the same: https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-gz I would rather not try to be smarter than the C compiler driver - I think it will be hard to maintain, the compression benefit won't be very high (https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/precedent.20for.20linker.20feature.20detection.3F/near/348646500), and libbacktrace doesn't support it (and adding that support will itself increase the size of the generated binary).
Oh that's my bad, sorry, I forgot rust-lang/cargo#11958 didn't make it into 1.70 beta. This is blocked for another 6 weeks in that case. |
Ah, if that is done for us, great. I guess maybe it'll still break with e.g. windows msvc toolchain, but we can deal with that then. Sounds good on waiting another cycle here. |
@bors r+ I think this is a good step to take -- I don't think it's enough by itself that we can enable debuginfo by default, but still, a great win for everyone building rustc locally. |
📌 Commit c5c439188764c4dadc82ae451025535359d8aea6 has been approved by It is now in the queue for this repository. |
This comment has been minimized.
This comment has been minimized.
i think this is a bug in gcc actually? its detection is wrong for mingw :( i can either switch bootstrap from using |
- Only add -gz if it's supported - Don't include extra unnecessary debuginfo when only debuginfo-level=1 is set - Compress debuginfo sections to reduce the size of debuginfo on disk. before: 650 MB line tables only: 335 MB compressed only: 216 MB compressed and line tables: 186 MB no debuginfo at all: 130 MB I want to investigate why `-C line-tables-only` is still ~tripling the size of the binary, but this seems like a good improvement in the meantime. I've tested that both valgrind and perf can read the debuginfo: ``` ([email protected]) ~/rust [08:31:08] ; valgrind $(rustup which rustc --toolchain rust_stage2) --version ==441671== Memcheck, a memory error detector ==441671== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==441671== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==441671== Command: /home/gh-jyn514/.local/lib/rustup/toolchains/rust_stage2/bin/rustc --version ==441671== rustc 1.70.0-dev ==441671== ==441671== HEAP SUMMARY: ==441671== in use at exit: 231,289 bytes in 1,874 blocks ==441671== total heap usage: 2,538 allocs, 664 frees, 486,368 bytes allocated ==441671== ==441671== LEAK SUMMARY: ==441671== definitely lost: 70,656 bytes in 1 blocks ==441671== indirectly lost: 0 bytes in 0 blocks ==441671== possibly lost: 0 bytes in 0 blocks ==441671== still reachable: 160,633 bytes in 1,873 blocks ==441671== suppressed: 0 bytes in 0 blocks ==441671== Rerun with --leak-check=full to see details of leaked memory ==441671== ==441671== For lists of detected and suppressed errors, rerun with: -s ==441671== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ; perf record $(rustup which rustc --toolchain rust_stage2) --version rustc 1.70.0-dev [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.005 MB perf.data (70 samples) ] ; perf report Samples: 70 of event 'cycles:u', Event count (approx.): 21356967 Overhead Command Shared Object Symbol 51.55% rustc ld-linux-aarch64.so.1 [.] _dl_lookup_symbol_x 18.70% rustc ld-linux-aarch64.so.1 [.] _dl_relocate_object 11.95% rustc ld-linux-aarch64.so.1 [.] do_lookup_x 5.55% rustc [unknown] [k] 0xffffa9ad41cfcfdc 2.68% rustc libc.so.6 [.] __GI___strlen_asimd 2.42% rustc librustc_driver-1a385c366c35e81a.so [.] llvm::StringMapImpl::LookupBucketFor 2.16% rustc librustc_driver-1a385c366c35e81a.so [.] _GLOBAL__sub_I_X86InstructionSelector.cpp 1.96% rustc libstd-990fe978dab76ef3.so [.] <alloc::vec::Vec<T,A> as core::clone::Clone>::clone 1.60% rustc librustc_driver-1a385c366c35e81a.so [.] llvm::cl::opt<bool, false, llvm::cl::parser<bool> >::~opt 1.22% rustc ld-linux-aarch64.so.1 [.] strcmp 0.13% rustc ld-linux-aarch64.so.1 [.] stat64 0.05% rustc ld-linux-aarch64.so.1 [.] __minimal_calloc 0.02% rustc ld-linux-aarch64.so.1 [.] __GI___tunables_init 0.02% rustc ld-linux-aarch64.so.1 [.] _dl_start 0.00% rustc [unknown] [k] 0xffffa9ad41cfd844 0.00% rustc ld-linux-aarch64.so.1 [.] _start ```
@bors r=Mark-Simulacrum rollup=iffy |
⌛ Testing commit 5eeeed1 with merge 62eb31bcf516293ac0a97bb31d11868594239953... |
💔 Test failed - checks-actions |
:/
@bors retry |
☀️ Test successful - checks-actions |
Finished benchmarking commit (42f28f9): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 645.963s -> 646.615s (0.10%) |
Huh, looks like this was a big improvement on hello world for some reason. I guess the compressed debuginfo makes it faster to load the executable into memory? I didn't think perf built rustc with debuginfo though ... |
We are building the standard library with debuginfo in CI I think - https://github.com/rust-lang/rust/blob/master/src/ci/run.sh#L95 So this probably makes sense. Let's see if we get any reports of problems -- I'm not sure if we need non-line-tables debuginfo for std for debugging to work of data structures (e.g., in gdb). I guess we have some tests for that and they did work, but sometimes our tests run in environments with different debug levels. |
…-Simulacrum bootstrap: Don't override `debuginfo-level = 1` to mean `line-tables-only` This has real differences in the effective debuginfo: in particular, it omits the module-level information and makes perf less useful (it can't distinguish "self" from "child" time anymore). Allow passing `line-tables-only` directly in config.toml instead. See https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/debuginfo.20in.20try.20builds/near/365090631 and https://rust-lang.zulipchat.com/#narrow/stream/238009-t-compiler.2Fmeetings/topic/.5Bsteering.5D.202023-06-09/near/364883519 for more discussion. This effectively reverts the cargo half of rust-lang#110221 to avoid regressing rust-lang#60020 again in 1.72.
before: 650 MB
line tables only: 335 MB
compressed only: 216 MB
compressed and line tables: 186 MB
no debuginfo at all: 130 MB
Here's an example backtrace:
with `debuginfo=1` (what we emit currently for `debuginfo-level-rustc = 1`)
with `debuginfo=line-tables-only` (what we'll emit for `debuginfo-level-rustc = 1`) after this change
with `debuginfo==0` (what we ship on nightly)
I want to investigate why
-C line-tables-only
is still ~tripling the size of the binary (update: done #110221 (comment)), but this seems like a good improvement in the meantime.I've tested that both valgrind and perf can read the debuginfo:
To test this, you can run
x build --stage 0 cargo
, setbuild.cargo = "build/host/stage0-tools-bin/cargo"
, and thenx build --stage 2 std
. You should be able to compare the rustc_driver.so outputs to each other:The difference between stage1 and stage2 is
debuginfo=1
vsdebuginfo=line-tables-only
. Both stages have-gz
(compressed debuginfo) enabled.This depends on rust-lang/cargo#11958 (and the exact commit of the cargo submodule will need to change before merging).
Helps with #104968.