Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

At debuginfo=0, don't inline debuginfo when inlining #123949

Merged

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Apr 15, 2024

See #123949 (comment) for info.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 15, 2024
@scottmcm scottmcm force-pushed the debug-no-means-no-debug-info-when-inlined branch from 8f9e4a5 to d4344e6 Compare April 15, 2024 01:46
@scottmcm
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 15, 2024
@bors
Copy link
Contributor

bors commented Apr 15, 2024

⌛ Trying commit d4344e6 with merge 00b13bc...

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 15, 2024
…o-when-inlined, r=<try>

At debuginfo=0, don't inline debuginfo when inlining

r? ghost
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Apr 15, 2024

☀️ Try build successful - checks-actions
Build commit: 00b13bc (00b13bc601a456136e1791e26a4de5c1d0f77a65)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (00b13bc): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.1% [0.4%, 1.7%] 4
Regressions ❌
(secondary)
0.2% [0.2%, 0.2%] 1
Improvements ✅
(primary)
-1.0% [-2.4%, -0.1%] 23
Improvements ✅
(secondary)
-0.8% [-1.7%, -0.1%] 20
All ❌✅ (primary) -0.7% [-2.4%, 1.7%] 27

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.7% [-1.7%, -1.7%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.7% [-1.7%, -1.7%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.6% [1.2%, 1.9%] 2
Regressions ❌
(secondary)
1.7% [1.6%, 1.9%] 2
Improvements ✅
(primary)
-2.2% [-2.6%, -1.9%] 2
Improvements ✅
(secondary)
-5.5% [-7.7%, -3.4%] 6
All ❌✅ (primary) -0.3% [-2.6%, 1.9%] 4

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.3%, 0.3%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.5% [-6.8%, -0.0%] 13
Improvements ✅
(secondary)
-4.5% [-10.9%, -0.5%] 4
All ❌✅ (primary) -2.3% [-6.8%, 0.3%] 14

Bootstrap: 675.963s -> 672.737s (-0.48%)
Artifact size: 316.01 MiB -> 315.87 MiB (-0.05%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Apr 15, 2024
@rust-log-analyzer

This comment has been minimized.

@scottmcm scottmcm force-pushed the debug-no-means-no-debug-info-when-inlined branch from 557265d to 83f97a0 Compare April 17, 2024 00:29
@scottmcm
Copy link
Member Author

r? @cjgillot
Sorry for sending yet another PR your way, but this is my revisiting of #110702 (comment) in a way with less total impact but that should address the original main goal.

Notably, it makes a big difference in removing unnecessary things for common things like &usize: PartialEq.

If you look at

pub fn demo(x: &usize, y: &usize) -> bool {
    x == y
}

today, it keeps a whole bunch of unnecessary things in MIR https://rust.godbolt.org/z/Y6xjcn6fb

fn demo(_1: &usize, _2: &usize) -> bool {
    debug x => _1;
    debug y => _2;
    let mut _0: bool;
    let mut _3: &&usize;
    let mut _4: &&usize;
    scope 1 (inlined std::cmp::impls::<impl PartialEq for &usize>::eq) {
        debug self => _3;
        debug other => _4;
        let mut _5: &usize;
        let mut _6: &usize;
        scope 2 (inlined std::cmp::impls::<impl PartialEq for usize>::eq) {
            debug self => _5;
            debug other => _6;
            let mut _7: usize;
            let mut _8: usize;
        }
    }

    bb0: {
        StorageLive(_3);
        _3 = &_1;
        StorageLive(_4);
        _4 = &_2;
        StorageLive(_5);
        StorageLive(_6);
        _5 = _1;
        _6 = _2;
        StorageLive(_7);
        _7 = (*_5);
        StorageLive(_8);
        _8 = (*_6);
        _0 = Eq(move _7, move _8);
        StorageDead(_8);
        StorageDead(_7);
        StorageDead(_6);
        StorageDead(_5);
        StorageDead(_4);
        StorageDead(_3);
        return;
    }
}

In particular, those &&usize locals are of no use to anyone, and mean that @saethlin has found things like how x.as_bytes() == y.as_bytes() can end up being worse than *x.as_bytes() == *y.as_bytes() because of those extra things.

Similarly, changes like this are great because the

          scope 6 (inlined NonNull::<[u8]>::as_ptr) {
-             debug self => _5;
              let mut _12: *const [u8];
          }

helps reduce that extra cost of using helpers, to avoid the "should we just use as casts because the functions are slower/bigger/whatever" conversations.

But those locals are only kept around because of the debuginfo pointing at them. They're otherwise dead, and existing MIR optimizations know that.

So this PR makes just one tiny change: when inlining things in MIR, don't inline the debuginfo into the caller if the caller is using -C debuginfo=0. That then means that the other post-inlining mir-opt passes will notice that the locals and assignments aren't needed, and remove them.

But it doesn't remove all the debug info. The debuginfo for the original function is still kept, for example. It's only when inlining that it disappears. Which is hopefully a good tradeoff anyway, since if things are small enough to be inlined they're often not interesting to debug into anyway. But we also don't remove all the debug info, so there will still be some step-into-it debug info available if you inline a debuginfo=0 callee into a debuginfo=2 caller -- at least the parameter debuginfo will still be there, for example.

That'll hopefully help avoid the impact that ended up with me not figuring out how to land that previous PR. And it'll still address the main pain point of compiled-with-debuginfo-std keeping a bunch of unnecessary stuff when inlined into your release-without-debuginfo build -- and, as the perf run shows, saving a bunch of instructions and binary size as it does so.

@scottmcm scottmcm marked this pull request as ready for review April 17, 2024 06:43
@rustbot
Copy link
Collaborator

rustbot commented Apr 17, 2024

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Comment on lines -55 to -61
_6 = &mut _5;
StorageLive(_12);
StorageLive(_11);
StorageLive(_7);
_7 = &(_5.0: u32);
StorageLive(_8);
_8 = &(_5.1: u32);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These references are another good example -- nothing actually needed them other than the inlined debuginfo, so they just disappear entirely.

StorageLive(_2);
_2 = char::methods::<impl char>::to_digit(_1, const 8_u32) -> [return: bb1, unwind unreachable];
}

bb1: {
_3 = &_2;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another nice concrete example: using Option::is_some used to leave a reference to the option in the statements here, even though that was never used for anything.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 17, 2024

This makes a lot of sense, none of the removed debug info seemed very useful, and is a perf improvement when not having debug info enabled due to stripping debug info from libstd functions when inlining them.

r? @oli-obk

@bors r+

@bors
Copy link
Contributor

bors commented Apr 17, 2024

📌 Commit 83f97a0 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 17, 2024
Comment on lines 9 to 10
scope 1 {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to PR, but this empty scopes looks useless, is they needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These empty ones that aren't from inlining do seem useless to me, but I don't know.

As you say, not this PR, so can you open an issue or a zulip thread about this? Cleaning up these scopes after SimplifyLocals, or something, does seem like a reasonable option.

@scottmcm
Copy link
Member Author

This is 4 lines of "real" code and 2000 lines of test updates, so
@bors rollup=iffy (will conflict with most mir test changes)

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 18, 2024
…o-when-inlined, r=oli-obk

At debuginfo=0, don't inline debuginfo when inlining

See rust-lang#123949 (comment) for info.
@bors
Copy link
Contributor

bors commented Apr 18, 2024

⌛ Testing commit 5d5b1d4 with merge c563dcc...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Apr 18, 2024

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 18, 2024
@scottmcm scottmcm force-pushed the debug-no-means-no-debug-info-when-inlined branch from 5d5b1d4 to b8ac5c0 Compare April 18, 2024 16:35
@scottmcm
Copy link
Member Author

Rebased to pick up the tests/crashes folder and updated the ICE test to explicitly ask for debuginfo so it keeps ICEing.

@bors r=oli-obk,onur-ozkan

@bors
Copy link
Contributor

bors commented Apr 18, 2024

📌 Commit 20cf595 has been approved by oli-obk,onur-ozkan

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 18, 2024
@bors
Copy link
Contributor

bors commented Apr 18, 2024

⌛ Testing commit 20cf595 with merge 3412f01...

@bors
Copy link
Contributor

bors commented Apr 18, 2024

☀️ Test successful - checks-actions
Approved by: oli-obk,onur-ozkan
Pushing 3412f01 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 18, 2024
@bors bors merged commit 3412f01 into rust-lang:master Apr 18, 2024
13 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Apr 18, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (3412f01): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.0% [0.3%, 1.8%] 5
Regressions ❌
(secondary)
1.0% [0.2%, 3.4%] 4
Improvements ✅
(primary)
-0.9% [-2.4%, -0.1%] 22
Improvements ✅
(secondary)
-1.2% [-1.8%, -0.2%] 10
All ❌✅ (primary) -0.6% [-2.4%, 1.8%] 27

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.8% [-3.9%, -0.9%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.8% [-3.9%, -0.9%] 5

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [1.0%, 1.7%] 3
Regressions ❌
(secondary)
2.0% [1.7%, 2.4%] 4
Improvements ✅
(primary)
-2.0% [-2.3%, -1.8%] 2
Improvements ✅
(secondary)
-3.3% [-4.0%, -2.4%] 5
All ❌✅ (primary) -0.1% [-2.3%, 1.7%] 5

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.5% [-6.8%, -0.0%] 13
Improvements ✅
(secondary)
-4.5% [-10.9%, -0.5%] 4
All ❌✅ (primary) -2.4% [-6.8%, 0.0%] 14

Bootstrap: 676.731s -> 672.221s (-0.67%)
Artifact size: 316.11 MiB -> 315.28 MiB (-0.26%)

@scottmcm scottmcm deleted the debug-no-means-no-debug-info-when-inlined branch April 19, 2024 00:13
@therealprof
Copy link
Contributor

Whee, finally... 🕺🏻

@rylev
Copy link
Member

rylev commented Apr 23, 2024

Seems like @oli-obk signed off on the perf regressions seen here since the improvements outweighed them.

@rustbot label: +perf-regression-triaged

.opts
.unstable_opts
.inline_mir_preserve_debug
.unwrap_or(self.tcx.sess.opts.debuginfo != DebugInfo::None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this == DebugInfo::Full? Isn't this current implementation retaining variable debuginfo when compiling with line-tables-only?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, it was always copied during inlining, so it at least shouldn't be a regression, right? And when the debuginfo is not set to generate variable debuginfo, shouldn't var_debug_info be empty anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it at least shouldn't be a regression, right?

Yeah, this ought not to be a regression, but I was reading this code and this condition is surprising.

And when the debuginfo is not set to generate variable debuginfo, shouldn't var_debug_info be empty anyway?

When we load optimized_mir for the standard library, it should always have variable debuginfo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. So, I thought about familiarizing myself a bit with the debuginfo handling, I'll try to send a PR that improves this then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.