[Performance]: dotnet/runtime tests are bottlenecked by MSBuild performance #10021

agocke · 2024-04-16T22:02:17Z

Issue Description

The dotnet/runtime tests consist of thousands of tests in the src/tests tree. In the lab these tests take upwards of 20 minutes to build.

However, it looks like most of this time is in MSBuild. When I build the first fraction of these tests (1100 tests) locally, it takes MSBuild 6:50s.

If I run the same projects through Roslyn's "Replay" runner, which runs the csc commands from the binlog, this takes 1:39s.

That's a roughly 5x speedup.

The overall cost here is significant complexity in the runtime build. The other piece of important info is that the libraries tests, take only a few minutes to build. That means that if we were to build the entire test tree together we would delay starting the libraries tests by > 20 minutes. To counteract this we make the build substantially more complicated by splitting out a separate job in our pipeline just to build the runtime tests. If the overall build time were significantly reduced, we could remove a lot of complexity and delay in our CI testing.

Steps to Reproduce

See dotnet/runtime src/tests tree. The command line I used to build was src/tests/build.sh allTargets skipnative skipgeneratelayout skiptestwrappers checked x64 /p:LibrariesConfiguration=Release -bl

Data

MSBuild 6:50s.
Csc 1:39s.

Analysis

No response

Versions & Configurations

No response

Regression

yes
no

Regression Details

No response

The text was updated successfully, but these errors were encountered:

agocke · 2024-04-16T22:32:47Z

fyi @rainersigwald

danmoseley · 2024-04-16T23:03:37Z

I thought the North Star here was far fewer assemblies?

agocke · 2024-04-16T23:37:16Z

You mean fewer projects? We've already performed a significant amount of consolidation. We can continue to shrink this down over time, but it will still likely be a lot of projects. And it doesn't look like csc has a problem with this scale.

agocke · 2024-04-17T02:31:26Z

The other reason this came up is that the 1ES pipeline work revealed how extremely complicated the runtime pipeline is. It's easily the largest pipeline in dotnet, exceeding standard AzDO rate limits by two orders of magnitude. The runtime tests currently represent an unresolvable complication in our AzDO pipelines wherein we would like to remove the extra job + upload + join, but can't do so if it would regress time-to-test-results by 20 min.

danmoseley · 2024-04-17T02:52:44Z

You mean fewer projects? We've already performed a significant amount of consolidation. We can continue to shrink this down over time, but it will still likely be a lot of projects.

Right. I'm not defending MSBuild. But why are more than say O(100) assemblies needed? I'm just curious, not suggesting I have any special insight.

agocke · 2024-04-17T05:21:00Z

I believe it's a mix of: Native AOT tests can't be mixed because more code has trimming side effects, some tests contaminate the process such that it can't be reused, some simply have code patterns that can't easily be combined, and finally test consolidation still ends up with a fair amount of manual effort and we can only commit so much per unit time.

rokonec · 2024-06-04T15:01:38Z

I am not saying there is no room for improvement, but MSBuild does lots more than csc tasks. Most of the work spent by MSBuild outside of csc task is to ensure it is up-to-date and consistent. For that have IO operations has to run, mostly involving checking file existence, timestamps, globing, etc... to detect changes in file system.

MichalPavlik · 2024-08-07T12:46:22Z

Could you please provide a binlog you collected during tests build? Thanks.

MichalPavlik · 2024-08-13T13:47:05Z

@agocke, although I wasn't able to build tests without error, I collected reasonable large binlog. From statistics, I see that most expensive task by far is ExecuteGenerateTestsScript defined in the Directory.Build.targets located in src\tests\JIT\HardwareIntrinsics\<arch>\Directory.Build.targets folder.
This targets takes ~11s (on the DevBox I ran the build) per project and spawns 2 processes with Exec task. The first exec starts another build and it takes usually ~6s. Instead of spawning a new process, I would recommend to use MSBuildTask to save time by avoiding new process creation for each tested intrinsic.
Another advice would be to decrease number of projects using this target. It looks like you have a separate project for each intrinsic. If you are able to group intrinsic tests and reduce number of projects, it would drastically reduce the build time.

@rainersigwald, pinging you for awareness. The target for x86 is located here, but other archs have the same approach.

agocke · 2024-08-13T20:35:57Z

Hi Michal, thanks for the investigation. I don't have time to dig deeply into it right now, but I'll look later.

Another advice would be to decrease number of projects using this target. It looks like you have a separate project for each intrinsic. If you are able to group intrinsic tests and reduce number of projects, it would drastically reduce the build time.

This is exactly what we don't want to do. It is something we've been doing slowly over the past couple of years, but the tests often have side effects that can pollute the process space and make diagnosing failures difficult. We have been able to mechanically move some, but that's difficult.

What we would like from MSBuild is to scale to thousands of projects without significant overhead.

MichalPavlik · 2024-08-14T08:51:42Z

...but the tests often have side effects that can pollute the process space and make diagnosing failures difficult

Yeah, my guess was that you need a strong isolation exactly for this reason. Unfortunately, creating hundreds of processes have a cost (+ CLR initialization).

What we would like from MSBuild is to scale to thousands of projects without significant overhead.

In this case, the overhead is caused mostly by custom actions in your build definition. As I mentioned earlier, I see one low hanging fruit that could improve your build. Anyway, this is interesting problem and I like interesting problems :) If you want, I could propose creating a task force to our team. It would contain member(s) of MSBuild team and Runtime team to figure out what we can do. We would need to understand how your build works and probably asking for some details.

agocke added the performance label Apr 16, 2024

AR-May added triaged Priority:2 Work that is important, but not critical for the release labels Apr 23, 2024

rokonec self-assigned this May 2, 2024

rokonec removed their assignment Jun 4, 2024

JanKrivanek mentioned this issue Aug 5, 2024

[Kitten] Kitten work - August/2024 #10479

Closed

14 tasks

JanKrivanek assigned JanKrivanek and MichalPavlik and unassigned JanKrivanek Aug 21, 2024

MichalPavlik added the size:3 label Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance]: dotnet/runtime tests are bottlenecked by MSBuild performance #10021

[Performance]: dotnet/runtime tests are bottlenecked by MSBuild performance #10021

agocke commented Apr 16, 2024

agocke commented Apr 16, 2024

danmoseley commented Apr 16, 2024

agocke commented Apr 16, 2024

agocke commented Apr 17, 2024

danmoseley commented Apr 17, 2024

agocke commented Apr 17, 2024

rokonec commented Jun 4, 2024

MichalPavlik commented Aug 7, 2024

MichalPavlik commented Aug 13, 2024

agocke commented Aug 13, 2024

MichalPavlik commented Aug 14, 2024 •

edited

Loading

[Performance]: dotnet/runtime tests are bottlenecked by MSBuild performance #10021

[Performance]: dotnet/runtime tests are bottlenecked by MSBuild performance #10021

Comments

agocke commented Apr 16, 2024

Issue Description

Steps to Reproduce

Data

Analysis

Versions & Configurations

Regression

Regression Details

agocke commented Apr 16, 2024

danmoseley commented Apr 16, 2024

agocke commented Apr 16, 2024

agocke commented Apr 17, 2024

danmoseley commented Apr 17, 2024

agocke commented Apr 17, 2024

rokonec commented Jun 4, 2024

MichalPavlik commented Aug 7, 2024

MichalPavlik commented Aug 13, 2024

agocke commented Aug 13, 2024

MichalPavlik commented Aug 14, 2024 • edited Loading

MichalPavlik commented Aug 14, 2024 •

edited

Loading