-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Massive xunit test performance degradation in .NET core 2.1.700 and 2.2.300 #2775
Comments
Did you get a chance to collect traces and compare where the difference comes from? |
@karelz It's the latest versions that are slow.
Not yet, how should I go about collecting traces? |
oh, I didn't notice it is minutes, not seconds. That is bad. @livarcocc @nguerrera is it a known issue? |
@karelz I'm unable to replicate it on my machine 😉 I've tried on Windows, WSL and Docker (there's a helper Dockerfile in the root of the repo), but given it's happening on two different CI systems, and three different OSS projects, I figured it worth reporting before being able to replicate on my laptop. The fastest way to execute the tests is to run coverage.cmd on windows, and coverage.sh on linux, coverage.cmd just tests .NET Framework though. |
OK, that is interesting - I assume we will have similar problems to replicate it. |
Quite possibly.
Nope, I haven't tried that. I'm not in a position to try that at the moment either. |
I'm going to try the Dockerfile out and see if I can reproduce the regression. |
First I hear of it. |
Strangely, the dotnet test runs are reporting mere seconds of total time while the overall job takes much longer. Is it possible to get timestamps at the start/end of each command? |
CC @singhsarab |
@brianrob I've just pushed an updated Dockerfile, and I'm experimenting with tweaking the following: docker build --cpuset-cpus=0 --cpuset-mems=0 --cpu-period=1000 --build-arg CACHE_BUST=1 . You need to keep changing CACHE_BUST and it will cache up to the test execution.
@nguerrera I've just pushed an update to display the dates, checkout the builds of this PR JSkimming/abioc#82 |
It seems that I'm not able to reproduce this either. |
@brianrob is there something we can ask @JSkimming to collect that we can then analyze? |
Yes. If you're able to reproduce the issue, please follow the instructions at https://github.com/dotnet/coreclr/blob/master/Documentation/project-docs/linux-performance-tracing.md to create a trace for each of the good and bad cases. You don't need to capture the whole thing - let's just start with 20 seconds for each. |
@brianrob any tips on how to get that running in a hosted CI system? So far I've only been able to replicate the issue on CircleCI and Travis. There's a lot of If it's something that can be enabled on Azure DevOps, I'm willing to set-up the build there too, though until I try, I don't know if I'll be able to replicate things. |
@nguerrera I've updated the main description with links to the builds with timings between the commands. |
@JSkimming sadly we are not familiar with those environments. |
OK, I think I've found the issue. TL;DROn a hunch, I changed the script to stop displaying the output of The HunchI noticed the test output flickered when displaying a message.
Here's an example I get the impression the CLI is writing that message a lot, which eventually results in the flickering because a buffer somewhere is filled. Hosted CI systems capture the stdout of the tools being executed, but in their case, all those writes are going through code (that ultimately streams to the browser) rather than directly to a terminal, and all those stdout writes from This also explains the discrepancy between the test execution timings from the Wed May 22 20:53:05 UTC 2019
0 minutes and 7 seconds elapsed.
$ dotnet test --no-restore --no-build -f netcoreapp2.1 -c Release /root/project/test/Abioc.Tests.Internal/Abioc.Tests.Internal.csproj --results-directory /root/project/test/TestResults/output/ --logger "trx;LogFileName=Abioc.Tests.Internal.trx"
Test run for /root/project/test/Abioc.Tests.Internal/bin/Release/netcoreapp2.1/Abioc.Tests.Internal.dll(.NETCoreApp,Version=v2.1)
Microsoft (R) Test Execution Command Line Tool Version 16.1.0
Copyright (c) Microsoft Corporation. All rights reserved.
Starting test execution, please wait...
Test run in progress.Test run in progress ... Results File: /root/project/test/TestResults/output/Abioc.Tests.Internal.trx
Test Run Successful.
Total tests: 310
Passed: 310
Total time: 2.1397 Seconds
Wed May 22 21:03:54 UTC 2019
10 minutes and 56 seconds elapsed.
Actually, I misread the versions, the issue was introduced in v16.1.0 see below. |
Awesome to see we are closer to a root cause! |
Looks like microsoft/vstest#1964 |
@nguerrera Ahh, I misread the numbers. The earlier (working) release was 16.0.1, and those changes went into 16.1.0. So it is in vstest. |
From a quick reading of the code I think I may have identified the issue. The ProgressIndicator clears and redisplays the progress message whenever it receives successive calls to Pause() then Start(). This occurs whenever ConsoleLogger.TestMessageHandler() or ConsoleLogger.TestResultHandler is called regardless of the verbosity level. As a result, the log message is constantly being redrawn, even if the verbosity level would not result in a message. I'm going to have a look at a potential fix, and submit a PR later. I'll keep you updated on progress. |
@karelz This should be moved to microsoft/vstest repo. |
@JSkimming Awesome work tracking this down! |
@karelz I've raised this PR microsoft/vstest#2024, all feedback is welcome. Do you want to raise this as an issue in the microsoft/vstest repo, I'm happy to do it if it helps. |
Awesome. Please open issue there if they need it, but maybe PR is enough. We can't move issues cross-org. Awesome work, thanks a ton!!! |
I observed a degradation of performance even in a Linux host environment (not only in a CI environment). One of our test project (around 2800 tests) took about 30 seconds (on 2.1.604) and now it takes about 1.87 minutes. We talk about a 4.5 times longer to run tests than previous SDK. Do you know if the PR involved will fix that too? |
@karelz this issue is still not resolved. The fix I implemented was merged on the 28th May, and the NuGet package was released on the 30th May, but until there is a .NET Core SDK release where the latest version of VSTest is "inserted", the problem persists. Is there any update on when a new SDK will be released incorporating the latest VSTest? |
The .NET SDK releases 2.1.701 & 2.2.301 are now available and contain the fixes. |
This should fix the performance per dotnet/core#2775 N.B., I'm force pushing so that the previous build gets cancelled.
TL;DR
Update: The issue has been found and is now tracked via PR microsoft/vstest#2024.
Update2: The issue has now been fixed with the .NET SDK releases 2.1.701 & 2.2.301, demonstrated by the following builds.
General
Operating System
I've found that the latest release .NET Core 2.1 & 2.2 have introduced a performance degradation when executing xunit tests via
dotnet test
.I have found this with 3 different OSS projects I maintain, and across two different CI providers (CircleCI & Travis). The specific details are below.
FYI. I haven't raised this against xunit, because the only difference is the two releases of .NET Core.
The following are builds of identical source code running against 4 different releases of .NET core running in CircleCI using containers.
Please see the compare-dotnet-releases branch of https://github.com/JSkimming/abioc.
As you can see the test execution takes an order of magnitude longer.
Here is the same build in travis. - NOTE: The mac build is fine because it is pinned to an earlier version of .NET Core.
Here are two completely different projects, both exhibiting the same issue in CircleCI and Travis
https://circleci.com/gh/JSkimming/tesla-net/tree/master
https://travis-ci.org/JSkimming/tesla-net/builds/531160500
https://circleci.com/gh/JSkimming/Castle.Core.AsyncInterceptor/tree/master
https://travis-ci.org/JSkimming/Castle.Core.AsyncInterceptor/builds/530615608
Update
Latest builds with timings between steps
The text was updated successfully, but these errors were encountered: