Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Threading.Tasks.Tests.Perf_AsyncMethods.Yield regressed on ARM64 #66837

Open
adamsitnik opened this issue Mar 18, 2022 · 3 comments
Open

Comments

@adamsitnik
Copy link
Member

System.Threading.Tasks.Tests.Perf_AsyncMethods.Yield seems to be quite noisy, but it has for sure regressed on ARM64.

The reporting system does not show it for Windows arm64 but I am able to constantly reproduce it on Surface Pro X. So it can be caused by something that is enabled in the SDK, but not with corerun (perf lab runs use corerun from local dotnet/runtime build, we are using the SDK that we ship for the monthly perf runs)

image

Surprisingly for Ubuntu arm64 the reporting system shows an improvement. But this time I've not received any Ubuntu arm64 inputs, so I can't confirm or deny it.

image

@AndyAyersMS you should be able to reproduce it on your M1

Repro:

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 net7.0 --filter System.Threading.Tasks.Tests.Perf_AsyncMethods.Yield --architecture arm64
Result Base Diff Ratio Modality Operating System Bit
Faster 1295.71 520.60 2.49 Windows 11 X64
Faster 990.40 328.88 3.01 Windows 11 X64
Same 403.16 368.82 1.09 Windows 11 X64
Faster 588.08 294.85 1.99 Windows 10 X64
Faster 855.19 352.81 2.42 Windows 11 X64
Slower 273.40 345.52 0.79 bimodal Windows 11 X64
Faster 1057.37 336.21 3.14 ubuntu 18.04 X64
Faster 1048.20 347.24 3.02 ubuntu 20.04 X64
Faster 851.95 362.10 2.35 ubuntu 18.04 X64
Same 551.51 575.17 0.96 ubuntu 18.04 X64
Slower 314.53 370.69 0.85 pop 20.04 X64
Same 333.26 318.63 1.05 several? alpine 3.13 X64
Same 310.12 320.67 0.97 debian 11 X64
Slower 157.43 391.50 0.40 macOS Monterey 12.2.1 Arm64
Slower 450.64 876.74 0.51 bimodal Windows 10 Arm64
Slower 326.61 822.06 0.40 Windows 11 Arm64
Same 441.79 441.64 1.00 several? Windows 10 X86
Same 310.05 350.47 0.88 bimodal Windows 10 X86
Same 370.33 348.84 1.06 Windows 10 X86
Slower 609.16 999.41 0.61 Windows 10 Arm
Slower 262.75 312.75 0.84 bimodal macOS Big Sur 11.6.3 X64
Slower 278.15 329.83 0.84 macOS Monterey 12.2.1 X64
Same 254.11 275.46 0.92 macOS Monterey 12.2.1 X64
@ghost
Copy link

ghost commented Mar 18, 2022

Tagging subscribers to this area: @dotnet/area-system-threading-tasks
See info in area-owners.md if you want to be subscribed.

Issue Details

System.Threading.Tasks.Tests.Perf_AsyncMethods.Yield seems to be quite noisy, but it has for sure regressed on ARM64.

The reporting system does not show it for Windows arm64 but I am able to constantly reproduce it on Surface Pro X. So it can be caused by something that is enabled in the SDK, but not with corerun (perf lab runs use corerun from local dotnet/runtime build, we are using the SDK that we ship for the monthly perf runs)

image

Surprisingly for Ubuntu arm64 the reporting system shows an improvement. But this time I've not received any Ubuntu arm64 inputs, so I can't confirm or deny it.

image

@AndyAyersMS you should be able to reproduce it on your M1

Repro:

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 net7.0 --filter System.Threading.Tasks.Tests.Perf_AsyncMethods.Yield --architecture arm64
Result Base Diff Ratio Modality Operating System Bit
Faster 1295.71 520.60 2.49 Windows 11 X64
Faster 990.40 328.88 3.01 Windows 11 X64
Same 403.16 368.82 1.09 Windows 11 X64
Faster 588.08 294.85 1.99 Windows 10 X64
Faster 855.19 352.81 2.42 Windows 11 X64
Slower 273.40 345.52 0.79 bimodal Windows 11 X64
Faster 1057.37 336.21 3.14 ubuntu 18.04 X64
Faster 1048.20 347.24 3.02 ubuntu 20.04 X64
Faster 851.95 362.10 2.35 ubuntu 18.04 X64
Same 551.51 575.17 0.96 ubuntu 18.04 X64
Slower 314.53 370.69 0.85 pop 20.04 X64
Same 333.26 318.63 1.05 several? alpine 3.13 X64
Same 310.12 320.67 0.97 debian 11 X64
Slower 157.43 391.50 0.40 macOS Monterey 12.2.1 Arm64
Slower 450.64 876.74 0.51 bimodal Windows 10 Arm64
Slower 326.61 822.06 0.40 Windows 11 Arm64
Same 441.79 441.64 1.00 several? Windows 10 X86
Same 310.05 350.47 0.88 bimodal Windows 10 X86
Same 370.33 348.84 1.06 Windows 10 X86
Slower 609.16 999.41 0.61 Windows 10 Arm
Slower 262.75 312.75 0.84 bimodal macOS Big Sur 11.6.3 X64
Slower 278.15 329.83 0.84 macOS Monterey 12.2.1 X64
Same 254.11 275.46 0.92 macOS Monterey 12.2.1 X64
Author: adamsitnik
Assignees: -
Labels:

arch-arm64, area-System.Threading.Tasks, tenet-performance

Milestone: -

@stephentoub
Copy link
Member

All await Task.Yield() does is queue a work item to the ThreadPool, so if there's a regression here, it's almost certainly around the ThreadPool.
cc: @kouvel

@ghost
Copy link

ghost commented Mar 18, 2022

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Issue Details

System.Threading.Tasks.Tests.Perf_AsyncMethods.Yield seems to be quite noisy, but it has for sure regressed on ARM64.

The reporting system does not show it for Windows arm64 but I am able to constantly reproduce it on Surface Pro X. So it can be caused by something that is enabled in the SDK, but not with corerun (perf lab runs use corerun from local dotnet/runtime build, we are using the SDK that we ship for the monthly perf runs)

image

Surprisingly for Ubuntu arm64 the reporting system shows an improvement. But this time I've not received any Ubuntu arm64 inputs, so I can't confirm or deny it.

image

@AndyAyersMS you should be able to reproduce it on your M1

Repro:

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 net7.0 --filter System.Threading.Tasks.Tests.Perf_AsyncMethods.Yield --architecture arm64
Result Base Diff Ratio Modality Operating System Bit
Faster 1295.71 520.60 2.49 Windows 11 X64
Faster 990.40 328.88 3.01 Windows 11 X64
Same 403.16 368.82 1.09 Windows 11 X64
Faster 588.08 294.85 1.99 Windows 10 X64
Faster 855.19 352.81 2.42 Windows 11 X64
Slower 273.40 345.52 0.79 bimodal Windows 11 X64
Faster 1057.37 336.21 3.14 ubuntu 18.04 X64
Faster 1048.20 347.24 3.02 ubuntu 20.04 X64
Faster 851.95 362.10 2.35 ubuntu 18.04 X64
Same 551.51 575.17 0.96 ubuntu 18.04 X64
Slower 314.53 370.69 0.85 pop 20.04 X64
Same 333.26 318.63 1.05 several? alpine 3.13 X64
Same 310.12 320.67 0.97 debian 11 X64
Slower 157.43 391.50 0.40 macOS Monterey 12.2.1 Arm64
Slower 450.64 876.74 0.51 bimodal Windows 10 Arm64
Slower 326.61 822.06 0.40 Windows 11 Arm64
Same 441.79 441.64 1.00 several? Windows 10 X86
Same 310.05 350.47 0.88 bimodal Windows 10 X86
Same 370.33 348.84 1.06 Windows 10 X86
Slower 609.16 999.41 0.61 Windows 10 Arm
Slower 262.75 312.75 0.84 bimodal macOS Big Sur 11.6.3 X64
Slower 278.15 329.83 0.84 macOS Monterey 12.2.1 X64
Same 254.11 275.46 0.92 macOS Monterey 12.2.1 X64
Author: adamsitnik
Assignees: -
Labels:

arch-arm64, area-System.Threading, tenet-performance

Milestone: -

@kouvel kouvel added this to the 7.0.0 milestone Mar 18, 2022
@mangod9 mangod9 modified the milestones: 7.0.0, 8.0.0 Jul 27, 2022
@kouvel kouvel modified the milestones: 8.0.0, 9.0.0 Aug 10, 2023
@mangod9 mangod9 modified the milestones: 9.0.0, Future Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants