-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP queue benchmarks #6127
WIP queue benchmarks #6127
Conversation
Notes about the benchmarks:
NET Core 3.1BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.2006 (21H2)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.201
[Host] : .NET Core 3.1.23 (CoreCLR 4.700.22.11601, CoreFX 4.700.22.12208), X64 RyuJIT
Job-GCFXAQ : .NET Core 3.1.23 (CoreCLR 4.700.22.11601, CoreFX 4.700.22.12208), X64 RyuJIT
InvocationCount=1 UnrollFactor=1
|
.NET 6BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.2006 (21H2)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.201
[Host] : .NET 6.0.3 (6.0.322.12309), X64 RyuJIT
Job-LYWECL : .NET 6.0.3 (6.0.322.12309), X64 RyuJIT
InvocationCount=1 UnrollFactor=1
|
Performance is higher across the board for .NET 6, which is what I'd expect, but not what's being reported.... |
Dropped the CallingThreadDispatcher and ran some larger sample sizes NET Core 3.1BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.2006 (21H2)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.201
[Host] : .NET Core 3.1.23 (CoreCLR 4.700.22.11601, CoreFX 4.700.22.12208), X64 RyuJIT
Job-OQAWPR : .NET Core 3.1.23 (CoreCLR 4.700.22.11601, CoreFX 4.700.22.12208), X64 RyuJIT
InvocationCount=1 UnrollFactor=1
NET 6BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.2006 (21H2)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.201
[Host] : .NET 6.0.3 (6.0.322.12309), X64 RyuJIT
Job-OIYZGM : .NET 6.0.3 (6.0.322.12309), X64 RyuJIT
InvocationCount=1 UnrollFactor=1
Again, .NET 6 is faster across the board. Whatever latency increase users are seeing, can't reproduce it easily here.... |
One thing that is interesting though - the second iteration of 100,000 is consistently slower on both readings for .NET 6 than it is for .NET Core 3.1 when the default dispatcher ( |
…ontheweb/akka.net into dotnet6-queuing-benchmarks
Adding an
|
Method | MsgCount | Mean | Error | StdDev | Median | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|
EnqueuePerformance | 10000 | 275.0 μs | 23.66 μs | 69.39 μs | 311.2 μs | - | 384 KB |
RunPerformance | 10000 | 2,380.3 μs | 449.23 μs | 1,324.58 μs | 1,521.8 μs | - | 30 KB |
EnqueuePerformance | 100000 | 1,732.5 μs | 32.15 μs | 77.64 μs | 1,717.3 μs | - | 3,073 KB |
RunPerformance | 100000 | 11,895.4 μs | 391.42 μs | 1,147.98 μs | 11,964.4 μs | - | 303 KB |
EnqueuePerformance | 1000000 | 17,848.0 μs | 282.91 μs | 264.63 μs | 17,842.8 μs | - | 24,578 KB |
RunPerformance | 1000000 | 122,506.7 μs | 2,427.81 μs | 4,560.01 μs | 123,052.4 μs | - | 3,024 KB |
EnqueuePerformance | 10000000 | 184,835.0 μs | 1,333.19 μs | 1,113.28 μs | 184,686.7 μs | - | 245,764 KB |
RunPerformance | 10000000 | 1,193,947.7 μs | 22,128.39 μs | 20,698.91 μs | 1,193,306.7 μs | 7000.0000 | 30,243 KB |
.NET 6
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.2006 (21H2)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.201
[Host] : .NET 6.0.3 (6.0.322.12309), X64 RyuJIT
Job-ICYPTK : .NET 6.0.3 (6.0.322.12309), X64 RyuJIT
InvocationCount=1 UnrollFactor=1
Method | MsgCount | Mean | Error | StdDev | Median | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|
EnqueuePerformance | 10000 | 237.9 μs | 5.13 μs | 15.13 μs | 235.3 μs | - | 385 KB |
RunPerformance | 10000 | 2,671.1 μs | 330.58 μs | 974.74 μs | 3,163.6 μs | - | 31 KB |
EnqueuePerformance | 100000 | 2,199.2 μs | 39.01 μs | 34.58 μs | 2,199.2 μs | - | 3,074 KB |
RunPerformance | 100000 | 11,007.1 μs | 314.61 μs | 922.70 μs | 10,923.7 μs | - | 304 KB |
EnqueuePerformance | 1000000 | 16,574.7 μs | 319.58 μs | 426.63 μs | 16,468.3 μs | - | 24,578 KB |
RunPerformance | 1000000 | 120,354.4 μs | 2,369.00 μs | 2,728.15 μs | 121,167.5 μs | - | 3,028 KB |
EnqueuePerformance | 10000000 | 166,294.2 μs | 2,433.99 μs | 2,276.75 μs | 166,628.0 μs | - | 245,765 KB |
RunPerformance | 10000000 | 1,082,541.1 μs | 14,659.78 μs | 12,995.51 μs | 1,081,883.4 μs | 7000.0000 | 30,243 KB |
Looks like the .NET 6 numbers suffered a lot more with an |
@Aaronontheweb If IStash is contributing performance issues in .NET 6, would this affect ReceivePersistentActors? UntypedPersistentActor inherits from Eventsourced, which implements a stash |
@iress-ljm indeed it would - the numbers are slightly, but consistently worse with stashing. I'm still leaning towards this being a dispatcher issue though, rather than a data structure problem. |
Thanks @Aaronontheweb, have you opened a new PR to track the dispatcher testing? It'd be good for the team to be able to track this issue |
We have some old NBench benchmarks for tracking dispatcher overhead: https://github.com/akkadotnet/akka.net/tree/dev/src/core/Akka.Tests.Performance/Dispatch - we should port some of those to Benchmark.NET. I'll see about doing that today - I'm doing some performance work around improving speeds in this area already today (see #6134) |
@iress-ljm Dispatcher benchmarks have been added here: #6140 So far it looks like .NET 6 performance is better than .NET Core 3.1, but again - I can demonstrate the .NET6-specific drop with RemotePingPong reliably still. |
* WIP queue benchmarks * completed MailboxThroughputBenchmarks * disable `CallingThreadDispatcher`
Fixes #
Changes
Please provide a brief description of the changes here.
Checklist
For significant changes, please ensure that the following have been completed (delete if not relevant):
Latest
dev
BenchmarksInclude data from the relevant benchmark prior to this change here.
This PR's Benchmarks
Include data from after this change here.