Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added LRU cache support to inbound Akka.Remote IActorRef resolution #5240

Merged

Conversation

Aaronontheweb
Copy link
Member

Before (current `dev)

OSVersion:                         Microsoft Windows NT 6.2.9200.0
ProcessorCount:                    16
ClockSpeed:                        0 MHZ
Actor Count:                       32
Messages sent/received per client: 200000  (2e5)
Is Server GC:                      True
Thread count:                      111

Num clients, Total [msg], Msgs/sec, Total [ms]
         1,  200000,     98232,    2036.89
         5, 1000000,    187266,    5340.11
        10, 2000000,    187776,   10651.83
        15, 3000000,    186963,   16046.41
        20, 4000000,    186142,   21489.67
        25, 5000000,    186721,   26778.20
        30, 6000000,    186568,   32160.53

After

OSVersion:                         Microsoft Windows NT 6.2.9200.0
ProcessorCount:                    16
ClockSpeed:                        0 MHZ
Actor Count:                       32
Messages sent/received per client: 200000  (2e5)
Is Server GC:                      True
Thread count:                      111

Num clients, Total [msg], Msgs/sec, Total [ms]
         1,  200000,    124147,    1611.16
         5, 1000000,    238550,    4192.89
        10, 2000000,    235433,    8495.12
        15, 3000000,    231965,   12933.28
        20, 4000000,    231603,   17271.60
        25, 5000000,    232051,   21547.47
        30, 6000000,    231161,   25956.73

Combined totals when also factoring in #5228 were as high as 240k+

Copy link
Member Author

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Described changes

@@ -65,7 +65,8 @@ protected override int Hash(string k)

protected override bool IsCacheable(IActorRef v)
{
return !(v is EmptyLocalActorRef);
// don't cache any FutureActorRefs, et al
return !(v is MinimalActorRef && !(v is FunctionRef));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filter out any temp actors from being added to the cache unless they're a FunctionRef - which is a rare but necessary edge case to support some Akka.Streams scenarios.

Should resolve #5230

@Aaronontheweb
Copy link
Member Author

Sharding benchmark performance with these changes:

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19041.1165 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.302
  [Host]     : .NET Core 3.1.17 (CoreCLR 4.700.21.31506, CoreFX 4.700.21.31502), X64 RyuJIT
  Job-LWNJGO : .NET Core 3.1.17 (CoreCLR 4.700.21.31506, CoreFX 4.700.21.31502), X64 RyuJIT

InvocationCount=1  UnrollFactor=1  
Method StateMode MsgCount Mean Error StdDev
SingleRequestResponseToLocalEntity Persistence 10000 119.025 ms 2.3701 ms 5.5867 ms
StreamingToLocalEntity Persistence 10000 5.589 ms 0.4031 ms 1.1632 ms
SingleRequestResponseToRemoteEntity Persistence 10000 4,377.781 ms 39.1379 ms 36.6097 ms
SingleRequestResponseToRemoteEntityWithLocalProxy Persistence 10000 4,644.217 ms 38.2285 ms 35.7590 ms
StreamingToRemoteEntity Persistence 10000 473.164 ms 9.4075 ms 9.6608 ms
SingleRequestResponseToLocalEntity DData 10000 117.487 ms 2.2693 ms 5.4370 ms
StreamingToLocalEntity DData 10000 5.110 ms 0.3269 ms 0.9325 ms
SingleRequestResponseToRemoteEntity DData 10000 4,355.930 ms 38.7493 ms 36.2461 ms
SingleRequestResponseToRemoteEntityWithLocalProxy DData 10000 4,558.669 ms 27.9259 ms 26.1219 ms
StreamingToRemoteEntity DData 10000 465.830 ms 8.9796 ms 11.0278 ms

@Aaronontheweb
Copy link
Member Author

Numbers for those same benchmarks on dev:

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19041.1165 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.302
  [Host]     : .NET Core 3.1.17 (CoreCLR 4.700.21.31506, CoreFX 4.700.21.31502), X64 RyuJIT
  Job-CCOYAC : .NET Core 3.1.17 (CoreCLR 4.700.21.31506, CoreFX 4.700.21.31502), X64 RyuJIT

InvocationCount=1  UnrollFactor=1  
Method StateMode MsgCount Mean Error StdDev Median
SingleRequestResponseToLocalEntity Persistence 10000 119.832 ms 2.5806 ms 7.487 ms 118.132 ms
StreamingToLocalEntity Persistence 10000 6.260 ms 0.4383 ms 1.272 ms 6.080 ms
SingleRequestResponseToRemoteEntity Persistence 10000 4,383.980 ms 27.1027 ms 22.632 ms 4,383.768 ms
SingleRequestResponseToRemoteEntityWithLocalProxy Persistence 10000 4,748.068 ms 29.8746 ms 24.947 ms 4,750.990 ms
StreamingToRemoteEntity Persistence 10000 516.840 ms 9.7221 ms 9.094 ms 518.284 ms
SingleRequestResponseToLocalEntity DData 10000 120.264 ms 2.4026 ms 6.855 ms 118.534 ms
StreamingToLocalEntity DData 10000 6.474 ms 0.7923 ms 2.311 ms 5.733 ms
SingleRequestResponseToRemoteEntity DData 10000 4,388.124 ms 65.3584 ms 54.577 ms 4,388.402 ms
SingleRequestResponseToRemoteEntityWithLocalProxy DData 10000 4,640.986 ms 17.7719 ms 14.840 ms 4,639.012 ms
StreamingToRemoteEntity DData 10000 511.530 ms 7.9896 ms 7.473 ms 511.900 ms

@Aaronontheweb
Copy link
Member Author

Virtually zero benefit for the sharding benchmark - which tells me that the costs of cache misses are probably negligible.

@Aaronontheweb
Copy link
Member Author

With the latest Actor.GetChild improvements merged in, looking at a 10-15% improvement on the Akka.Cluster.Sharding benchmarks in this PR so far:

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19041.1165 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.302
  [Host]     : .NET Core 3.1.17 (CoreCLR 4.700.21.31506, CoreFX 4.700.21.31502), X64 RyuJIT
  Job-WSNPID : .NET Core 3.1.17 (CoreCLR 4.700.21.31506, CoreFX 4.700.21.31502), X64 RyuJIT

InvocationCount=1  UnrollFactor=1  
Method StateMode MsgCount Mean Error StdDev
SingleRequestResponseToLocalEntity Persistence 10000 108.365 ms 2.1277 ms 2.9124 ms
StreamingToLocalEntity Persistence 10000 5.142 ms 0.3590 ms 1.0302 ms
SingleRequestResponseToRemoteEntity Persistence 10000 3,897.897 ms 14.0622 ms 12.4658 ms
SingleRequestResponseToRemoteEntityWithLocalProxy Persistence 10000 NA NA NA
StreamingToRemoteEntity Persistence 10000 447.964 ms 6.3798 ms 5.3274 ms
SingleRequestResponseToLocalEntity DData 10000 107.760 ms 2.1452 ms 2.5537 ms
StreamingToLocalEntity DData 10000 4.748 ms 0.2883 ms 0.8226 ms
SingleRequestResponseToRemoteEntity DData 10000 3,810.028 ms 5.0696 ms 3.9580 ms
SingleRequestResponseToRemoteEntityWithLocalProxy DData 10000 4,165.030 ms 19.2049 ms 17.9642 ms
StreamingToRemoteEntity DData 10000 442.327 ms 3.7282 ms 3.4873 ms

Benchmarks with issues:
ShardMessageRoutingBenchmarks.SingleRequestResponseToRemoteEntityWithLocalProxy: Job-WSNPID(InvocationCount=1, UnrollFactor=1) [StateMode=Persistence, MsgCount=10000]

Copy link
Contributor

@Arkatufus Arkatufus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Aaronontheweb Aaronontheweb merged commit c3b6880 into akkadotnet:dev Sep 2, 2021
@Aaronontheweb Aaronontheweb deleted the perf/use-LRUCache-resolve branch September 2, 2021 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants