-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak when disposing actor system with non default ActorRefProvider #2640
Comments
It leaking even with LocalSystem but very slow. You need this (more strings in config - faster leak):
Then parse and inject this into local system, lower sensitivity to memory leak from 100mb to 2 and increase IterationCount to 2000 :) |
it looks like the cluster systems leaks much more data as the wasted memory size grows much faster. |
I reduced the reproduction steps to just creating and disposing the actor system. Output is now for
using System;
using Akka.Actor;
using Akka.Configuration;
using Xunit;
using Xunit.Abstractions;
namespace Akka.Cluster.Tools.Tests.ClusterClient
{
public class AkkaTests
{
private readonly ITestOutputHelper _output;
public AkkaTests(ITestOutputHelper output)
{
_output = output;
}
[Fact]
public void IfActorSystemWithDefaultActorRefProviderIsCreatedAndDisposed_ThenThereShouldBeNoMemoryLeak()
{
TestForMemoryLeak(() => CreateAndDisposeActorSystem(null));
}
[Fact]
public void IfActorSystemWithRemoteActorRefProviderIsCreatedAndDisposed_ThenThereShouldBeNoMemoryLeak()
{
const string ConfigStringRemote = @"
akka {
actor {
provider = ""Akka.Remote.RemoteActorRefProvider, Akka.Remote""
}";
TestForMemoryLeak(() => CreateAndDisposeActorSystem(ConfigStringRemote));
}
[Fact]
public void IfActorSystemWithClusterActorRefProviderIsCreatedAndDisposed_ThenThereShouldBeNoMemoryLeak()
{
const string ConfigStringCluster = @"
akka {
actor {
provider = ""Akka.Cluster.ClusterActorRefProvider, Akka.Cluster""
}";
TestForMemoryLeak(() => CreateAndDisposeActorSystem(ConfigStringCluster));
}
private void CreateAndDisposeActorSystem(string configString)
{
ActorSystem system;
if (configString == null)
system = ActorSystem.Create("Local");
else
{
var config = ConfigurationFactory.ParseString(configString);
system = ActorSystem.Create("Local", config);
}
// ensure that a actor system did some work
var actor = system.ActorOf<TestActor>();
var result = actor.Ask<ActorIdentity>(new Identify(42)).Result;
system.Terminate().Wait();
system.Dispose();
}
private void TestForMemoryLeak(Action action)
{
const int iterationCount = 100;
const long memoryThreshold = 10 * 1024 * 1024;
action();
var memoryAfterFirstRun = GC.GetTotalMemory(true);
Log($"After first run - MemoryUsage: {memoryAfterFirstRun}");
for (var i = 1; i <= iterationCount; i++)
{
action();
if (i % 10 == 0)
{
var currentMemory = GC.GetTotalMemory(true);
Log($"Iteration: {i} - MemoryUsage: {currentMemory}");
if (currentMemory > memoryAfterFirstRun + memoryThreshold)
throw new InvalidOperationException("There seems to be a memory leak!");
}
}
}
private void Log(string text)
{
_output.WriteLine(text);
}
private class TestActor : ReceiveActor
{
}
}
} |
After some debugging the Terminate() method: |
Question/Statement. While it sounds like there could be some leaking occurring, I would think that you would want to force collection in your tests since the dispose pattern on it's own may not guarantee that all memory is freed. Things should dispose correctly but there's a difference (to me, anyway) between a soft leak that happens when a full GC is done and a hard leak that never gets handled. What does it look like if a GC.Collect() is thrown in? |
According to msdn first parameter of "GC.GetTotalMemory(true)" forces a full collection: I also rerun the tests with old school memory cleanup like |
retested with Akka 1.2.3. |
Pretty sure this issue and the problems we were having on #3668 are related. Going to be reproing it and looking into it. |
Took @Ralf1108's reproduction code and turned it into this so I could run DotMemory profiling on it. Looks like a leak in the HOCON tokenizer: https://github.com/Aaronontheweb/Akka.NET264BugRepro |
So I've conclusively found the issue; it's still an issue in Akka.NET v1.3.11; and my research shows that @Ralf1108's original theory on its origins is correct - all of the The root cause is this function call; akka.net/src/core/Akka/Dispatch/AbstractDispatcher.cs Lines 576 to 596 in 4f0fbb8
By default, the Workaround and EvidenceIf I change Memory holds pretty steady at around 25mb. It eventually climbs to 30mb after starting and stopping 1000 If I turn this setting back to its default, however... Climbs up to 41mb and then fails early, since it exceeded its 10mb max allowance for memory creep. So, as a workaround for this issue you could do what I did here and just set the following in your HOCON:
That should help. Permanent FixI'm going to work on a reproduction spec for this issue so we can regression-test it, but what I think I'm going to recommend doing is simply shutting down all dispatcher executors synchronously - that way there's nothing left behind and no dependency on the order in which the scheduler vs. the dispatcher gets shut down. I don't entirely know what the side-effects will be of doing this, but I suspect not much: the dispatcher can't be shutdown until 100% of actors registered on it for use have stopped, which occurs during |
I also think, based on the data from DotMemory, there might be some memory issues with |
…f ForkJoinDispatcher variants
Closed via #3734 |
I updated a local copy of https://github.com/Aaronontheweb/Akka.NET264BugRepro to 1.3.12, bumped the memory sensitivity up to 100 Mb and it still throws at approximately 300 iterations |
@EJantzerGitHub that'd be because of #3735. It was blowing up at ~30 before. Pretty sure the issue is related to some closures inside |
Thanks Aaron. I will be watching that bug then with great interest |
@EJantzerGitHub no problem! If you'd like to help send in a pull request for it, definitely recommend taking a look at that reproduction program using a profiler like DotMemory. That's how I track this sort of stuff down usually. They have a pretty useful tutorial on the subject too: https://www.jetbrains.com/help/dotmemory/How_to_Find_a_Memory_Leak.html |
needed to give the system more messages to process so we guarantee hitting all four dispatcher threads when running the test suite in parallel.
… .NET 4.5.2 (#3668) * migrated to 'dotnet test' * added missing variable for detecing TeamCity * added more targets for end-to-end testing and building * fixed issue with dotnet test lockup for Akka.Cluster.Tests * upgraded all core projects to standards * fixed all major build issues thus far * upgraded all contrib/cluster projects * completed standardizing all projects * fixed issue with Akka.DI.Core * upgrade Linux to .NET Core 2.0 SDK * further fixes to build.sh * changed search location for MNTR assemblies * upgraded MNTR to .NET 4.6.1 * fixed build.sh dotnet-install command * fixed .NET Core test execution * fixed issue with Akka.Remote.Tests.MultiNode outputting to wrong spot * added channel to build.sh * changed to wget * fixed dotnet installer url * skip API approvals on .NET Core * fixed issue with MNTR NuGet packaging * disabled FsCheck * attempted to address Akka.Persistence memory leak * migrated to 'dotnet test' * added missing variable for detecing TeamCity * added more targets for end-to-end testing and building * fixed issue with dotnet test lockup for Akka.Cluster.Tests * rebased on dev * fixed all major build issues thus far * upgraded all contrib/cluster projects * completed standardizing all projects * fixed issue with Akka.DI.Core * upgrade Linux to .NET Core 2.0 SDK * further fixes to build.sh * changed search location for MNTR assemblies * upgraded MNTR to .NET 4.6.1 * fixed build.sh dotnet-install command * fixed .NET Core test execution * fixed issue with Akka.Remote.Tests.MultiNode outputting to wrong spot * added channel to build.sh * changed to wget * fixed dotnet installer url * skip API approvals on .NET Core * fixed issue with MNTR NuGet packaging * disabled FsCheck * attempted to address Akka.Persistence memory leak * fixed issue with Akka.Streams tests * standardized FluentAssertions version * fixed compilation of TCK * upgraded to .NET Core 2.1 SDK * removed restore stage - no longer needed * bumpe tests to .NET Core 2.1 * Revert "bumpe tests to .NET Core 2.1" This reverts commit f76e09f. * workaround dotnet/msbuild#2275 until .NET Core 2.1 migration * Revert "upgraded to .NET Core 2.1 SDK" This reverts commit b000b76. * improved test error result handling * Revert "Revert "upgraded to .NET Core 2.1 SDK"" This reverts commit 1b1a836. * Revert "Revert "bumpe tests to .NET Core 2.1"" This reverts commit 175d6ca. * moving onto .NET Standard 2.0 * standardized most test projects * fixed common.props references * fixed .NET Core 2.1 build systems * fixed issue with packing MNTR * fixed issue with single test failure stopping build * fixed failure handling * fixed issues with Akka.Streams specs * fixed scan for incremental tests * working on FsCheck standardization issues * removed more net implicit standard junk * cleaning up implicit package versions; bumped to JSON.NET 12.0.1 * fixed port bindings for Akka.Cluster.Tools and Akka.Cluster.Sharding so suites could theoretically run in parallel * fixed more ports * fixed compilation errors * rolled back to Newtonsoft.Json 9.0.1 * disabled parallelization in Akka.Streams.Tests * added xunit.runner.json * Disabled xunit.runner.json for Akka.Streams.Tests * added more debug logging to scriptedtest * issue appears to be the 1ms deadline not being long enough on .NET Core - stream isn't even wired up yet * fixed race condition with Bug2640Spec for #2640 needed to give the system more messages to process so we guarantee hitting all four dispatcher threads when running the test suite in parallel. * updated API approvals * fixed issue with Bug2640Spec again No longer looking for an exact thread count since the CPU may not schedule it that. Instead, just ensure that all of the threads hit on the dispatcher shut down when the dispatcher is terminated. * same fix as previous
* close akkadotnet#2640 - fixed shutdown routine of HashedWheelTimerScheduler
… .NET 4.5.2 (akkadotnet#3668) * migrated to 'dotnet test' * added missing variable for detecing TeamCity * added more targets for end-to-end testing and building * fixed issue with dotnet test lockup for Akka.Cluster.Tests * upgraded all core projects to standards * fixed all major build issues thus far * upgraded all contrib/cluster projects * completed standardizing all projects * fixed issue with Akka.DI.Core * upgrade Linux to .NET Core 2.0 SDK * further fixes to build.sh * changed search location for MNTR assemblies * upgraded MNTR to .NET 4.6.1 * fixed build.sh dotnet-install command * fixed .NET Core test execution * fixed issue with Akka.Remote.Tests.MultiNode outputting to wrong spot * added channel to build.sh * changed to wget * fixed dotnet installer url * skip API approvals on .NET Core * fixed issue with MNTR NuGet packaging * disabled FsCheck * attempted to address Akka.Persistence memory leak * migrated to 'dotnet test' * added missing variable for detecing TeamCity * added more targets for end-to-end testing and building * fixed issue with dotnet test lockup for Akka.Cluster.Tests * rebased on dev * fixed all major build issues thus far * upgraded all contrib/cluster projects * completed standardizing all projects * fixed issue with Akka.DI.Core * upgrade Linux to .NET Core 2.0 SDK * further fixes to build.sh * changed search location for MNTR assemblies * upgraded MNTR to .NET 4.6.1 * fixed build.sh dotnet-install command * fixed .NET Core test execution * fixed issue with Akka.Remote.Tests.MultiNode outputting to wrong spot * added channel to build.sh * changed to wget * fixed dotnet installer url * skip API approvals on .NET Core * fixed issue with MNTR NuGet packaging * disabled FsCheck * attempted to address Akka.Persistence memory leak * fixed issue with Akka.Streams tests * standardized FluentAssertions version * fixed compilation of TCK * upgraded to .NET Core 2.1 SDK * removed restore stage - no longer needed * bumpe tests to .NET Core 2.1 * Revert "bumpe tests to .NET Core 2.1" This reverts commit f76e09f. * workaround dotnet/msbuild#2275 until .NET Core 2.1 migration * Revert "upgraded to .NET Core 2.1 SDK" This reverts commit b000b76. * improved test error result handling * Revert "Revert "upgraded to .NET Core 2.1 SDK"" This reverts commit 1b1a836. * Revert "Revert "bumpe tests to .NET Core 2.1"" This reverts commit 175d6ca. * moving onto .NET Standard 2.0 * standardized most test projects * fixed common.props references * fixed .NET Core 2.1 build systems * fixed issue with packing MNTR * fixed issue with single test failure stopping build * fixed failure handling * fixed issues with Akka.Streams specs * fixed scan for incremental tests * working on FsCheck standardization issues * removed more net implicit standard junk * cleaning up implicit package versions; bumped to JSON.NET 12.0.1 * fixed port bindings for Akka.Cluster.Tools and Akka.Cluster.Sharding so suites could theoretically run in parallel * fixed more ports * fixed compilation errors * rolled back to Newtonsoft.Json 9.0.1 * disabled parallelization in Akka.Streams.Tests * added xunit.runner.json * Disabled xunit.runner.json for Akka.Streams.Tests * added more debug logging to scriptedtest * issue appears to be the 1ms deadline not being long enough on .NET Core - stream isn't even wired up yet * fixed race condition with Bug2640Spec for akkadotnet#2640 needed to give the system more messages to process so we guarantee hitting all four dispatcher threads when running the test suite in parallel. * updated API approvals * fixed issue with Bug2640Spec again No longer looking for an exact thread count since the CPU may not schedule it that. Instead, just ensure that all of the threads hit on the dispatcher shut down when the dispatcher is terminated. * same fix as previous
… .NET 4.5.2 (akkadotnet#3668) * migrated to 'dotnet test' * added missing variable for detecing TeamCity * added more targets for end-to-end testing and building * fixed issue with dotnet test lockup for Akka.Cluster.Tests * upgraded all core projects to standards * fixed all major build issues thus far * upgraded all contrib/cluster projects * completed standardizing all projects * fixed issue with Akka.DI.Core * upgrade Linux to .NET Core 2.0 SDK * further fixes to build.sh * changed search location for MNTR assemblies * upgraded MNTR to .NET 4.6.1 * fixed build.sh dotnet-install command * fixed .NET Core test execution * fixed issue with Akka.Remote.Tests.MultiNode outputting to wrong spot * added channel to build.sh * changed to wget * fixed dotnet installer url * skip API approvals on .NET Core * fixed issue with MNTR NuGet packaging * disabled FsCheck * attempted to address Akka.Persistence memory leak * migrated to 'dotnet test' * added missing variable for detecing TeamCity * added more targets for end-to-end testing and building * fixed issue with dotnet test lockup for Akka.Cluster.Tests * rebased on dev * fixed all major build issues thus far * upgraded all contrib/cluster projects * completed standardizing all projects * fixed issue with Akka.DI.Core * upgrade Linux to .NET Core 2.0 SDK * further fixes to build.sh * changed search location for MNTR assemblies * upgraded MNTR to .NET 4.6.1 * fixed build.sh dotnet-install command * fixed .NET Core test execution * fixed issue with Akka.Remote.Tests.MultiNode outputting to wrong spot * added channel to build.sh * changed to wget * fixed dotnet installer url * skip API approvals on .NET Core * fixed issue with MNTR NuGet packaging * disabled FsCheck * attempted to address Akka.Persistence memory leak * fixed issue with Akka.Streams tests * standardized FluentAssertions version * fixed compilation of TCK * upgraded to .NET Core 2.1 SDK * removed restore stage - no longer needed * bumpe tests to .NET Core 2.1 * Revert "bumpe tests to .NET Core 2.1" This reverts commit f76e09f. * workaround dotnet/msbuild#2275 until .NET Core 2.1 migration * Revert "upgraded to .NET Core 2.1 SDK" This reverts commit b000b76. * improved test error result handling * Revert "Revert "upgraded to .NET Core 2.1 SDK"" This reverts commit 1b1a836. * Revert "Revert "bumpe tests to .NET Core 2.1"" This reverts commit 175d6ca. * moving onto .NET Standard 2.0 * standardized most test projects * fixed common.props references * fixed .NET Core 2.1 build systems * fixed issue with packing MNTR * fixed issue with single test failure stopping build * fixed failure handling * fixed issues with Akka.Streams specs * fixed scan for incremental tests * working on FsCheck standardization issues * removed more net implicit standard junk * cleaning up implicit package versions; bumped to JSON.NET 12.0.1 * fixed port bindings for Akka.Cluster.Tools and Akka.Cluster.Sharding so suites could theoretically run in parallel * fixed more ports * fixed compilation errors * rolled back to Newtonsoft.Json 9.0.1 * disabled parallelization in Akka.Streams.Tests * added xunit.runner.json * Disabled xunit.runner.json for Akka.Streams.Tests * added more debug logging to scriptedtest * issue appears to be the 1ms deadline not being long enough on .NET Core - stream isn't even wired up yet * fixed race condition with Bug2640Spec for akkadotnet#2640 needed to give the system more messages to process so we guarantee hitting all four dispatcher threads when running the test suite in parallel. * updated API approvals * fixed issue with Bug2640Spec again No longer looking for an exact thread count since the CPU may not schedule it that. Instead, just ensure that all of the threads hit on the dispatcher shut down when the dispatcher is terminated. * same fix as previous
using Akka 1.2.0
If local actor system is created and disposed repeatedly then everything is fine.
If same is done with cluster actor system then there seems to be a memory leak after disposing.
Check tests:
IfLocalActorSystemIsStartedAndDisposedManyTimes_ThenThereShouldBeNoMemoryLeak
Output:
Got ActorIdentity: 42
After first run - MemoryUsage: 1mb
Iteration: 2 - MemoryUsage: 1mb
Got ActorIdentity: 42
Got ActorIdentity: 42
Iteration: 4 - MemoryUsage: 1mb
Got ActorIdentity: 42
Got ActorIdentity: 42
Iteration: 6 - MemoryUsage: 1mb
...
Got ActorIdentity: 42
Got ActorIdentity: 42
Iteration: 98 - MemoryUsage: 1mb
Got ActorIdentity: 42
Got ActorIdentity: 42
Iteration: 100 - MemoryUsage: 1mb
Got ActorIdentity: 42
IfClusterActorSystemIsCreatedAndDisposedManyTimes_ThenThereShouldBeNoMemoryLeak
Output:
Got ActorIdentity: 42
After first run - MemoryUsage: 35mb
Iteration: 2 - MemoryUsage: 35mb
Got ActorIdentity: 42
Got ActorIdentity: 42
Iteration: 4 - MemoryUsage: 102mb
Got ActorIdentity: 42
Got ActorIdentity: 42
Iteration: 6 - MemoryUsage: 169mb
System.InvalidOperationException : There seems to be a memory leak!
The text was updated successfully, but these errors were encountered: