-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC doesn't run when using TransactionScope + async #50683
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Any chance I could get an update on this? There is a pretty serious leak in TransactionScope with TransactionScopeAsyncFlowOption.Enabled. |
@karlra, out of curiosity, does omitting I'm also curious if .NET 7 (current latest) makes any difference. Finally, what about .NET 7 with |
Yes, that's the whole point, omitting TransactionScopeAsyncFlowOption.Enabled fixes the issue (but then TransactionScope is unusable with async code). I just tested with net7 and there is no appreciable difference vs net6. TieredPGO also makes no difference. This is with TransactionScopeAsyncFlowOption.Enabled. The GC does run, but it doesn't collect whatever is being leaked here: This is without TransactionScopeAsyncFlowOption.Enabled. GC runs often, but RAM usage is stable. |
Note similarity with #108762. Am going to go ahead and close this issue as it has been open for a while, and there's no evidence of a leak: growing memory usage in itself is not a problem, and only means that the GC hasn't yet decided to reclaim that memory. An actual leak would mean that an OutOfMemoryException would be thrown eventually; that has been flagged in #108762 for Android only; if someone can put together a repro that shows an OOM for non-mobile .NET, that would indeed be interesting and we'd want to look into it. |
Interesting attitude. And you don't feel like it's a problem that performance gets worse by 800% in this example? |
@karira first, I put together a quick BenchmarkDotNet test. When filing issues on performance and/or memory issues, please always include such a BenchmarkDotNet benchmark rather than a Stopwatch-based benchmark - that greatly helps ensuring you're seeing correct results and not e.g. getting skewed by warm-up problems (it generally helps getting your issue to be addressed more quickly etc.). Here are the results (code just below):
Benchmark codeBenchmarkRunner.Run<Benchmark>();
[MemoryDiagnoser]
public class Benchmark
{
[Benchmark]
public void Default()
{
using (new TransactionScope())
{
// Do nothing
}
}
[Benchmark]
public void TransactionScopeAsyncFlowOption_Enabled()
{
using (new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
// Do nothing
}
}
} In other words, from TransactionScope is indeed x3.5 slower with TransactionScopeAsyncFlowOption.Enabled, and also allocates more memory, with some of it living long enough to get into gen2. While it may be possible to optimize TransactionScopeAsyncFlowOption.Enabled handling, the benchmark doesn't show a problem per se. I read through #1419, and if I understand correctly, this issue is basically a duplicate of that, which you opened because #1419 was closed. I essentially agree with what @mconnew wrote there: there's no evidence of a problem with TransactionScope itself - and certainly no memory leak as you've indicated above - although there may be a GC issue lurking here. I'll reopen and tag this as a GC issue, hopefully someone from the GC team and chime in (I recommend also looking at #1419 as well for previous discussion). Am also happy to try and help if I can. |
Tagging subscribers to this area: @dotnet/gc |
I hear what you are saying, however, I don't agree that this is a case where Benchmark.NET is appropriate or useful. The problem is that the performance gets progressively worse and in the end the application will use all RAM available on the system without doing anything at all. Those issues don't come through at all in your benchmarks. |
The point of the BenchmarkDotNet benchmark here is to show exactly what's happening at the TransactionScope level, i.e. how much time it's taking, how much memory it's allocating, and whether that memory stays around or not. It serves precisely to help isolate things, as we're suspecting a GC issue etc. |
I have a partial explanation for this - will try to post later today. |
I believe that there are many pieces that fit together to cause this problem. Unfortunately, each part makes sense on its own. I'll attempt to describe it all here.
|
I've experimented locally with several changes to
Note also that the Note: "during an object resurrection" is sloppy here - the point is that the |
@markples after this bit, I kept reading to see if you'd mention the Why does this behavior trigger for Just asking for clarification. |
The source code does seem to show that the behavior is async-vs-not. if (AsyncFlowEnabled)
{
Debug.Assert(ContextKey != null);
// Async Flow is enabled and CallContext will be used for ambient transaction.
_threadContextData = CallContextCurrentData.CreateOrGetCurrentData(ContextKey);
...
public static ContextData CreateOrGetCurrentData(ContextKey contextKey)
{
s_currentTransaction.Value = contextKey;
return s_contextDataTable.GetValue(contextKey, (env) => new ContextData(true));
}
...
private static readonly ConditionalWeakTable<ContextKey, ContextData> s_contextDataTable = new ConditionalWeakTable<ContextKey, ContextData>(); I assume that this is due to additional bookkeeping needed to make the async scenario work, but I don't know any details here. @roji is this something that you can answer? |
There are two remaining things on my list:
|
@markples thanks for doing the deep dive here, very interesting stuff...!
Unfortunately not, not without some investigation... I will note that the code has been this way for a very long time, basically since TransactionScopeAsyncFlowOption.Enabled was introduced (I'm almost certain the same logic exists on .NET Framework). I'll also note that in general, the bar for changing things in System.Transactions is quite high - this isn't a part of .NET that's being actively evolved (and in any case, unless we say that using CWT is discouraged and should be removed, I'm not sure there's an actionable Sys.Tx-side thing to do...). |
Thanks @roji. Given some of the details in #1419, it looks like the performance problem has been around for a very long time as well. It looks like the CWT implementation was changed between framework and .net core. The previous version locked on reads, which slows them, but then cleanup is much simpler (can be immediate). I don't know the history of that change (though it's certainly understandable to not want a lock for reads). It's possible that one or both of the above ideas will handle real scenarios. It would certainly be possible to build a microbenchmark that defeats them, so I worry a bit that a real scenario could still hit trouble. It would be interesting to try these are @karlra's workload. Cloning the entries into a new |
There seems to be a comment in the relevant code that mentions multiple app domains and remoting which .NET Core does not support. Is this no longer useful and could this be removed to mitigate the issue? runtime/src/libraries/System.Transactions.Local/src/System/Transactions/Transaction.cs Lines 1053 to 1057 in 43813ac
|
@aromaa thanks for looking into it... That may be true, I'm not sure - there may still be scenarios where the CWT usage is needed etc. This would require deeper investigation, and as I said, the bar on doing changes in System.Transactions is relatively high. In any case, I'm still not convinced we should be necessarily working in the direction of dropping CWT from Sys.Tx; the issues analyzed by @markples above are general to CWT, so we should be thinking about them regardless of Sys.Tx (unless the decision is made that CWT is now discouraged etc.). |
…sferred (#108941) ConditionalWeakTable uses internal Container objects for the underlying table. Container entries are write-once because the read path is lock-free. When a Container is full, a new Container is allocated, entries are copied, and compaction can occur (if there aren't any currently live enumerators relying on specific indices). A two-pass finalization scheme is used to free the entries (dependent handles) and then the Containers themselves. Finalization provides a guarantee that the Container is no longer in use, and the second pass accounts for finalization resurrection. Because entries can be duplicated across Containers, each Container contains a link to the one that replaces it. This can _greatly_ extend the lifetime of Containers. (See #50683 and #108447.) However, if the Container is empty and not being enumerated, there is no need to link it to the next Container. This PR handles that case, which includes microbenchmarks where single entries are added and removed from ConditionalWeakTable and equivalent tests where TransactionScope is functioning as a wrapper around a ConditionalWeakTable entry. Of course, this is only a partial solution because a single live entry or enumerator leaves the old behavior. Another caveat is that the finalization queue can be filled faster than it can be emptied, though this is more likely in microbenchmarks where other work isn't being done.
Description
I have an application that uses the TransactionScope class to control SQL transactions. We also use the same library to do mathematical simulations, but in this case it injects a dummy repository to stop the persistence to SQL. When doing this I noticed that performance became progressively worse, and eventually the app crashed with OOM errors.
I had previously created a bug report here: #1419 but it was closed because it was determined that this is a GC issue and not a TransactionScope issue.
Take this code:
This program's memory usage will grow forever, and the time it takes to complete 1 million TransactionScopes will also grow forever.
Uncommenting the GC.Collect(2) fixes the ever increasing memory issue, but for some reason the GC does not run on its own in this program. You might think that this is a contrived example but I found this issue in production code, where we have console apps that do extremely heavy calculations.
Note that even if GC.Collect(2) is called manually, the performance per 1 million rounds still gets worse by 75% or so until it evens out. This suggests that our data access code is impacted by this even in situations when the GC might run level 2 collections on its own due to other allocations causing it.
Configuration
.net core 3.1, Windows 10 x64
Regression?
This does not seem to happen on .net framework.
Other information
The situation improves somewhat with .net 5 as I haven't managed to get OOM errors but it still gets massively progressively worse.
The text was updated successfully, but these errors were encountered: