Make ConditionalWeakTable reads lock-free #8507

stephentoub · 2016-12-07T21:47:17Z

The ConditionalWeakTable implementation in corert supports lock-free reads, whereas the current implementation in coreclr locks on every operation. The first commit in this PR ports the corert implementation back to coreclr, to provide lock-free reads. In a simple benchmark that repeatedly and serially reads items from the table, this improves throughput by ~25%. But in a simple benchmark that repeatedly and in parallel reads items from the table, this improves throughput significantly, changing something that was entirely serialized to something that runs in parallel, e.g. on my quad-core, throughput improved by ~5x.

However, the implementation did make one situation significantly worse. Adding lots and lots of objects to the table resulted in a massive slowdown, due to how dependency handles were managed to enable the lock-freedom. For example, in a test that added 5M object pairs to the table, before the change it took ~4s and after the change anywhere from ~15s to ~30s. The second commit addresses this, by changing the scheme slightly with how dependency handles are managed, and specifically avoiding duplicating them. This brings addition speed back to being approximately the same as it was before, while maintaining the lock-freedom and read speed improvements.

I also added more tests in dotnet/corefx#14279, and they pass before and after this change.

(I will wait to merge this until after #8490 goes in and incorporate those changes.)

Fixes #3206
cc: @jkotas, @mjp41, @Maoni0

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs

stephentoub · 2016-12-07T23:06:45Z

Shoot, I just realized my second commit doesn't correctly handle removes. Will fix.

stephentoub · 2016-12-08T03:13:25Z

I fixed the issue with removed handles not being freed.

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs

stephentoub · 2016-12-08T13:39:17Z

I addressed @mjp41's observation about freeing the handles for collected objects.

But I'm not sure how to address @jkotas' concern about resurrection. Is there some special runtime finalization-related support we can use to help in this case? Otherwise, the whole approach from corert seems flawed. In looking over the code, the aim was to make reads not only lock-free but also interlocked-free (at least in managed code), and the latter is the key part. If it's interlocked-free, we can't know while we're resizing or removing that some other thread may have already grabbed a dependent handle but not used it yet, which means we can't free it then, which means we need to rely on finalization, but in the finalizer we may similarly be racing with this object getting resurrected and another thread accessing the dependent handles. If we were to use interlocked operations, we could probably ref count every handle access; it would likely be the same (or maybe even slightly higher) reading cost as with a lock, but without serializing parallel access.

Thoughts?

mjp41 · 2016-12-08T15:21:24Z

Would the following work?

The Containers have a field, _parent, pointing to the parent ConditionalWeakTable.
The Containers have a field, _finalizationCount counting the number of times the finalizer has been called.

At the start of the finalizer add something like

// Need to take the parent lock, so that _parent._container assignment does not race.
lock(_parent._lock) {
    if(_oldKeepAlive == null
         || _finalizationCount == 0 ) {
       //
       // This is the currently active container, and the first time it has been finalised.
       //

       // unlink from _parent to make this object unreachable, and unresurrectable
       _parent._container = null;
       // Resurrect this object.
       GC.ReRegisterForFinalize(this);
        _finalizationCount = 1;
       return;   //A
    }
}

At A, the container has been resurrected, but is unreachable in the Heap, so only current readers can be accessing it. They will be keeping it live in the GC, but once they no longer have it as a root, then it will be fully unreachable, and cannot be reaccessed or resurrected, so the second time the finalization is called, there won't be a race.

stephentoub · 2016-12-08T17:34:35Z

Thanks, @mjp41. I hadn't considered re-registration, but that makes sense as a way to fight against resurrection. I'm also not too concerned about the costs here, as in my experience the majority of use for a ConditionalWeakTable is in very long-lived scenarios, e.g. a statically rooted CWT.

Question on your example, though... I assume you meant:

(_oldKeepAlive == null && _finalizationCount == 0)

rather than:

(_oldKeepAlive == null || _finalizationCount == 0)

? We shouldn't need to play this trick with old containers (which would have _oldKeepAlive != null), since they're no longer referenced from the heap anyway.

mjp41 · 2016-12-08T17:40:20Z

@stephentoub yes, you are correct. I meant &&.

mjp41 · 2016-12-08T17:42:50Z

I am a little unsure if all your code for _oldKeepAlive is necessary in Resizing/Constructing, as all the objects are live, so do they really need such a hand-over of ownership? Is this it protect against the thread going away mid-execution?

stephentoub · 2016-12-08T17:46:12Z

I am a little unsure if all your code for _oldKeepAlive is necessary

By "all your code", you mean the two lines that set _oldKeepAlive to a singleton and then set it to null? This was part of my trying to protect against things like thread aborts, but I already removed part of that solution, since thread aborts aren't possible in coreclr/corert, so I guess I can remove those two lines as well. (I also need to fix up Clear given your finalization suggestion.)

stephentoub · 2016-12-08T18:34:54Z

yes, you are correct

Actually, I think we're both wrong, and unfortunately I don't think we can factor in _oldKeepAlive into this logic. Consider this usage:

CWT gets dropped and then resurrected
Just before the active container from that CWT is finalized, two threads start to access it, one adding and one reading.
The one reading grabs a dependency handle but doesn't dereference it yet.
The one adding finds that it needs to grow because the current container is full, so it takes the lock, creates a new container, and sets this container's _oldKeepAlive to non-null.
Then this container's finalizer continues running, takes the lock, and sees _oldKeepAlive as non-null, at which point if we're factoring in _oldKeepAlive and skipping this logic in that case, we'll proceed to free the dependency handles, even though a reader has access to one of them.

That unfortunately means the logic would need to be more like:

...
if (!_finalized)
{
    finalized = true;
    lock (_parent._lock)
    {
        if (_parent._container == this)
            _parent._container = null;
    }
    GC.ReRegisterForFinalize(this);
    return;
}
...

and every container will end up surviving multiple rounds of GC. Maybe given typical usage that's not bad?

stephentoub · 2016-12-08T19:08:24Z

I've updated with this change, but I'm going to also try out the reference counting approach and see how it fairs in comparison.

stephentoub · 2016-12-08T20:52:14Z

but I'm going to also try out the reference counting approach and see how it fairs in comparison

... the ref counting approach I tried is much worse... not quite as bad as a lock on every operation, but pretty close.

jkotas · 2016-12-08T22:36:31Z

every container will end up surviving multiple rounds of GC

It is pretty standard pattern in this kind of lock-free schemes around unmanaged resources. Java has PhantomReference primitive for this exact purpose that allows the runtime to make it a bit more efficient in theory. We are simulating it with what's available. We have other places with similar patterns that require two GC ticks for cleanup, e.g. here:

coreclr/src/mscorlib/src/System/Reflection/Emit/DynamicILGenerator.cs

Line 748 in f8b8b6a

DestroyScout scout = null;

stephentoub · 2016-12-08T22:39:37Z

Thanks, @jkotas. In that case, I don't have further changes planned; please let me know if you have any other feedback, are ok with the port + fixes, etc.

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs

jkotas · 2016-12-09T03:18:58Z

LGTM

The CoreRT ConditionalWeakTable was modified to support lock-free reads. This ports the implementation back to coreclr.

The CoreRT implementation of ConditionalWeakTable that was ported back to CoreCLR uses a special scheme to make reads lock-free. When the container needs to grow, it allocates new arrays and duplicates all of the dependency handles from the previous array, rather than just copying them. This avoids issues stemming from a thread getting a dependency handle in an operation on the container, then having that handle destroyed, and then trying to use it; the handle won't be destroyed as long as the container is referenced. However, this also leads to a significant cost in a certain situation. Every time the container grows, it allocates another N dependency handles where N is the current size of the container. So, for example, with an initial size of 8, if 64 objects are added to the container, it'll allocate 8 dependency handles, then another 16, then another 32, and then another 64, resulting in significantly more handles than in the old implementation, which would only allocate 64 handles total. This commit fixes that by changing the scheme slightly. A container still frees its handles in its finalizer. However, rather than duplicating all handles, that responsibility for freeing is transferred from one container to the next. Then to avoid issues where, for example, the second container is released while the first is still in use, a reference is maintained from the first to the second, so that the second can't be finalized while the first is still in use. The commit also fixes a race condition with resurrection and finalization, whereby dependency handles could be used while or after they're being freed by the finalizer. It's addressed by only freeing handles in a second finalization after clearing out state in the first finalization to guarantee no possible usage during the second.

mjp41 · 2016-12-09T10:18:28Z

@stephentoub I agree we were both wrong :-)

LGTM

stephentoub · 2016-12-09T12:13:15Z

Thanks, @mjp41 and @jkotas.

@safern, will your ConditionalWeakTable change go in soon? If so, I'll wait for that and incorporate your changes into this. If not, I could go ahead with this and then you could merge.

safern · 2016-12-09T18:02:26Z

@stephentoub I expect to have them approved by today, but you can go ahead and merge and then I'll merge my changes with yours :)

stephentoub · 2016-12-09T18:03:45Z

Ok. Thanks, @safern.

stephentoub · 2016-12-09T18:04:20Z

@jkotas, I assume the fixes from here should be ported back to corert?

jkotas · 2016-12-09T18:05:04Z

Yes, it would be nice to port them.

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs

Make ConditionalWeakTable reads lock-free Commit migrated from dotnet/coreclr@7adcc81

dnfclas added the cla-already-signed label Dec 7, 2016

danmoseley reviewed Dec 7, 2016

View reviewed changes

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs Show resolved Hide resolved

stephentoub force-pushed the portcorert_cwt branch from 54cf0c6 to af98557 Compare December 8, 2016 03:12

jkotas reviewed Dec 8, 2016

View reviewed changes

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs Show resolved Hide resolved

mjp41 reviewed Dec 8, 2016

View reviewed changes

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs Show resolved Hide resolved

stephentoub force-pushed the portcorert_cwt branch from 81519d7 to 3b8a1dd Compare December 8, 2016 19:07

jkotas reviewed Dec 9, 2016

View reviewed changes

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs Show resolved Hide resolved

stephentoub commented Dec 9, 2016

View reviewed changes

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs Show resolved Hide resolved

Port ConditionalWeakTable from CoreRT

2e475a3

The CoreRT ConditionalWeakTable was modified to support lock-free reads. This ports the implementation back to coreclr.

stephentoub force-pushed the portcorert_cwt branch from 3b8a1dd to 1689861 Compare December 9, 2016 03:29

stephentoub force-pushed the portcorert_cwt branch from 1689861 to 03525a4 Compare December 9, 2016 03:31

stephentoub merged commit 7adcc81 into dotnet:master Dec 9, 2016

stephentoub deleted the portcorert_cwt branch December 9, 2016 18:05

stephentoub mentioned this pull request Dec 9, 2016

Port fixes to ConditionalWeakTable dotnet/corert#2322

Merged

stephentoub commented Dec 9, 2016

View reviewed changes

src/mscorlib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs Show resolved Hide resolved

safern mentioned this pull request Dec 10, 2016

Adding API ConditionalWeakTable.AddOrUpdate #8490

Merged

jkotas mentioned this pull request Dec 29, 2016

Update issue guide #8728

Merged

karelz modified the milestone: 2.0.0 Aug 28, 2017

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022

Merge pull request dotnet/coreclr#8507 from stephentoub/portcorert_cwt

2b090d2

Make ConditionalWeakTable reads lock-free Commit migrated from dotnet/coreclr@7adcc81

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make ConditionalWeakTable reads lock-free #8507

Make ConditionalWeakTable reads lock-free #8507

stephentoub commented Dec 7, 2016 •

edited

Loading

stephentoub commented Dec 7, 2016

stephentoub commented Dec 8, 2016

stephentoub commented Dec 8, 2016

mjp41 commented Dec 8, 2016 •

edited

Loading

stephentoub commented Dec 8, 2016

mjp41 commented Dec 8, 2016

mjp41 commented Dec 8, 2016

stephentoub commented Dec 8, 2016

stephentoub commented Dec 8, 2016 •

edited

Loading

stephentoub commented Dec 8, 2016

stephentoub commented Dec 8, 2016

jkotas commented Dec 8, 2016 •

edited

Loading

stephentoub commented Dec 8, 2016

jkotas commented Dec 9, 2016

mjp41 commented Dec 9, 2016

stephentoub commented Dec 9, 2016

safern commented Dec 9, 2016

stephentoub commented Dec 9, 2016

stephentoub commented Dec 9, 2016

jkotas commented Dec 9, 2016

Make ConditionalWeakTable reads lock-free #8507

Make ConditionalWeakTable reads lock-free #8507

Conversation

stephentoub commented Dec 7, 2016 • edited Loading

stephentoub commented Dec 7, 2016

stephentoub commented Dec 8, 2016

stephentoub commented Dec 8, 2016

mjp41 commented Dec 8, 2016 • edited Loading

stephentoub commented Dec 8, 2016

mjp41 commented Dec 8, 2016

mjp41 commented Dec 8, 2016

stephentoub commented Dec 8, 2016

stephentoub commented Dec 8, 2016 • edited Loading

stephentoub commented Dec 8, 2016

stephentoub commented Dec 8, 2016

jkotas commented Dec 8, 2016 • edited Loading

stephentoub commented Dec 8, 2016

jkotas commented Dec 9, 2016

mjp41 commented Dec 9, 2016

stephentoub commented Dec 9, 2016

safern commented Dec 9, 2016

stephentoub commented Dec 9, 2016

stephentoub commented Dec 9, 2016

jkotas commented Dec 9, 2016

stephentoub commented Dec 7, 2016 •

edited

Loading

mjp41 commented Dec 8, 2016 •

edited

Loading

stephentoub commented Dec 8, 2016 •

edited

Loading

jkotas commented Dec 8, 2016 •

edited

Loading