Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support struct Enumerator for ConcurrentDictionary #25448

Open
stephentoub opened this issue Mar 14, 2018 · 33 comments
Open

Support struct Enumerator for ConcurrentDictionary #25448

stephentoub opened this issue Mar 14, 2018 · 33 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections backlog-cleanup-candidate An inactive issue that has been marked for automated closure.
Milestone

Comments

@stephentoub
Copy link
Member

ConcurrentDictionary enables enumerating the dictionary while other threads are modifying it, making it very useful in a variety of scenarios. But its GetEnumerator allocates an enumerator object. We can't change the return type of GetEnumerator, but we can at least expose an enumerator struct to enable enumeration without allocating.

Proposal:

namespace System.Collections.Concurrent
{
    public class ConcurrentDictionary<TKey, TValue>
    {
        public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator(); // existing

        public struct Enumerator // new
        {
            public Enumerator(ConcurrentDictionary<TKey, TValue> dictionary);
            public bool MoveNext();
            public KeyValuePair<TKey, TValue> Current { get; }
        }
    }
}

Example usage:

private static ConcurrentDictionary<string, int> s_dict = new ConcurrentDictionary<string, int>();
...
var enumerator = new ConcurrentDictionary<string, int>.Enumerator(s_dict);
while (enumerator.MoveNext())
{
    Use(enumerator.Current);
}
@karelz
Copy link
Member

karelz commented Mar 18, 2018

@stephentoub is Future ok? Or do you think we should push it into 2.1?

@stephentoub
Copy link
Member Author

Future is fine

@davidfowl
Copy link
Member

davidfowl commented Mar 31, 2018

Is this difficult to do? I'd love to measure our broadcast performance scenario with this. We currently filed this issue https://github.com/aspnet/SignalR/issues/1796 to try and work around allocating the enumerator per message (which we're currently doing).

@Joe4evr
Copy link
Contributor

Joe4evr commented Mar 31, 2018

We can't change the return type of GetEnumerator, but we can at least expose an enumerator struct to enable enumeration without allocating.

Isn't the compiler (and/or JIT) already smart enough to emit direct calls for struct enumerators? 🤔

Unless you meant that changing from a class to a struct enumerator is a binary breaking change, in which case.... Would it be suitable to quirk instead?

I just wouldn't like have having to remember to use the more verbose syntax over the simplicity of foreach for one specific collection type to keep GC pressure down.

@stephentoub
Copy link
Member Author

Is this difficult to do?

No, the iterator just needs to be rewritten manually.

Unless you meant that changing from a class to a struct enumerator is a binary breaking change

Correct. Changing the GetEnumerator signature is a breaking change.

Would it be suitable to quirk instead?

No, not for API signature changes.

@stephentoub
Copy link
Member Author

Is this difficult to do?

@davidfowl, e.g. something like this:
stephentoub/corefx@d11bdcd

@terrajobst
Copy link
Member

terrajobst commented Apr 24, 2018

I'm not sure I like it because it seems heavy-handed and the consuming code cannot use foreach.

@jaredpar is there any way we could change the compiler rules for IEnumerable to prefer a struct-based enumerator when GetEnumerator() is already taken? I suspect we'll have this problem in other areas too. @bartonjs suggested that we could add another property/method that returns a struct that provides a struct-based GetEnumerator() method. Again, feels heavy-handed...

@jaredpar
Copy link
Member

is there any way we could change the compiler rules for IEnumerable ...

No

This is one of the most used patterns in .NET. Changing the compiler rules for how we bind existing IEnumerable is a back compat issue.

@stephentoub
Copy link
Member Author

stephentoub commented May 3, 2018

This has probably been discussed before, but is there anything reasonable we could do in the runtime to make it a non-breaking change to change a signature from:

public IEnumerator<T> GetEnumerator() { ... }

to

public Enumerator GetEnumerator() { ... }

public struct Enumerator : IEnumerator<T> { ... }

such that code compiled against the former would "just work" with an implementation containing the latter? Then code compiled against the new signature would get the allocation benefit but code compiled against the older signature would still work.

cc: @jkotas

@jkotas
Copy link
Member

jkotas commented May 3, 2018

You can have both Enumerator GetEnumerator() and IEnumerator<T> GetEnumerator() methods in the IL. It is just that you cannot do that in C# today. It is possible to hack it via IL rewriting, e.g. the implementation can contain both methods, but reference assembly just the new one. The problem with introducing such non-standard overload in the implementation is that it would break reflection or scripting scenarios.

The other possible approach is to wait for the JIT to get better and optimize out the allocation in the current less than ideal pattern for you.

@svick
Copy link
Contributor

svick commented May 3, 2018

@jkotas

It is possible to hack it via IL rewriting, e.g. the implementation can contain both methods, but reference assembly just the new one.

Wouldn't that mean that if you upgrade to the newer reference assembly, code that does any kind of type inference will start behaving differently? E.g.:

var e = cdict.GetEnumerator(); // is the type of e Enumerator or IEnumerator<T>?

void F<T>(T x) {}
F(cdict.GetEnumerator()); // is the inferred T Enumerator or IEnumerator<T>?

I think that's an unacceptable breaking change.

@jkotas
Copy link
Member

jkotas commented May 3, 2018

This is a general problem with introducing overloads. If the new overload has same behavior as the old one (ie it is unlikely to change the behavior of existing code silently), I believe it has been considered as an acceptable option. It has been treated on case by case.

@terrajobst
Copy link
Member

terrajobst commented May 22, 2018

Video

The options listed here are all somewhat unfortunate (that is, they suck for various reasons :-)).

One compromise would be to add a new method called GetEnumerator2() and change the compiler to support foreach-ing enumerators and code would like this:

var dict = new ConcurrentDictionary<string, int>();
// add stuff

foreach (var kv in dict.GetEnumerator2())
{
    Use(enumerator.Current);
}

The established patterns for struct-based alternative is ValueXxx, like ValueTuple or ValueTask but in this case value might be a bad idea due to the generic parameter TValue.

@jaredpar, are you still supportive of this change to the compiler?
@stephentoub, are you OK with the resulting usage?

@svick
Copy link
Contributor

svick commented May 22, 2018

@terrajobst How is that better than a method like GetStructEnumerable() that returns a struct whose GetEnumerator() returns the struct enumerator (which you called "heavy-handed" before)?

With GetStructEnumerable():

  • You don't need to change the language (or the compiler).
  • The usage stays the same as with GetEnumerator2 (you just change the name of the called method).
  • The API surface does become larger, but if that's an issue, you could combine the enumerable and the enumerator into a single type.
  • Performance should not be worse, since the additional call should be inlineable.

@jaredpar
Copy link
Member

jaredpar commented May 23, 2018

are you still supportive of this change to the compiler?

I'm personally comfortable with the idea of allowing foreach on IEnumerator<T>. It's a language change though hence has to go through LDM. Happy to represent the issue though.

@jnm2
Copy link
Contributor

jnm2 commented May 28, 2018

There have been a number of times I've wanted foreach on an IEnumerator<T>-patterned struct. This language change would be welcome!

@buybackoff
Copy link

Suddenly this has become almost the only allocator on my hot path (c.3Gb per minute). What are other current alternatives for small number of values without lock? I'm using it instead of missing but long-awaited in CoreFx ConcurrentHashSet (only keys).

The enumerator is implemented with yield keyword that does not support Reset. However, the Reset method is still not deprecated officially, but it is optional. What if we could reuse the enumerator and call Reset on it instead of GetEnumerator()? That will allow to keep current API unchanged.

buybackoff referenced this issue in Spreads/Spreads Jul 11, 2018
…ocates enumerators (3+Gb/min in the test). Now zero allocations and 20% faster.

See https://github.com/dotnet/corefx/issues/28046, revert to CD when this is fixed.
@stephentoub
Copy link
Member Author

The other possible approach is to wait for the JIT to get better and optimize out the allocation in the current less than ideal pattern for you.

@jkotas, are we any closer to this happening? It's been discussed for years. :)

What if we could reuse the enumerator and call Reset on it instead of GetEnumerator()?

Interesting idea. I'd be ok with that as a stopgap measure, but it wouldn't fully replace the benefits of GetEnumerator being allocation-free, in particular because you couldn't share that enumerator instance across multiple threads. Plus, it's very non-discoverable.

@buybackoff
Copy link

buybackoff commented Jul 11, 2018

in particular because you couldn't share that enumerator instance across multiple threads.

If thread that resets could exclusively use it after reset that it's ok. The new struct enumerator could be a field of a class enumerator and just re-created on Reset.

Plus, it's very non-discoverable.

Those who care about enumerator allocations will find it. This issue was the first Google result yesterday :)

Another issue is that it will still have interface method call and no inlining, but this is still so much better than allocations on every small broadcast message like in SignalR and my case.

@davidfowl
Copy link
Member

It would be great to have something usable that didn't allocate (not sure about the reset hack). Here's what we currently do in SignalR:

https://github.com/aspnet/SignalR/blob/2d4fb0af6fd2ef2e3a059a2a15a56484d8800d35/src/Microsoft.AspNetCore.SignalR.Core/HubConnectionStore.cs#L44

@znakeeye
Copy link

Suddenly this has become almost the only allocator on my hot path (c.3Gb per minute). What are other current alternatives for small number of values without lock? I'm using it instead of missing but long-awaited in CoreFx ConcurrentHashSet (only keys).

The enumerator is implemented with yield keyword that does not support Reset. However, the Reset method is still not deprecated officially, but it is optional. What if we could reuse the enumerator and call Reset on it instead of GetEnumerator()? That will allow to keep current API unchanged.

Funny. This allocator has become an outstanding hotspot in our memory intensive project as well. Maybe time to write a custom concurrent dictionary in the meantime? Or did you find a way to get rid of the allocations?

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@stephentoub stephentoub modified the milestones: 5.0.0, Future Jun 23, 2020
@layomia layomia removed the untriaged New issue has not been triaged by the area owner label Jun 24, 2020
@ghost
Copy link

ghost commented Oct 26, 2021

Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process.

This process is part of the experimental issue cleanup initiative we are currently trialing in a limited number of areas. Please share any feedback you might have in the linked issue.

@znakeeye
Copy link

Will this be included in .NET 6? .NET 7?

That mswtfbot wants to move this to the backlog? Meanwhile, I'm thinking about this issue every single night. In our mega large codebase, I tried using a lock + regular Dictionary, but soon realized it had even worse performance than ConcurrentDictionary. Obviously, I soon need to write a new concurrent dictionary 😓 Nice...

@ghost ghost removed the no-recent-activity label Oct 28, 2021
@stonstad
Copy link

This concept has implications for game dev scenarios in which zero allocation enumeration is needed, i.e. Unity.

@davidfowl
Copy link
Member

Profiling a SignalR based game again, there's per frame cost allocating these that I would love to see go away in .NET 7.

@stephentoub
Copy link
Member Author

@jaredpar, another category of examples binary obsoletion could help with, if we could come up with a decent way of having both signatures in the C# and not having to drop to IL to overload by return type.

@davidfowl
Copy link
Member

@jaredpar, another category of examples binary obsoletion could help with, if we could come up with a decent way of having both signatures in the C# and not having to drop to IL to overload by return type.

We have another scenario for this in ASP.NET Core. My attempt was going to be introducing a source breaking change but not a binary breaking one by manipulating the ref assembly.

@jaredpar
Copy link
Member

@davidfowl can you point me to the ASP.NET scenarios in this area? Trying to get the list together so we can find the best way to approach this problem in compiler / language.

@davidfowl
Copy link
Member

Sent mail to avoid polluting this thread 😄

@AndyAyersMS
Copy link
Member

I'd appreciate seeing some canonical examples here too, perhaps we can find opportunities for escape analysis to allow stack allocation of one of these objects.

@znakeeye
Copy link

Do you want a GC benchmark example?

I found this issue when hunting GC waits on a very hot path where GetEnumerator() was significant.

@stephentoub
Copy link
Member Author

I'd appreciate seeing some canonical examples here too, perhaps we can find opportunities for escape analysis to allow stack allocation of one of these objects.

Specific to ConcurrentDictionary<>, they're generally of the form:

private ConcurrentDictionary<Key, Value> _data = new();
...
void Handle()
{
    foreach (KeyValuePair<Key, Value> entry in _data)
    {
       ...
    }
}

with variations, like the dictionary being a static readonly field, or being a method parameter.
e.g.
https://github.com/dotnet/aspnetcore/blob/46562b1435bf111a7425b40f507b157b42a016a4/src/Components/Components/src/CascadingValueSource.cs#L95
https://github.com/dotnet/aspnetcore/blob/46562b1435bf111a7425b40f507b157b42a016a4/src/SignalR/server/Core/src/StreamTracker.cs#L79
https://github.com/dotnet/aspnetcore/blob/46562b1435bf111a7425b40f507b157b42a016a4/src/SignalR/server/StackExchangeRedis/src/Internal/AckHandler.cs#L53
https://github.com/dotnet/aspnetcore/blob/46562b1435bf111a7425b40f507b157b42a016a4/src/Servers/Kestrel/Core/src/Internal/Infrastructure/ConnectionManager.cs#L69
https://github.com/dotnet/aspnetcore/blob/46562b1435bf111a7425b40f507b157b42a016a4/src/Servers/Kestrel/Core/src/Internal/Infrastructure/TransportConnectionManager.cs#L59
https://github.com/dotnet/aspnetcore/blob/46562b1435bf111a7425b40f507b157b42a016a4/src/SignalR/server/Core/src/DefaultHubLifetimeManager.cs#L129C36-L129C47

foreach (KeyValuePair<string, Logger> registeredLogger in _loggers)

foreach (KeyValuePair<object, CacheEntry> entry in oldState._entries)

foreach (KeyValuePair<HttpConnectionKey, HttpConnectionPool> entry in _pools)

More generally, we'll often see these enumerator allocations from methods that return an interface, e.g. IList<T> or IEnumerable<T>, and the consumer then just foreach's it or uses LINQ with it. In such cases, even if there's a type like List<T> that provides a struct-based Enumerator, it still ends up allocating due to the boxing.

@AndyAyersMS
Copy link
Member

Thanks. With PGO the JIT can see through the interface calls, but it is not yet adept at converting a connected graph of checks into a check-free region that will let the JIT see that the enumerator does not escape (see eg #62457). And escape analysis is disabled for ref classes and can't handle boxes yet.

The ultimate plan here would be to use GDV to devirtualize, fully inline the enumerator construction and enumerator calls, stack allocate the enumerator (either boxed struct or class), and then promote the enumerator fields. At that point the entire enumerator object likely evaporates, and we have something akin to a for loop over the concrete collection.

I don't know if we can get all this into .NET 9 but would like to continue to work on the necessary technology.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections backlog-cleanup-candidate An inactive issue that has been marked for automated closure.
Projects
None yet
Development

No branches or pull requests