-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CollectionsMarshal.SetCount API for List<T> #55217
Comments
Tagging subscribers to this area: @eiriktsarpalis Issue DetailsBackground and MotivationThe existing CollectionsMarshal.AsSpan(List) API allows developers to use List for performant code as a Span, however it restricts modifications of it by having no way to change the Length of the List. Exposing an API in CollectionsMarshal for doing that would solve that and help to remove redundant array allocations and AddRange copies when adding data to a List that comes from an API that takes a Span (AddRange doesn't accept Spans so the buffers can't even be stack allocated). Implementation details:
Proposed APInamespace System.Runtime.InteropServices
{
public static class CollectionsMarshal {
+ public static void SetLength<T>(List<T> list, int length);
}
} Example implementation: public static void SetLength<T>(List<T> list, int length)
{
if (length < 0)
throw new ArgumentOutOfRangeException(); // would probably be a throw helper
if (length == list._size)
return;
list._version++; // would need to be made internal
if (length < list._size)
{
if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
Array.Clear(list._items, length, list._size - length);
list._size = length;
return;
}
if (length > list.Capacity)
list.Capacity = length;
list._size = length;
} Usage ExamplesRandomNumberGenerator.Fill is an example Span API here, more usual case would be using less expensive APIs. List<byte> list = new(16);
CollectionsMarshal.SetLength(list, 16);
RandomNumberGenerator.Fill(CollectionsMarshal.AsSpan(list));
// we have a length 16 list filled with random data Alternative DesignsCurrently the developer is required to do this, which requires an array allocation and a copy coming from AddRange List<byte> list = new(16);
byte[] array = new byte[16];
RandomNumberGenerator.Fill(array);
list.AddRange(array); RisksThe API could expose non cleared memory for value types with no GC references (when the List was cleared or if Lists would use GetUninitalizedArray when the initial array was allocated unitialized).
|
IMO this would be more useful and discoverable as an instance method |
While an instance Resize that'd zero-init would be more discoverable, I'd rather add both APIs than just it, since probably a lot of the usecases for AsSpan that'd modify don't care about the contents of it, so it'd only prove to be an unnecessary cost. Edit: I guess there could also be an optional parameter, something like |
May be just expose list's entire array as a span? E.g. following new API should be added (name is draft): namespace System.Runtime.InteropServices
{
public static class CollectionsMarshal
{
+ public static Span<T> AsCapacitySpan<T>(List<T>? list);
}
} Implementation is similar to existing AsSpan:
Usage example with already mentioned RandomNumberGenerator.Fill:
|
This wouldn't really help as an API like this wouldn't modify the Count, meaning that the data written there would be inaccessible without using a span. It is also possible to somewhat create that already without using reflections with |
Could I get some movement on the proposal @eiriktsarpalis? It's been sitting as |
As a meta point aside, it takes work to review API proposals, validate that the concept is something we want included, ensure the design makes sense given the rest of what's exposed, etc. That work all happens prior to an issue being marked as API approved. So the fact that it was added to the cited "planned work" issue is in fact movement. |
@MichalPetryka I'm also not in favor of this API as proposed, I much more like @GrabYourPitchforks's proposal. If you care about perf why do you use List rather than pre-allocated buffer. Perhaps I also simply cannot see end to end scenario where I'd use List with RNG.Fill and care much about perf. Perhaps if you shared E2E scenario it would be more convincing |
RNG was used as an example API that takes in a span, if you want a more real usecase the example span AddRange from #1530 (comment) shows a scenario with less overhead. I'd be fine with a general purpose API as proposed by @GrabYourPitchforks, but an "unsafe" version would still be useful, even for example in that proposed AddRange to avoid the overhead of writing the data there twice. |
I'll suggest to change proposal to what @GrabYourPitchforks has suggested and once you have that API perhaps create a second proposal (only if you feel you'll need it - perhaps share a benchmark with potential perf improvement to support that) |
This issue has been marked |
This issue has been automatically marked |
I've tried benchmarking both a zeroing and a non zeroing version with an AddRange from #1530 adding 50 ints each to an empty list with preallocated memory, but me and 3 other people have gotten wildly different results:
One person:
Another person:
While the ratios are completely different, they all show a clear win for a non zeroing version, so I'd prefer both to be exposed. (I have absolutely no idea what would cause the ratio to be so wildly different on my machine, but it persists on every single run) |
@layomia, I see you assigned this to yourself. Does that mean you're working on this? |
@stephentoub yes. |
I personally am not a big fan of the decision to not zero the non-gc elements, somebody someday will shoot their leg with this: List<int> a = Enumerable.Range(0, 1024).ToList();
a.SetCount(10); // doesn't clear upper 1024-10 elements. OK
...
a.SetCount(100); // doesn't clear on expansion either. Not OK
// I now have garbage in my upper 90 elements (as in implemented in #80311) |
This is a lowlevel API where the intent of use is for developers to write their own performant APIs that extend It exists in the The API can be used "incorrectly" but the intended usage scenario is that after It is no more unsafe than Particularly for large |
Honestly, it doesn't read for me as "it's unsafe", but rather "a set of tools to help with interop" |
Interop and marshalling is by definition unsafe. It's one of the many things we'd mark as That's why we didn't name this |
Adds the ability to resize lists, exposed in CollectionsMarshal due to potentially risky behaviour caused by the lack of element initialization. Supersedes dotnet#77794. Fixes dotnet#55217.
* Add CollectionsMarshal.SetCount(list, count) Adds the ability to resize lists, exposed in CollectionsMarshal due to potentially risky behaviour caused by the lack of element initialization. Supersedes #77794. Fixes #55217. * Update XML doc * Add missing using * Fix test * Update CollectionsMarshalTests.cs * Update CollectionsMarshal.cs * Update CollectionsMarshalTests.cs * Update CollectionsMarshalTests.cs
Background and Motivation
The existing CollectionsMarshal.AsSpan(List) API allows developers to use List for performant code as a Span, however it restricts modifications of it by having no way to change the Count of the List. Exposing an API in CollectionsMarshal for doing that would solve that and help to remove redundant array allocations and AddRange copies when adding data to a List that comes from an API that takes a Span (AddRange doesn't accept Spans so the buffers can't even be stack allocated).
Implementation details:
Proposed API
namespace System.Runtime.InteropServices { public static class CollectionsMarshal { + public static void SetCount<T>(List<T> list, int count); // alternative could be EnsureCount } }
Example implementation:
Usage Examples
The API could be used for creating an extension method in place of the missing ReadOnlySpan AddRange:
Random.GetBytes is an example API that fills a Span with data here.
This would be as valid, just a bit less performant:
With this the end count would also be 16
With this the end count would also be 16 and we'd have the original content from before Clear after SetLength (not guaranteed, implementation detail)
Alternative Designs
Currently the developer is required to do this, which requires an array allocation and a copy coming from AddRange:
The first example usage would look like this:
Risks
The API could expose non cleared memory for value types with no GC references (when the List was cleared or if Lists would use GetUninitalizedArray when the initial array was allocated unitialized).
The text was updated successfully, but these errors were encountered: