Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ArraySegmentPool #41240

Closed
2 tasks done
wizardars opened this issue Aug 23, 2020 · 20 comments
Closed
2 tasks done

Add ArraySegmentPool #41240

wizardars opened this issue Aug 23, 2020 · 20 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Buffers
Milestone

Comments

@wizardars
Copy link

wizardars commented Aug 23, 2020

Background and Motivation

This pool will be necessary for developers who do not agree to compromise and want to write the fastest solutions. For example, protocols with zero allocations over UDP with a bandwidth of more than 1 gigabit .

Implementation explanation

Pool for ArraySegments.

Features:

  • Thread-safe
  • Autoresize

Model:

                         POOL
   .---------------------------------------------.
   |                                             |
   |  arrayLayout => [1][2][3]                   |
   |                  |  |  +-----------+        |
   |                  |  +-----+        |        |
   |                  |        |        |        |
   |                  v        v        v        |
   |        array => [1][1][1][2][2][2][3][3][3] |
   |                                             |
   |  if arrayLayout[x] = 0 (free)               |
   |  if arrayLayout[x] = 1 (rented)             |
   '---------------------------------------------'

Usage Examples

Initialize (Auto resize mode)

int maxArraySegmentLength = 9_038;
int initialCapacity = 1;
int maxCapacity = 100_000;
var arraySegmentPool = new ArraySegmentPool<byte>(maxArraySegmentLength, initialCapacity, maxCapacity);

Initialize (Fixed size mode)

int maxArraySegmentLength = 9_038;
int fixedCapacity = 100_000;
var arraySegmentPool = new ArraySegmentPool<byte>(maxArraySegmentLength, fixedCapacity);

Rent

int arraySegmentLength = 1250;
ArraySegment<byte> arraySegment = arraySegmentPool.DangerousRent(arraySegmentLength);

Return

arraySegmentPool.Return(ref arraySegment);

Alternative Designs

No other options.

Risks

The ArraySegment will be modified and size will be increased by 1 byte.
https://github.com/dotnet/runtime/pull/41237/files#diff-aa8a8894d7639b5648acaf9aac2a36c4

Proposed implementation

namespace System.Buffers
{
    /// <summary>
    /// Implements a dangerous, thread safe, auto resizable Pool that uses an array of objects to store <see cref="ArraySegment{T}"/>.
    /// The best fit is when one thread <see cref="DangerousRent"/>, same or others <see cref="Return"/> back short-lived <see cref="ArraySegment{T}"/>.
    /// (!) WARNING: If the <see cref="ArraySegment{T}"/> is not returned, a memory leak will occur.
    /// (!) WARNING: Do not use <see cref="ArraySegmentPool{T}"/> for long-lived objects, otherwise the memory consumption will be enormous.
    /// (!) WARNING: Slice <see cref="ArraySegment{T}"/> to zero not permitted.
    /// (!) WARNING: User of <see cref="ArraySegmentPool{T}"/> must strictly understand how the structure differs from the class.
    /// (!) WARNING: If the user has made many copies of the <see cref="ArraySegment{T}"/>, then only one copy needs to be returned to the <see cref="ArraySegmentPool{T}"/>. After returning, you should not use the leftover copies, as this will corrupt the data.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    public class ArraySegmentPool<T>
    {
        private volatile _pool _currentPool;
        private readonly bool _autoResize;
        private readonly int _maxArraySegmentLength;
        private readonly int _maxCapacity;
        private readonly object _poolLock = new object();
        /// <summary>
        /// Gets the number of rented <see cref="ArraySegment{T}"/> after last resize.
        /// </summary>
        public int Count
        {
            get
            {
                return _currentPool.Count;
            }
        }
        /// <summary>
        /// Gets the total number of <see cref="ArraySegment{T}"/> after last resize.
        /// </summary>
        public int Capacity
        {
            get
            {
                return _currentPool.ArrayLayout.Length;
            }
        }
        /// <summary>
        /// Gets the maximum capacity of <see cref="ArraySegmentPool{T}"/>.
        /// </summary>
        public int MaxCapacity
        {
            get
            {
                return _maxCapacity;
            }
        }
        /// <summary>
        /// Gets a value that indicates whether the <see cref="ArraySegmentPool{T}"/> is resizeble.
        /// </summary>
        /// <returns></returns>
        public bool IsResizeble
        {
            get
            {
                return _autoResize;
            }
        }
        //TODO: maybe visible to ?
#if UT
        /// <summary>
        /// (Debug) Shows how many unsuccessful attempts were made before the <see cref="ArraySegment{T}"/> was taken.
        /// </summary>
        private long _failsCount;
        public long FailsCount
        {
            get
            {
                return _failsCount;
            }
        }
        /// <summary>
        /// (Debug)
        /// </summary>        
        public int LastRentedSegment
        {
            get
            {
                return _currentPool.LastRentedSegment;
            }
        }
        /// <summary>
        /// (Debug)
        /// </summary>        
        public int[] UnderlyingLayoutArray
        {
            get
            {
                return _currentPool.ArrayLayout;
            }
        }        
#endif
        /// <summary>
        /// Main storage of the <see cref="ArraySegmentPool{T}"/>.
        /// </summary>
        private class _pool
        {
            volatile public T[] Array; // Stores segments.
            volatile public int[] ArrayLayout; // Stores information about rented segments.(Interlocked not support byte)
            volatile public int Count; // Stores number of used segments.
            volatile public int LastRentedSegment; // DangerousRent using that value to find next free segment.
        }
        /// <summary>
        /// Constructs a new <see cref="ArraySegmentPool{T}"/> with fixed capacity size.
        /// </summary>
        /// <param name="maxArraySegmentLength">Maximum length of the <see cref="ArraySegment{T}"/></param>
        /// <param name="fixedCapacity">Maximum count of <see cref="ArraySegment{T}"/></param>            
        public ArraySegmentPool(int maxArraySegmentLength, int fixedCapacity) : this(maxArraySegmentLength, fixedCapacity, fixedCapacity, false) { }
        /// <summary>
        /// Constructs a new auto resizeble <see cref="ArraySegmentPool{T}"/>.
        /// </summary>
        /// <param name="maxArraySegmentLength">Maximum length of the <see cref="ArraySegment{T}"/></param>
        /// <param name="initialCapacity">Initial <see cref="ArraySegment{T}"/> count</param>
        /// <param name="maxCapacity">Maximum count of <see cref="ArraySegment{T}"/></param>
        public ArraySegmentPool(int maxArraySegmentLength, int initialCapacity, int maxCapacity) : this(maxArraySegmentLength, initialCapacity, maxCapacity, true) { }
        private ArraySegmentPool(int maxArraySegmentLength, int initialCapacity, int maxCapacity, bool autoResize)
        {
            if (maxArraySegmentLength < 1 | initialCapacity < 1 | maxCapacity < 1)
                throw new ArgumentOutOfRangeException("Arguments must be greater than 1");
            if (initialCapacity > maxCapacity)
                throw new ArgumentOutOfRangeException("InitialCapacity > MaxCapacity");
            if ((long)maxArraySegmentLength * maxCapacity > int.MaxValue)
                throw new OverflowException("MaxCapacity");
            _maxArraySegmentLength = maxArraySegmentLength;
            _maxCapacity = maxCapacity;
            _autoResize = autoResize;
            _currentPool = new _pool() { ArrayLayout = new int[initialCapacity], Array = new T[_maxArraySegmentLength * initialCapacity] };
        }
        /// <summary>
        /// (!) Dangerous. Gets an <see cref="ArraySegment{T}"/> of the default length. <see cref="ArraySegment{T}"/> must be returned via <see cref="Return"/> on the same <see cref="ArraySegmentPool{T}"/> instance to avoid memory leaks.
        /// </summary>
        /// <returns><see cref="ArraySegment{T}"/></returns>
        public ArraySegment<T> DangerousRent()
        {
            return DangerousRent(_maxArraySegmentLength);
        }
        /// <summary>
        /// (!) Dangerous. Gets an <see cref="ArraySegment{T}"/> of the custom length. <see cref="ArraySegment{T}"/> must be returned via <see cref="Return"/> on the same <see cref="ArraySegmentPool{T}"/> instance to avoid memory leaks.
        /// </summary>    
        /// <param name="length">Lenght of the rented <see cref="ArraySegment{T}"/>. Lenght must be equal or smaller than <see cref="defaultLength"/></param>
        /// <returns><see cref="ArraySegment{T}"/></returns>
        public ArraySegment<T> DangerousRent(int length)
        {
            if (length < 0 | length > _maxArraySegmentLength)
                throw new ArgumentOutOfRangeException("Length must be positive and smaller or equal than default length");
            _pool pool = _currentPool;
            //Get new resized pool if free segment not finded.
            if (pool.Count >= pool.ArrayLayout.Length)
                pool = GetNewPool(pool);
            //Try find free segment and ocupy.
            int position = pool.LastRentedSegment + 1;
            int searchCount = 0;
            do
            {
                if (position > pool.ArrayLayout.GetUpperBound(0))
                    position = 0;
                if (Threading.Interlocked.CompareExchange(ref pool.ArrayLayout[position], 1, 0) == 0)
                {
                    Threading.Interlocked.Increment(ref pool.Count);
                    pool.LastRentedSegment = position;
                    return new ArraySegment<T>(pool.Array, position * _maxArraySegmentLength, _maxArraySegmentLength).Slice(0, length);
                }
                //TODO: maybe visible to ?
#if UT
                Threading.Interlocked.Increment(ref _failsCount);
#endif
                position += 1;
                searchCount += 1;
                //That check prevent state, where thread will loop forever.
                if (searchCount == pool.ArrayLayout.Length)
                {
                    pool = GetNewPool(pool);
                    position = 0;
                    searchCount = 0;
                }
            }
            while (true);
        }
        /// <summary>
        /// Returns to the <see cref="ArraySegmentPool{T}"/> an <see cref="ArraySegment{T}"/> that was previously obtained via <see cref="DangerousRent"/> on the same <see cref="ArraySegmentPool{T}"/> instance.
        /// </summary>
        /// <param name="arraySegment"></param>
        public void Return(ref ArraySegment<T> arraySegment)
        {
            _pool pool = _currentPool;
            if (arraySegment.Array == pool.Array)
            {
                //return segment.
                int position = arraySegment.Offset / _maxArraySegmentLength;
                if (arraySegment.IsSlicedToEnd == true)
                    position--;
                if (Threading.Interlocked.Exchange(ref pool.ArrayLayout[position], 0) == 0)
                    throw new Exception("ArraySegment was returned already");
                Threading.Interlocked.Decrement(ref pool.Count);
            }
            arraySegment = ArraySegment<T>.Empty;
        }
        /// <summary>
        /// Sets the capacity of <see cref="ArraySegmentPool{T}"/> to the size of the used <see cref="ArraySegment{T}"/>. This method can be used to minimize a pool's memory overhead once it is known that no new <see cref = "ArraySegment{T}" /> will be added to the <see cref= "ArraySegmentPool{T}"/>.
        /// </summary>
        public void TrimExcess()
        {
            lock (_poolLock)
            {
                if (_autoResize == false)
                    throw new AccessViolationException("Can't trim while auto resize false");
                int count = _currentPool.Count;
                int new_layout_length = count > 1 ? count : 1;
                int new_length = new_layout_length * _maxArraySegmentLength;
                _currentPool = new _pool() { ArrayLayout = new int[new_layout_length], Array = new T[new_length] };
            }
        }
        /// <summary>
        /// Resize the <see cref="ArraySegmentPool{T}"/> and update the instance reference.
        /// </summary>
        /// <param name="pool"></param>
        /// <returns>New pool</returns>
        private _pool GetNewPool(_pool pool)
        {
            lock (_poolLock)
            {
                if (_autoResize == false)
                    throw new OverflowException("ArraySegmentPool size out of max capacity");
                //check if other thread already create new resized pool.
                if (pool != _currentPool)
                    return _currentPool;
                //check limits.
                if (pool.ArrayLayout.Length == _maxCapacity)
                    throw new OverflowException("ArraySegmentPool size out of max capacity");
                //create new resized pool and refresh current ref.
                int newLayoutLength = pool.ArrayLayout.Length * 2L < _maxCapacity ? pool.ArrayLayout.Length * 2 : _maxCapacity;
                int newLength = _maxArraySegmentLength * newLayoutLength;
                _currentPool = new _pool() { ArrayLayout = new int[newLayoutLength], Array = new T[newLength] };
                //return new pool.
                return _currentPool;
            }
        }
    }
}

source.zip

Proposed test

https://github.com/wizardars/test

@wizardars wizardars added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Aug 23, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Aug 23, 2020
@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented Aug 23, 2020

Tagging subscribers to this area: @tannergooding, @pgovind
See info in area-owners.md if you want to be subscribed.

@davidfowl
Copy link
Member

davidfowl commented Aug 23, 2020

@wizardars you need to propose an API. Here's an example of an API proposal #41118 or #41191.

Also ArraySegment<T> is basically legacy at this point, replaced by Memory<T>

@GrabYourPitchforks
Copy link
Member

Also ArraySegment<T> is basically legacy at this point, replaced by Memory<T>

Eh, I wouldn't go quite that far. It's true that we're adding Memory<T> throughout our async API surface, but that's because Memory<T> is a more useful abstraction for most of those scenarios. I'd still use ArraySegment<T> over Memory<T> for scenarios where we know we're dealing with arrays and don't need all of the complexity that Memory<T> brings along. I see the type having significant life left in it.

@davidfowl
Copy link
Member

davidfowl commented Aug 24, 2020

I would never use ArraySegment<T> in a new API. It's leaky abstraction at best and hardly anything uses it today in the BCL (good think it's convertible to Memory<T> and Span<T>)

@wizardars
Copy link
Author

@davidfowl We do not have a template for the proposed implementations, so I improvised.
Span and Memory do not replace a ArraySegment, they are just different. With a ArraySegment, you have access to the original array. Yes of course you're right, don't use it. This is a silver bullet. For critical scenarios only. You can easily shoot yourself in the leg. You always have to pay for speed. The market dictates the conditions, he will not be satisfied with 400 Mbps, he wants 1000.

@davidfowl
Copy link
Member

You don't need the implementation until the issue just the public API surface. Memory is a replacement for ArraySegment and you can get at the Array (if it happens to be backed by one).

Can you explain why this API is needed in the BCL? Why can't this just exist in your project?

@FiniteReality
Copy link

Since ArraySegment just wraps an array, couldn't you just use ArrayPool and wrap the array in an ArraySegment?

@Clockwork-Muse
Copy link
Contributor

Your tests don't demonstrate that using a pool yields a benefit (no comparison with no pool). The preferred way to set these up is to use something like Benchmarkdotnet (which is what most of the code here uses).

Also, none of this code actually pools ArraySegment<whatever>s - you're just pooling an array (silently). Pooling ArraySegment itself is problematic anyways, given the fact that it's a struct.

@wizardars
Copy link
Author

@davidfowl I think the reason is the same as for the ArrayPool and other tools.
And to be specific:

  • I see no alternatives.
  • Tested in production and showed as an indispensable tool.
  • It is well designed internally. (Linear reading in hot path).
  • It is well designed for the user. (Embodies the old .net style and is understandable as a generic list, punishes for negligence).
  • It is as dangerous as an ArraySegment, but it makes it clear. (Yes, it shoots at the knees but there are plenty of places in .Net where you can shoot yourself as well and I think most developers are aware that ArraySegments are better used inside the abstraction than outside).
  • I assume that it will be popular in some scenarios with high load, more than a ArrayPool.

@wizardars
Copy link
Author

wizardars commented Aug 25, 2020

@FiniteReality I think not. ArrayPool and ArraySegmentPool are completely different. ArrayPool for arrays and ArraySegmentPool for array segments.

@wizardars
Copy link
Author

wizardars commented Aug 25, 2020

@Clockwork-Muse ArraySegmentPool is not an improved version of ArrayPool, it is a completely different tool. I don't know how to compare ArraySegment without the pool, because ArraySegment it's just a segment of array. So w/o array segment can't exist. Actually, an array is a pool for a segment, but without the convenient management that a ArraySegmentPool provides.

@Clockwork-Muse
Copy link
Contributor

@wizardars -
What @davidfowl is getting at is "does this solve a problem in the BCL?" or "is this covering an extremely common pain point not handled (or not handled well) by an existing library?".

Taking that list:

// Lines crossed out are not relevant
I see no alternatives.
Tested in production and showed as an indispensable tool.
It is well designed internally. (Linear reading in hot path).
It is well designed for the user. (Embodies the old .net style and is understandable as a generic list, punishes for negligence).
It is as dangerous as an ArraySegment, but it makes it clear. (Yes, it shoots at the knees but there are plenty of places in .Net where you can shoot yourself as well and I think most developers are aware that ArraySegments are better used inside the abstraction than outside).
I assume that it will be popular in some scenarios with high load, more than a ArrayPool.

There aren't many (any?) alternatives, because people were probably pooling the arrays (even before ArrayPool - some uses might only need a single array for the entire program lifetime!), because that's a more general solution. Because if you happened to need a segment, getting one was trivial, once you had the array:

var pooled = GetArrayFromPool();
var segment = new ArraySegment(pooled);

.... and once you had that segment, most of the code down that stack would be focused on working with that single segment by slicing it, not needing to go back to a pool (contrast with working with raw arrays, where you should be hitting a pool each time, or stackalloc-ing).

And that's the problem, actually. Across API boundaries, you should be using Span/Memory. Inside an API, if you're not stackalloc-ing (and thus probably using Span), you're probably going to be in a single call stack, and so you're usually going to only need the one base array anyways, like I showed earlier.

@wizardars
Copy link
Author

@Clockwork-Muse Thank you for the discussion, but I did not understand which scenario you are describing and what exactly you think about adding a pool to the BCL.
To make the conversation more productive, you'd better answer the next question.
Are you in favor of adding or not? If not why?

@Clockwork-Muse
Copy link
Contributor

I am not in favor of adding this pool, because it doesn't provide a significant enhancement to what was possible before. At best, this pool turns two lines into one, and that rarely. If I was interfacing with APIs that took ArraySegment, I'd be using the existing ArrayPool (if not something stackalloced), because wrapping one in a segment is trivial. More recent APIs are taking Span/Memory, though, so there's diminishing call for it.

@wizardars
Copy link
Author

@Clockwork-Muse Thank you for doing such a job and comparing the speed. I will be extremely grateful to you if you attach your test here).

@Clockwork-Muse
Copy link
Contributor

.... what exactly are you expecting me to compare here? Using ArrayPool vs ArraySegmentPool? That's your job; you're the one that has to show that your proposed API has a net benefit, possibly including things like speed and memory pressure, over what is currently possible.
The current performance tests you've provided don't do anything like that, they just show "what using this pool costs". And at that, not even in relation to anything like doing it without the pool at all - you haven't shown that your proposal is an actual benefit over just allocating new arrays each call. (Mind, I assume it probably is, but you haven't actually demonstrated that at all)

This file has the performance tests for ArrayPool. If you rewrite your performance tests to look like this and run both sets, how do they compare?

@teo-tsirpanis
Copy link
Contributor

teo-tsirpanis commented Aug 26, 2020

I think that a better solution would be to extend the already existing MemoryPool class into something more than a thin layer over ArrayPool.

Such efficient implementation that could be used for MemoryPool.Shared would essentially require a full-blown memory manager with free lists, on top of the managed garbage collector.

Because such a managed manual memory manager seems meaningless and inefficient, value types with no references could use unmanaged memory with AllocHGlobal to fully harness the benefits of manual memory management (and to automatically pin the memory as well) and the other types could fall back to the existing implementation.

User-created memory pools can use a fixed-size implementation like the one @wizardars proposed. Two different implementations also exist between the shared and custom array pools.

@Clockwork-Muse
Copy link
Contributor

I think that a better solution would be to extend the already existing MemoryPool class into something more than a thin layer over ArrayPool.

The class is abstract, so the intention is that other types can be created. Including something that interfaces with unmanaged memory (... at which point it becomes managed by the implementing class....)

In the majority case, though, just using the default memory/array pool is going to be way more than sufficient (and may perform better in some scenarios), because an array is going to take (almost) the same amount of space both managed and "unmanaged". There may be cases that need to be working over native/device memory, and want to pool it (say, GPU buffers), but these are either going to be one-offs unique to that library, or a common library used by an extremely small subset of all Nuget packages. That is, not something we would design/build.

@pgovind pgovind removed the untriaged New issue has not been triaged by the area owner label Sep 14, 2020
@pgovind pgovind added this to the Future milestone Sep 14, 2020
@stephentoub
Copy link
Member

Thanks for the suggestion. We're not going to add a pool dedicated to ArraySegment. ArrayPool suffices for that, since a segment is just a tuple of an array, offset, and length.

@stephentoub stephentoub closed this as not planned Won't fix, can't repro, duplicate, stale Jan 14, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Feb 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Buffers
Projects
None yet
Development

No branches or pull requests

10 participants