Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CollectionsMarshal support for the Stack<T> collection #82999

Open
theodorzoulias opened this issue Mar 5, 2023 · 8 comments
Open

Add CollectionsMarshal support for the Stack<T> collection #82999

theodorzoulias opened this issue Mar 5, 2023 · 8 comments

Comments

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Mar 5, 2023

Background and motivation

I am coming from the API proposal #15922 about allowing random access in the Stack<T> (and Queue<T>) collection, after having experienced recently myself a lack of features in the Stack<T> collection, that induced me to use reluctantly a List<T> instead. I would like to suggest giving to the Stack<T> collection the same CollectionsMarshal support that the List<T> enjoys, in order to make it a more flexible and powerful collection: the AsSpan and SetCount APIs.

Specific scenario: A Stack<T> is the collection of choice when someone wants to transform a recursive algorithm to an iterative algorithm, for example traversing a tree. It is desirable that the children of each node are pushed in the stack in the reverse order, so that the first child is popped first. The Stack<T> offers no way to push a sequence of items in the reverse order, without materializing the sequence, causing multiple array allocations along the way. If the CollectionsMarshal.AsSpan was available, someone could easily implement an efficient PushReverseRange extension method for the Stack<T> collection, instead of being forced to switch to a List<T>, in the detriment of the readability of their code (List<T> stack = new(); is semantically incorrect).

API Proposal

namespace System.Runtime.InteropServices
{
    public static partial class CollectionsMarshal
    {
        public static Span<T> AsSpan<T> (Stack<T> stack);
        public static void SetCount<T>(Stack<T> stack, int count);
    }    
}

API Usage

Below is the aforementioned PushReverseRange extension method, that pushes an enumerable sequence in the Stack<T> in reverse order, with minimal allocations. The items are inserted in normal order, and then they are reversed directly on the backing array of the Stack<T>. In case of an exception during the enumeration of the items, the original Count of the stack is restored with the CollectionsMarshal.SetCount API:

public static void PushReverseRange<T>(this Stack<T> source, IEnumerable<T> items)
{
    int originalCount = source.Count;
    try
    {
        foreach (T item in items)
            source.Push(item);
    }
    catch
    {
        CollectionsMarshal.SetCount(source, originalCount); throw;
    }

    Span<T> span = CollectionsMarshal.AsSpan(source);
    span.Slice(originalCount, source.Count - originalCount).Reverse();
}

Alternative Designs

Another option could be to add directly in the Stack<T> class a couple of APIs, intended to optimize common scenarios that currently require intermediate buffers:

namespace System.Collections.Generic
{
    public class Stack<T>
    {
        public Stack<T>.ReverseItems ReverseItems { get; }
        public void PushReverseRange<T>(IEnumerable<T> items);
    }    
}

Cloning a Stack<T>, a scenario mentioned below by @eiriktsarpalis, could be accomplished with either of these two APIs:

Stack<T> clone = new(original.ReverseItems);
Stack<T> clone = new();
clone.PushReverseRange(original);

Risks

Exposing the backing storage of the Stack<T> will reduce the freedom of optimizing the internal structure of the collection in the future, for example switching from the array to a linked-list.

Labels: area-System.Collections / api-suggestion.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 5, 2023
@ghost
Copy link

ghost commented Mar 5, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

I am coming from the API proposal #15922 about allowing random access in the Stack<T> (and Queue<T>) collection, after having experienced recently myself a lack of features in the Stack<T> collection, that induced me to use reluctantly a List<T> instead. I would like to suggest giving to the Stack<T> collection the same CollectionsMarshal support that the List<T> enjoys, in order to make it a more flexible and powerful collection: the AsSpan and SetCount APIs.

Specific scenario: A Stack<T> is the collection of choice when someone wants to transform a recursive algorithm to an iterative algorithm, for example traversing a tree. It is desirable that the children of each node are pushed in the stack in the reverse order, so that the first child is popped first. The Stack<T> offers no way to push a sequence of items in the reverse order, without materializing the sequence, causing multiple array allocations along the way. If the CollectionsMarshal.AsSpan was available, someone could easily implement an efficient PushReverseRange extension method for the Stack<T> collection, instead of being forced to switch to a List<T>, in the detriment of the readability of their code (List<T> stack = new(); is semantically incorrect).

API Proposal

namespace System.Runtime.InteropServices
{
    public static partial class CollectionsMarshal
    {
        public static Span<T> AsSpan<T> (Stack<T> stack);
        public static void SetCount<T>(Stack<T> stack, int count);
    }    
}

API Usage

Below is the aforementioned PushReverseRange extension method, that pushes an enumerable sequence in the Stack<T> in reverse order, with minimal allocations. The items are inserted in normal order, and then they are reversed directly on the backing array of the Stack<T>. In case of an exception during the enumeration of the items, the original Count of the stack is restored with the CollectionsMarshal.SetCount API:

public static void PushReverseRange<T>(this Stack<T> source, IEnumerable<T> items)
{
    int originalCount = source.Count;
    try
    {
        foreach (T item in items)
            source.Push(item);
    }
    catch
    {
        CollectionsMarshal.SetCount(source, originalCount); throw;
    }

    Span<T> span = CollectionsMarshal.AsSpan(source);
    span.Slice(originalCount, source.Count - originalCount).Reverse();
}

Alternative Designs

No response

Risks

Exposing the backing storage of the Stack<T> will reduce the freedom of optimizing the internal structure of the collection in the future, for example switching from the array to a linked-list.

Labels: area-System.Collections / api-suggestion.

Author: theodorzoulias
Assignees: -
Labels:

area-System.Collections

Milestone: -

@eiriktsarpalis
Copy link
Member

I've often had need for something similar to "PushReverseRange" as currently there is no good way to clone a stack without using an intermediate buffer. Perhaps though having a set of methods in CollectionMarshal is overkill when trying to solve this problem? Could a new instance method be more helpful here?

@theodorzoulias
Copy link
Contributor Author

theodorzoulias commented Mar 6, 2023

@eiriktsarpalis having the PushReverseRange available as a public member of the Stack<T> would be great, but I am hesitant to make such proposals because they are usually rejected for reasons related with the impact on the static footprint of applications. Exposing the backing storage of the Stack<T> through the CollectionsMarshal has other risks, but at least AFAIK it has no such impact!

Another scenario where the CollectionsMarshal.AsSpan could be handy is for enumerating a Stack<T> in reverse order. In the same algorithm where I needed the PushReverseRange, I also needed to split occasionally a stack to multiple smaller stacks, and enumerating the original stack in reverse order (starting from the bottom of the stack) made the implementation simpler and less mind-bending. If I wanted to go crazy with avoiding allocations, I could also preserve the original stack as one of the resulting stacks (instead of discarding it), by rearranging it internally in a way similar with the List<T>.RemoveAll implementation. With the AsSpan and SetCount APIs, I could do with my stacks anything that I could imagine. Even things too clever for my own good!

@tannergooding
Copy link
Member

Perhaps though having a set of methods in CollectionMarshal is overkill when trying to solve this problem

I think the general point of things like CollectionMarshal is to provide a couple of primitive APIs that allow broader or more general APIs to be built by the community without needing to come back and ask for a new instance method each time.

Notably, we have to consider whether such changes "block" our ability to change or otherwise optimize the data structure later on.

For example, List<T> is always going to be a linear list.

For Stack<T> we currently implement it effectively as a linear list of items and I expect its incredibly unlikely we do any kind of specialization to the layout so these APIs should be fine/safe. However, since its in general FILO without arbitrary indexing/lookup, there is in general some things which are possible to do and which such CollectionMarshal APIs would fully lock out.

For Queue<T>, it's implemented using a circular buffer that grows when the entire thing becomes full and so we couldn't easily expose such CollectionMarshal APIs

@Joe4evr
Copy link
Contributor

Joe4evr commented Mar 7, 2023

It is desirable that the children of each node are pushed in the stack in the reverse order, so that the first child is popped first. The Stack<T> offers no way to push a sequence of items in the reverse order, without materializing the sequence, causing multiple array allocations along the way.

Sounds like you should be using Queue<T> here instead.

@eiriktsarpalis
Copy link
Member

My point is that unlike List<T>, I'm questioning whether there is real need for a CollectionsMarshal API for stack, even assuming that the underlying representation is never going to change.

That being said, @theodorzoulias's use case is definitely there, and cannot be addressed currently without using an intermediate buffer. See also #83086. I think any static footprint concerns that have prevented us from adding a number of instance methods to List do not apply here: Stack isn't nearly as ubiquitous as List.

@theodorzoulias
Copy link
Contributor Author

Sounds like you should be using Queue<T> here instead.

@Joe4evr no, definitely not. Traversing a tree using a Queue<T> results in a ton of nodes being stored in the queue during the traversal. At some moment half of the total number of nodes might be inside the queue. Using a Stack<T> results in a much smaller max Count for the buffer. With a queue you get a breadth-first traversal, and with a stack a depth-first traversal.

@eiriktsarpalis
Copy link
Member

Cloning a Stack<T>, a scenario mentioned below by @eiriktsarpalis, could be accomplished with either of these two APIs:

Note that the proposed ReverseItems property works only works when you have the original stack at hand, which is not the case in many scenaria (e.g. roundtrippping serializations).

@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Mar 8, 2023
@eiriktsarpalis eiriktsarpalis added this to the Future milestone Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants