-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: using ArrayPool<byte> in MemoryStream with new APIs #22428
Comments
BTW, I have recently added a note to the documentation of |
Would it be less dangerous to return ReadOnlySpan in another overload of TryGetBuffer ? |
A little maybe, but many of the same problems apply, e.g. stream.TryGetBuffer(out ArraySegment<byte> buffer);
stream.Write(...);
Use(buffer); vs stream.TryGetBuffer(out ReadOnlySpan<byte> buffer);
stream.Write(...);
Use(buffer); In both cases, the stream may have replaced the array referenced by Plus, many of the cases for which TryGetBuffer really need access to the array. |
We use |
@svick If a particular class that implements |
How about having a |
@Timovzl Are you saying you I think that in the cases where |
Just adding another voice to this, given the widespread use of re: Disposing memory streams: There's so much tooling, helpers, VS Add-Ins, guidance etc that all falls to a pit of 'always dispose your disposables'. We treat any undisposed (especially I/O) class as a red flag in code reviews, and feel dirty when forced to hand such things off with no concrete confirmation of cleanup ( MVC's FileResult() is a great example there) |
@svick I uh... always use Still, I maintain my position, exceptions notwithstanding. As this very issue demonstrates, allowing our class documentation to say, "hey, you may use this class without dispose, just for laziness", puts us in this awkward position when the implementation can no longer support that liberty. Even if disposing is not necessary, why share this in the documentation at all? Why let people skip dispose on some Stream types? Just have them disposed as usual, even where it was not strictly necessary. And then we can keep the exceptions minimal, such as for |
@markrendle , if you need MemoryStreamPool - you can just use Looks like it has all that you want:
|
@imanushin |
Maybe something that could be considered here also. |
Using chunks internally seems inconsistent with supporting GetBuffer(). I do think using chunks internally is interesting to consider, but I'm not sure how to square this with the GetBuffer() API; seems like we might need a separate type for this. |
It gets a new larger contiguous buffer if necessary to present a single buffer with all content: The only problem there is that the buffer might be from the pool, and now you have a second potential owner of said buffer. Would be safer to just treat GetBuffer() like ToArray() with a fresh non-pooled array. |
In that case, at the very least an array from the array pool could be returned. That leaves the user with the option of returning it, making the code mostly allocation-free. As for users unaware of this... renting an array from the array pool and not returning it has no downsides over just allocating a new array, right? |
Good point, as long as the MemoryStream doesn't return it when Disposed in that case. That's the tricky part of it... the buffer can't be owned by both the MemoryStream and the consumer because it's fine if neither returns it to the pool but bad if one returns it and the other continues using it. |
Agreed. Once you call This does require some attention to the single-block case too. Do we implement a special case to relinquish ownership of the single block? That seems hard to keep track of. We might need to switch to "large buffers" as soon as Any clever ideas? It would be so nice if the single-block case could stay copy-free... |
public MemoryStream (byte[] buffer, int index, int count, bool writable, bool publiclyVisible);
public MemoryStream(ArrayPool<byte> pool, bool publiclyVisible) When
|
The more I've thought about this, the more I think it needs to be done separately from MemoryStream:
|
Also the GetBuffer() issues mentioned above. |
Actually, with the above discussion about this, the intent was that the returned array becomes the stream's own buffer (at least until you write to the stream), so changing the data certainly works. Hopefully the discussion makes more sense with that in mind.
I imagine a less ideal
That would be the easiest. It is a bit disappointing that the simple case of a single chunk would then involve copying, so I'm still holding out hopes that we can come up with something more clever. But it might be acceptable. |
I think a separate, more suitable API would certainly be worth pursuing. I'm wondering, when we wish to supply a low-allocation implementation, in how many cases we must pass a If a I'm not fond of the idea of honoring API misuse. If you neglect to dispose an If a type unrelated to Admittedly, as for not disposing, the documentation says that's allowed, so that needs to be addressed. (Strange implementation detail to mention in the documentation, but I digress.) If we can make sure that not disposing merely degenerates to the equivalent of using non-pooled streams (analog to not returning arrays rented from an array pool), then this is a non-issue, right? Potential gains, with no potential losses. |
Arrays from the pool are more "valuable" than others, as they're much more likely to have been around for longer, be in higher generations, etc. There's also been discussion about using pinned/aligned arrays in the pool. So taking arrays from the pool and not returning them is generally worse on a system than allocating new ones, degrading other unrelated components using the pool.
Not disposing a MemoryStream is not misuse, nor is using ToArray after disposal. Concurrent use is misuse, but at the same time, MemoryStream is used by millions of libraries and applications; there is guaranteed to be misuse. And we can't afford to introduce changes into such a type that could introduce a myriad of difficult to diagnose race conditions in entirely unrelated parts of the app, e.g. misuse of a MemoryStream over here causes sensitive data to be leaked to an http response over there. We need to be cognizant of such issues for new APIs, but it's a much bigger deal for existing ones, especially ones as core s MemoryStream. |
I did not realize! That does limit our options... and support your proposition to not inherit from I'm still very curious if we even have examples where inheriting from |
To my knowledge there aren't any public APIs in the core libraries strongly-typed to accept a MemoryStream. No doubt there are cases outside of the core libraries. |
@stephentoub Would this possibly fall under Up For Grabs? Or have you made some traction internally? I think ArrayPoolStream would work nicely, but add a little creative breathing room without need for backwards compatibility. Could also operate as a MemoryStream drop-in replacement for a large amount of use cases (those not misusing the Stream) to begin with. |
@houseofcat, sorry for my prolonged delay in responding; I missed the GitHub notification and only just saw the response. It's up for grabs, but at this point that's about coming up with a design proposal rather than actually submitting a PR with the implementation. |
Weighing the value of this against other Stream-related work, we've concluded that we will not pursue bringing this feature into .NET 8. We do expect to bring this feature into dotnet/runtime in the future, but for the time-being we encourage usage of the RecyclableMemoryStream implementation. If folks encounter blockers for using RecyclableMemoryStream, the .NET team would consider contributing to that library to unblock those scenarios. |
AB#1244354
When not constructed with a specific
byte[]
,MemoryStream
allocatesbyte[]s
every time it needs to grow. It would be tempting to just change the implementation to useArrayPool<byte>.Shared.Rent
to get that array, but this is problematic for a few reasons, mostly to do with existing code:ArrayPool<byte>
, it's important to release the currently used buffer back to the pool when the stream is disposed. And it would be expensive to make MemoryStream finalizable to deal with this.One solution would just be to introduce a new Stream type, e.g.
That has its own downsides, though. In particular, there's something nice about this just being a part of MemoryStream, being easily used in existing code that uses MemoryStreams, etc. Another option would be to just plumb this into MemoryStream, but in an opt-in manner, and accept some of the resulting limitations because they're opt-in:
public MemoryStream(ArrayPool<byte> pool)
.I'm leaning towards the second option: just add the new
MemoryStream(ArrayPool<byte>)
ctor, and allow TryGetBuffer to work; devs just need to know that when they create the stream with this ctor, they should dispose the stream, and they shouldn't use the array retrieved from TryGetBuffer after doing something else to the stream.The text was updated successfully, but these errors were encountered: