-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a source generator for Stream that emits the boilerplate for most members and enforces using the most-efficient members #79839
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsBased on #79261 (comment), there's the possibility of using a source generator to guide developers to implement Stream correctly i.e: use span-based and task-based APIs instead of byte-array overloads and APM (Begin*/End*). API Proposalnamespace System.IO
{
/// <summary>
/// Instructs the System.IO.Stream source generator to generate boilerplate implementations for multiple legacy members, leaving the implementer only with the task of providing implementations for the core operations.
/// </summary>
[AttributeUsage(AttributeTargets.Class, AllowMultiple = false, Inherited = false)]
public sealed class GenerateStreamBoilerplateAttribute : Attribute
{
public bool CanRead { get; set; } = true;
public bool CanSeek { get; set; } = true;
public bool CanWrite { get; set; } = true;
}
} API UsageThis new source generator most likely will ship as an OOB NuGet package. Once installed, you opt-in into boilerplate generation by annotating any partial Stream implementation with [GenerateStreamBoilerplate]
public partial class MyStream : Stream
{
public override long Length
{
get { throw new NotImplementedException(); }
}
public override long Position
{
get { throw new NotImplementedException(); }
set { throw new NotImplementedException(); }
}
public override void Flush()
{
throw new NotImplementedException();
}
private partial int ReadCore(Span<byte> buffer)
{
// Do Read
}
private partial ValueTask<int> ReadCoreAsync(Memory<byte> buffer, CancellationToken cancellationToken)
{
// Do ReadAsync
}
private partial long SeekCore(long offset, SeekOrigin origin)
{
// Do Seek
}
private partial void SetLengthCore(long value)
{
// Do SetLength
}
private partial void WriteCore(ReadOnlySpan<byte> buffer)
{
// Do Write
}
private partial ValueTask WriteCoreAsync(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken)
{
// Do WriteAsync
}
} Notes:If you opt into boilerplate generation, you will get most methods sourcegen'd and you will be asked to provide bodies for You can specify which member your stream supports by specifying If one operation is specified as unsupported or if you override all the methods related to one kind of operation, e.g: Alternative DesignsBe more aggressive on generation and emit boilerplate for all Stream-derived classes with a partial modifier.We can relax the filter and encompass more types if we decide that any Stream-derived type would be a good candidate for boilerplate generation. This would also mean that we don't need an attribute for opting-in but we may want to offer one for opting-out. Provide analyzers+fixers instead.We already have an analyzer+fixer for preferring Memory overloads for Stream.Read/WriteAsync (#33790). NotesTaskToApm is not public API so we will need to emit the helpers with the source generator. As suggested in #79261 (comment), I considered adding an RisksN/A Thanks to @madelson, @teo-tsirpanis for their suggestions in #79261.
|
I think the codegen for
In addition, I think it is worth considering whether These changes would lower the implementation burden for implementers (2 get/set properties instead of 1 get property, 1 get/set property, and 2 methods). |
@madelson you mean to have LengthCore as a partial property? That is not yet supported on C#, see dotnet/csharplang#6420. Alternatively what we can do is:
That will make it 1 get property (Length) and 3 methods (SeekCore, GetPosition, and SetLengthCore). |
But I presume that |
Length is not always available for a seekable stream. Some file systems like |
@jozkee good call. Guess this would have to be separate getter and setter methods if we went with my approach instead.
This is an interesting approach. Technically I think we could avoid The upside is eliminating a redundant method. The downside is that it implicitly wants this seek edge-case to be implemented efficiently as a position retrieval, which users may not consider. I suppose that if we keep On balance, I'd probably lean towards having fewer methods to implement. Thoughts?
@adamsitnik thanks I was not aware of this edge case. However, since
I'm not saying that this would be a good default implementation of If we need to support |
2 additional ideas: AsyncI think we should add a property like this to the attribute:
If this property is set to If CanTimeoutI think we should add a property like this to the attribute:
This would influence the codegen for the CanTimeout property and Read/WriteTimeout properties (forcing you to implement vs. throwing, checking CanRead/Write, checking disposal, and validating arguments). It would still be up to users to incorporate the values of the read/write timeout properties into their read/write implementations. The main benefit here is clarity for users. With this addition, we can entirely eliminate optional virtual method sets in favor of required implementations. This gives the user a guarantee that the are providing only and all the necessary implementations for their desired set of capabilities at compile time. |
@madelson thanks for the input, sorry for the delay.
Instead, we could make that the generator emit partials for FlushAsyncCore and DisposeAsyncCore based on a hint, say, ReadAsync or WriteAsync are implemented by the user.
I agree that adding CanTimeout is good, but I'm concerned with it being false by default. It will be practically non-existent, unless the user discovers that they can set it on the Attribute ctor. Will include it in the proposal and see if there's more input against it. |
@stephentoub please take a look. |
@jozkee can you explain what this means in practice? I'd like it to be the case that if I decide to implement async IO then I need to provide implementation of FlushAsync/DisposeAsync and otherwise I should ignore them. Do you agree?
That's fair, but FWIW, I think that if we can eliminate optional virtual methods from the equation that will be a really nice boon for implementers. It is much easier to implement an abstract class when the compiler tells you exactly what set of methods you MUST implement. |
I provided feedback to @jozkee offline, but I'll summarize here. We're going to have to make tradeoffs here between how flexible we are and how prescriptive we are. For me, I'd start by erring on the side of flexibility with the goal of reducing as much boilerplate as possible. Summarizing, I'd like to see us start with an approach where the developer writes only the minimal amount they need to do what they want and the generator fills in the rest as best as possible, while also issuing diagnostics about opportunities a developer has to do better. For example, if a developer just writes: [GeneratedStream]
public class MyStream
{
public int Read(byte[] buffer, int offset, int count) { ... }
} then the generator would generate a CanRead that returns true, CanWrite and CanSeek that return false, and throwing implementations for all the Write methods. It would also issue an info-level diagnostic that without a span-based Read implementation, various inefficiencies result. Or for example, if a developer just writes: [GeneratedStream]
public class MyStream
{
public ValueTask WriteAsync(ReadOnlyMemory<byte> source, CancellationToken cancellationToken) { ... }
} CanWrite would be generated to return true, CanRead and CanSeek to return false, and all of the Read methods would be generated to throw. Most interestingly, the Write(byte[] buffer, ...) method would be generated as a sync-over-async delegation to WriteAsync, the BeginWrite/EndWrite methods would be generated to delegate to WriteAsync as well, and a diagnostic would be issued about the lack of a synchronous Write. We could debate what diagnostics are issued and their default levels, but this would not only enable a developer to opt-in to things being more or less impactful (e.g. maybe they want to change the level about lack of async methods to be warnings instead of info), it would also enable evolution if/as more virtuals are added to Stream in the future (the source generator would generate things as best it could and issue a new diagnostic about the new method being available). Effectively, the generator would heuristically look at everything the developer wrote and then choose to override some additional set of Stream's methods based on that. As for next steps, I think we should start with a prototype in dotnet/runtimelab. No matter what path we choose here, I expect there's going to be a lot of work in tuning the approach and the specifics to get it "right". I'd also like to see us try it out on all of the streams in dotnet/runtime and other repos; how close to absolute minimal boilerplate does it let us get? |
@stephentoub thanks for sharing. I think that with sufficient diagnostics this can achieve what I'd personally hope to get out of this feature which is giving developers confidence that they've implemented exactly what they need to, no more and no less. I'd love to see the diagnostics be fairly opinionated by default in terms of guiding you to a complete and high-performing implementation. |
That is the intent of my suggested approach: the generator fills in literally everything such that no matter how much you write, you have a compiling implementation (e.g. if you write literally nothing in the class implementation, you end up with all the CanXx properties returning false and any method that requires an implementation and that can nop be implemented to throw), and it generates the best possible implementation based on what you wrote. And then a potential slew of diagnostics are issued to tell you all the things you could/should be doing better. In some cases the diagnostics could be suggesting you override additional methods, in some cases they could be suggesting you delete code you added, in some cases they could be analyzing code you wrote and making recommendations for how to do it better, etc. |
I think we can split this work into two items, one is the sole generation being as flexible as possible, which will be addressed by dotnet/runtimelab#2185, and the next one is emitting warnings and errors that get caught by the source-gen. I've edited the top description to reflect this. We can continue adding more items as they keep appearing. |
I'm moving this to Future. For 8.0, we'll get Add Stream boilerplate source-generator and attribute by Jozkee · #2185 merged into runtimelab and available for trials, and then we'll assess if that should be further productized during .NET 9. |
Background and motivation
Based on #79261 (comment), there's the possibility of using a source generator to guide developers to implement Stream correctly i.e: use span-based and task-based APIs instead of byte-array overloads and APM (Begin*/End*).
We can emit boilerplate for
virtual
s, non-span-based, and non-task-based methods; and then emitpartial
s to enforce using the newest members inStream
implementations in projects that install the source generator proposed here.A working prototype can be found here: https://github.com/Jozkee/StreamSourceGen.
API Proposal
API Usage
This new source generator most likely will ship as an OOB NuGet package. Once installed, you opt-in into boilerplate generation by annotating any partial Stream implementation with
[GenerateStreamBoilerplate]
.Notes:
If you opt into boilerplate generation, you will get most methods sourcegen'd and you will be asked to provide bodies for
partial
*Core*
methods. Opting in doesn't prevent you from overriding any member.You can specify which members your stream supports by specifying
CanRead
,CanSeek
, andCanWrite
properties inGenerateStreamBoilerplateAttribute
, i.e: if you want to turn-off one or more stream capabilities e.g: seeking, you need to specifyCanSeek = false
in the attribute, this will tell the source generator that for seek-related methods (Seek, SetLength), we will implement them to alwaysthrow NotSupportedException("Stream does not support seeking")
.If one operation is specified as unsupported or if you override all the methods related to one kind of operation, e.g:
Read
, thepartial ReadCore
method won't be emitted and you will need to remove it if you were previously providing a body to it.Alternative Designs
Be more aggressive on generation and emit boilerplate for all Stream-derived classes with a partial modifier.
We can relax the filter and encompass more types if we decide that any Stream-derived type would be a good candidate for boilerplate generation. This would also mean that we don't need an attribute for opting-in but we may want to offer one for opting-out.
Provide analyzers+fixers instead.
We already have an analyzer+fixer for preferring Memory overloads for Stream.Read/WriteAsync (#33790).
We could provide similar ones for sync Read/Write, and for APM (Begin*/End*).
Notes
TaskToApm is not public API so we will need to emit the helpers with the source generator.
We could lower the amount of sourcegen'd code for Begin*/End* if we reconsider #61729.
As suggested in #79261 (comment), I considered adding an
Async
property toGenerateStreamBoilerplateAttribute
(to tell whether the stream was going to support sync, async, or both) but removed it at the end since I think custom streams are more likely to support both contexts and you can still manually override what you don't want to support.Risks
N/A
Thanks to @madelson, @teo-tsirpanis for their suggestions in #79261.
Update
Based on recent conversations, we will start by providing a source generator with flexibility as most important aspect. Once done, tune the flexible vs prescriptive dial based on user experience.
This work can be split into 2 items:
The text was updated successfully, but these errors were encountered: