-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: language support for async sequences #261
Comments
I'd personally would love to see support for both Ultimately switching between the two would be pretty easy (and already supported in Rx/Ix), but from a sequence producer point of view they would behave a little differently in that Also, if considering support for From the consumer point of view they should probably behave the same way. |
@HaloFour
|
I don't disagree. Rx is awesome like that. I advocate for it mostly to bring Rx closer to the BCL so that people are more aware that it exists, and also because those core interfaces are at least a part of the BCL whereas |
I'm not familiar with Ix, so I can't comment on any existing IAsyncEnumerable, but I would rather the team start fresh when thinking about async enumerables rather than try to build off IObservable. Rx was an interesting project, but it was designed mostly before async existed and then later tried to bolt the two concepts together with varying success. Present-day Rx also has a very cluttered API surface area with poor documentation all around. async/await enables code that looks almost identical to synchronous code- I'd like to be able to work with asynchronous sequences as effortlessly as you can work with IEnumerable today. I've definitely wanted to mix |
Indeed, there is a lot of duplication between the two because they were developed independently and Rx never had the resources that BCL/PFX had. I also don't think that Rx/Ix could be merged into the BCL as is. The Ix In my opinion supporting both would be worthwhile. The compiler could generate different state machines depending on the return type of the async iterator. |
I've been wishing for this feature ever since C# 5 came out. Being able to write something like I noticed that Roslyn already has an |
@thomaslevesque, the Roslyn link is 404. |
Uh... looks like it's no longer there. A search for |
Entity Framework uses the IAsyncEnumerable pattern to enable async database queries. In EF6 we had our own version of the interface, but in EF7 we have taken a dependency on IX-Async. |
@anpete Seems to me that if async streams depends specifically on a new BCL Perhaps the compiler could support the different interfaces by convention, or have an easy way to unify them through a common utility extension method. But if, for whatever reason, they need to be converted back to their proper specific interface that would still pose problems. I believe quite strongly that not at least trying to integrate the BCL and the Roslyn languages with Rx/Ix is a massive wasted opportunity. |
Just to provide some additional background, this can already be done in F# (because F# "computation expressions", which is a mechanism behind both iterators and asyncs is flexible enough). So, the C# design might learn something useful from the F# approach to this. See:
Probably the most interesting consideration here is what is the programming model:
You can convert between the two, but going from Rx to AsyncSeq is tricky (you can either drop values when the caller is not accepting them, or cache values and produce them later). The thing that makes AsyncSeq nicer from sequential programming perspective (i.e. when you write statement-based method) is that it works well with things like
Here, we wait 1 second before consuming the next value from So, I think that if C# gets something like asynchronous sequences (mixing iterators and await), the pull-based design that is used by F# asyncSeq is a lot more sensible. Rx works much better when you use it through LINQ-style queries. EDIT: (partly in reply to @HaloFour's comment below) - I think that it makes sense to support the async iterator syntax for (*) As a side-note, I find |
@tpetricek The difference in behavior between Beyond that hopefully both interfaces will enjoy support of all of the common LINQ operators plus those operators that apply to asynchronous streams. |
@tpetricek - The FSharp.Control.AsyncSeq documentation has been clarified to use the terminology "asynchronous pull", rather than just "pull", i.e. a pull operation that returns asynchronously, |
It would be nice if the reading from async sequence had constant stack usage and simple associative operations like concatenation had decent performance no matter whether left- or right-associated. Eg. reading from
causes |
@radekm For your kind of usage (sequence is materialized, size is known in advance) you can already use
Does your request mean that all possible implementations of |
@vladd It was only a simple example, you can take for instance static IEnumerable<BigInteger> Fib()
{
return Fib(BigInteger.Zero, BigInteger.One);
}
static IEnumerable<BigInteger> Fib(BigInteger a, BigInteger b)
{
yield return a;
foreach (var x in Fib(b, a + b))
{
yield return x;
}
} which has the same problems. What I want is to compose complex asynchronous sequences from very simple and reusable parts. To do this the operators like concatenation must be efficient. Since I don't know how to do this in C# I'll give a few examples in Scala with scalaz-stream.
def fib(a: BigInt = 0, b: BigInt = 1): Process[Nothing, BigInt] =
emit(a) ++ fib(b, a + b) There is no risk of stack overflow and reading the first n items takes O(n) not O(n^2) (assuming that
process1.take[Int](5).filter(_ > 0) ++ process1.id This applies the filter only to the first 5 integers of the stream. You can use it with operator Process(1, 2, -3, -4, -5, -6, -7) |> (process1.take[Int](5).filter(_ > 0) ++ process1.id) and it gives you 1, 2, -6, -7. |
I think that it would be wise to have language parity with F# for supporting async pull sequences (e.g. That leads me on to making some comments about Rx and why I dont think it needs any language support (right now).
@thomaslevesque says "I've been wishing for this feature ever since C# 5 came out.". @HaloFour "Observable.ForEach" shudder. |
@LeeCampbell I'd largely be happy if the C# team did the same thing they did with I think that there is a massive amount of information for Rx out there, but if nobody knows to look it might as well not exist. I think that it needs the same kind of campaign from MS that LINQ and asynchrony received. Some kind of inclusion in the languages pushes that point. I've been doing a lot of Java dev lately and it annoys me how much excitement there seems to be building around Rx that I don't see on the .NET side. |
I am interested to see how you would see this work. I think the way you work with and AsyncEnum and the way you work with an IObservable sequence are quite different. The former you poll and pull from until complete and then you move on to the next statement.
The later you set up a subscription providing call backs and then move on immediately. The callbacks for an Observable sequence are called at some future point in time.
With this in mind, they (to me at least) are totally different things, so I am not sure why or how language support would help here. Would like to see a sample of your ideas. I can see some usefulness for language support of AsynEnum sequences, again, at least to get language parity with F# |
To give you an idea, I already currently have an extension method for public IObservable<int> Range(int start, int count, int delay) {
return Observable.Create(async observer => {
for (int i = 0; i < count; i++) {
await Task.Delay(delay);
observer.OnNext(i + start);
}
});
}
public async Task TestRx() {
Random random = new Random();
IObservable<int> observable = Range(0, 20, 1000);
using (IAsyncEnumerator<int> enumerator = observable.GetAsyncEnumerator()) {
while (await enumerator.MoveNextAsync()) {
Console.WriteLine(enumerator.Current);
await Task.Delay(random.Next(0, 2000));
}
}
} There are several overloads to I hope that kind of answers your question. It's early and I didn't get much sleep last night. Basically, I am treating the |
Guys who are interested in IObservable support -- can you describe the benefit integrating this into the language would bring? |
To Devil's Advocate my own arguments:
Now, given the probability of Devil's Advocate point 1, some streaming analog to Now, for my arguments from the generating side, I'd like to revisit my use case of dispatching a bunch of asynchronous operations. This is something that the current project I work on does incredibly frequently, basically n+1 operations against web services where the first response provides a bunch of IDs that then need to be resolved individually*. If async streams return public async IAsyncEnumerable<User> QueryUsers(int organizationId, CancellationToken ct) {
Organization organization = await ws.GetOrganization(organizationId, ct);
foreach (int userId in organization.UserIds) {
User user = await ws.GetUser(userId);
yield return user; // can't continue until consumer calls IAsyncEnumerator.MoveNext
}
} Granted, there could be BCL methods to make this a little easier, but it feels like something that can be supported out of the box: public async IObservable<User> QueryUsers(int organizationId, CancellationToken ct) {
Organization organization = await ws.GetOrganization(organizationId, ct);
foreach (int userId in organization.UserIds) {
User user = await ws.GetUser(userId);
yield return user; // Effectively the same as calling IObserver.OnNext(user)
}
} |
I kinda like that. I don't think handwritten consumption shapes should weigh heavily on the design here. |
That has two problems. First it binds this proposal to the fate of another must-less-certain proposal, one that would likely require CLR and BCL changes in order to accomplish. Two, it still results in the loss of variance since structs can't be variant. |
Pattern: familiar vs efficient(I'm summing up some currently-under-discussion design points)... What should the async foreach pattern be like? First option is to be familiar like interface IAsyncEnumerable<out T> {
IAsyncEnumerator<T> GetAsyncEnumerator();
}
interface IAsyncEnumerator<out T> {
T Current {get;}
Task<bool> MoveNextAsync();
} The other option is to be more efficient. We can be more efficient in a few ways: (1) by avoiding heap allocation and returning just structs which include state machine and method builder and enumerator, so the caller can allocate them all on the stack; (2) by avoiding the double-interface dispatch; (3) by having a tight non-async inner loop. There's been lots of discussion on fine-tuning the precise best way to achieve these efficiency goals, but here are some simplistic versions: // (1) avoiding heap allocation entirely
// Declaration side:
async iterator MyAsyncEnumerable<int, struct StateMachine> GetStream() { ... }
// Consumption side:
foreach (var x in GetStream()) { ... }
// (2) avoiding the double-interface-dispatch
interface IAsyncEnumerator<out T> {
Task<Tuple<T,bool>> TryGetNextAsync()
}
// (3) avoiding async when working through the batch...
while (await enumerator.GetNextChunkAsync())
{
while (enumerator.TryMoveNext(out value)) { ... }
} _As the discussion has progressed, I've seen the "efficient" versions become steadily less appealing..._ Heap allocations. Why do we care about avoiding heap allocations entirely? I see that eliminating heap allocation is desirable for in-memory data structures that you iterate over with In all, the heavy-weight language work needed to support heap-free async streams seems way disproportionate to the benefit. (That language work might be Heap-free streams as we've envisaged them will only apply to consumption by // This is familiar LINQ extension method
static void Select<T,U>(this IEnumerable<T> src, Func<T,U> lambda)
// We could plumb more information through to avoid boxing
static void Select<T,Table, Tator, ...>(this IEnumerable<T,Table,Tator> src, ...) At this point we waved our hands and said "We've discussed and investigated escape analysis in the past -- something where some part of the infrastructure can see that the Conclusion: should give up on heap-free async streams. Avoid double interface dispatch. Sometimes we believe that interface dispatch onto a struct is one of the slowest parts of .NET. Other times we think it's cached and so pretty fast. We haven't benchmarked this yet. There's no point pursuing it for efficiency's sake unless it's been benchmarked. The downside of "avoid double interface dispatch" is that it doesn't work nicely with covariance. And covariance is more important. The only way to retain covariance in just a single interface dispatch is something like this: One attractive feature of a Avoid async when working through the batch. Let's work through concretely how this would work when you iterate over an async iterator method. There are subtleties here that aren't at first obvious... async iterator IAsyncEnumerable<int> GetStream()
{
while (true)
{
await buf.GetNextByteAsync();
yield return buf.NextByte;
}
}
var enumerator = GetStream().GetEnumerator();
while (await enumerator.GetNextChunkAsync())
{
while (enumerator.TryMoveNext(out value)) { ... }
} The question is, what granularity do the chunks come in? The easy answer is that A more complicated answer is that Conclusion: we should of course benchmark the single-interface-dispatch and the buffers. But they would have to show dramatic wins to be worth the concommitant ugliness. |
Cancellation and ConfigureAwait(I'm summing up some currently-under-discussion design points)... We'd talked about cancellation being done in two ways: // way 1
using (var ator = able.GetAsyncEnumerator(token))
// way 2
foreach (await var x in xs.WithCancellation(token)) To avoid the compiler+language having to know about cancellation, we could define the first method with a default parameter: We'd talked about foreach (await var x in xs.ConfigureAwait(false)) We'd talked about a shorthand We'd talked about how when you obtain an obejct that implements _QUESTIONS_. Q1. Why do we need both "way1" and "way2"? Can't we just do it with "way2"? Q2. Normally you can do Q3. Does Q4. Does Q5. For folks like ServiceFabric who wish to force you to provide a cancellation token, would we make it so |
Home of IAsyncEnumerable(I'm summing up some currently-under-discussion design points)... Where is the home of It feels like the interface type Not sure about the extension method What do folks @onovotny think about this? |
I think that it makes sense to have the interface itself live somewhere central so that methods from mscorlib or System could potentially return it as a signature with its own internal implementation. For the home of the LINQ implementation, I would recommend IX. We would of course welcome any and all contributions from Microsoft and other teams. IX would adapt and use whatever the final signature is for the I would think the extension methods should go along with IX as well -- basically, if you want to do IX Async, that's the real library to reference for the main logic. |
@ljw1004 I think new extension methods for Personally, I also find RX really hard to use because the documentation is spotty / dated. We shouldn't tie a brand-new language feature to an old library designed for a world in which the language feature didn't yet exist. |
And that's even more true of IX |
I completely agree. However I think the reason for that failing is largely due to the lack of attention Rx/Ix received from Microsoft. My opinion has always been that since Rx is so far along with its implementation of LINQ on both push and pull streams that it makes a great deal of sense for Microsoft to take advantage of their own work and build async streams on top of that. If that entailed bringing Rx under the BCL proper, which I imagine would require a bit of a refactoring, I would not be opposed to that. It likely requires it just from a consistency point of view since Rx and TPL diverged quite a bit in common areas. It makes me sad that Rx seems to be getting more love on the JVM these days than on .NET where it was born. And here we are arguing over the details of reimplementing much of it. |
@HaloFour By the time C# 7+1 ships Rx/Ix will likely be over 10 years old. It is filled with outdated paradigms and the web filled with outdated documentation. I think a much better tact would be adding a new library for
|
This thread has raised a several good points about Ix and Rx. Few thoughts here: Some of the comments have addressed Rx as opposed to Ix. Ix, despite living in the Rx repository, can version and release independently of Rx. For the purposes of this thread and disccussion, I think it would be helpful to focus on Ix rather than Rx. Of course, we'd be more than happy to have discussions about the current state and future of Rx over on the Rx repo. There was some disccussion here around the lack of discoverability around Ix. What do you think could help fix that? Ix today already has the When it comes to documentation, we wholeheartedly agree that it could be made better. That's also an area that can and will be improved. We would very interested in what kinds of changes would make the documentation better. About suggestions that Ix hasn't had the same sort of API review and was designed for a time pre-async and has outdated techniques, I would respectfully disagree. The API of Ix has been carefully thought out by the team and treated with the same level of review that an "in-box" library would have. Furthermore, Ix Async itself was rewritten this summer to use a modern approach based on what I would suggest that Ix overall is hardened and time-tested with very thorough test coverage and review practices. The location of the code should not matter, but lest that be a blocker, the code could split into its own repo should there be a need. Overall, the path to implementing whatever shape of the interface resuls from this discussion is far shorter and mostly done aleady. That's a huge boost. I would also like to state that community contributions are very welcome. In fact, it was a community member that did the initial refactor to introduce |
My main issue with the documentation of Rx, is that the only API reference on MSDN is horribly out of date. The situation seems to be even worse for Ix: there's nothing on MSDN or anywhere else, and sometimes methods don't even have XML documentation comments. For example, when I search for "AsyncEnumerable.Buffer" (or "IAsyncEnumerable Buffer"), I find:
|
Interesting to compare notes with the concurrent development in JavaScript. http://www.2ality.com/2016/10/asynchronous-iteration.html |
@onovotny I agree with your comments about Ix being carefully designed and implemented. As you know, we were able to switch vNext of EF Core (currently on our dev branch) to the new version that contains the reimplementation of the operators, and we appreciate the advantages of the new implementations. That said there are still a few issues we should discuss with regards to the API aligment with the idiomatic patterns used across .NET for async. In particular we believe that query operators that return a single In summary, I actually don't have a strong opinion on where the LINQ operators for cc @anpete |
@divega AFAIK, the reason the |
@onovotny are you referring to VB query comprehension syntax? Otherwise do you know which operators map to C# comprehension syntax for which this would be a problem? As far as I remember VB provides sugar for many more operators but is strict about their return types. If I am remembering correctly, this means the current naming probably doesn't help. To clarify, I said previously that it was about BTW, this could be an interesting criteria to decide where to place operators. Let's say there are two groups:
I believe the second group is for more advanced/less common scenarios and having to include an extra dependency to use them wouldn't hurt too much. |
I'm specifically referring to the LINQ syntax I agree there seems to be two groups as you describe, and there's no keyword equiv of |
@onovotny ok, I was referring specifically to the ones that can be awaited. For |
This is now tracked at dotnet/csharplang#43 |
Here's my take on async enumerators/sequences using C# 7's Task-Like types. The approach could potentially act as a playground for the language feature. |
@Andrew-Hanlon |
@jnm2 That's a good point. Making the |
Shouldn't it be |
Just wanted to add my vote for enabling |
Both C# and VB have support for iterator methods and for async methods, but no support for a method that is both an iterator and async. Its return type would presumably be something like
IObservable<T>
orIAsyncEnumerable<T>
(which would be like likeIEnumerable
but with aMoveNext
returningTask<bool>
).This issue is a placeholder for a feature that needs design work.
The text was updated successfully, but these errors were encountered: