Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Awaiting records. #2321

Open
lrhn opened this issue Jun 28, 2022 · 15 comments
Open

Awaiting records. #2321

lrhn opened this issue Jun 28, 2022 · 15 comments
Assignees
Labels
feature Proposed language feature that solves one or more problems

Comments

@lrhn
Copy link
Member

lrhn commented Jun 28, 2022

Dart Record Await

The current definition of await e in Dart checks the value of e is a Future<T> where T is flatten(S), and S is the static type of e. If so, it waits for the future to complete, otherwise it waits for a short while (microtask) before evaluating to the value of e.

If you have multiple futures, and you want to wait for all of them, then Future.wait works on a list of similarly typed futures. It has special affordances for cleaning up the values of successful futures for when some, but not all, futures fail.

With records, it's possible to have a typed record of differently typed futures that you want to await all of. To do this in a well-typed way, this is a proposal to allow await to do that directly.

Proposal

If we do nothing, await record will always evaluate to the value of record with a short delay inserted. A record cannot implement Future.

Since awaiting a record is always useless, we will instead enhance await on a record type as follows.

Terminology: A record has a number of positional elements and a number of named elements with distinct names. The shape of a record or a record type is the number of positional elements and the set of named element names, written as a pair (n, S) ∈ ℕ×𝒫(Id). The set of positions in a tuple type T with that shape is the set PT = {1..n}∪S ⊆ ℕ∪Id. We write the type of a record with shape (n, {x1, …, xk}) as (T1, … ,Tn, x1: Tx1, …, xk: Txk), and the record itself as (v1, … ,vn, x1: vx1, …, xk: vxk).

Semantics

Let e be an expression with static type T, which is a record type (and not Record) with a given shape (number of positional places, set of names of named places) which is (n, S), and static element types Ti for i ∈ {1..n} and Tx for x ∈ S.

We extend flatten to record types as:

  • flatten((T1, … ,Tn, x1: Tx1, …, xk: Txk)) = (flatten(T1), … ,flatten(Tn), x1: flatten(Tx1), …, xk: flatten(Txk))

That is, it applies point-wise to each element type of the record type, yielding a type with the same shape.

Let a be the expression await e, where e has static type T.
The static type of a is flatten(T).

The expression await e is then evaluated as follows:

  • Evaluate e to a value v. Let V be the runtime type of v.
  • Await v with static type T to a value w and errors E (a set of pairs of objects and stacktraces). (Shorthand ℰ(v, T) = (w, E).) This is defined (recursively) as:
    • If T is FutureOr<S> for some S (so flatten(T) is S),
      • If V is a subtype of Future<S>, then when the future v completes, the await of v completes with the corresponding await result: If the future completes with a value w, then ℰ(v, T) = (w, {}), and if the future completes with an error r and stacktrace s, then ℰ(v, T) = (null, {(r, s)}).
      • Otherwise ℰ(v, T) = (v, {}).
    • If T implements Future<S> for some S, then V must be a subtype of Future<S>. Wait for v to complete, and the await of v completes with the corresponding await result.
    • If T is a record type (T1, … ,Tn, x1: Tx1, …, xk: Txk):
      • Then v must be a record value (v1, … ,vn, x1: vx1, …, xk: vxk).
      • For each position p ∈ PT of the record shape (in asynchronous parallel):
        • The runtime type of vp is a subtype of Tp by definition.
        • await vp at type Tp to a value wp with errors Ep.
      • When all positions have (asynchronously) awaited to a value wp and errors Ep,
      • then let w be the tuple (w1, … ,wn, x1: wx1, …, xk: wxk).
      • Then ℰ(v, T) = (w, ⋃pPTEp).
    • Otherwise, if T is a supertype of Future<Object>, so flatten(T) is T, and V is a subtype of Future<T>, behave as if T had been Future<T> above.
    • Otherwise if T is a supertype of Record and V is a record type (T1, … ,Tn, x1: Tx1, …, xk: Txk), behave as if T had been (Object?, … ,Object?, x1: Object?, …, xk: Object?) above. That is, the same shape, but unknown type on each position.
    • Otherwise, ℰ(v, T) = (v, {}). (Nothing to await.)
  • At some time after v with static type T has awaited to w and errors E, the await e completes as follows:
    • If E is not empty
      • let L be an unmodifiable List object containing, for each (r, s) in E, an object as if created by the object creation invocation AsyncError(r, s). There is no specified order to these errors.
      • Let R be the type (T1?, … ,Tn?, x1: Tx1?, …, xk: Txk?).
      • Let C be an object as if created by the object creation invocation RecordAwaitError<R>(w, L), which is an error containing a record and a list of AsyncErrors.
      • Then await e completes by throwing C with the current stack trace.
    • Otherwise, await e completes with the value w.

This behavior ensures that a nested record is traversed in its entirety and each non-record element is awaited as if by await on that expression, at its known static type, only all the awaits happen in parallel, and the result is only available when all of the individual results are available. If any of those awaits result in an error, all the errors are collected, and a single error is thrown by the record await, containing all the values of the successfully awaited elements. That allows the client code to clean up and release any resources related to those values.

The definition of RecordAwaitError can be:

class RecordAwaitError<T extends Record> extends Error {
  final T values;
  final List<AsyncError> errors;
  RecordAwaitError(this.values, this.errors);
  String toString() => "Record await errors: ${errors.length}";
}

Awaiting dynamic or Object containing a record.

Currently, if you await an expression with static type dynamic or Object, and it contains a Future<Object> (or a subtype), that future is awaited, even though it wasn't visible in the static type.

We probably want to await the individual elements of an expression of type dynamic or Object that containing a record containing futures as well.

So, when awaiting a value v with static type T which is a non-record type, and the runtime type of v is a record type, dynamically await each element of the record, recursively, as above, at type Object? or dynamic.

Implementation

If we assume that the implementation will know the memory layout of every record shape, and most likely store the elements as contiguous memory cells, it should be able to traverse and await each element, and then create a new record with the same shape from the results, as well as capture errors in single growable accumulator List.

It is necessary to check every element for being a record, as well as for being a Future of an appropriate type.

@lrhn lrhn added feature Proposed language feature that solves one or more problems patterns Issues related to pattern matching. labels Jun 28, 2022
@munificent
Copy link
Member

The structure of a record or a record type is the number of positional elements and the set of named element names, written as a pair (n, S) ∈ ℕ×𝒫(Id).

FYI, the records proposal calls this concept a record's shape. ("Structure" is a pretty heavily overloaded term.)

Otherwise, I love this.

I wish we could extend this to awaiting List objects too, but I suppose that isn't always valid since some crazy user class could implement both List and Future.

@lrhn
Copy link
Member Author

lrhn commented Jun 29, 2022

Updated "structure" -> "shape".

We could allow it to work for List as well, based on static type. Maybe.
Depends on whether we can precisely define flatten and have runtime behavior which matches,.

If the static type is List<X>, it works on the elements. If the static type is Future<X>, it awaits the future. If the static type is SillySubtypeOfBoth<X, Y>, I'd probably go with the future behavior.

That means:

  • flatten(FutureOr<List<T>>) = List<T>
  • flatten(List<FutureOr<T>>) = List<T>
  • flatten(SillySubtypeOfBoth<F, L>) = F -- favor future aspect of implement Future<F>, List<L>

and the runtime behavior should match. The algorithm above seems like it should work, as long as the List case is after the Future case.

For dynamic we generally try to dynamically do what we would have done if the runtime type had been the static type. Which means again favoring the future aspect. Should probably do the same for await (x as Object) or another non-committant supertype of both Future and List/Record/future-special-cases.

@rakudrama
Copy link
Member

If any of those awaits result in an error, all the errors are collected, and a single error is thrown by the record await, containing all the values of the successfully awaited elements.

Is this correct: the error value is a record-tree with the same shape-tree as the await argument, but with null in the error positions.

Can you confirm that R in RecordAwaitError<R>(w, L) is a static type that can be precomputed by the compiler?
In the case of await e where the static type of e is a supertype of Record, we have a new requirement to be able to synthesize the top type of the a record of the given shape.

i.e

Future get problem => throw ...;
...
try {
  Object record = (x: (1, problem), y: (problem, 2));
  await record;
} catch (error) {
   // error has type RecordAwaitError<(x: Object?, y: Object?)>
   // error.values is `(x: (1, null), y: (null, 2))`
}

If we assume that the implementation will know the memory layout of every record shape, and most likely store the elements as contiguous memory cells

I was not planning on either being true for the dart2js implementation, but it might be the most compact way to implement toString.

I wish we could extend this to awaiting List objects too, but I suppose that isn't always valid since some crazy user class could implement both List and Future.

So would await list require scanning list to see if it contains Futures?
Unlike records, Lists have object identity and are mutable, so await list may or may not copy the list which may be confusing.

@lrhn
Copy link
Member Author

lrhn commented Aug 6, 2022

Is this correct: the error value is a record-tree with the same shape-tree as the await argument, but with null in the error positions.

Yes. That allows us to provide the values of the successful awaits, which might be needed for clean-up.

Can you confirm that R in RecordAwaitError<R>(w, L) is a static type that can be precomputed by the compiler?

It should be. It has the same shape as the input record, and where the original record had type T, the error record has type flatten(T)? instead, except if T is a statically a record type, then this conversion is recursed into that record as well.

I'm not sure we actually need to create the top-type at a given shape. We recurse into the record as if the static type had been a record type with all Object?s, but we don't actually use that record type for anything. We can just use Object? as context type for each individual record element we are awaiting.

I was not planning on either being true for the dart2js implementation, but it might be the most compact way to implement toString.

Maybe JS compilation can make the record elements the only enumerable properties, and use object property enumeration for toString.

And about awaiting lists ... yeah. It's too weird. Sadly. Would be nice to not need Future.wait. But we do.

@munificent
Copy link
Member

munificent commented Aug 17, 2022

We discussed this in the language meeting today. We'd like some way to wait multiple futures in parallel that preserves the type of each future (which Future.wait() doesn't do).

We have an opportunity to something now around awaiting records because there is no existing Dart code that uses records that would break by changing that behavior. If we ship records under the current behavior that awaiting any non-future value just returns the original value, then changing that behavior later to await the individual fields is a breaking change.

I don't think we're certain yet that having awaiting a record implicitly await its fields is the right answer. There's some open questions around how deeply it should traverse. Must all fields of the record be futures for this to be allowed? Does it recursively await records that are subfields (like Lasse proposes here) or only one level deep? If a field is a Future<Future<Foo>> does it recursively await both futures? If we support awaiting records, should we also support awaiting structs?

There are a few other options we could consider:

Define extensions on tuples of futures

In the core library, we could define extensions like:

extension TupleFuture2<T1, T2> on (Future<T1>, Future<T2>) {
  Future<(T1, T2)> wait() {
    // Set up completers for the two fields, and return a Future that completes
    // with a record of the results when both fields complete...
  }
}

extension TupleFuture3<T1, T2, T3> on (Future<T1>, Future<T2>, Future<T3>) {
  Future<(T1, T2)> wait() {
    // Set up completers for the two fields, and return a Future that completes
    // with a record of the results when both fields complete...
  }
}

// 4, 5, 6, etc.

Then you'd use it like:

example(Future<String> nameFuture, Future<int> ageFuture) async {
  var (name, age) = await (nameFuture, ageFuture).wait();
}

Or we could define these as static methods on Future if we wanted to keep them closer to Future.wait().

Pros:

  • Requires no language changes.
  • Avoids adding even more complexity to flatten().
  • Can be shipped with records or after.
  • Simple.
  • The compiler only lets you wait records that do only contain futures.

Cons:

  • Since we have no support for polymorphism over arity, we'd have to define extensions for each arity we want to support and there would be some arbitrary maximum.
  • No support for records with named fields, though that's likely not an important use case anyway.
  • A little verbose to have to write both await and .wait(). (Though Future.wait() is the same.)

Await recurses into a record's fields only when the expression is statically known to be a record type

So an await of a record does recurse into the fields but deciding to do that is done at compile-time based on the static type of the expression being awaited, not its runtime type.

It's not clear (to me at least) if this is better or worse.

Await recurses into record fields only if the expression is a record literal

In other words, we would treat await (<fields...>) as special syntax that awaits each of the individual fields and produces a future of a record of their results.

We still have to decide what happens if you await a value that happens to be a record object. It would probably just yield the original object. It would likely be confusing that hoisting a record literal out of an await into a local variable and then awaiting that variable would completely change its behavior.

Define a different "deep await" syntax

@stereotype441 suggested a new syntax like await * to mean "expand the value and awaits its subcomponents. So:

var recordOfFutures = (Future.value(1), Future.value("str"));
var a = await recordOfFutures;
// a is the same as recordOfFutures.

var b = await* recordOfFutures;
// b is Future<(int, String)>.

We'd have to decide what kinds of values are valid for await* and what happens if you use it on an unsupported kind of value. We could probably make it a static error to use await* on any type where it isn't meaningful, like we do with yield*.

Pros:

  • Keeps await focused on awaiting single values and being able to gracefully handle any value of any type.
  • Lets us ship this as a future extension to the language after we ship records if needed.
  • Allows us to also support iterables and lists with await*, which is shorter than Future.wait().

Cons:

  • Now there's two kinds of awaits and potential user confusion about what they do and when to use which. (The parallel with yield and yield* should hopefully help.)

My current preference would be to define a couple of static methods on Future like:

class Future<T> {
  static Future<(T1, T2)> wait2<T1, T2>(
      Future<T1> future1, Future<T2> future2) {
    ...
  }

  static Future<(T1, T2, T3)> wait3<T1, T2, T3>(
      Future<T1> future1, Future<T2> future2, Future<T3> future3) {
    ...
  }

  // Up to some max arity. Object.hash() goes up to 20.
}

Then in a future release, we might consider adding await* as sugar for these and Future.wait() on iterables. But I suspect that parallel waiting isn't common enough to really justify baked in syntax/semantics for it.

For now this is still an open discussion.

@Levi-Lesches
Copy link

I like await* for awaiting a collection, as it not only parallels yield*, but also sync* and async*. If records get their own type (#1290), you can have Iterable get allFields on the Record type (order could be positional fields first, order unspecified for named fields). That would be pretty bad for tree-shaking, but if someone's using a getter called allFields then that's a good sign they probably want all the fields.

So you could have two ways of awaiting records:

(Future<int>, Future<int>) myRecord;
(int, int) awaitedRecord = await record;  // awaiting a record returns a typed record
List<int> awaitedIterable = await* record.allFields;  // awaiting an iterable returns a list (or another iterable). 

As for recursion, I think await myRecord should be equivalent to calling await field on every field in the record, as if it were await* myRecord.allFields. await myRecord would return a new record with each field awaited by one level. So

(int, Future<int>, Future<Future<int>>) myRecord;
(int, int, Future<int>) awaited = await myRecord;

await* would return the upper bound of all the types passed to it, but awaited one level. So

List<Future<int>> numbers;
List<int> awaitedNumbers = await* numbers;

/// This list contains futures, nested futures, and non-futures, so awaiting it returns [List<Object>]
List<Object> misc;
List<Object> awaitedMisc = await* misc;

That answers a few questions:

Must all fields of the record be futures for this to be allowed?

No, since you can await a non-future as well.

Does it recursively await records that are subfields (like Lasse proposes here) or only one level deep?

No, since [await Future<T>, await Future<Future<T>>] is [T, Future<T>]

If a field is a Future<Future> does it recursively await both futures?

No, since await Future<Future<T>> is Future<T>.

@lrhn
Copy link
Member Author

lrhn commented Aug 18, 2022

There are complications around parallel wait that must be addressed.

  • If more than one future throws, the result should throw. What should it throw?
  • If some futures complete and others throw, the successful values should be available, because they might need to be properly disposed. (The Future.wait has a clean-up function which gets called with the successful values).

I like the idea of letting await only be special on a record literal, where it's more like a parallel await syntax than an operation on record values.

Heck, we could just introduce `await` `(` exprlist `)` as a syntax for parallel await, which is ambiguous with `` await expression` when expressions can be records, and then let the disambiguation chose the parallel await meaning.
We still need records for the result.

But then, we could also introduce await* ( exprlist ) as an await-parallelizer context operator, where each expression in the list is evaluated in parallel. You can write await in the expressions, so await* (foo(await bar()), baz(await qux()), await fifi()) would evaluate each expression until its first async suspension, then move to the next one, and only when all three expressions have completed will the await* complete.
That is: A regular "parallel computation" operator, where the individual computations can contain await, rather than spreading the await to individual futures.
Usage:

(int x, int y) = await* (await isolateFib(12), await isolateFib(16));

(I'd still make errors be reported as an ParallelAwaitError<(int?, int?), (Object?, Object?)> containing a record with the successful values and a record with the actual errors, per position.)

@munificent munificent added records Issues related to records. and removed patterns Issues related to pattern matching. labels Aug 19, 2022
@munificent
Copy link
Member

  • If more than one future throws, the result should throw. What should it throw?
  • If some futures complete and others throw, the successful values should be available, because they might need to be properly disposed. (The Future.wait has a clean-up function which gets called with the successful values).

All of this suggests to me that the semantics are complex enough that it should be a core library API and not built-in language syntax. If we have methods like Future.wait2(), etc., that makes it easier to support optional parameters for error handlers, etc.

@lrhn
Copy link
Member Author

lrhn commented Aug 20, 2022

The only real problem with using library methods/extensions is that it doesn't scale infinitely.

The good thing is that if all you want is parallel computation, it doesn't need to scale. You can parallelize parallel multiple computations.

   var ((v1, v2, v3, v4), (v5, v6, v7, v8)) = await Future.wait2(Future.wait4(e1, e2, e3, e4).wait(), Future.wait4(e5, e6, e7, e8));

if you need to wait for 8 expressions. Or, more likely, using extension methods:

   var ((v1, v2, v3, v4), (v5, v6, v7, v8)) = await ((e1, e2, e3, e4).wait, (e5, e6, e7, e8).wait).wait;

Implementation could be something like:

import "dart:async";
import "package:async/async.dart";

extension WaitRecord2<T1, T2> on (Future<T1> f1, Future<T2> f2) {
  Future<(T1, T2)> get wait {
    var c = Completer<(T1, T2)>(sync: true);
    Result<T1>? r1;
    Result<T2>? r2;
    int pending = 2;
    void done() {
      if (--pending > 0) return;
      var result1 = r1!;
      var result2 = r2!;
      var v1 = result1.asValue();
      var v2 = result2.asValue();
      if (v1 != null && v2 != null) {
         c.complete((v1.value, v2.value));
         return;
      }
      var values = (v1?.value, v2?.value);
      var errors = (result1.asError()?.error, result2.asError()?.error);
      c.completeError(ParallelWaitError<(T1?, T2?)>(values, errors));
    }
    Result.capture(f1).then((r) {
      r1 = r;
      done();
    });
    Result.capture(f2).then((r) {
      r2 = r;
      done();
    });
    return c.future;
  }
}

This code grows linearly in the size of the tuple, so we may want to keep the supported size limit fairly low.

@munificent
Copy link
Member

The only real problem with using library methods/extensions is that it doesn't scale infinitely.

In practice, I think that won't be a problem. The only reason to prefer awaiting a tuple over awaiting a list is because the futures you're awaiting have different types. That implies that as the number of fields you need to await goes up, so does the complexity of the surrounding code since it must be dealing with an increasing number of different types that it's using in different ways. I think that places a natural upper limit on how many fields we would need to support.

In cases where you need to await a large number of fields of a small number of different types, you can always await a tuple with a field containing a list of each types.

@lrhn
Copy link
Member Author

lrhn commented Aug 23, 2022

@munificent That too. A record is a fixed number, n, of separate and distinct values. If you have a fixed large number of independet futures that you need to await, maybe you'r doing something weird.

We could implement parallel wait up to five or six to begin with, and it would probably be fine.

@munificent
Copy link
Member

OK, I'm going to close this. I'm OK with shipping support for records without special await support for them. (That means that awaiting a record just returns the record like awaiting any other non-future value.) If anyone disagrees and wants to keep discussing this, feel free to re-open.

@lrhn lrhn removed the records Issues related to records. label Mar 29, 2023
@lrhn
Copy link
Member Author

lrhn commented Mar 29, 2023

I'm going to reopen this and request it as a separate language feature.

It doesn't have to be part of the original records feature, but it may still carry its own weight as a canonical (and compiler-optimized) way to do parallel awaits for multiple futures.
If nothing else, it might allow some compiler optimizations.

I have an implementation of extension methods which allow you to convert a (Future<T1>, Future<T2>) into a Future<(T1, T2)>, but it has a code-size overhead, and is limited to 2-9 positional fields. (Only the overhead really worries me. If you need to await more than nine statically known and distinctly typed futures at the same time, you can probably live with being the one person who need to nest the records.)

@lrhn lrhn reopened this Mar 29, 2023
@lrhn
Copy link
Member Author

lrhn commented Apr 5, 2023

The alternative, should we ever be able to abstract over record structure, could be an operation on futures, such that intFuture | stringFuture would get type Future<(int, String)> and then Future<(int, String)> | Future<(bool, num)> would become Future<(int, String, bool, num)>, in some predictable and definable way.

That would be preferable, but it's just not something we can do with the existing language.

(And it would probably require a whole slew of new features, including generic operators, operator overloading and record type spreads. Then it could be something like:

extension ParallelFutureValue<T> on Future<T> {
  Future<(T, S)> operator | <S>(Future<S> other) { 
     return (this, other).wait; // Current implementation.
  }
  Future<(T, ...R)> operator | <R extends Record>(Future<R> other) { 
     try {
        var (t, r) = await (this, other).wait;
        return (t, ...r);
     } on ParallelWaitError<(T?, R?), (AsyncError?, AsyncError?)> catch (e, s) {
       // New record type with same shape as R, but nullable on each. 
       // Basically applying a known type function (alias) to each position in the record.
       typedef RecordErrorValues = R::map((X) -> X?); position.
       typedef RecordErrorErrors = R::map((X) -> AsyncError?); // Similar.
       // R::new(...) creates new record value with same shape as R and `null` in each field, which has type:
       // R::map((X) => Null).
       RecordErrorValues recordErrorValues = e.values.$2 ?? R::new(() => null);
       var recordErrors = e.errors.$2 as ParallelWaitError<RecordErrorValues, RecordErrorErrors>? ?? R::new(() => null);
       throw ParallelWaitError<(T?, ...RecordErrorValues), (AsyncError?, ... RecordErrorErrors>(
         (e.values.$1, ...recordErrorValues), (e.errors.$1, ...recordErrors));
     }
  }
}
extension ParallelFutureRecord<R extends Record> on Future<R> {
  Future<(...R, S)> operator | <S>(Future<S> other) { ...similar... }
  Future<(...R, ...R2)> operator | <R2 extends Record>(Future<R2> other) { ...similar... }
}

And that's just scratching the surface of the features that might be needed. Might need first class, higher order type aliases.)

@TekExplorer
Copy link

i'd really like those features. being able to operate on record types would be awesome.

especially if we could possibly dynamically create one with a List<Object?> and Map<String, Object?> similar to Function.apply, or convert a record to the same

perhaps a (R extends Record).construct([], {for (final (String name, <T>) in R::named) ...}) or something like that which would throw if the shape doesn't match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems
Projects
None yet
Development

No branches or pull requests

5 participants