Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES6 Generators #534

Closed
wants to merge 9 commits into from
Closed

Conversation

samwgoldman
Copy link
Member

Behavior

Generators are iterable. Each value yielded from the generator function
is iterated.

Generators are iterators. Values can be passed in through the next
method. The iterator result values will be either a) the yielded values,
b) the value returned from the generator (the completion value) or c)
undefined, if next is called after completion.

Generators can delegate to other iterables, including other generators.
Values passed to next on the delegator generator will be passed
through to the delegatee. Similarly, values yielded by the delegatee
will be returned along the iterable/iterator interfaces.

The "delegate yield" expression, yield *, returns the completion value
of the delegatee.

Implementation

While normal functions' return type is the type of the return
statement, generators are different. The return type is always a
Generator, where the second type argument is the type of the return
statement.

We use the same process to infer this type--an internal tvar named
"return"--but then override the return type when creating the function
type, if the function is a generator.

In order to track the next and yield types, we introduce new
internal tvars, which accumulate lower bounds according to their use,
and upper bounds from explicit annotation of the Generator type.

Caveats

The generator interface is highly dynamic, and not particularly amenable
to static typing.

The type of Generator<Y,R,N> enforces that all next values and all
yielded values be described by a single type, but it's quite normal to
use values of different types. Consider the following example:

function *userGen() {
  var name = yield("What is your name?");
  var age = yield("How old are you?");
  return { name, age }
}

var gen = userGen();
var name, age, user;
if (name = prompt(gen.next().value)) {
  if (age = prompt(gen.next(name).value)) {
    user = gen.next(parseInt(age, 10)).value;
    // user is { name: string|number, age: string|number }
    // but we "want" { name: string, age: number }
  }
}

To escape this pitfall, you can use Generator<any,any,any> or write
the necessary dynamic type tests. Ideally, Flow would enforce that you
first pass a string to next, then a number, but that isn't
possible in general, and in specific instances is still very hard, so
I'm leaving it for future work (if ever).

Another difficulty is the fact that the argument to next is optional.
This is because, in order to "start" the generator, you need to call
next once. Put differently, the first argument to next is not
returned by the first yield expression; the second argument to next
is. Consider the following example:

function *bmiCalc() {
  var height: number = yield("What is your height, in meters?");
  var weight: number = yield("What is your weight, in kilograms?");
  return weight / (height * height);
}

// The first argument to next is always `void`. The value of this
// expression is the string "What is your height..."
bmiCalc.next();

// This call to `next` expects the value for `height`, above, but
// because the type of `next` marks its argument optional, we allow
// this. :(
bmiCalc.next();

In this implementation, I wanted to make things strict, so the return
type of a yield expression is always Y|void, and the generator needs
to do a dynamic type test in order to get at the expected Y.

The above caveats, and the strict interpretation of this implementation,
has the consequence that most, if not all, generator code "in the wild"
will result in type errors from Flow. We might consider sacrificing
correctness here for practical purposes, but I think it will be easier
to go from strict to less strict than the other way around.

/* Generators */
interface Generator<Y,R,N> {
@@iterator(): Iterator<Y>;
next(value?: N): IteratorResult<R|Y>;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I change the union here from R|Y to Y|R, the test outcomes change. Tests that currently exhibit expected failures start to succeed unexpectedly. It almost seems like the speculative match failure in the first branch isn't working—instead of falling through to the other case, we see an error right away.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I half wonder if you're hitting the horrible Cache. Could you rebase past 7a4a629 and try again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!!!

That did it! I also get a much nicer error message for one of the unexpected failures.

@samwgoldman
Copy link
Member Author

OK, so thanks to @gabelevi's insight, I was able to resolve one of the more confusing problems here. There are 2 outstanding issues here:

  1. yield* expression return values are not captured correctly—can't make them fail as expected.
  2. super confusing error messages around passing mistyped arguments to next "through" a generator in a delegate yield scenario.

@gabelevi
Copy link
Contributor

Really glad that the termination strategies commit helped! Go @avikchaudhuri!

We were speaking a little on irc about declare function and how to deal with async and generators. A little background. I believe non-declare async functions and generator functions should annotated like this:

async function foo(): Promise<number> { return 42; }
function *bar(): Generator<number, bool, string> {
  var str = yield 42;
  return str === "hello";
}

The type of foo is () => Promise<number>. The async-ness of foo is an implementation detail and not included in its type. Someone using foo doesn't care if foo is an async function or just a function that returns a Promise. Sure, we check that an async function has a Promise return type, but then we can forget whether foo is async. This is similar to how we check the body of a function, but how the body itself doesn't show up in the function's type.

Same thing for generators. The type of bar is () => Generator<number, bool, string>. The generator-ness of bar is an implementation detail and not included in the type. A caller of bar doesn't care if bar is a generator or just a function that returns a generator.

declare declarations are really just declaring the interfaces/APIs/types/whatever. It's not necessary for them to declare async-ness or generator-ness. It's sufficient for them to just express the type of the function. So if we were to declare foo and bar we would just write

declare function foo(): Promise<number>;
declare function bar(): Generator<number, bool, string>;

@samwgoldman
Copy link
Member Author

@gabelevi Thank you for clarifying. I'm glad you're thinking about these things, because now that you've put it so plainly, it's obvious. I will revert the part of this patch that adds parser support for *function syntax in declaration files—the convert part was nonsense anyway.

@gabelevi
Copy link
Contributor

I should also mention how impressed I am with this PR. This is tricky stuff and you're doing an awesome job with it. Thanks a ton for working on this!

@@iterator(): Iterator<Y>;
next(value?: N): IteratorResult<Y|R>;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super rough idea, but I wonder if we could express this as something like

interface $Iterator<Y, R, N> {
  next(value?: N): IteratorResult<Y>;
}
type Iterator<T> = $Iterator<T, *, *>

interface $Iterable<Y, R, N> {
  @@iterator(): $Iterator<Y, R, N>;
}
type Iterable<T> = $Iterable<T, *, *>;

// Generator<Y, R, N> should be an $Iterable<Y, R, N>
// Generator<Y, R, N> should be an Iterable<Y>
// Generator<Y, R, N> should be an $Iterator<Y, R, N>
// Generator<Y, R, N> should be an Iterator<Y>
interface Generator<Y, R, N> {
    @@iterator(): Iterator<Y>;
    next(value?: N): IteratorResult<Y|R>;
}

declare function $yieldDelegate<R, N>(g: $Iterable<*, R, N>, n:N): R

If we do this, I think we could deal with something like

yield *({ [Symbol.iterator]: function *() { return yield 4; } })

ignoring the fact that Flow doesn't have support for Symbol.iterator, of course :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm a bit unclear about what that buys us. Could you extend your example with a test case showing an expected type error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here's what I'm trying to do. If you yield * to a generator, your PR accurately tries to match the delegating generator's R and N to the delegated generator's R and N. This is right and correct and awesome.

If you're yield *'ing to something that is NOT a generator but IS an Iterable, then you're not doing anything with the delegating generator's R and N. This is fine if the Iterable is something like an array, since R is void and N is void. However, I can conceive of a custom Iterable who's Iterator is something fancy, like a generator.

So given the example above

yield *({ [Symbol.iterator]: function *() { return yield 4; } })

If you call next('batman'), this yield * expression should evaluate to batman. (The example in babel: https://goo.gl/1Xq5VV)

I think it's possible to capture this idea, by creating $Iterator and $Iterable which keep track of the N and R types too. I'm not possible it will work, but it would be cool!

@samwgoldman
Copy link
Member Author

So I thought of another situation with typing Generators that is basically a Huge Bummer.

In the original issue I showed this example, showing how a common usage of generators is not terribly amenable to type checking:

function *userGen() {
  var name = yield("What is your name?");
  var age = yield("How old are you?");
  return { name, age }
}

var gen = userGen();
var name, age, user;
if (name = prompt(gen.next().value)) {
  if (age = prompt(gen.next(name).value)) {
    user = gen.next(parseInt(age, 10)).value;
    // what is the type of user?
    // ideally { name: string; age: number }
    // but we can't match up `next` calls to `yield` calls
    // so we get { name: string|number, age: string|number }
  }
}

There's a related issue as well, caused by the fact that the argument to next is optional.

function *bmiCalc() {
  var height: number = yield("What is your height, in meters?");
  var weight: number = yield("What is your weight, in kilograms?");
  return weight / (height * height);
}
bmiCalc.next(); // initial value to next is always blank
bmiCalc.next(); // this should be an error, but we allow it

So, unsound. Bummer. We have a few options:

  • Leave it as is—super unsatisfying because of soundness issues
  • Force yield return values to always be N | void—bad ergonomics, but also similar the union building we do for next arguments already
  • Properly model generators: allow next/yield pairs to have different types—not possible in general, afaict, and way beyond my ability to implement

@samwgoldman samwgoldman mentioned this pull request Jun 24, 2015
64 tasks
}

// we use this signature when typing yield* expressions
declare var $yieldDelegate: (<R,N>(g: Generator<*,R,N>, n:N) => R) & ((i: Iterable<*>, n:mixed) => void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The void return type might also be wrong. As in my example above, if the Iterable has a generator for an Iterable, that generator might return and that returned value will be used here.

@gabelevi
Copy link
Contributor

gabelevi commented Jul 1, 2015

I wouldn't get too bummed. There are a ton of common practices in dynamic programming languages that aren't amenable to static typing. When one encounters these there are a couple options.

  1. Fight for soundness at all costs. Soundness is great, but this will cause users pain and they'll be unhappy.
  2. Allow the users to do clowny things at the cost of soundness. Users will be happy until they shoot themselves in the foot and blame us for not warning them.

There isn't a right answer. We just need to balance how hard we're making the users' lives versus how likely they are to shoot themselves in the foot.

Force yield return values to always be N | void

There's precedence for going either way. We make users check for undefined in optional function parameters, but we ignore the fact that indexing into an Array<number> might return undefined. I'm slightly more tempted to make the return type N | void since you're right, the user already has to deal with the union building. If they need to inspect the union before they use it, they might as well check for undefined too.

In any case, we have a somewhat iterative approach to Flow. It's not terrible to make a choice now and change it later if we come to regret it. It's always easier to start stricter and relax the rules later than to go from relaxed to strict (cause then you need to go fix your code). So that's my 2 cents...take it or leave it!

Properly model generators: allow next/yield pairs to have different types—not possible in general, afaict, and way beyond my ability to implement

It does seem really bloody hard in general, especially with tricky control flows in the generator or in the calling code. I definitely think we shouldn't try this at the moment.

@nmn
Copy link
Contributor

nmn commented Aug 5, 2015

Any update on this?

@samwgoldman
Copy link
Member Author

@nmn This is pretty close to being done, but I'm having trouble triggering an expected error, which confounded me for a while. To be honest, I haven't spent much time on it in a while because I've been busy with my day job. Definitely want to dig into this soon, though.

@nmn
Copy link
Contributor

nmn commented Aug 5, 2015

@samwgoldman I'm sorry to bug you about this. Love all the work you're putting in.

Anyway, I've been learning a lot about the AST recently. Though I'm still completely illiterate in ocaml.
I hope to learn some ocaml soon, so I can contribute instead of just complaining.

If it's something that can be explained, I'm very interested in knowing what the missing error is?

@samwgoldman
Copy link
Member Author

@nmn If you read through the activity on this PR, you'll find this comment which attempts to explain it.

Also, @gabelevi made a good point about iterables with generators as iterators, which I need to handle, although that should be easier.

Looking forward to seeing your contributions! Learning OCaml isn't too too hard. :)

@nmn
Copy link
Contributor

nmn commented Aug 6, 2015

@samwgoldman oh i did read that note, but I wasn't sure you were blocked by the same thing. The whole AST, compiling stuff is new for me. Currently practicing my skills with eslint (and now Babel) plugins.

Reading ocaml tutorials on the side. That whole +. to add floats is quite funky, but I seem to get most of it.

Behavior:

Generators are iterable. Each value yielded from the generator function
is iterated.

Generators are iterators. Values can be passed in through the `next`
method. The iterator result values will be either a) the yielded values,
b) the value returned from the generator (the completion value) or c)
`undefined`, if `next` is called after completion.

Generators can delegate to other iterables, including other generators.
Values passed to `next` on the delegator generator will be passed
through to the delegatee. Similarly, values yielded by the delegatee
will be returned along the iterable/iterator interfaces.

The "delegate yield" expression, `yield *`, returns the completion value
of the delegatee.

Implementation:

While normal functions' return type is the type of the `return`
statement, generators are different. The return type is always a
Generator, where the second type argument is the type of the `return`
statement.

We use the same process to infer this type--an internal tvar named
"return"--but then override the return type when creating the function
type, if the function is a generator.

In order to track the `next` and `yield` types, we introduce new
internal tvars, which accumulate lower bounds according to their use,
and upper bounds from explicit annotation of the Generator type.

Caveats:

The generator interface is highly dynamic, and not particularly amenable
to static typing.

The type of Generator<Y,R,N> enforces that all next values and all
yielded values be described by a single type, but it's quite normal to
use values of different types. Consider the following example:

```javascript
function *userGen() {
  var name = yield("What is your name?");
  var age = yield("How old are you?");
  return { name, age }
}

var gen = userGen();
var name, age, user;
if (name = prompt(gen.next().value)) {
  if (age = prompt(gen.next(name).value)) {
    user = gen.next(parseInt(age, 10)).value;
    // user is { name: string|number, age: string|number }
    // but we "want" { name: string, age: number }
  }
}
```

To escape this pitfall, you can use `Generator<any,any,any>` or write
the necessary dynamic type tests. Ideally, Flow would enforce that you
first pass a `string` to `next`, then a `number`, but that isn't
possible in general, and in specific instances is still very hard, so
I'm leaving it for future work (if ever).

Another difficulty is the fact that the argument to `next` is optional.
This is because, in order to "start" the generator, you need to call
`next` once. Put differently, the first argument to `next` is not
returned by the first `yield` expression; the second argument to `next`
is. Consider the following example:

```javascript
function *bmiCalc() {
  var height: number = yield("What is your height, in meters?");
  var weight: number = yield("What is your weight, in kilograms?");
  return weight / (height * height);
}

// The first argument to next is always `void`. The value of this
// expression is the string "What is your height..."
bmiCalc.next();

// This call to `next` expects the value for `height`, above, but
// because the type of `next` marks its argument optional, we allow
// this. :(
bmiCalc.next();
```

In this implementation, I wanted to make things strict, so the return
type of a yield expression is always `Y|void`, and the generator needs
to do a dynamic type test in order to get at the expected `Y`.

The above caveats, and the strict interpretation of this implementation,
has the consequence that most, if not all, generator code "in the wild"
will result in type errors from Flow. We might consider sacrificing
correctness here for practical purposes, but I think it will be easier
to go from strict to less strict than the other way around.
@samwgoldman
Copy link
Member Author

@gabelevi OK, I went with your suggestion to include <Y,R,N> type params into the Iterable and Iterator interface, $-prefixed to keep the convenient definitions we have now.

I also rebased this and re-wrote the description on this issue to reflect the current state of things. This is ready for review!

@nmn
Copy link
Contributor

nmn commented Aug 6, 2015

👏👏👏👏
So excited about this.

(please now let/const is the only pending ES6 feature, right?)

@AprilArcus
Copy link
Contributor

I can't wait to see if this is practical to use with Koa.

@samwgoldman
Copy link
Member Author

I brought up in IRC that it might be worth exploring "soundiness" w.r.t. generators. I think the highest-reward trade-off we could make is the return type from yield expressions. Right now we wrap with OptionalT because the next argument is optional, but unwrapping the VoidT is crazy annoying.

return "";
}

var x: number = yield *inner()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment that this is an error?

done: boolean;
value?: T;
value?: Y|R;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for Iterator<string>, you'll have value?: string | void. It's a little weird that there's a void due to being optional and a void due to there being no return, but I think it's fine.

I actually wonder if maybe we can make value not be optional, since the union'd void captures the behavior perfectly. Like if iterate over a Generator<string, number, void>, will the IteratorResult ever be omitted? Looking at the mozilla docs it seems like it shouldn't be undefined if there is a return

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value is optional—it becomes undefined upon completion of the iterator. i.e., when done=true, value=undefined. (Edit: mixed up true and false.)

@gabelevi
Copy link
Contributor

gabelevi commented Sep 4, 2015

It's always easier to go from stricter->less strict than the other way around. I think you're making the right call with the OptionalT, and if we change our minds later we can go the less sound route.

But yeah, this PR is looking great! I think we can get this merged for the next deploy!

In reality, the type of IteratorResult is something like:

```
type IteratorResult<Y,R> = { done: false, value: Y } | { done: true, value: ?R }
```

However, boolean sentinels is not implemented.

It's simpler for everyone to use the single-parameter interface, so
let's keep things simple.
@samwgoldman
Copy link
Member Author

@gabelevi OK. Unless I'm mistaken regarding the optional-ness of IteratorResult's value, the changes should cover your comments. Merged into latest master as well.


interface Iterable<T> {
@@iterator(): Iterator<T>;
interface $Iterable<Y,R,N> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name these type vars more explicitly? They can show up in error messages, and sometimes without context it makes the error more confusing/unclear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an opinion on the more explicit names? Yield, Return, and Next?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objection. I can add a commit here, but it seems like an easy enough change to make in your internal merge, too, if you want to make the change yourself. (I'm not sure what the internal merge looks like, of course, so this could be wrong-headed of me.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, Yield/Return/Next seem good. I'll pass it along to @gabelevi who's submitted the internal merge request

Class methods go through a quite different, and a bit more convoluted
process. The hack I was using for function declarations where I swapped
out the inferred return type with a Generator typeapp no longer applies.

This is a good thing, because it forced me to replace the hack with
something better. Like async functions, the fancy generator return type
is established in the Return statement handler directly. If there is no
explicit return, we default to a Generator typeapp where R=void.

I intentionally left the capacity for async=true/generator=true var
scopes. Babel, via regenerator[1], seems to understand `async function*`
using the AsyncIterator[2] proposal.

For now, the parser will reject async generator functions, but we might
choose to support async iterators in the future, so having the capacity
to do so in VarScope seems right.

1. https://github.com/facebook/regenerator/blob/9423b4ade99173b603430a151599cf8637166f16/test/async.es6.js#L259
2. https://github.com/zenparsing/async-iteration/#async-generator-functions
This one comes down to personal opinion I suppose. I don't feel strongly
about it.
@samwgoldman
Copy link
Member Author

@gabelevi @jeffmo OK, so I think I've responded to all feedback here, now. There's some commentary in the individual commits.

The changes that enable class method generators also fixed a stupid bug with generators causing unexpected BoundT errors. However, I'm not able to get simple generic generator definitions to type check. For example:

function *generic_yield<Y>(y: Y): Generator<Y,void,void> {
  yield y;
}
generators.js:108:25,25: Y
This type is incompatible with
generators.js:108:1,110:1: some incompatible instantiation of Y

I don't see anything wrong with the above definition, though. At the moment I'm pretty stuck on that, and I'm not sure banging on it more is going to lead to a lot of insight. Either of you able to show that a) the error is correct or b) what's wrong with the implementation?

@jeffmo
Copy link
Contributor

jeffmo commented Sep 8, 2015

I know several people who've been asking for this for a long time now... You're going to be a hero!

We're going to do a deploy today or tomorrow and wanted to be able to get this in. We can just circle back on the generics stuff later.

@gabelevi
Copy link
Contributor

gabelevi commented Sep 8, 2015

Woo generators are in! Nice job @samwgoldman!

@jeffmo ++ on the generics stuff. You're right, that example shouldn't error, but what you have is great and I really want it in this deploy.

Again, thanks so much for building this. It's awesome, you're awesome, and I'm excited to get it out!

cagdasbozman pushed a commit to OCamlPro/flow that referenced this pull request Sep 10, 2015
Summary: Generators are iterable. Each value yielded from the generator function
is iterated.

Generators are iterators. Values can be passed in through the `next`
method. The iterator result values will be either a) the yielded values,
b) the value returned from the generator (the completion value) or c)
`undefined`, if `next` is called after completion.

Generators can delegate to other iterables, including other generators.
Values passed to `next` on the delegator generator will be passed
through to the delegatee. Similarly, values yielded by the delegatee
will be returned along the iterable/iterator interfaces.

The "delegate yield" expression, `yield *`, returns the completion value
of the delegatee.

While normal functions' return type is the type of the `return`
statement, generators are different. The return type is always a
Generator, where the second type argument is the type of the `return`
statement.

We use the same process to infer this type--an
Closes facebook#534

Reviewed By: @jeffmo

Differential Revision: D2416847

Pulled By: @gabelevi
@dchambers
Copy link

@samwgoldman / @gabelevi: is there any documentation explaining the work that was done for ES6 generators as I've been unable to find anything by skimming the Flow docs? For example, this snippet was shown in the preceding comments:

function *generic_yield<Y>(y: Y): Generator<Y,void,void> {
  yield y;
}

yet I get errors when I try to use Generator. Additionally, I'd like to know how to insist that the type of an object after yielding won't be undefined, since for my generator it never will be.

@samwgoldman
Copy link
Member Author

@dchambers Right now the only docs are on the blog, but hopefully that should cover everything you want to know. Let me know if it doesn't.

@dchambers
Copy link

Excellent documentation @samwgoldman. First class stuff!

In case it helps anyone else I had to mark Generator as being a global in my .eslint config to prevent linting errors, for example:

"globals": {
  "Generator"
},

but everything else worked exactly as described in the documentation.

@samwgoldman
Copy link
Member Author

Excellent! That's really nice to hear. Also, thanks for reminding me that I'll need to update that post soon, since @mroch landed some refinement improvements that will make using generators a lot more convenient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants