Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Co-routines #1342

Open
nicowilliams opened this issue Feb 16, 2017 · 13 comments
Open

Co-routines #1342

nicowilliams opened this issue Feb 16, 2017 · 13 comments

Comments

@nicowilliams
Copy link
Contributor

It'd be nice to be able to write:

def f(a; b; c):
    with_coroutine(a) as $a |
    with_coroutine(b) as $b |
    with_coroutine(c) as $c | 
    [co($a), co($b), co($c)];

f(range(3); range(3;6); range(6;9)) # -> [0, 3, 6] [1, 4, 7] [2, 5, 8]

or even better:

def f(@a; @b; @c):
    [@a, @b, @c];

f(range(3); range(3;6); range(6;9)) # -> [0, 3, 6] [1, 4, 7] [2, 5, 8]

This is somewhat inspired by Icon's co-routines. In Icon one can also pass new inputs to co-routines, and even refresh (restart) them, but for jq I think passing a new input to a co-routine would be the same as restarting it. Restarting a co-routine could be restart(@name), which will get whatever . is passed in as its new input.

@nicowilliams
Copy link
Contributor Author

def f(@a; @b; @c):
    [@a//null, @b//null, @c//null]; # //null in case the different co-routines have different numbers of outputs

@nicowilliams
Copy link
Contributor Author

And with varargs (see #1341):

def f(@args[]):
     range(args$[]) as $i | @args$[$i];

Ahh, we need a way determine the number of varargs arguments; here I used args$[].

@nicowilliams
Copy link
Contributor Author

So, I really like the idea of @name as syntax for referring to things like co-routines, and, really, also open file handles!

Basically, a @name reference would be a lot like a def, but closing over internal state other than jvs and other defs. Because a @name would be like a function, it can take an input value (.), and will output zero, one, or more values.

A @name representing a file open for reading would ignore its input and output either the next input from the file, or all of them (depending on open-time options).

A @name representing a file open for writing would write its inputs and either output them too or output empty, depending on open-time options.

A @name representing a co-routine would ignore its input and output the next output of the co-routine, or empty if the co-routine is complete. A rewind @name would reset a co-routine (or file handle, where sensible) and feed it a new input.

This would mean there's no need to have jv-like file handles. And you could not store @names, only pass them around.

@nicowilliams
Copy link
Contributor Author

Another thing, in Icon one can pass new values to co-routines, which values are then available via a special keyword. That would work for jq, though it'd be a bit weird since it would like the inputs builtin: non-deterministic, but we've already crossed that Rubicon.

OTOH, @name would not work well for full-duplex I/O, but we could use two handles, one for each direction.

We could even create threads to run co-routines in the background, and have a builtin that takes an arbitrary number of handles and returns the name/index of one that is ready for I/O, and this could be the basis for async I/O support in jq. Varargs would absolutely be a requirement here.

Ultimately, the nice thing about @name syntax is that it would make handles [to co-routines/threads, open files, pipes, sockets, databases, ...] lexically scoped, just like $name syntax, with handles closed automatically when their creation expressions are backtracked through, and with no internal details leaking out to the jq program.

@nicowilliams
Copy link
Contributor Author

Actually, there should be a single operator / syntax for creating co-routines. Co-routines should access inputs passed on each invocation via input/inputs and should get a new input every time they are context-switched to. The input to the left-most filter in a co-routine should probably be null. There should be an operator for restarting a co-routine.

This will be most similar to... Icon!

@nicowilliams
Copy link
Contributor Author

And there should be a flushinputs builtin too.

@nicowilliams
Copy link
Contributor Author

Possible syntax:

def alternate(@a; @b): while (a; .,b);

alternate(range(5); range(4; -1; -1))

This allows a function to decide to make co-routines out of some of its arguments. The co-routines look like and are functions. When a function exits its frame, it cleans up the co-routines.

@nicowilliams
Copy link
Contributor Author

nicowilliams commented Apr 13, 2017

One interesting thing will be handling tail calls: a tail call from a function frame that has co-routines cannot be made a proper tail call without first cleaning up the co-routines. The way I envision this is to have a stack of {jq_state instance pointer, jq stack address} where a co-routine has been allocated, and when doing tail calls this has to be checked, and either tail call disabled or co-routines cleaned up (by forcibly unwinding/backtracking to hit all the co-routine creation instructions).

There would have to be a new instruction for making a co-routine. It would create a jq_state with a copy of the parent but set to start at the right place. When backtracking through this instruction the jq_state would be cleaned up.

@nicowilliams
Copy link
Contributor Author

An alternative syntax could be @<expr> as <name> |, and then we could make def f(@a; @b): ... work like it does for $formal_argument. I like this.

@fadado
Copy link

fadado commented Apr 14, 2017

I really wish I could start testing the coroutines. I actually have and have studied the Icon book. It's obviously out of print, but is available to download! In fact I reread all the old Icon and SNOBOL books and articles in order to learn to program with jq ;-)

@nicowilliams
Copy link
Contributor Author

nicowilliams commented Apr 15, 2017

@fadado :]

Yes, I have a soft spot for Icon. I do wish it had closures. I also wish it still compiled to C, and preferably C with GCC extensions like local functions and computed gotos. Examining the old Icon compiler output was a fun way to learn what continuation passing style (CPS) is and how it works.

Regarding co-routines, I guess an implementation plan would look like this:

  • finish the C-coded generators branch and make sure that tail calls cleanup C-coded generator states
  • add a slot in frames for a list of co-routines
  • add an opcode to create a co-routine which on backtrack/raising cleans up co-routines as with C-coded generators
  • add machinery for "cloning" jq_state instances, sharing bytecode
    • this will need a way to start a co-routine in a specific code block other than the top-level
    • this will need a way to change how input/inputs work in co-routines (access inputs provided by callers)
    • add a flushinputs builtin -- hmmm, maybe a latestinput that discards earlier unconsumed inputs
  • add syntax to generate the new opcode
  • lastly, add simple I/O builtins as discussed, including a sandbox/soft-chroot option for the command-line

@nicowilliams
Copy link
Contributor Author

The good news is that I am getting confident about both, the design and the syntax.

@nicowilliams
Copy link
Contributor Author

Also, I want this as much as you, @fadado.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants