Task advances on done() not next() ? #6125

filmackay · 2014-03-12T11:14:26Z

Unlike other iterable types, Task seem to advance on done instead of next. Is there a reason for this - it seems to break the iteration contract. eg.

function p()
    produce("A")
    produce("B")
    produce("C")
end

t = Task(p)

println("1")
s = start(t)
println("2")
done(t, s)
println("3")
(v, s) = next(t, s)
println("4 $v")
done(t, s)
println("5")
(v, s) = next(t, s)
println("6 $v")
(v, s) = next(t, s)
println("7 $v")
done(t, s)
println("8")

Produces:

The text was updated successfully, but these errors were encountered:

JeffBezanson · 2014-03-12T17:42:51Z

It's because you can't tell whether a task is done without trying to run it, at which point you also get the first value it produces.

ivarne · 2014-03-12T17:57:52Z

You are also kind of breaking the contract by not checking if the iteration is done before getting the next value. Where is the iteration contract that says you are allowed to do that? next(1:3,4) does not give a BoundsError.

StefanKarpinski · 2014-03-12T18:03:35Z

This used to be the other way, but there was a complaint about that too. In the current arrangement, if you always call next before done the way you're supposed to, this works as one would expect.

ivarne · 2014-03-12T18:08:03Z

@StefanKarpinski I assume you intended to write "done before next"?

StefanKarpinski · 2014-03-12T20:00:25Z

Right, yes that.

filmackay · 2014-03-12T23:55:15Z

Thanks guys.

OK - so the contract is to always check done before each next, thereby avoiding that is undefined. Would it make more sense for iterators to have a next that returns a (done, value, state) tuple? At the moment an iterator needs to check for doneness in next too - to throw a bounds error. This would remove any ambiguity by binding the two operations together. value would be () if done is true.

ivarne · 2014-03-13T08:20:00Z

I also wondered why we need two functions (done and next), but I think I managed to find some arguments.

Many implementations of next does not check bounds. Only methods that would otherwise segfault does that in order to prevent memory corruption. You are an example to show that it is not safe to put the @inbounds declaration in next because we assume that it will not be called when done is false. Maybe we could have something like a unsafe_next that would be allowed to segfault if not called right after done?
If next returned (done, value, state) we would not always have a valid value of the correct type to return for value. That would lead to type unstable, slow code, because we can not raise an exception as we do now. The alternative would be to make the done returned from next tell if there are one more element in the iterable, but that would make you unable to process the result of one task before the next has arrived.
Python generators uses exceptions to stop iteration of generators, and while that is convenient in a lot of cases, it is much slower than the done check, and thus not suitable for Julia.

mbauman · 2014-03-13T15:21:34Z

Yes, that makes sense. Thanks for the explanation. It's surprisingly tricky.

It still feels weird to be mutating the task within the done check. Would it make more sense to do this in start? It's even simpler if you store the next result in the state, but that breaks a few other methods. As an aside, are there no tests for Tasks?

start(t::Task) = next(t)[2]
done(t::Task, val) = istaskdone(t)
next(t::Task, val) = (t.result, (t.result = consume(t); nothing))

JeffBezanson · 2014-03-13T15:51:05Z

These definitions work fairly well:

start(t::Task) = consume(t)
done(t::Task, val) = istaskdone(t)
next(t::Task, val) = (val, consume(t))

The only downside is that they switch to the task twice before you actually get the first value. It would be nice to be able to compute the second thing that next returns at the end of the loop body rather than at the top. Interestingly, this can lead to better code generation too (as I found in #5469), but fortunately there was a workaround for now.

That seems to ask for nextval and nextstate functions, which would also cut down on tuples. While that would be the most breaking change of all time, it's still interesting to think about.

mbauman · 2014-03-13T15:53:15Z

That's exactly what I played with first… it's wonderfully simple but it seems that some other methods (like schedule) depend upon t.result being used. But that can be fixed.

JeffBezanson · 2014-03-13T15:54:32Z

I don't think any other code depends on the iterator setting t.result. Really, that mutating during iteration is just a big nuisance.

StefanKarpinski · 2014-03-13T16:15:37Z

The nextval nextstate thing could actually be made to be relatively non-disruptive with these interim definitions:

nextval(itr,state) = next(itr,stat)[1]
nextstate(itr,state) = next(itr,stat)[2]
next(itr,state) = (nextval(itr,state), nextstate(itr,state))

These seem circular, but of course the circularity would be broken when any itr type had one or the other set of definitions – next or nextval and nextstate. I'm not advocating that since I think this is kind of ugly, but it's worth considering alternatives. As I recall, one of the motivations for the way next simultaneously returns a value and computes the next index was iteration over UTF-8 strings, where the character decoding process also naturally gives you the starting index of the next character.

JeffBezanson · 2014-03-13T16:29:17Z

Quite true; the other side of the coin is being able to share computation between nextval and nextstate.

StefanKarpinski · 2014-03-13T16:41:33Z

It's arguable that you could just do that work twice most of the time and as long as the nextval and nextstate calls get inlined, LLVM would probably be able to eliminate the duplication. There are cases where there's enough work to be done that inlining wouldn't happen, but in such cases maybe some other approach to avoiding the duplicated work would be warranted.

StefanKarpinski closed this as completed Mar 12, 2014

This was referenced Mar 20, 2014

Filter(f, Task) is fundamentally broken #6224

Closed

enumerate(eachline(file)) produces every other line of file #6284

Closed

JeffBezanson mentioned this issue Apr 7, 2014

task iter does not consume in done call. fixes #6447 #6448

Closed

JeffBezanson mentioned this issue Aug 26, 2014

iterators that don't know they're done until they've tried #8149

Closed

nolta mentioned this issue Nov 27, 2014

WIP: split next into nextval & nextstate #9182

Closed

simonster mentioned this issue Dec 12, 2014

Code generation and cartesian iteration #9080

Closed

JeffBezanson mentioned this issue Feb 10, 2016

add Flatten iterator #14805

Merged

timholy mentioned this issue Jun 11, 2016

julep: extended iteration API, with proof-of-principle for fixing performance of Cartesian iteration #16878

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task advances on done() not next() ? #6125

Task advances on done() not next() ? #6125

filmackay commented Mar 12, 2014

JeffBezanson commented Mar 12, 2014

ivarne commented Mar 12, 2014

StefanKarpinski commented Mar 12, 2014

ivarne commented Mar 12, 2014

StefanKarpinski commented Mar 12, 2014

filmackay commented Mar 12, 2014

ivarne commented Mar 13, 2014

mbauman commented Mar 13, 2014

JeffBezanson commented Mar 13, 2014

mbauman commented Mar 13, 2014

JeffBezanson commented Mar 13, 2014

StefanKarpinski commented Mar 13, 2014

JeffBezanson commented Mar 13, 2014

StefanKarpinski commented Mar 13, 2014

Task advances on done() not next() ? #6125

Task advances on done() not next() ? #6125

Comments

filmackay commented Mar 12, 2014

JeffBezanson commented Mar 12, 2014

ivarne commented Mar 12, 2014

StefanKarpinski commented Mar 12, 2014

ivarne commented Mar 12, 2014

StefanKarpinski commented Mar 12, 2014

filmackay commented Mar 12, 2014

ivarne commented Mar 13, 2014

mbauman commented Mar 13, 2014

JeffBezanson commented Mar 13, 2014

mbauman commented Mar 13, 2014

JeffBezanson commented Mar 13, 2014

StefanKarpinski commented Mar 13, 2014

JeffBezanson commented Mar 13, 2014

StefanKarpinski commented Mar 13, 2014