-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: efficient product iterator #14596
Conversation
Also, I tried to be careful to make the same start/done/next calls that nested loops would. I think it's pretty close but it makes the code slightly more awkward than it could be. It's really quite amazing how many ways there are to write |
I suspect this would be easier, and probably faster, with #9182---you almost surely wouldn't need the whole With the current architecture, is there anyway to avoid the |
s1, s2, Nullable{eltype(p.b)}(), (done(p.a,s1) || done(p.b,s2)) | ||
end | ||
|
||
@inline function prod_next(p, st) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename as next
and delete the definition below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used by both Prod and Prod2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed that.
I'm not sure that's true --- to avoid calling |
You could call |
True, that would work for iterators where you want to avoid extra state changes, but it still depends on the iterator being able to return the same value again, so somebody has to store it. |
I'm afraid I don't follow. Currently It seems to me that with #9182, this would reduce to start(p::Prod2) = (start(p.a), start(p.b))
nextval(p::Prod2, s) = nextval(p.a, s[1]), nextval(p.b, s[2])
nextstate(p::Prod2, s) = isdone(p.a, s[1]) ? (start(p.a), nextstate(p.b, s[2])) : (nextstate(p.a, s[1]), s[2]) Or, if nextval(p::Prod2, s, vs) = (v1,vs1 = nextval(p.a, s[1], vs[1]); v2,vs2 = nextval(p.b, s[2], vs[2]); (v1,v2), (vs1,vs2)``` |
Yes. If you allow repeated calls to |
Wouldn't it just be nextval(iter, state, valstate) = fetch_value(iter, state), 0 when you can look up a value (which is probably >90% of all cases?), and function nextval(iter, state, valstate)
state == valstate.state && return get(valstate.value), valstate
nv = fetch_value(iter, state)
nv, ValState(Nullable(nv), state)
end for a streaming container? This restricts usage of The main point being that, once you've implemented "persistence" for any iterators that need it, building more complex iterators becomes a more composable operation. |
But you're probably right that it's not much harder for an "uncooperative" iterator with the current architecture (EDIT: if we required that iterators support repeated calls to |
Regarding ordering, @marcusps had a PR JuliaCollections/Iterators.jl#40 to implement Lexicographic ordering. So, there are use cases. |
|
It would be quite natural to also define
and declare |
I don't love exporting the name for this, could be confusing. Why can't this be/stay in Iterators.jl? |
@JeffBezanson's policy has been to move commonly useful iterators into Base if they have really efficient implementations (i.e. as fast as what you would write out by hand). |
Why should performance determine whether something is in base or a package? Without context, exporting We could use a bit more namespacing and modularization in base for this kind of thing. |
Obviously not everything that is fast should be in base, but certainly things that are slow should generally not be. This is not my argument, I'm just telling you what the policy has been. |
Stuff in Base has a stronger implied warranty because you didn't choose to install it. So it's silly to put things in there that we then have to tell people not to use because they're too slow. The next question is whether the functionality is important enough. I suspect it is; I'm considering basing N-d comprehensions on this. I can understand the naming concern; the term "product" is quite overloaded. The same applies to iterators that might have both a lazy and eager version. So maybe a module makes sense. |
+1 to having these in a |
If you are thinking about it currently: a shape/ |
How should we name |
Base.Iter sounds okay. Thinking to export or not export the Iter module? |
🚲 🏠 Should probably be |
Not loving it. I would propose not exporting the module. If we had #1255, I would prefer leaving them both with the name import Base.Iterators
import Iterators as Iterators2 or using Base.Iterators
using Iteratos as Iterators2 # since "using" imports the module name into the current scope |
I'm with @kmsquire on this one. |
Alternatively, |
@kmsquire I agree that's nicer, but it seems odd to introduce a deliberate name conflict with a package. |
What if the |
I suppose that could make sense, but we don't currently allow that. It would especially complicate package precompilation. |
Python and Rust use the name |
How about this: for now, I will add this functionality without exporting |
Sounds great. |
note: not exported yet
5bb4184
to
c4c826a
Compare
RFC: efficient product iterator
This uses the same general trick as the
zip
iterator to get performance without generated functions. I'm sure it could be better, but it's at least reasonably fast:The times are pretty variable but this looks something like a factor of 2.
This implements the same iteration order as comprehensions (first iterator is the innermost loop). Would anybody want lexicographic order in addition or instead?