Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support init keyword in sum/prod/maximum/minimum #36188

Merged
merged 26 commits into from
Jun 11, 2020
Merged

Conversation

tkf
Copy link
Member

@tkf tkf commented Jun 8, 2020

This PR extends/finalizes @timholy's #35839. It adds init kwarg also to sum and prod.

close #35839

@tkf tkf requested a review from nalimilan June 8, 2020 04:32
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
test/reduce.jl Outdated Show resolved Hide resolved
Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks so much for picking this up and getting it across the finish line!

base/reduce.jl Show resolved Hide resolved
base/reduce.jl Show resolved Hide resolved
base/reflection.jl Outdated Show resolved Hide resolved
@tkf
Copy link
Member Author

tkf commented Jun 8, 2020

BTW, why don't we guarantee that init is only used for empty collections?

how about removing "neutral"?

@nalimilan I believe we should keep these caveats. It is highly beneficial that reduce implementation can use init that is guaranteed to be the neutral element if we want to parallelize reduce. Or for having some parallelized API that is Base-compatible.

Consider reduce(op, xs; init = x0) with collect(xs) = [a, b, c, d]. If xs is a vector, we'd split xs in multiple threads equally:

x = reduce(op, [a, b]; init = x0)  # in thread 1
y = reduce(op, [c, d]; init = x0)  # in thread 2
ans = op(x, y)

However, it's not always possible to guarantee that split collections are non-empty. For example, xs might be:

xs = (x for x in [bad, bad, bad, bad, a, b, c, d] if x !== bad)

then the collection-splitting routine might produce

left = (x for x in [bad, bad, bad, bad] if x !== bad)
right = (x for x in [a, b, c, d] if x !== bad)

(That's how SplittablesBase.jl supports Iterators.filter)

Then, we'd have

x = reduce(op, []; init = x0)            # in thread 1
y = reduce(op, [a, b, c, d]; init = x0)  # in thread 2

@assert x == x0

ans = op(x, y)

As you can see, x0 is used even though xs itself is not empty and it is important that this is the neutral element so that we have ans == y.

Of course, the parallel implementation of reduce can completely ignore the init keyword and internally use appropriate neutral element. However, having a concrete init helps a lot for JIT'ing single-threaded basecase code. I think it'd be better to keep the current semantics documented in reduce.

@nalimilan
Copy link
Member

OK, I was just curious whether we currently make use of that possibility. But for "neutral", IMHO the docstring should either mention that the behavior is unspecified (like for other docstrings), or the word should be removed.

@tkf
Copy link
Member Author

tkf commented Jun 8, 2020

I was just curious whether we currently make use of that possibility

It's possible but I'm not sure. I just think it's a reasonable API to keep it as-is.

base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Show resolved Hide resolved
Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tkf tkf merged commit 70d8497 into JuliaLang:master Jun 11, 2020
@tkf tkf deleted the sum-init branch June 11, 2020 07:50
@tkf
Copy link
Member Author

tkf commented Jun 11, 2020

@nalimilan @timholy Thanks a lot for your detailed review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fold sum, maximum, reduce, foldl, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants