Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Document that mapreduce(f, op, _) may exploit commutativity of op and purity of f #36424

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions base/reduce.jl
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,10 @@ In general, it will be necessary to provide `init` to work with empty collection
intermediate collection needs to be created. See documentation for [`reduce`](@ref) and
[`map`](@ref).

Known commutativity of the operation `op` such as `+` and `max` may be exploited in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "known commutativity" mean? Are we strict about the fact that max should be commutative, or are we limiting ourselves to only some max(::T, ::T)s?

We already say that the associativity of reduce isn't specified (and that if you want to demand a particular associativity to use fold[lr])... I know that's not the same as commutativity, but it feels very similar since it means the accumuland might appear on either side.

Copy link
Member Author

@tkf tkf Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant by "known commutativity" was that max implementer must make it commutative.

I guess we can instead restrict this to known functions and input types (so methods). But then we can't use this for something like maximum(f, xs) because robustly dispatching on return type is not really possible.

Are there use cases for non-commutative max, min, +, &, | etc.? If I need something similar to max but not commutative, I think I'm OK with creating a new function maxish. I think it's better to demand more to the implementer so that the higher order functions can optimize more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bigger point here is simply: are there reduction algorithms that care about commutativity but not associativity? Associativity seems like the bigger deal to me, but perhaps that's just a limitation of my own imagination.

My gut reaction would lean towards making (map)fold[lr] the implementations that guarantee both associativity and commutativity, and free mapreduce to do whatever it wants on both properties.

Copy link
Member Author

@tkf tkf Jun 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Associativity seems like the bigger deal to me

That's my understanding, too.

From a sequential computation tree

        +
       / \
      +   4
     / \
    +   3
   / \
  +   2
 / \
0   1

you can't lower the height of the tree (= data dependency) by using only commutativity:

  +
 / \
4    +
    / \
   +   3
  / \
 2   +
    / \
   0   1

while it's possible with associativity

     +
   /   \
  +     +
 / \   / \
1   2 3   4

I can't see how modifying the tree using commutativity helps the computation. I think commutativity is useful only if the op already is associative. (Though I'd love to know if there is something you can do only with commutativity.)

free mapreduce to do whatever it wants on both properties

I think my proposal is a bit more subtle. For the associativity, it's the caller that asserts it. For the commutativity, it's the implementer of op. I guess this "asymmetry" of who asserts what is not really the most beautiful API. Some alternatives are:

(1) Add a wrapper type

struct Commutative{F} <: Function
    op::F
end

(f::Commutative)(a, b) = f.op(a, b)

so that mapreduce can dispatch on op::Commutative{<:Union{typeof(max), typeof(min)}}. This way, it'd always be the caller who asserts certain properties. For example, you'd write maximum(f, xs) = mapreduce(f, Commutative(max), xs) and document that max(::eltype(xs), ::eltype(xs)) must be commutative. (This is something I meant to explore in JuliaFolds/Transducers.jl#143.)

(2) Add reduce_commutative (with a better name) that requires op to be associative and commutative. If we go this way, maybe we can use a consistent naming scheme like folda (= reduce) for associative op and foldac for associative and commutative op.

(3) Add reduce_dwim (with a better name) that also auto-detect known associativity. This way, it's always the implementer who asserts the associativity.

Ultimately, I think using reduce for (3) would be ideal for users. They can just call reduce and it dispatches to an optimal implementation. But it is not possible to do this until Julia 2.0. I think what I suggest in this PR can be used as an intermediate step for implementing (3) in Julia 2.0.

implementation. This means that the elements of `itr` may be accessed out of order and
the order in which `f` is called is implementation-defined.
Comment on lines +274 to +275
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbauman What do you think about the purity/out-of-order part?


!!! compat "Julia 1.2"
`mapreduce` with multiple iterators requires Julia 1.2 or later.

Expand Down Expand Up @@ -440,6 +444,9 @@ use non-associative operations like `-` because it is undefined whether `reduce(
should be evaluated as `(1-2)-3` or `1-(2-3)`. Use [`foldl`](@ref) or
[`foldr`](@ref) instead for guaranteed left or right associativity.

Known commutativity of the operation `op` such as `+` and `max` may be exploited in the
implementation. This means that the elements of `itr` may be accessed out of order.

Some operations accumulate error. Parallelism will be easier if the reduction can be
executed in groups. Future versions of Julia might change the algorithm. Note that the
elements are not reordered if you use an ordered collection.
Expand Down