-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Document that mapreduce(f, op, _) may exploit commutativity of op and purity of f #36424
Conversation
@@ -270,6 +270,10 @@ In general, it will be necessary to provide `init` to work with empty collection | |||
intermediate collection needs to be created. See documentation for [`reduce`](@ref) and | |||
[`map`](@ref). | |||
|
|||
Known commutativity of the operation `op` such as `+` and `max` may be exploited in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "known commutativity" mean? Are we strict about the fact that max
should be commutative, or are we limiting ourselves to only some max(::T, ::T)
s?
We already say that the associativity of reduce
isn't specified (and that if you want to demand a particular associativity to use fold[lr])... I know that's not the same as commutativity, but it feels very similar since it means the accumuland might appear on either side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant by "known commutativity" was that max
implementer must make it commutative.
I guess we can instead restrict this to known functions and input types (so methods). But then we can't use this for something like maximum(f, xs)
because robustly dispatching on return type is not really possible.
Are there use cases for non-commutative max
, min
, +
, &
, |
etc.? If I need something similar to max
but not commutative, I think I'm OK with creating a new function maxish
. I think it's better to demand more to the implementer so that the higher order functions can optimize more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bigger point here is simply: are there reduction algorithms that care about commutativity but not associativity? Associativity seems like the bigger deal to me, but perhaps that's just a limitation of my own imagination.
My gut reaction would lean towards making (map)fold[lr]
the implementations that guarantee both associativity and commutativity, and free mapreduce
to do whatever it wants on both properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Associativity seems like the bigger deal to me
That's my understanding, too.
From a sequential computation tree
+
/ \
+ 4
/ \
+ 3
/ \
+ 2
/ \
0 1
you can't lower the height of the tree (= data dependency) by using only commutativity:
+
/ \
4 +
/ \
+ 3
/ \
2 +
/ \
0 1
while it's possible with associativity
+
/ \
+ +
/ \ / \
1 2 3 4
I can't see how modifying the tree using commutativity helps the computation. I think commutativity is useful only if the op
already is associative. (Though I'd love to know if there is something you can do only with commutativity.)
free
mapreduce
to do whatever it wants on both properties
I think my proposal is a bit more subtle. For the associativity, it's the caller that asserts it. For the commutativity, it's the implementer of op
. I guess this "asymmetry" of who asserts what is not really the most beautiful API. Some alternatives are:
(1) Add a wrapper type
struct Commutative{F} <: Function
op::F
end
(f::Commutative)(a, b) = f.op(a, b)
so that mapreduce
can dispatch on op::Commutative{<:Union{typeof(max), typeof(min)}}
. This way, it'd always be the caller who asserts certain properties. For example, you'd write maximum(f, xs) = mapreduce(f, Commutative(max), xs)
and document that max(::eltype(xs), ::eltype(xs))
must be commutative. (This is something I meant to explore in JuliaFolds/Transducers.jl#143.)
(2) Add reduce_commutative
(with a better name) that requires op
to be associative and commutative. If we go this way, maybe we can use a consistent naming scheme like folda
(= reduce
) for associative op
and foldac
for associative and commutative op
.
(3) Add reduce_dwim
(with a better name) that also auto-detect known associativity. This way, it's always the implementer who asserts the associativity.
Ultimately, I think using reduce
for (3) would be ideal for users. They can just call reduce
and it dispatches to an optimal implementation. But it is not possible to do this until Julia 2.0. I think what I suggest in this PR can be used as an intermediate step for implementing (3) in Julia 2.0.
Co-authored-by: Stefan Karpinski <[email protected]>
implementation. This means that the elements of `itr` may be accessed out of order and | ||
the order in which `f` is called is implementation-defined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbauman What do you think about the purity/out-of-order part?
@mbauman should we close this, since the docs already mention associativity, and the additional mention of commutativity doesn't sound that useful? perhaps we should actually just suggest that all implementations be commutative, and someday worry about changing some like |
Commutativity can be useful for table join operations, because join(smalltable, bigtable) is often written with a full sequential scan on the left and a hash lookup on the right. Being able to swap the operands can make it much faster. It doesn't have to be written this way of course, but I think it's a good example of where commutativity is helpful. Commutativity will also help whenever you're receiving items asynchronously out of order and you want to combine them with an accumulator. If the 4th item arrives before the 3rd, you can proceed without waiting. Consider too when the arguments are in different locations. For example, |
Currently
reduce
etc. document that it may exploit associativity of the operator. However, we already exploit commutativity of certain operations likemax
julia/base/reduce.jl
Lines 627 to 630 in 98a845f
Also, using
@simd
arguably requires this caveat. So, I propose to document the current behavior.If we do this, I think it makes sense to also require that
f
andgetindex(A, _)
formapreduce(f, op, A)
are reasonably pure and they may be called out of order. Again, this is already assumed inmapreduce(f, max, array)
since it runs over the array (up to) twice.julia/base/reduce.jl
Lines 642 to 650 in 98a845f
In particular, it means that it's possible to implement a more efficient
maximum(f, ::PermutedDimsArray)
etc.close #36081