-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement accumulate and friends #702
Conversation
This looks pretty good to me at a quick read, though I should take a closer look. It's impressive that you can get away with this high level of abstraction. |
I wrote too many |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I took so long to look at this.
I think it's basically good, though I worry just a bit that it's very different from the rest of the code in this package and is asking rather a lot from the compiler ;-) But hey, we should arguably be moving away from using so many generated functions, so perhaps we should just go with it.
src/mapreduce.jl
Outdated
data = _map(a, CartesianIndices(a)) do _, CI | ||
D = _valof(dims) | ||
I = Tuple(CI) | ||
J = Base.setindex(I, 1, D) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already import Base.setindex
elsewhere in deque.jl so it should be available. Should we just move that import to the main StaticArrays.jl file and use setindex
unqualified here?
src/mapreduce.jl
Outdated
|
||
struct _InitialValue end | ||
|
||
_maybeval(dims::Integer) = Val(Int(dims)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's just me, but I can't help reading this as _maybe_eval
:-/ Can we change to _maybe_val
?
results = _reduce( | ||
a, | ||
dims, | ||
(init = (similar_type(a, Union{}, Size(0))(), init),), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I noticed here is that for length-0 input, the result has eltype of Union{}
. I can see why, but in other cases we do try to to have a "sensible" output eltype for convenience, eg SA{Int}[] .+ SA{Int}[]
preserves the Int eltype via some hopefully-not-dubious trickery (#664). Is it possible to do something similar here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I noticed here is that for length-0 input, the result has eltype of
Union{}
.
I don't think it's possible to solve this in general as the output value depends on the function op
. I think another (somewhat) sensible output is to use the eltype
of input array. But this is not always the valid answer; e.g., accumulate(//, [1, 2])
(note that Base
's accumulate(//, Int[])
magically returns Rational{Int64}[]
as it uses promote_op
).
As all static arrays (including MArray
) are shape-immutable, I don't think it is so harmful to return Union{}
element type.
My preference is:
Union{}
promote_op(op, eltype(input), eltype(input))
eltype(input)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's possible to solve this in general as the output value depends on the function op
.
This is what #664 hacks around for map
by using return_type
😬 (but it's no worse than Base which does the same thing for broadcast
and collect
on empty containers). I've never been quite sure that's the right choice, but if it can't be made to work or is otherwise terrible I'm ok with Union{}
:-) Prior to #664 we used Union{}
for map
and people mostly seemed ok with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I misread the last half of your first comment.
But I think one difficulty of accumulate
compared to map
is that output "eltype
" is kind of ill-defined (or maybe it's better to say length-dependent). It's conceivable to have "type sequence" like this
op(::A, ::A) :: B
op(::B, ::A) :: C
op(::C, ::A) :: D
op(::D, ::A) :: D # finally "stabilized"
But, even though B
, C
, D
are all different types, promote_type(B, C, D)
may return type E
that is a concrete type. It is also easy to have "oscillatory" output:
osc_op(::Nothing, ::Any) = missing
osc_op(::Missing, ::Any) = nothing
So, maybe we should use something like this?
function f(op, acc, x)
T = typeof(acc)
while true
acc = op(acc, x)
T = promote_type(T, typeof(acc))
non_existing_value && return T
# `non_existing_value` modeling truncation arbitrary-sized input.
end
end
return_type(f, Tuple{typeof(op), typeof(init), eltype(a)})
Impressively, julia
can figure out some non-trivial examples like Base._return_type(f, Tuple{typeof(+), Int, Union{Int,Missing}})
and Base._return_type(f, Tuple{typeof(osc_op), Missing, Int})
.
I'm not sure if we need to go this far, though. Base
is not doing this for accumulate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it's confusing to know what the empty case should do here and it's worse than map. With your fix we have
julia> accumulate((a,b)->(a+b)/2, SA[1,2])
2-element SArray{Tuple{2},Float64,1,2} with indices SOneTo(2):
1.0
1.5
julia> accumulate((a,b)->(a+b)/2, SA[1])
1-element SArray{Tuple{1},Int64,1,1} with indices SOneTo(1):
1
julia> accumulate((a,b)->(a+b)/2, SA{Int}[])
0-element SArray{Tuple{0},Int64,1,0} with indices SOneTo(0)
It seems like we must make a somewhat arbitrary choice for the length-0 case with init
. You've got
julia> accumulate((a,b)->(a+b)/2, SA{Int}[1], init=0)
1-element SArray{Tuple{1},Float64,1,1} with indices SOneTo(1):
0.5
julia> accumulate((a,b)->(a+b)/2, SA{Int}[], init=0)
0-element SArray{Tuple{0},Float64,1,0} with indices SOneTo(0)
Overall do you feel like this is an improvement on using Union
? I do like that cumsum
now preserves numeric eltypes and I feel like that could be helpful for users for the same reason that "fixing" map was.
We do have oddities like
julia> cumsum(Int8[1])
1-element Array{Int64,1}:
1
julia> cumsum(SA{Int8}[1])
1-element SArray{Tuple{1},Int8,1,1} with indices SOneTo(1):
1
julia> cumsum(SA{Int8}[1,2])
2-element SArray{Tuple{2},Int64,1,2} with indices SOneTo(2):
1
3
I think that inconsistency between here and Base might be harmless though, and it's certainly not clear how to "fix" it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall do you feel like this is an improvement on using
Union
?
That's a tough question... 😅 Personally, I think we should be using Union{}
eltype everywhere when it's not possible to figure out eltype
without relying on the compiler. However, we don't have a good toolset and coding convention for dealing with Union{}
eltype ("mutate-or-widen" style, even at the surface API level).
But I think the consistency of the API within a package and with respect to Base
matters. As map
is already using return_type
, overall, I think using return_type
seems to be a decent solution here.
Co-Authored-By: Chris Foster <[email protected]>
This reverts commit 4ca0144.
Looks great. I'd still like to see whether we can get everything inlined for the Is this good to go from your end? |
Yeah, I did everything I wanted to add. By the way, I added the inference-based eltype handling but I forgot to comment that. Did you check it? Re: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I added the inference-based eltype handling but I forgot to comment that
Yes I noticed that 👍 I had a play around with it and it seems reasonable enough. I think the zero size choice may be somewhat arbitrary for cases other than init=_InitialValue
? I guess there might be some weird sense in taking typeof(init)
... considering the left-extended sequence init, init+a[1], init+a[1]+a[2], ...
. However that seems a bit theoretical rather than practical, given the value of init
isn't actually used in the empty case.
I'd be happy to go with what's here.
src/mapreduce.jl
Outdated
@inline Base.cumsum(a::StaticArray; kw...) = | ||
accumulate(Base.add_sum, a; init = Base.reduce_empty(Base.add_sum, eltype(a)), kw...) | ||
@inline Base.cumprod(a::StaticArray; kw...) = | ||
accumulate(Base.mul_prod, a; init = Base.reduce_empty(Base.mul_prod, eltype(a)), kw...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: cumsum(SA{Int8}[1])
#702 (comment)
It's easy to make it Base
-compatible. So maybe it's worth doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I guess it's not so bad for +
given that already has a specialized add_sum
. So sure, if it's easy let's make this compatible.
I considered using struct _CustomInitialValue end
function mapaccumulate(f, op, xs)
accumulate(xs; init = _CustomInitialValue()) do acc, x
if acc isa _CustomInitialValue
f(x)
else
op(acc, f(x))
end
end
end |
With the use of |
Yeah, I think it's better this way as we are using I tried to merge this but I don't see the merge button. Also, it looks like I can't push to StaticArrays.jl. Maybe I don't have the right permission? |
Sorry, just a confusion with organization level permissions. Should be fixed now. |
Thanks! I just merged it. I appreciate that you reviewed this PR very thoroughly! |
Very nice — thanks for all the thought you put into addressing my quibbles and for some some nice improvement in functionality! |
This PR implements
accumulate
,cumsum
andcumprod
forStaticArray
.It const-props with vectors:
Functions are not completely inlined with matrices. Benchmarks show no allocations:
(edited by @c42f to add: fixes #730)