-
Notifications
You must be signed in to change notification settings - Fork 21
Mapreduce performance #35
Comments
The tweaks in 76309a0#diff-88161f25d2abf2976952b3a2c500b755 appear to have closed some of the performance gaps that can be seen above. The remaining gaps seem restricted to cases in which a mapreduce-related method is called on a IMO both approaches to determining the type of the empty I'm slightly more of the mind that such cases ought to return Most recent run:
|
I'm not sure I completely understand the problem, but this sounds similar to the question of the type to return from a comprehension on an empty iterator input. One of the solutions discussed there is to return an Would it make sense to return a |
That issue does seem quite similar, insofar as it too concerns what to do with the Currently, lifted binary operators such as julia> Y = NullableArray(Int, 5)
5-element NullableArrays.NullableArray{Int64,1}:
Nullable{Int64}()
Nullable{Int64}()
Nullable{Int64}()
Nullable{Int64}()
Nullable{Int64}()
julia> Y[2:5] = [2:5...]
4-element Array{Int64,1}:
2
3
4
5
julia> Y
5-element NullableArrays.NullableArray{Int64,1}:
Nullable{Int64}()
Nullable(2)
Nullable(3)
Nullable(4)
Nullable(5)
julia> reduce(+, Y)
Nullable{Int64}()
julia> ans.value
12884901905 This is fine, since you oughtn't to expect a meaningful (or even safe) answer from retrieving the julia> op(x::Nullable, y::Nullable) = Nullable(x.value + y.value < 10 ? 5 : 5.0, x.isnull | y.isnull)
op (generic function with 1 method)
julia> X = NullableArray([1:5...], [fill(false, 4); true])
5-element NullableArrays.NullableArray{Int64,1}:
Nullable(1)
Nullable(2)
Nullable(3)
Nullable(4)
Nullable{Int64}()
julia> reduce(op, X)
Nullable{Float64}()
julia> X.values[5] = 1
1
julia> X[5]
Nullable{Int64}()
julia> reduce(op, X)
Nullable{Int64}() Arguably, any user who writes such a whimsical function probably deserves equally whimsical behavior when reducing with that function. That being said, I think this behavior is less than desirable, since it is reasonable to expect that computing the return type of an empty Note that this issue only concerns instances where one calls a reducing method with One alternative is to forego the reducing computations entirely in the presence of null entries and simply return
Thoughts about these options:
Some more thoughts:
and
seems to provide good reason at least to adopt option 3 for now, since, at least based on these numbers, this ought to perform much better than the current implementation and possibly on a par with the |
Here are my results from running the profiling methods included in https://github.com/johnmyleswhite/NullableArrays.jl/blob/master/perf/mapreduce.jl:
Though more than for
broadcast
, there is still relatively little specialized code formapreduce
other specialized methods for skipping over null entries and hooking them into the generalmapreduce
interface. A few noteworthy items:sumabs
andsumabs2
are 10x faster than they were last week without my having touched them. I suspect this is due to improvements in Base Julia, possibly to do with codegen? Now I'm beginning to understand the importance of rigorously tracking performance against environmental variables, since it would have been interesting to see what caused the speedup.NullableArray
s as it is forDataArray
s, but that speedup is lost by the time get to the exposed interface.NullableArrays
.The text was updated successfully, but these errors were encountered: