Skip to content
This repository has been archived by the owner on May 4, 2019. It is now read-only.

Rewrite broadcast() and map() based on lift() #166

Merged
merged 19 commits into from
Jan 23, 2017
Merged

Conversation

nalimilan
Copy link
Member

@nalimilan nalimilan commented Nov 11, 2016

Remove the custom implementation of broadcast(), and just call
the base method on the lift()ed method. This implements the
blacklist approach: methods with a custom lifting behavior like
isnull() should override lift() to get passed Nullable values
directly; if they do not return a Nullable, the result is a standard
Array rather than a NullableArray.

Performance will probably be worse than before, but at least the
semantics will be correct. We can always re-implement the custom
and faster versions later when broadcast() has stabilized in Base
and Nullable support has settled.

This is WIP since I haven't thought much about this yet, because it copies lift from JuliaLang/julia#18758 without too much thinking either, and because it only works on 0.6 at the moment.

Cf. discussions at #144 and #163.

X[1] = Nullable()
A[1]= 0
# FIXME: element-wise syntax should give the same result
# R = get.(X, 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nalimilan
Copy link
Member Author

@davidagold Where do you think lift should live? In NullableOps?

@davidagold
Copy link
Contributor

Preferably NullableArrays. Or am I wrong for thinking 'lift' and lifted ops
are sufficiently different strategies that they ought to live in different
packages?

On Sunday, November 13, 2016, Milan Bouchet-Valat [email protected]
wrote:

@davidagold https://github.com/davidagold Where do you think lift
should live? In NullableOps?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#166 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALUCm4fyI9IQ1hlGuon_zLjcVTdMHRJPks5q9xxvgaJpZM4Kv-IA
.

@nalimilan
Copy link
Member Author

No, that's fine with me, I just wanted to be sure that was OK for you.

@nalimilan
Copy link
Member Author

JuliaLang/julia#19313 is kind of annoying: in some cases, the f.(x) syntax doesn't work for two-argument functions.

There's another related issue which is even more annoying: loop fusion prevents lift from detecting special (blacklisted) functions. So e.g. identity.(isnull.(x)) returns NULL instead of false for missing entries.

These two issues mostly affect corner cases, so this change is probably worth it anyway, but in the longer term I'm not sure what to do. The f.(x) syntax could be improved in Julia to be more flexible, i.e. calling a method with the function and input types as arguments, which could return a transformed function (lifted, in our case). Then loop fusion would combine these functions instead of the original ones. But I don't see this being accepted in Base without a strong use case, i.e. more or less official support for the lifting framework; we may as well merge JuliaLang/julia#18758 at the same time.

I'm starting to think (again) that all function calls should lift by default; that would be a much more consistent option.

@johnmyleswhite
Copy link
Member

I'm not sure we should move forward with this just yet. It might be the right decision (and I'm inclined to think it is), but it might be easier to start with a less ambitious approach in which people have to explicitly call special functions like isnull. People love to complain about the burdens of those kinds of things, but I worry that this approach will be too magical and will actually prevent people from learning how to handle non-standard semantics for lifting.

@nalimilan
Copy link
Member Author

So that means lifting by default, without any blacklist?

@bramtayl
Copy link

bramtayl commented Nov 15, 2016

This kind of thing could be integrated pretty easily in ChainMap. Once a lift function is available, I could make

@chain begin
    x
    @over is.null
    @lift_over identity
end

go to

lift(broadcast, identity, broadcast(is.null, x) )

Edit: works now on master

@nalimilan nalimilan force-pushed the nl/broadcast branch 2 times, most recently from 5f7c6a1 to 5de334f Compare November 20, 2016 16:34
@codecov-io
Copy link

codecov-io commented Nov 20, 2016

Current coverage is 60.18% (diff: 85.00%)

Merging #166 into master will decrease coverage by 24.67%

@@             master       #166   diff @@
==========================================
  Files            14         13     -1   
  Lines           865        663   -202   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits            734        399   -335   
- Misses          131        264   +133   
  Partials          0          0          

Powered by Codecov. Last update 2cc2894...4efaebe

@nalimilan nalimilan force-pushed the nl/broadcast branch 3 times, most recently from dcd8797 to 4d1a1f8 Compare November 25, 2016 19:59
@nalimilan
Copy link
Member Author

nalimilan commented Nov 25, 2016

I've updated the branch to always lift. It now uses a private broadcast_lift function, since for now we don't want people to override the lifting behaviour.

Surprisingly, new implementation appears to be just as fast as the current one. Since it passes the tests on 0.5 and 0.6, I think we can seriously consider merging it. It's not yet clear how to get non-standard lifting to work, but fixing standard cases sounds like the highest priority to me.
EDIT: I got the benchmarks wrong. There's still an allocation problem which kills performance. I'll investigate.

@nalimilan nalimilan changed the title WIP: Rewrite broadcast() based on lift() Rewrite broadcast() based on lift() Nov 25, 2016
@nalimilan
Copy link
Member Author

Really good news: with the latest changes, I get a very similar performance to master, even slightly better, for one- and two-argument cases. Three-argument and beyond is still terribly slow; I can add a special case for 3 if needed, waiting for a more general solution.

Example on a simple benchmark:

using NullableArrays
X = NullableArray(1:100000);
Y = NullableArray(1:100000);
using BenchmarkTools

# With this branch:
julia> @benchmark broadcast!(+, X, Y)
BenchmarkTools.Trial: 
  memory estimate:  176.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     107.072 μs (0.00% GC)
  median time:      108.384 μs (0.00% GC)
  mean time:        114.363 μs (0.00% GC)
  maximum time:     360.646 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark broadcast!(+, X, X, Y)
BenchmarkTools.Trial: 
  memory estimate:  224.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     222.965 μs (0.00% GC)
  median time:      228.248 μs (0.00% GC)
  mean time:        246.007 μs (0.00% GC)
  maximum time:     1.621 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

# With master:
julia> @benchmark broadcast!(+, X, Y)
BenchmarkTools.Trial: 
  memory estimate:  224.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     117.858 μs (0.00% GC)
  median time:      151.116 μs (0.00% GC)
  mean time:        197.762 μs (0.00% GC)
  maximum time:     1.576 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark broadcast!(+, X, X, Y)
BenchmarkTools.Trial: 
  memory estimate:  272.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     243.590 μs (0.00% GC)
  median time:      251.746 μs (0.00% GC)
  mean time:        265.030 μs (0.00% GC)
  maximum time:     1.100 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

More generally, running perf/broadcast.jl shows a slight advantage to this PR, except in a handful of cases where the difference isn't that large:
https://gist.github.com/nalimilan/57ea0640ffb8fc4a8424b558efe3ff4d

Comments?

@nalimilan nalimilan force-pushed the nl/broadcast branch 2 times, most recently from 4ca67ef to 5f3e5c0 Compare November 26, 2016 21:29
ftype(f, A) = typeof(f)
ftype(f, A...) = typeof(a -> f(a...))
ftype(T::DataType, A) = Type{T}
ftype(T::DataType, A...) = Type{T}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in JuliaLang/julia#19421 might be relevant here. Best!

end
ziptype(A) = Tuple{eltype(A)}
ziptype(A, B) = Zip2{Tuple{eltype(A)}, Tuple{eltype(B)}}
@inline ziptype(A, B, C, D...) = Zip{Tuple{eltype(A)}, ziptype(B, C, D...)}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise here, the changes in JuliaLang/julia#19421 might be relevant. Best!

@nalimilan nalimilan force-pushed the nl/broadcast branch 3 times, most recently from 89336d1 to dc34174 Compare December 4, 2016 16:31
@nalimilan nalimilan changed the title Rewrite broadcast() based on lift() Rewrite broadcast() and map() based on lift() Dec 4, 2016
@nalimilan
Copy link
Member Author

nalimilan commented Dec 4, 2016

I've added the corresponding changes for map. Since the current implementation directly accesses the values and isnull fields, it is about twice faster as the new one. It's a bit annoying, but I'd say we should do it anyway, as we don't really have the manpower to maintain this while improving at the same time the API. With some luck, the compiler will also improve.

Calling the generic map implementation required changing the semantics a bit when an array is empty and no function matches the element type: we used to choose a Union{} eltype, while the generic implementation returns Any. I don't think it matters a lot anyway, but better be consistent.

EDIT: failure on 0.6 is already on master, and fixed by #169.

@@ -1,3 +1,4 @@
julia 0.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we delete this line?

"""
broadcast!(f, B::NullableArray, As::NullableArray...; lift::Bool=false)
invoke_broadcast!{F, N}(f::F, dest, As::Vararg{NullableArray, N}) =
invoke(broadcast!, Tuple{Function, AbstractArray, Vararg{AbstractArray, N}}, f, dest, As...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Function be F? Probably won't make a difference in most cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, I guess it's more correct if e.g. f is a type and there's a specific method for them.

_to_shape(broadcast_indices(Xs...))),
Xs...; lift=lift)
function Base.broadcast!{F, N}(f::F, dest::NullableArray, As::Vararg{NullableArray, N})
# These definitions are needed to avoid allocation due to splatting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified that allocations are avoided for f2 of more than two arguments? Also, did you try @inline decorations for f2 to avoid the splatting penalty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no allocations on Julia master, though beyond 2 arguments performance is lower. I think I've tried everything I could (given my skills at least), without success (except a generated function, but that's probably overkill for one line). Since there are several issues open about varargs being slow, my hope is that it's going to improve at some point.

if null_safe_op(f, eltype(x))
return @compat Nullable(f(unsafe_get(x)), !isnull(x))
else
U = Core.Inference.return_type(f, Tuple{eltype(x)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check if U is leaf, yes? If so, we should decide what to do when U is abstract. Return Union{} as in 16961?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

if hasnulls(xs...)
return Nullable{U}()
else
return Nullable(f(_unsafe_get(xs...)...))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why the recursive approach as opposed to map(unsafe_get, xs)... ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my tests it avoided allocations, which isn't the case with map.

# broadcast!(f, Z::NullableArrays.NullableArray, Xs::NullableArrays.NullableArray...)
@inline Base.broadcast!(f, Z::NullableArray; lift=false) =
broadcast!(f, Z, Z; lift=lift)
Call `broadcast` with nullable lifting semantics and return a `NullableArray`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should revise this line in light of the of the behavior defined in line 62.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Actually, since in this PR for now lift isn't supposed to be overloaded (so that lifting is always done), the description is correct. I've removed line 62. I hope we can improve this later in another PR.

"""
map(f, As::NullableArray...)

Call `map` with nullable lifting semantics and return a `NullableArray`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly should revise this description.

using Base: collect_similar, Generator

invoke_map!{F, N}(f::F, dest, As::Vararg{NullableArray, N}) =
invoke(map!, Tuple{Function, AbstractArray, Vararg{AbstractArray, N}}, f, dest, As...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, should we have Function -> F?

@@ -37,8 +36,7 @@ Types declared as safe can benefit from higher performance for operations on nul
always computing the result even for null values, a branch is avoided, which helps
vectorization.
"""
null_safe_op(f::Any, ::Type) = false
null_safe_op(f::Any, ::Type, ::Type) = false
null_safe_op(f::Any, ::Type...) = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Isn't this in Base?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still carry a copy of null_safe_op because it doesn't exist on 0.4, and even on 0.5 it only comes with definitions for isless and isequal. But indeed we should merge them at some point (I'll do that in another PR).

Remove the custom implementation of broadcast(), and just call
the base method on the lifted method. For now, standard lifting
semantics are always used, even for logical operators and isnull/get.
This only covers one- and two-argument cases.
In my benchmarks, this approach is just as fast as manually repeating
isnull(x) | isnull(y)...
This introduces a small behavior change for consistency with the generic
map(): mapping over an empty array returns a NullableArray{Any} when the
function does not exist, rather than a NullableArray{Union{}}.
The 1- and 2-argument versions are slower on 0.5 due to allocations,
but faster (especially the latter) on 0.6.
…d function

Performance is now the same on Julia 0.5 and 0.6, and the 1- and 2-argument
specialization give the same result.
Rebasing mistake probably.
These have recently been simplified. map() in Base does not use
the same path as broadcast(), but doing so here is much simpler
and results should be equivalent for consistency anyway.
@nalimilan
Copy link
Member Author

Will squash and merge tomorrow unless somebody objects or beats me to it.

if VERSION >= v"0.5.0-dev+3294"
@test isequal(Z1, NullableArray(Union{}[])) # no type inference
@test isempty(Z1)
if VERSION >= v"0.6.0-dev"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please be specific with VERSION cutoffs like this, will this really work for all 0.6-dev versions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looks like I fixed this in https://github.com/JuliaStats/NullableArrays.jl/pull/177/files. Will remove the if.

(FWIW, at the time I added these conditions, the issue was that inference failed on 0.5.0, and I had no idea which commit(s) fixed it except for bisecting it...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #183.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants