Rewrite broadcast() and map() based on lift() #166

nalimilan · 2016-11-11T17:36:01Z

Remove the custom implementation of broadcast(), and just call
the base method on the lift()ed method. This implements the
blacklist approach: methods with a custom lifting behavior like
isnull() should override lift() to get passed Nullable values
directly; if they do not return a Nullable, the result is a standard
Array rather than a NullableArray.

Performance will probably be worse than before, but at least the
semantics will be correct. We can always re-implement the custom
and faster versions later when broadcast() has stabilized in Base
and Nullable support has settled.

This is WIP since I haven't thought much about this yet, because it copies lift from JuliaLang/julia#18758 without too much thinking either, ~~and because it only works on 0.6 at the moment~~.

Cf. discussions at #144 and #163.

nalimilan · 2016-11-13T14:18:30Z

test/broadcast.jl

+    X[1] = Nullable()
+    A[1]= 0
+    # FIXME: element-wise syntax should give the same result
+    # R = get.(X, 0)


Cf. JuliaLang/julia#19313.

nalimilan · 2016-11-13T14:19:26Z

@davidagold Where do you think lift should live? In NullableOps?

davidagold · 2016-11-13T20:13:11Z

Preferably NullableArrays. Or am I wrong for thinking 'lift' and lifted ops
are sufficiently different strategies that they ought to live in different
packages?

On Sunday, November 13, 2016, Milan Bouchet-Valat [email protected]
wrote:

@davidagold https://github.com/davidagold Where do you think lift
should live? In NullableOps?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#166 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALUCm4fyI9IQ1hlGuon_zLjcVTdMHRJPks5q9xxvgaJpZM4Kv-IA
.

nalimilan · 2016-11-13T21:10:07Z

No, that's fine with me, I just wanted to be sure that was OK for you.

nalimilan · 2016-11-14T09:04:05Z

JuliaLang/julia#19313 is kind of annoying: in some cases, the f.(x) syntax doesn't work for two-argument functions.

There's another related issue which is even more annoying: loop fusion prevents lift from detecting special (blacklisted) functions. So e.g. identity.(isnull.(x)) returns NULL instead of false for missing entries.

These two issues mostly affect corner cases, so this change is probably worth it anyway, but in the longer term I'm not sure what to do. The f.(x) syntax could be improved in Julia to be more flexible, i.e. calling a method with the function and input types as arguments, which could return a transformed function (lifted, in our case). Then loop fusion would combine these functions instead of the original ones. But I don't see this being accepted in Base without a strong use case, i.e. more or less official support for the lifting framework; we may as well merge JuliaLang/julia#18758 at the same time.

I'm starting to think (again) that all function calls should lift by default; that would be a much more consistent option.

johnmyleswhite · 2016-11-14T17:26:33Z

I'm not sure we should move forward with this just yet. It might be the right decision (and I'm inclined to think it is), but it might be easier to start with a less ambitious approach in which people have to explicitly call special functions like isnull. People love to complain about the burdens of those kinds of things, but I worry that this approach will be too magical and will actually prevent people from learning how to handle non-standard semantics for lifting.

nalimilan · 2016-11-14T17:34:23Z

So that means lifting by default, without any blacklist?

bramtayl · 2016-11-15T19:18:45Z

This kind of thing could be integrated pretty easily in ChainMap. Once a lift function is available, I could make

@chain begin
    x
    @over is.null
    @lift_over identity
end

go to

lift(broadcast, identity, broadcast(is.null, x) )

Edit: works now on master

codecov-io · 2016-11-20T16:34:41Z

Current coverage is 60.18% (diff: 85.00%)

Merging #166 into master will decrease coverage by 24.67%

@@             master       #166   diff @@
==========================================
  Files            14         13     -1   
  Lines           865        663   -202   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits            734        399   -335   
- Misses          131        264   +133   
  Partials          0          0

Powered by Codecov. Last update 2cc2894...4efaebe

nalimilan · 2016-11-25T20:04:38Z

I've updated the branch to always lift. It now uses a private broadcast_lift function, since for now we don't want people to override the lifting behaviour.

Surprisingly, new implementation appears to be just as fast as the current one. Since it passes the tests on 0.5 and 0.6, I think we can seriously consider merging it. It's not yet clear how to get non-standard lifting to work, but fixing standard cases sounds like the highest priority to me.
EDIT: I got the benchmarks wrong. There's still an allocation problem which kills performance. I'll investigate.

nalimilan · 2016-11-26T18:00:15Z

Really good news: with the latest changes, I get a very similar performance to master, even slightly better, for one- and two-argument cases. Three-argument and beyond is still terribly slow; I can add a special case for 3 if needed, waiting for a more general solution.

Example on a simple benchmark:

using NullableArrays
X = NullableArray(1:100000);
Y = NullableArray(1:100000);
using BenchmarkTools

# With this branch:
julia> @benchmark broadcast!(+, X, Y)
BenchmarkTools.Trial: 
  memory estimate:  176.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     107.072 μs (0.00% GC)
  median time:      108.384 μs (0.00% GC)
  mean time:        114.363 μs (0.00% GC)
  maximum time:     360.646 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark broadcast!(+, X, X, Y)
BenchmarkTools.Trial: 
  memory estimate:  224.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     222.965 μs (0.00% GC)
  median time:      228.248 μs (0.00% GC)
  mean time:        246.007 μs (0.00% GC)
  maximum time:     1.621 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

# With master:
julia> @benchmark broadcast!(+, X, Y)
BenchmarkTools.Trial: 
  memory estimate:  224.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     117.858 μs (0.00% GC)
  median time:      151.116 μs (0.00% GC)
  mean time:        197.762 μs (0.00% GC)
  maximum time:     1.576 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark broadcast!(+, X, X, Y)
BenchmarkTools.Trial: 
  memory estimate:  272.00 bytes
  allocs estimate:  6
  --------------
  minimum time:     243.590 μs (0.00% GC)
  median time:      251.746 μs (0.00% GC)
  mean time:        265.030 μs (0.00% GC)
  maximum time:     1.100 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

More generally, running perf/broadcast.jl shows a slight advantage to this PR, except in a handful of cases where the difference isn't that large:
https://gist.github.com/nalimilan/57ea0640ffb8fc4a8424b558efe3ff4d

Comments?

Sacha0 · 2016-11-27T18:07:04Z

src/broadcast.jl

+    ftype(f, A) = typeof(f)
+    ftype(f, A...) = typeof(a -> f(a...))
+    ftype(T::DataType, A) = Type{T}
+    ftype(T::DataType, A...) = Type{T}


The changes in JuliaLang/julia#19421 might be relevant here. Best!

Sacha0 · 2016-11-27T18:08:32Z

src/broadcast.jl

    end
+    ziptype(A) = Tuple{eltype(A)}
+    ziptype(A, B) = Zip2{Tuple{eltype(A)}, Tuple{eltype(B)}}
+    @inline ziptype(A, B, C, D...) = Zip{Tuple{eltype(A)}, ziptype(B, C, D...)}


Likewise here, the changes in JuliaLang/julia#19421 might be relevant. Best!

nalimilan · 2016-12-04T16:35:02Z

I've added the corresponding changes for map. Since the current implementation directly accesses the values and isnull fields, it is about twice faster as the new one. It's a bit annoying, but I'd say we should do it anyway, as we don't really have the manpower to maintain this while improving at the same time the API. With some luck, the compiler will also improve.

Calling the generic map implementation required changing the semantics a bit when an array is empty and no function matches the element type: we used to choose a Union{} eltype, while the generic implementation returns Any. I don't think it matters a lot anyway, but better be consistent.

EDIT: failure on 0.6 is already on master, and fixed by #169.

davidagold · 2016-12-04T20:27:23Z

REQUIRE

@@ -1,3 +1,4 @@
 julia 0.4


Should we delete this line?

davidagold · 2016-12-04T20:31:13Z

src/broadcast.jl

-    """
-        broadcast!(f, B::NullableArray, As::NullableArray...; lift::Bool=false)
+invoke_broadcast!{F, N}(f::F, dest, As::Vararg{NullableArray, N}) =
+    invoke(broadcast!, Tuple{Function, AbstractArray, Vararg{AbstractArray, N}}, f, dest, As...)


Should Function be F? Probably won't make a difference in most cases.

Probably, I guess it's more correct if e.g. f is a type and there's a specific method for them.

davidagold · 2016-12-04T20:33:36Z

src/broadcast.jl

-                                       _to_shape(broadcast_indices(Xs...))),
-                      Xs...; lift=lift)
+function Base.broadcast!{F, N}(f::F, dest::NullableArray, As::Vararg{NullableArray, N})
+    # These definitions are needed to avoid allocation due to splatting


Have you verified that allocations are avoided for f2 of more than two arguments? Also, did you try @inline decorations for f2 to avoid the splatting penalty?

There are no allocations on Julia master, though beyond 2 arguments performance is lower. I think I've tried everything I could (given my skills at least), without success (except a generated function, but that's probably overkill for one line). Since there are several issues open about varargs being slow, my hope is that it's going to improve at some point.

davidagold · 2016-12-04T20:36:22Z

src/lift.jl

+    if null_safe_op(f, eltype(x))
+        return @compat Nullable(f(unsafe_get(x)), !isnull(x))
+    else
+        U = Core.Inference.return_type(f, Tuple{eltype(x)})


We should probably check if U is leaf, yes? If so, we should decide what to do when U is abstract. Return Union{} as in 16961?

Makes sense.

davidagold · 2016-12-04T20:40:19Z

src/lift.jl

+        if hasnulls(xs...)
+            return Nullable{U}()
+        else
+            return Nullable(f(_unsafe_get(xs...)...))


Out of curiosity, why the recursive approach as opposed to map(unsafe_get, xs)... ?

In my tests it avoided allocations, which isn't the case with map.

davidagold · 2016-12-04T20:42:34Z

src/broadcast.jl

-    #   broadcast!(f, Z::NullableArrays.NullableArray, Xs::NullableArrays.NullableArray...)
-    @inline Base.broadcast!(f, Z::NullableArray; lift=false) =
-        broadcast!(f, Z, Z; lift=lift)
+Call `broadcast` with nullable lifting semantics and return a `NullableArray`.


We should revise this line in light of the of the behavior defined in line 62.

Good catch. Actually, since in this PR for now lift isn't supposed to be overloaded (so that lifting is always done), the description is correct. I've removed line 62. I hope we can improve this later in another PR.

davidagold · 2016-12-04T20:43:10Z

src/map.jl

+"""
+    map(f, As::NullableArray...)
+
+Call `map` with nullable lifting semantics and return a `NullableArray`.


Similarly should revise this description.

davidagold · 2016-12-04T20:44:10Z

src/map.jl

+using Base: collect_similar, Generator
+
+invoke_map!{F, N}(f::F, dest, As::Vararg{NullableArray, N}) =
+    invoke(map!, Tuple{Function, AbstractArray, Vararg{AbstractArray, N}}, f, dest, As...)


Similarly, should we have Function -> F?

davidagold · 2016-12-04T20:47:14Z

src/operators.jl

@@ -37,8 +36,7 @@ Types declared as safe can benefit from higher performance for operations on nul
 always computing the result even for null values, a branch is avoided, which helps
 vectorization.
 """
-null_safe_op(f::Any, ::Type) = false
-null_safe_op(f::Any, ::Type, ::Type) = false
+null_safe_op(f::Any, ::Type...) = false


I'm confused. Isn't this in Base?

We still carry a copy of null_safe_op because it doesn't exist on 0.4, and even on 0.5 it only comes with definitions for isless and isequal. But indeed we should merge them at some point (I'll do that in another PR).

Remove the custom implementation of broadcast(), and just call the base method on the lifted method. For now, standard lifting semantics are always used, even for logical operators and isnull/get.

This only covers one- and two-argument cases.

In my benchmarks, this approach is just as fast as manually repeating isnull(x) | isnull(y)...

This introduces a small behavior change for consistency with the generic map(): mapping over an empty array returns a NullableArray{Any} when the function does not exist, rather than a NullableArray{Union{}}.

The 1- and 2-argument versions are slower on 0.5 due to allocations, but faster (especially the latter) on 0.6.

…d function Performance is now the same on Julia 0.5 and 0.6, and the 1- and 2-argument specialization give the same result.

Rebasing mistake probably.

These have recently been simplified. map() in Base does not use the same path as broadcast(), but doing so here is much simpler and results should be equivalent for consistency anyway.

nalimilan · 2017-01-22T16:31:13Z

Will squash and merge tomorrow unless somebody objects or beats me to it.

tkelman · 2017-02-15T21:12:05Z

test/map.jl

-    if VERSION >= v"0.5.0-dev+3294"
-        @test isequal(Z1, NullableArray(Union{}[])) # no type inference
+    @test isempty(Z1)
+    if VERSION >= v"0.6.0-dev"


please be specific with VERSION cutoffs like this, will this really work for all 0.6-dev versions?

Actually, looks like I fixed this in https://github.com/JuliaStats/NullableArrays.jl/pull/177/files. Will remove the if.

(FWIW, at the time I added these conditions, the issue was that inference failed on 0.5.0, and I had no idea which commit(s) fixed it except for bisecting it...)

nalimilan force-pushed the nl/broadcast branch from 3432db6 to 5b8a1bd Compare November 13, 2016 14:05

nalimilan mentioned this pull request Nov 13, 2016

Element-wise call syntax does not always lower to standard broadcast() JuliaLang/julia#19313

Closed

nalimilan commented Nov 13, 2016

View reviewed changes

nalimilan force-pushed the nl/broadcast branch 2 times, most recently from 5f7c6a1 to 5de334f Compare November 20, 2016 16:34

nalimilan force-pushed the nl/broadcast branch 3 times, most recently from dcd8797 to 4d1a1f8 Compare November 25, 2016 19:59

nalimilan changed the title ~~WIP: Rewrite broadcast() based on lift()~~ Rewrite broadcast() based on lift() Nov 25, 2016

nalimilan force-pushed the nl/broadcast branch 2 times, most recently from 4ca67ef to 5f3e5c0 Compare November 26, 2016 21:29

Sacha0 reviewed Nov 27, 2016

View reviewed changes

nalimilan force-pushed the nl/broadcast branch 3 times, most recently from 89336d1 to dc34174 Compare December 4, 2016 16:31

nalimilan changed the title ~~Rewrite broadcast() based on lift()~~ Rewrite broadcast() and map() based on lift() Dec 4, 2016

davidagold reviewed Dec 4, 2016

View reviewed changes

nalimilan force-pushed the nl/broadcast branch from a2e873a to d5d1aad Compare December 5, 2016 09:34

nalimilan added 19 commits January 19, 2017 17:35

Rewrite broadcast() using Base implementation and lifting semantics

4233744

Remove the custom implementation of broadcast(), and just call the base method on the lifted method. For now, standard lifting semantics are always used, even for logical operators and isnull/get.

Use generated function to avoid allocation

899ed75

This only covers one- and two-argument cases.

Fix performance without generated functions

23ec9f2

Improve efficiency of null checking for generic case

2953516

In my benchmarks, this approach is just as fast as manually repeating isnull(x) | isnull(y)...

Fix call to null_safe_op() for generic case

82d9fde

Fix remaining allocations in generic method

cc787d9

Add non-splatting methods up to 7 arguments

0d649dd

Also port map() to lift()

6518c47

This introduces a small behavior change for consistency with the generic map(): mapping over an empty array returns a NullableArray{Any} when the function does not exist, rather than a NullableArray{Union{}}.

Fix broadcast() tests

a69bd5f

Small naming improvement

a1926c7

Review comments

0eab4dc

Use generated function to avoid allocations on Julia 0.5

0a00622

The 1- and 2-argument versions are slower on 0.5 due to allocations, but faster (especially the latter) on 0.6.

Simplify code and improve performance on Julia 0.5, use only generate…

615d084

…d function Performance is now the same on Julia 0.5 and 0.6, and the 1- and 2-argument specialization give the same result.

Remove Julia 0.4 lines

75edfcb

Rebasing mistake probably.

Update eltype computation code to match Base

2d8670d

These have recently been simplified. map() in Base does not use the same path as broadcast(), but doing so here is much simpler and results should be equivalent for consistency anyway.

Fix deprecation warnings

5f10b07

Fix 0.6 deprecations

83ea4cf

Require Compat 0.13.0

3f17571

Fix computing eltype

5a88dab

nalimilan force-pushed the nl/broadcast branch from 4efaebe to 5a88dab Compare January 19, 2017 16:36

nalimilan mentioned this pull request Jan 22, 2017

min/max on differences of DataFrame columns in Julia JuliaData/DataFrames.jl#1150

Closed

nalimilan mentioned this pull request Jan 23, 2017

[RFC] Fix [map]reduce with no non-null and single non-null element #158

Open

nalimilan merged commit 801f529 into master Jan 23, 2017

nalimilan deleted the nl/broadcast branch January 23, 2017 21:35

piever mentioned this pull request Jan 24, 2017

broadcast with arguments of different type #174

Closed

nalimilan mentioned this pull request Feb 9, 2017

isnull.(x) #180

Open

tkelman reviewed Feb 15, 2017

View reviewed changes

nalimilan mentioned this pull request Feb 18, 2017

Add special lifting semantics for isnull() with map() and broadcast() #185

Open

nalimilan mentioned this pull request Feb 25, 2017

Wrong return type for .== #142

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite broadcast() and map() based on lift() #166

Rewrite broadcast() and map() based on lift() #166

nalimilan commented Nov 11, 2016 •

edited

Loading

nalimilan Nov 13, 2016

nalimilan commented Nov 13, 2016

davidagold commented Nov 13, 2016

nalimilan commented Nov 13, 2016

nalimilan commented Nov 14, 2016

johnmyleswhite commented Nov 14, 2016

nalimilan commented Nov 14, 2016

bramtayl commented Nov 15, 2016 •

edited

Loading

codecov-io commented Nov 20, 2016 •

edited

Loading

nalimilan commented Nov 25, 2016 •

edited

Loading

nalimilan commented Nov 26, 2016

Sacha0 Nov 27, 2016

Sacha0 Nov 27, 2016

nalimilan commented Dec 4, 2016 •

edited

Loading

davidagold Dec 4, 2016

davidagold Dec 4, 2016

nalimilan Dec 4, 2016

davidagold Dec 4, 2016

nalimilan Dec 4, 2016

davidagold Dec 4, 2016

nalimilan Dec 4, 2016

davidagold Dec 4, 2016

nalimilan Dec 4, 2016

davidagold Dec 4, 2016

nalimilan Dec 4, 2016

davidagold Dec 4, 2016

davidagold Dec 4, 2016

davidagold Dec 4, 2016

nalimilan Dec 4, 2016

nalimilan commented Jan 22, 2017

tkelman Feb 15, 2017

nalimilan Feb 15, 2017

nalimilan Feb 15, 2017

Rewrite broadcast() and map() based on lift() #166

Rewrite broadcast() and map() based on lift() #166

Conversation

nalimilan commented Nov 11, 2016 • edited Loading

Choose a reason for hiding this comment

nalimilan commented Nov 13, 2016

davidagold commented Nov 13, 2016

nalimilan commented Nov 13, 2016

nalimilan commented Nov 14, 2016

johnmyleswhite commented Nov 14, 2016

nalimilan commented Nov 14, 2016

bramtayl commented Nov 15, 2016 • edited Loading

codecov-io commented Nov 20, 2016 • edited Loading

Current coverage is 60.18% (diff: 85.00%)

nalimilan commented Nov 25, 2016 • edited Loading

nalimilan commented Nov 26, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nalimilan commented Dec 4, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nalimilan commented Jan 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nalimilan commented Nov 11, 2016 •

edited

Loading

bramtayl commented Nov 15, 2016 •

edited

Loading

codecov-io commented Nov 20, 2016 •

edited

Loading

nalimilan commented Nov 25, 2016 •

edited

Loading

nalimilan commented Dec 4, 2016 •

edited

Loading