Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operations on SparseMatrices depend heavily on inference #59

Open
martinholters opened this issue Dec 14, 2016 · 15 comments
Open

Operations on SparseMatrices depend heavily on inference #59

martinholters opened this issue Dec 14, 2016 · 15 comments

Comments

@martinholters
Copy link
Contributor

In JuliaLang/julia#19518, broadcast for sparse matrices underwent a major revision, overall a move in the right direction, IMHO. But it also resulted in a regression (in my eyes) if type inference fails for the operation/element type combination. This issue is meant to continue the discussion started in JuliaLang/julia#19518 (comment).

There are three aspects here that make the problem more acute than for ordinary Arrays.

  1. For sparse matrices, even basic operations like + and - just invoke broadcast internally. OTOH, for Arrays, these do some special tricks for the result type, which helps for the empty case. However, getting the type right for the empty case may not be important enough to warrant such special-casing the be replicated.
  2. However, broadcast for sparse matrices relies on inference to determine the output element type not only in the empty as in all-dimensions-zero case, but always also if no entries need to be stored (everything zero), making this case much more likely to occur.
  3. Once inference fails for an all-zero (or truely empty) case, the element type becomes Any, which precludes all further operations as zero(::Any) is undefined.

E.g.:

julia> x = spzeros(Real, 2, 2)
2×2 sparse matrix with 0 Real nonzero entries

julia> x + x
2×2 sparse matrix with 0 Any nonzero entries

julia> x + x + x
ERROR: MethodError: no method matching zero(::Type{Any})

Edit: Similarly also

julia> speye(Real, 3) * 1 * 1
ERROR: MethodError: no method matching zero(::Type{Any})

The immediate fix I have in mind (but I find the sparse broadcast machinery too intimidating on first glance to try to implement properly) is the following: If the result of broadcast(op, a, b) would have element type Any (i.e. inference failed) and nzval is empty, use typeof(op(zero(eltype(a)), zero(eltype(b)))) as element type instead. (Likewise for one- and more-argument cases, of course.) For the example above, this would give an element type of Int which would be much more useful than Any.

@martinholters
Copy link
Contributor Author

Ah, no, point 2 does not hold, things are even worse:

julia> x = sparse(eye(Real,3,3))
3×3 sparse matrix with 3 Real nonzero entries:
  [1, 1]  =  1
  [2, 2]  =  1
  [3, 3]  =  1

julia> x + x
ERROR: MethodError: no method matching zero(::Type{Any})

Edit:
Oh, I see, that is in part fixed by JuliaLang/julia#19589.

But even on that branch:

julia> x + x
3×3 sparse matrix with 3 Any nonzero entries:
  [1, 1]  =  2
  [2, 2]  =  2
  [3, 3]  =  2

Element type is Any, while broadcast on Array gives the much more useful element type of Int:

julia> broadcast(+, eye(Real,3,3), eye(Real,3,3))
3×3 Array{Int64,2}:
 2  0  0
 0  2  0
 0  0  2

We really should get that for sparse matrices, too.

@martinholters
Copy link
Contributor Author

I have added

--- a/base/sparse/sparsematrix.jl
+++ b/base/sparse/sparsematrix.jl
@@ -1434,7 +1434,15 @@ function broadcast!{Tf,N}(f::Tf, C::SparseMatrixCSC, A::SparseMatrixCSC, Bs::Var
                         _broadcast_notzeropres!(f, fofzeros, C, A, Bs...)
 end
 function broadcast{Tf,N}(f::Tf, A::SparseMatrixCSC, Bs::Vararg{SparseMatrixCSC,N})
-    _aresameshape(A, Bs...) && return map(f, A, Bs...) # could avoid a second dims check in map
+    if _aresameshape(A, Bs...)
+        res = map(f, A, Bs...) # could avoid a second dims check in map
+        if !isleaftype(eltype(res))
+            t = mapreduce(typeof, typejoin, typeof(f(zero(eltype(A)), zero.(eltype.(Bs))...)), res.nzval)
+            return convert(SparseMatrixCSC{t,indtype(res)}, res)
+        else
+            return res
+        end
+    end
     fofzeros = f(_zeros_eltypes(A, Bs...)...)
     fpreszeros = !_broadcast_isnonzero(fofzeros)
     indextypeC = _promote_indtype(A, Bs...)

on top of JuliaLang/julia#19589 and that has fixed my most pressing troubles. I guess we would want this behavior

  • in all cases, not just when delegating to map
  • implemented in a more efficient manner, if possible. (Similar to how it is done for Array, perhaps?)

@Sacha0 do you agree and if so, could you tackle that?

@oscardssmith
Copy link

What would happen if we made 0 (as in the int) the result of zero(::Any). This would not be type stable, but it only comes up in code that already isn't, so I don't think we loose anything there. It also has the advantage that it should lead to relatively sane behavior.

@Sacha0
Copy link
Member

Sacha0 commented Dec 14, 2016

do you agree and if so, could you tackle that?

Before expressing an opinion I would like to think this issue through carefully (and possibly play with an implementation). Completing the overall functionality (extending the present work to sparse vectors, combinations of sparse vectors / matrices and scalars, et cetera) seems higher priority and likely to consume all time I have prior to feature freeze. Could we revisit this topic in a few weeks?

In the interim, if moving from promote_op to _promote_op(f, typestuple(As...)) in map/broadcast over one or two input sparse matrices is causing substantial breakage, as a stopgap we could return to calling promote_op in those cases.

Thoughts? Thanks and best!

@martinholters
Copy link
Contributor Author

Completing the overall functionality (extending the present work to sparse vectors, combinations of sparse vectors / matrices and scalars, et cetera) seems higher priority and likely to consume all time I have prior to feature freeze. Could we revisit this topic in a few weeks?

Counterproposal: We try to fix any mistakes first before repeating them for the other cases.

Having thought about this some more, I am pretty sure that the sparse broadcasting code should never depend on type inference. For the Array code, this is used as a last resort (and maybe as a performance optimization?), namely when broadcasting over empty inputs where no actual output elements have been computed to use the type of. For sparse matrices, we compute f(zero(...), ...) anyway, so when we can use the type of that for the empty/no-stored-elements case, and the typejoin of all computed output values (including f(zero(...), ...)) otherwise, similar to how it is done for Arrays.

@martinholters
Copy link
Contributor Author

Marking as regression and adding milestone as this means arithmetic on sparse matrices becomes broken whenever inference hick-ups.

@martinholters martinholters changed the title Make SparseMatrices less dependent on inference Operations on SparseMatrices depend heavily on inference Dec 16, 2016
martinholters referenced this issue in JuliaLang/julia Dec 16, 2016
Instead of determining the output element type beforehand by querying
inference, the element type is deduced from the actually computed output
values (similar to broadcast over Array, but taking into account the
output for the all-inputs-zero case). For the type-unstable case,
performance is sub-optimal, but at least it gives the correct result.

Closes #19595.
martinholters referenced this issue in JuliaLang/julia Dec 16, 2016
Instead of determining the output element type beforehand by querying
inference, the element type is deduced from the actually computed output
values (similar to broadcast over Array, but taking into account the
output for the all-inputs-zero case). For the type-unstable case,
performance is sub-optimal, but at least it gives the correct result.

Closes #19595.
martinholters referenced this issue in JuliaLang/julia Dec 16, 2016
Instead of determining the output element type beforehand by querying
inference, the element type is deduced from the actually computed output
values (similar to broadcast over Array, but taking into account the
output for the all-inputs-zero case). For the type-unstable case,
performance is sub-optimal, but at least it gives the correct result.

Closes #19595.
@Sacha0
Copy link
Member

Sacha0 commented Dec 19, 2016

Thoughts:

The result eltype of sparse broadcast should match that of dense broadcast. Dense broadcast determines result eltype as follows: It retrieves the inferred eltype. If (1) the inferred eltype is concrete, it uses the inferred eltype. If (2) the result is empty, it uses the inferred eltype. If (3) the inferred eltype is abstract and the result is nonempty, it computes and uses the typejoin of the types of the result's values.

Note that a sparse vector/matrix resulting from broadcast that stores no entries is not generally empty: It contains some combination of zeros. In that situation sparse broadcast's return eltype should also match that of dense broadcast, using the inferred eltype if the inferred eltype is concrete or, if the inferred eltype is abstract and the result is nonempty, the typejoin of the types of the result's values. Similarly note that a truly empty sparse vector/matrix resulting from broadcast does not contain zeros of any kind, in which situation consistency with dense broadcast's result eltype (the inferred eltype) seems the most important concern.

I have a plan for realizing this behavior fully. At present sparse broadcast's behavior matches dense broadcast's behavior in cases (1) and (2) of the three above, covering most use. Though I hope to match case (3) in the near-term (next few weeks), prioritizing overall functionality (involving feature introduction) over matching case (3) seems prudent thirteen days out from feature freeze.

Marking as regression and adding milestone as this means arithmetic on sparse matrices becomes broken whenever inference hick-ups.

Kindly remember that as of a few weeks ago: generic sparse broadcast over a single matrix did not exist, only a few specializations for common operations (which, having no promotion mechanism, often got the result type incorrect even in simple cases e.g. JuliaLang/julia#18974); sparse broadcast over a pair of matrices existed, but was not generic (silently returning incorrect results as e.g. in JuliaLang/julia#19110), and its "promotion mechanism" was promote_type on the input eltypes (often incorrect, though 'accidentally' correct in some of the cases mentioned above); sparse broadcast did not exist for more than two input matrices.

With that in mind, please consider that characterizing generic sparse broadcast's present promotion behavior as a regression seems a bit of a stretch 😉. In contrast, suboptimal, inconsistent, or incomplete all seem like reasonable characterizations.

As noted above, promotion cases (1) and (2) cover most use, and case (3) can be matched after feature freeze. On the other hand, extending generic sparse broadcast to sparse vectors and combinations involving scalars need be complete by feature freeze. As such, prioritizing the latter seems prudent at this time; please bear with me for the short while that requires.

Best!

@martinholters
Copy link
Contributor Author

truly empty sparse vector/matrix resulting from broadcast does not contain zeros of any kind, in which situation consistency with dense broadcast's result eltype (the inferred eltype) seems the most important concern.

Well, using type inference in the empty case is used for the dense broadcast primarily due to the lack of alternatives. Whether that should be replicated for the sparse case might be open for debate, as here, using zero to obtain a pseudo-element might be ok, while it is definitely not ok for dense arrays. So I'm not sure this is the "most important concern".

With that in mind, please consider that characterizing generic sparse broadcast's present promotion behavior as a regression seems a bit of a stretch

Oh, your work on sparse broadcast is certainly a huge step forward, no doubt. I'm not characterizing it as a regression. I'm merely saying it lead to a (fairly specific) regression: Something that ought to work and worked before stopped working. Unfortunately, this regression completely broke ACME, which is why I'm personally very interested it seeing it fixed. I think I have found a workaround, though, so I'm a bit more relaxed now. 😄

On a related note:

julia> a=Real[0];

julia> a+a
1-element Array{Real,1}:
 0

julia> broadcast(+, a, a)
1-element Array{Int64,1}:
 0

because + uses the (hacky?) promote_op while broadcast(+, ...) does not. Would you say we should make dense and sparse behave the same? If so, maybe it's a good opportunity to rethink the dense behavior?

martinholters referenced this issue in HSU-ANT/ACME.jl Dec 19, 2016
To avoid hitting JuliaLang/julia#19595, + and - are overloaded for the
problematic case in a rather ad-hoc fashion.
@Sacha0
Copy link
Member

Sacha0 commented Dec 19, 2016

I think I have found a workaround

Cheers, glad to hear :).

On a related note:

I agree wholeheartedly that the (Real[0] + Real[0])::Vector{Real} versus broadcast(+, Real[0], Real[0])::Vector{Int} discrepancy is disconcerting, particularly given that after JuliaLang/julia#17623 Real[0] + Real[0] and Real[0] .+ Real[0] will bear different eltypes.

With the disclaimer that I haven't fully thought this issue through yet, like @pabloferz I err on the side of Real[0] + Real[0] yielding a Vector{Int} (consistent with broadcast): Generally applying promote_op's present behavior seems suboptimal (e.g. having round.(Int, Real[0]) return Vector{Real} rather than Vector{Int}), and what a better container-abstract-eltype preservation rule would say is not clear to me. In contrast, broadcast's eltype algorithm is simple, clear, and mostly works well (notwithstanding the troublesome empty case). On the other hand, (Real[0] + Real[0])::Vector{Real} certainly has its appeal.

Thoughts? Thanks and best!

@martinholters
Copy link
Contributor Author

To me, changing the behavior of + for dense matrices to be in line with broadcast(+, ...) (i.e. returning Vector{Int} for the example above) looks like the only option that has the potential of being consistent and maintainable. Maybe we should open a dedicated issue for further discussion?

@Sacha0
Copy link
Member

Sacha0 commented Dec 20, 2016

Maybe we should open a dedicated issue for further discussion?

Sounds good! Best!

@martinholters
Copy link
Contributor Author

Issue is at JuliaLang/julia#19669.

@tkelman
Copy link

tkelman commented Dec 22, 2016

While it would be ideal not to be changing return types of these operations post-feature-freeze, if the current behavior is considered buggy / incomplete then this can hopefully be worked on after feature freeze but prior to / during RC's. Probably not absolutely release blocking?

mbauman referenced this issue in JuliaLang/julia Apr 23, 2018
This patch represents the combined efforts of four individuals, over 60
commits, and an iterated design over (at least) three pull requests that
spanned nearly an entire year (closes #22063, #23692, #25377 by superceding
them).

This introduces a pure Julia data structure that represents a fused broadcast
expression.  For example, the expression `2 .* (x .+ 1)` lowers to:

```julia
julia> Meta.@lower 2 .* (x .+ 1)
:($(Expr(:thunk, CodeInfo(:(begin
      Core.SSAValue(0) = (Base.getproperty)(Base.Broadcast, :materialize)
      Core.SSAValue(1) = (Base.getproperty)(Base.Broadcast, :make)
      Core.SSAValue(2) = (Base.getproperty)(Base.Broadcast, :make)
      Core.SSAValue(3) = (Core.SSAValue(2))(+, x, 1)
      Core.SSAValue(4) = (Core.SSAValue(1))(*, 2, Core.SSAValue(3))
      Core.SSAValue(5) = (Core.SSAValue(0))(Core.SSAValue(4))
      return Core.SSAValue(5)
  end)))))
```

Or, slightly more readably as:

```julia
using .Broadcast: materialize, make
materialize(make(*, 2, make(+, x, 1)))
```

The `Broadcast.make` function serves two purposes. Its primary purpose is to
construct the `Broadcast.Broadcasted` objects that hold onto the function, the
tuple of arguments (potentially including nested `Broadcasted` arguments), and
sometimes a set of `axes` to include knowledge of the outer shape. The
secondary purpose, however, is to allow an "out" for objects that _don't_ want
to participate in fusion. For example, if `x` is a range in the above `2 .* (x
.+ 1)` expression, it needn't allocate an array and operate elementwise — it
can just compute and return a new range. Thus custom structures are able to
specialize `Broadcast.make(f, args...)` just as they'd specialize on `f`
normally to return an immediate result.

`Broadcast.materialize` is identity for everything _except_ `Broadcasted`
objects for which it allocates an appropriate result and computes the
broadcast. It does two things: it `initialize`s the outermost `Broadcasted`
object to compute its axes and then `copy`s it.

Similarly, an in-place fused broadcast like `y .= 2 .* (x .+ 1)` uses the exact
same expression tree to compute the right-hand side of the expression as above,
and then uses `materialize!(y, make(*, 2, make(+, x, 1)))` to `instantiate` the
`Broadcasted` expression tree and then `copyto!` it into the given destination.

All-together, this forms a complete API for custom types to extend and
customize the behavior of broadcast (fixes #22060). It uses the existing
`BroadcastStyle`s throughout to simplify dispatch on many arguments:

* Custom types can opt-out of broadcast fusion by specializing
  `Broadcast.make(f, args...)` or `Broadcast.make(::BroadcastStyle, f, args...)`.

* The `Broadcasted` object computes and stores the type of the combined
  `BroadcastStyle` of its arguments as its first type parameter, allowing for
  easy dispatch and specialization.

* Custom Broadcast storage is still allocated via `broadcast_similar`, however
  instead of passing just a function as a first argument, the entire
  `Broadcasted` object is passed as a final argument. This potentially allows
  for much more runtime specialization dependent upon the exact expression
  given.

* Custom broadcast implmentations for a `CustomStyle` are defined by
  specializing `copy(bc::Broadcasted{CustomStyle})` or
  `copyto!(dest::AbstractArray, bc::Broadcasted{CustomStyle})`.

* Fallback broadcast specializations for a given output object of type `Dest`
  (for the `DefaultArrayStyle` or another such style that hasn't implemented
  assignments into such an object) are defined by specializing
  `copyto(dest::Dest, bc::Broadcasted{Nothing})`.

As it fully supports range broadcasting, this now deprecates `(1:5) + 2` to
`.+`, just as had been done for all `AbstractArray`s in general.

As a first-mover proof of concept, LinearAlgebra uses this new system to
improve broadcasting over structured arrays. Before, broadcasting over a
structured matrix would result in a sparse array. Now, broadcasting over a
structured matrix will _either_ return an appropriately structured matrix _or_
a dense array. This does incur a type instability (in the form of a
discriminated union) in some situations, but thanks to type-based introspection
of the `Broadcasted` wrapper commonly used functions can be special cased to be
type stable.  For example:

```julia
julia> f(d) = round.(Int, d)
f (generic function with 1 method)

julia> @inferred f(Diagonal(rand(3)))
3×3 Diagonal{Int64,Array{Int64,1}}:
 0  ⋅  ⋅
 ⋅  0  ⋅
 ⋅  ⋅  1

julia> @inferred Diagonal(rand(3)) .* 3
ERROR: return type Diagonal{Float64,Array{Float64,1}} does not match inferred return type Union{Array{Float64,2}, Diagonal{Float64,Array{Float64,1}}}
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope

julia> @inferred Diagonal(1:4) .+ Bidiagonal(rand(4), rand(3), 'U') .* Tridiagonal(1:3, 1:4, 1:3)
4×4 Tridiagonal{Float64,Array{Float64,1}}:
 1.30771  0.838589   ⋅          ⋅
 0.0      3.89109   0.0459757   ⋅
  ⋅       0.0       4.48033    2.51508
  ⋅        ⋅        0.0        6.23739
```

In addition to the issues referenced above, it fixes:

* Fixes #19313, #22053, #23445, and #24586: Literals are no longer treated
  specially in a fused broadcast; they're just arguments in a `Broadcasted`
  object like everything else.

* Fixes #21094: Since broadcasting is now represented by a pure Julia
  datastructure it can be created within `@generated` functions and serialized.

* Fixes #26097: The fallback destination-array specialization method of
  `copyto!` is specifically implemented as `Broadcasted{Nothing}` and will not
  be confused by `nothing` arguments.

* Fixes the broadcast-specific element of #25499: The default base broadcast
  implementation no longer depends upon `Base._return_type` to allocate its
  array (except in the empty or concretely-type cases). Note that the sparse
  implementation (#19595) is still dependent upon inference and is _not_ fixed.

* Fixes #25340: Functions are treated like normal values just like arguments
  and only evaluated once.

* Fixes #22255, and is performant with 12+ fused broadcasts. Okay, that one was
  fixed on master already, but this fixes it now, too.

* Fixes #25521.

* The performance of this patch has been thoroughly tested through its
  iterative development process in #25377. There remain [two classes of
  performance regressions](#25377) that Nanosoldier flagged.

* #25691: Propagation of constant literals sill lose their constant-ness upon
  going through the broadcast machinery. I believe quite a large number of
  functions would need to be marked as `@pure` to support this -- including
  functions that are intended to be specialized.

(For bookkeeping, this is the squashed version of the [teh-jn/lazydotfuse](#25377)
branch as of a1d4e7e. Squashed and separated
out to make it easier to review and commit)

Co-authored-by: Tim Holy <[email protected]>
Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Andrew Keller <[email protected]>
Keno referenced this issue in JuliaLang/julia Apr 27, 2018
This patch represents the combined efforts of four individuals, over 60
commits, and an iterated design over (at least) three pull requests that
spanned nearly an entire year (closes #22063, #23692, #25377 by superceding
them).

This introduces a pure Julia data structure that represents a fused broadcast
expression.  For example, the expression `2 .* (x .+ 1)` lowers to:

```julia
julia> Meta.@lower 2 .* (x .+ 1)
:($(Expr(:thunk, CodeInfo(:(begin
      Core.SSAValue(0) = (Base.getproperty)(Base.Broadcast, :materialize)
      Core.SSAValue(1) = (Base.getproperty)(Base.Broadcast, :make)
      Core.SSAValue(2) = (Base.getproperty)(Base.Broadcast, :make)
      Core.SSAValue(3) = (Core.SSAValue(2))(+, x, 1)
      Core.SSAValue(4) = (Core.SSAValue(1))(*, 2, Core.SSAValue(3))
      Core.SSAValue(5) = (Core.SSAValue(0))(Core.SSAValue(4))
      return Core.SSAValue(5)
  end)))))
```

Or, slightly more readably as:

```julia
using .Broadcast: materialize, make
materialize(make(*, 2, make(+, x, 1)))
```

The `Broadcast.make` function serves two purposes. Its primary purpose is to
construct the `Broadcast.Broadcasted` objects that hold onto the function, the
tuple of arguments (potentially including nested `Broadcasted` arguments), and
sometimes a set of `axes` to include knowledge of the outer shape. The
secondary purpose, however, is to allow an "out" for objects that _don't_ want
to participate in fusion. For example, if `x` is a range in the above `2 .* (x
.+ 1)` expression, it needn't allocate an array and operate elementwise — it
can just compute and return a new range. Thus custom structures are able to
specialize `Broadcast.make(f, args...)` just as they'd specialize on `f`
normally to return an immediate result.

`Broadcast.materialize` is identity for everything _except_ `Broadcasted`
objects for which it allocates an appropriate result and computes the
broadcast. It does two things: it `initialize`s the outermost `Broadcasted`
object to compute its axes and then `copy`s it.

Similarly, an in-place fused broadcast like `y .= 2 .* (x .+ 1)` uses the exact
same expression tree to compute the right-hand side of the expression as above,
and then uses `materialize!(y, make(*, 2, make(+, x, 1)))` to `instantiate` the
`Broadcasted` expression tree and then `copyto!` it into the given destination.

All-together, this forms a complete API for custom types to extend and
customize the behavior of broadcast (fixes #22060). It uses the existing
`BroadcastStyle`s throughout to simplify dispatch on many arguments:

* Custom types can opt-out of broadcast fusion by specializing
  `Broadcast.make(f, args...)` or `Broadcast.make(::BroadcastStyle, f, args...)`.

* The `Broadcasted` object computes and stores the type of the combined
  `BroadcastStyle` of its arguments as its first type parameter, allowing for
  easy dispatch and specialization.

* Custom Broadcast storage is still allocated via `broadcast_similar`, however
  instead of passing just a function as a first argument, the entire
  `Broadcasted` object is passed as a final argument. This potentially allows
  for much more runtime specialization dependent upon the exact expression
  given.

* Custom broadcast implmentations for a `CustomStyle` are defined by
  specializing `copy(bc::Broadcasted{CustomStyle})` or
  `copyto!(dest::AbstractArray, bc::Broadcasted{CustomStyle})`.

* Fallback broadcast specializations for a given output object of type `Dest`
  (for the `DefaultArrayStyle` or another such style that hasn't implemented
  assignments into such an object) are defined by specializing
  `copyto(dest::Dest, bc::Broadcasted{Nothing})`.

As it fully supports range broadcasting, this now deprecates `(1:5) + 2` to
`.+`, just as had been done for all `AbstractArray`s in general.

As a first-mover proof of concept, LinearAlgebra uses this new system to
improve broadcasting over structured arrays. Before, broadcasting over a
structured matrix would result in a sparse array. Now, broadcasting over a
structured matrix will _either_ return an appropriately structured matrix _or_
a dense array. This does incur a type instability (in the form of a
discriminated union) in some situations, but thanks to type-based introspection
of the `Broadcasted` wrapper commonly used functions can be special cased to be
type stable.  For example:

```julia
julia> f(d) = round.(Int, d)
f (generic function with 1 method)

julia> @inferred f(Diagonal(rand(3)))
3×3 Diagonal{Int64,Array{Int64,1}}:
 0  ⋅  ⋅
 ⋅  0  ⋅
 ⋅  ⋅  1

julia> @inferred Diagonal(rand(3)) .* 3
ERROR: return type Diagonal{Float64,Array{Float64,1}} does not match inferred return type Union{Array{Float64,2}, Diagonal{Float64,Array{Float64,1}}}
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope

julia> @inferred Diagonal(1:4) .+ Bidiagonal(rand(4), rand(3), 'U') .* Tridiagonal(1:3, 1:4, 1:3)
4×4 Tridiagonal{Float64,Array{Float64,1}}:
 1.30771  0.838589   ⋅          ⋅
 0.0      3.89109   0.0459757   ⋅
  ⋅       0.0       4.48033    2.51508
  ⋅        ⋅        0.0        6.23739
```

In addition to the issues referenced above, it fixes:

* Fixes #19313, #22053, #23445, and #24586: Literals are no longer treated
  specially in a fused broadcast; they're just arguments in a `Broadcasted`
  object like everything else.

* Fixes #21094: Since broadcasting is now represented by a pure Julia
  datastructure it can be created within `@generated` functions and serialized.

* Fixes #26097: The fallback destination-array specialization method of
  `copyto!` is specifically implemented as `Broadcasted{Nothing}` and will not
  be confused by `nothing` arguments.

* Fixes the broadcast-specific element of #25499: The default base broadcast
  implementation no longer depends upon `Base._return_type` to allocate its
  array (except in the empty or concretely-type cases). Note that the sparse
  implementation (#19595) is still dependent upon inference and is _not_ fixed.

* Fixes #25340: Functions are treated like normal values just like arguments
  and only evaluated once.

* Fixes #22255, and is performant with 12+ fused broadcasts. Okay, that one was
  fixed on master already, but this fixes it now, too.

* Fixes #25521.

* The performance of this patch has been thoroughly tested through its
  iterative development process in #25377. There remain [two classes of
  performance regressions](#25377) that Nanosoldier flagged.

* #25691: Propagation of constant literals sill lose their constant-ness upon
  going through the broadcast machinery. I believe quite a large number of
  functions would need to be marked as `@pure` to support this -- including
  functions that are intended to be specialized.

(For bookkeeping, this is the squashed version of the [teh-jn/lazydotfuse](#25377)
branch as of a1d4e7e. Squashed and separated
out to make it easier to review and commit)

Co-authored-by: Tim Holy <[email protected]>
Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Andrew Keller <[email protected]>
@ChrisRackauckas
Copy link

Is this not handled by the new broadcast @mbauman ?

@mbauman
Copy link
Contributor

mbauman commented Jun 1, 2018

No, unfortunately not. The default dense case is fixed, but the sparse infrastructure is still built around relying on inference. It's a pretty major refactor to fix it as I recall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants