Matrix Multiplication API #473

jebej · 2017-09-28T18:12:09Z

Currently, there is the following line in the sparse matmult code:

https://github.com/JuliaLang/julia/blob/056b374919e11977d5a8d57b446ad1c72f3e6b1d/base/sparse/linalg.jl#L94-L95

I am assuming that this means we want to have the more general A_mul_B*!(α,A,B,β,C) = αAB + βC methods which overwrite C (the BLAS gemm API) for dense arrays. Is that still the case? (It also seems possible to keep both APIs, i.e. keep the A_mul_B*!(C,A,B) methods, which would simply be defined as A_mul_B*!(C,A,B) = A_mul_B*!(1,A,B,0,C).)

I would personally like to have the gemm API defined for all array types (this has been expressed elsewhere). Implementing these methods for dense arrays seems fairly simple, since they would just directly call gemm et al. The sparse case is already implemented. The only real modification would be to the pure julia generic matmult, which does not accept α and β arguments.

This would lead to generic code that works with any type of array/number. I currently have a simple implementation of expm (after having done the modification to _generic_matmatmult!) which works with bigfloats and sparse arrays.

The text was updated successfully, but these errors were encountered:

Sacha0 · 2017-09-28T18:45:46Z

Ref. #160, JuliaLang/julia#20053, #464. Best!

jebej · 2017-09-28T18:52:49Z

Thanks for the references. I suppose this issue has more to do with adding the gemm-style methods than an API revamp, but it can be closed if we think it still is too similar to #160.

jebej · 2017-09-30T14:05:53Z

As a starting point, would there be support for having _generic_matmatmul! have the gemm API? It's a fairly simple change, and purely additive/nonbreaking, since the current method would simply be implemented by having α=1 and β=0. I can make the PR. I'd probably go similarly to this version (in that version I cut all the transpose stuff because I didn't need it, I wouldn't do that here).

andreasnoack · 2017-10-01T08:10:04Z

Yes. That would be a good start. We need to consider the ordering of the arguments though. Originally, I thought it was more natural to follow the BLAS ordering but we fairly consistent about having output arguments first which is also the case for the current three-argument A_mul_B!. Furthermore and as you have already pointed out, the three-argument version would correspond to the five-argument version with α=1 and β=0 and default value arguments are last. Of course, we don't necessarily have to use the default value syntax for this but it would make sense to use it here.

StefanKarpinski · 2017-10-02T17:28:41Z

Why not just introduce a generic gemm function?

jebej · 2017-10-03T00:21:06Z

Yes. That would be a good start. We need to consider the ordering of the arguments though. Originally, I thought it was more natural to follow the BLAS ordering but we fairly consistent about having output arguments first which is also the case for the current three-argument A_mul_B!. Furthermore and as you have already pointed out, the three-argument version would correspond to the five-argument version with α=1 and β=0 and default value arguments are last. Of course, we don't necessarily have to use the default value syntax for this but it would make sense to use it here.

Sounds good. We can continue the discussion about the actual argument ordering and method renaming in #160. This is more about just having the five-argument version available, so I'll keep the current Ax_mul_Bx!(α,A,B,β,C) interface.

Why not just introduce a generic gemm function?

Are you suggesting renaming _generic_matmatmul! to gemm! in addition to the changes above?

jebej · 2017-10-03T00:26:03Z

To be clearer, I do think we should end up having a single method mul(C,A,B,α=1,β=0), along with lazy transpose/adjoint types to dispatch on, but that will be an other PR.

andreasnoack · 2017-10-03T06:38:27Z

Why not just introduce a generic gemm function?

I think gemm is a misnomer in Julia. In BLAS, the ge part indicates that the matrices are general, i.e. have no special structure, the first m is multiply and the list m is matrix. In Julia, the ge (general) part is encoded in the signature as is the last m (matrix) so I think we should just call it mul!.

dlfivefifty · 2018-02-14T01:21:57Z

Is the notation mul!(α, A, B, β, C) from SparseArrays.jl

https://github.com/JuliaLang/julia/blob/160a46704fd1b349b5425f104a4ac8b323ea85af/stdlib/SparseArrays/src/linalg.jl#L32

"finalised" as the official syntax? And this would be in addition to mul!(C, A, B), lmul!(A, B), and rmul!(A,B)?

I'm not too big a fan of having A, B, and C in different orders.

andreasnoack · 2018-02-14T08:27:41Z

Is the notation mul!(α, A, B, β, C) from SparseArrays.jl "finalised" as the official syntax?

I'd say no. Originally, I liked the idea of following BLAS (and the order also matched how this is usually written mathematically) but now I think it makes sense to just add the scaling arguments as optional fourth and fifth arguments.

dlfivefifty · 2018-02-14T21:21:11Z

So just to clarify, you'd like optional arguments in the sense

function mul!(C, A, B, α=1, β=0)
 ...
end

The other option would be optional keyword arguments

function mul!(C, A, B; α=1, β=0)
...
end

but I'm not sure people are too happy with unicode.

andreasnoack · 2018-02-14T22:15:12Z

I'm very happy with unicode but it is true that we try always have an ascii option and it would be possible here. The names α and β are also not super intuitive unless you know BLAS so I think using positional arguments the better solution here.

#26117) * mul!(α, A, x, β, y) -> mul!(y, A, x, a, b) (#23919) * missing )

perrutquist · 2018-07-06T07:40:55Z

In my opinion, a more logical nomenclature would be to let muladd!(A, B, C, α=1, β=1) map to the various BLAS routines that do multiplication and addition. (gemm as above, but also e.g. axpy when A is a scalar.)

The mul! function could then have an interface like mul!(Y, A, B, ...) taking an arbitrary number of arguments (just like * does) as long as all intermediate results can be stored in Y. (An optional kwarg could specify the order of multiplication, with a reasonable default)

mul!(Y, A::AbstractVecOrMat, B:AbstractVecOrMat, α::Number) would have the default implementation muladd!(A, B, Y, α=α, β=0), as would the other permutations of two matrix/vectors and a scalar.

matthieugomez · 2018-07-08T00:34:44Z

Another vote to have mul!(C, A, B, α, β) defined for dense matrices. This would allow to write generic code for dense and sparse matrices. I wanted to define such a function in my non linear least squares package but I guess this is type piracy.

dmbates · 2018-10-03T17:41:49Z

I also have been tempted to write mul!(C, A, B, α, β) methods for the MixedModels package and engage in a bit of type piracy, but it would be much better if such methods were in the LinearAlgebra package. Having methods for a muladd! generic for this operation would be fine with me too.

simonbyrne · 2018-10-03T21:23:25Z

I'm in favour, though I think it should probably have a different name than just mul!. muladd! seems reasonable, but am certainly open to suggestions.

simonbyrne · 2018-10-03T21:23:55Z

Maybe mulinc! for multiply-increment?

YingboMa · 2018-10-03T22:26:28Z

Maybe we can have something like addmul!(C, A, B, α=1, β=1)?

StefanKarpinski · 2018-10-04T03:59:28Z

Isn’t this a form of muladd!? Or is the idea behind calling it addmul! that it mutates the add argument rather than the multiply argument? Would one ever mutate the multiply argument?

simonbyrne · 2018-10-04T04:31:32Z

Note that we do mutate non-first elements in some cases, e.g. lmul! and ldiv!, so we could do them in the usual "muladd" order (i.e. muladd!(A,B,C)). The question is what order should α and β go? One option would be to make the keyword arguments?

tkf · 2018-10-04T05:07:01Z

Wouldn't it be nice if you leave an option to the implementers to dispatch on the types of the scalars α and β? It's easy to add sugars for end-users.

andreasnoack · 2018-10-04T06:57:51Z

I thought we already settled on mul!(C, A, B, α, β) with default values for α, β. We are using this version in https://github.com/JuliaLang/julia/blob/b8ca1a499ff4044b9cb1ba3881d8c6fbb1f3c03b/stdlib/SparseArrays/src/linalg.jl#L32-L50. I think some packages are also using this form but I don't remember which on top of my head.

tkf · 2018-10-04T07:48:08Z

Thanks! It would be nice if that's documented.

simonbyrne · 2018-10-04T15:32:35Z

I thought we already settled on mul!(C, A, B, α, β) with default values for α, β.

SparseArrays uses it, but I don't recall it being discussed anywhere.

dmbates · 2018-10-04T16:04:52Z

In some ways the muladd! name is more natural because it is a multiplication followed by an addition. However, the default values of α and β, muladd!(C, A, B, α=1, β=0) (note that the default for β is zero, not one), turn it back into mul!(C, A, B).

It seems to be a case of pedantry vs consistency whether to call this mul! or muladd! and I think the case of the existing method in SparseArrays would argue for mul!. I find myself in the curious case of arguing for consistency despite my favorite Oscar Wilde quote, "Consistency is the last refuge of the unimaginative."

simonbyrne · 2018-10-04T18:05:21Z

In some ways the muladd! name is more natural because it is a multiplication followed by an addition. However, the default values of α and β, muladd!(C, A, B, α=1, β=0) (note that the default for β is zero, not one), turn it back into mul!(C, A, B).

There is an interesting exception to this when C contains Infs or NaNs: theoretically, if β==0, the result should still be NaNs. This doesn't happen in practice because BLAS and our sparse matrix code explicitly check for β==0 then replace it with zeros.

tkf · 2019-08-12T02:27:58Z

I'd vote for strong zero (:heart:) if it were "Merge for 1.x" instead of "Merge for 1.3". Wouldn't the options make sense if all "Merge for 1.3" are "Merge 1.x"?

chriscoey · 2019-08-12T03:46:04Z

Thanks @chethega. @tkf I am in the position of really needing the new mul! ASAP without caring too much about the NaN decision (as long as performance is not compromised).

tkf · 2019-08-12T03:59:19Z

Did you check out LazyArrays.jl? FYI, it has really nice fused matrix multiplication support. You can also use BLAS.gemm! etc. safely since they are public methods https://docs.julialang.org/en/latest/stdlib/LinearAlgebra/#LinearAlgebra.BLAS.gemm!

chriscoey · 2019-08-12T04:05:11Z

I actually truly need the generic mul! since we use a wide variety of structured and sparse and plain old dense matrices, for representing different optimization problems most efficiently. I'm here for the genericity and speed.

tkf · 2019-08-12T04:37:59Z

I see. And I just remembered that we discussed things in LazyArrays.jl so of course you already knew it...

Regarding "ASAP", Julia's four months release cycle is, at least as a design, to avoid "merge rush" just before feature freeze. I know it's not fair for me to say this because I've tried the same thing before... But I think someone needs to mention this as a reminder. The bright side is Julia is super easy to build. You can start using it right after it is merged until next release.

Edit: release -> merge

chriscoey · 2019-08-12T05:04:55Z

Thanks. I find deadlines to be helpful motivators, and I would like to avoid letting this go stale again. So I would propose that we try to use the deadline as a goal.

tkf · 2019-08-12T05:16:20Z

It's great that you are actively injecting energy to this thread!

dlfivefifty · 2019-08-12T06:07:38Z

I actually truly need the generic mul! since we use a wide variety of structured and sparse and plain old dense matrices, for representing different optimization problems most efficiently. I'm here for the genericity and speed.

A 5-argument mul! won't work well when you have many different types: you'll need combinatorially many overrides to avoid ambiguity. This is one of the motivations behind LazyArrays.jl MemoryLayout system. It's used for "structured and sparse" matrices precisely for this reason in BandedMatrices.jl and BlockBandedMatrices.jl. (Here even sub views of banded matrices dispatch to banded BLAS routines.)

chriscoey · 2019-08-12T06:45:46Z

Thanks, I will try LazyArrays again.

Since the 5-arg mul seems to be generally considered as a temporary stopgap (until some solution like LazyArrays can be used in 2.0), I think we can aim to get it merged without necessarily being the ideal or perfect solution for the long run.

@chethega when do you think we should count the tallies for the new nonbinding vote?

chethega · 2019-08-12T08:56:34Z

@tkf Sure, you're right that strong/weak/undef zeros make sense for 1.x as well.

However, I think there are quite a few people who would rather have 1.3 mul! now than wait until 1.4 to get 5-arg mul!. If there was no deadline, then I'd wait some more and take some more time for thinking how to conduct a proper poll (at least 10 days for voting). Most importantly, we cannot have a meaningful vote without first presenting and benchmarking competing implementations on the speed and elegance of weak/strong zeros. I personally suspect that weak zeros could be made almost as fast as strong zeros, by first checking iszero(alpha), then scanning the matrix for !isfinite values, and only then using a slow path with extra allocation; but I prefer strong zero semantics anyway.

@chethega when do you think we should count the tallies for the new nonbinding vote?

Triage must make a decision (delay / strong / weak / backstop) this week for the 1.3 alpha. I think Thursday 15th or Wednesday 14th are sensible options for triage to make a count, and take it into consideration. I probably won't be able to join on Thursday, so someone else will have to count.

Realistically, it is OK to be conservative here, and miss the deadline, and continue the discussion and wait for 1.4.

On the other hand, we may already have reached consensus without noticing it: @andreasnoack made some powerful arguments that a zero coefficient should be a strong zero. It may well be that he managed to convince all the weak zero proponents. It may well be that there is a large majority that really wants 5-arg mul!, preferably last year, and does not really care about this little detail. If that was the case, then it would be a pity to further delay the feature, just because no one wants to shut up the discussion.

dlfivefifty · 2019-08-12T09:25:14Z

Why not just throw an error for now:

β == 0.0 && any(isnan,C) && throw(ArgumentError("use β = false"))

chethega · 2019-08-12T12:08:07Z

Why not just throw an error for now

I added that option to the poll. Great compromise idea!

StefanKarpinski · 2019-08-12T14:06:37Z

Just to set expectations: the feature freeze for 1.3 is in three days so there’s basically no way this can make it in time. We’re being pretty strict about feature freezes and branching since it’s the only part of the release cycle that we can really control the timing of.

andreasnoack · 2019-08-12T14:16:26Z

The work is already done in JuliaLang/julia#29634 though. Just needs to be adjusted and rebased.

chriscoey · 2019-08-12T14:27:53Z

@tkf for JuliaLang/julia#29634 could you list the work that remains to be done (incl renaming and handling zeros according to the vote)? I know you're busy so perhaps we can find a way to split any remaining todos so the burden won't fall on you again.

tkf · 2019-08-13T01:24:34Z

The TODOs I can think of ATM are:

rebase
rename addmul! -> mul!
- and adjust code assuming that the new API wouldn't be called mul! (e.g., @deprecate mul!(C, A, B, α, β) addmul!(C, A, B, α, β))
get OK from core dev about the implementation strategy of MulAddMul etc.; see Implement multiply-add interface in LinearAlgebra julia#29634 (comment). @andreasnoack, did you have chance to look at it?
ideally add tests using Quaternion; see Implement multiply-add interface in LinearAlgebra julia#29634 (comment)

My PR implements BLAS semantics of the β = 0 handling. So other handling, like throwing an error, also has to be implemented.

tkf · 2019-08-13T03:05:45Z

My PR implements BLAS semantics of the β = 0 handling.

Sorry, my memory was stale; my implementation was not consistent and propagates NaN sometimes. So additional TODO is to make the behavior of β = 0.0 consistent.

andreasnoack · 2019-08-13T07:21:19Z

The MulAddMul type would just be for internal use, right?

tkf · 2019-08-13T07:48:27Z

Yes, it's completely an internal detail. My worries were (1) there may be too many specializations (beta=0 etc. is encoded in the type parameter) and (2) it decreases the readability of the source code.

andreasnoack · 2019-08-13T08:20:45Z

Those are valid concerns. We already produce a ton of specializations in the linear algebra code so it's good to think through if we really need to specialize here. My thinking has generally been that we should optimize for small matrices since it not free (as you said, it complicates the source code and could increase compile times) and people are better off using StaticArrays for small matrix multiplications. Hence, I lean towards just checking the values at runtime but we can always adjust this later if we change our mind so we shouldn't let this cause any delays.

dlfivefifty · 2019-08-13T08:52:45Z

FYI soft zeros do have simple implementations:

if iszero(β) && β !== false && !iszero(α)
   lmul!(zero(T),y) # this handles soft zeros correctly
   BLAS.gemv!(α, A, x, one(T), y) # preserves soft zeros
elseif iszero(α) && iszero(β)
   BLAS.gemv!(one(T), A, x, one(T), y) # puts NaNs in the correct place
   lmul!(zero(T), y) # everything not NaN should be zero
elseif iszero(α) && !iszero(β)
   BLAS.gemv!(one(T), A, x, β, y) # puts NaNs in the correct place
   BLAS.gemv!(-one(T), A, x, one(T), y) # subtracts out non-NaN changes
end

tkf · 2019-08-13T09:04:23Z

@andreasnoack Sorry, I forgot that we actually needed the specialization to optimize the inner-most loop for some structured matrices like mul!(C, A::BiTriSym, B, α, β) JuliaLang/julia#29634 (comment). Some specializations can be removed but that's actually more work (so delays).

chriscoey · 2019-08-13T12:09:30Z

Hence, I lean towards just checking the values at runtime but we can always adjust this later if we change our mind so we shouldn't let this cause any delays.

Great!

andreasnoack · 2019-08-14T21:32:42Z

Fixed by JuliaLang/julia#29634

tkf · 2019-08-14T21:39:42Z

@andreasnoack Thanks a lot for reviewing and merging this in time!

Now that it is merged for 1.3 it started makes me very nervous about the implementation 😄. I appreciate if people here can test their linear algebra code more thoroughly when 1.3-rc comes out!

KristofferC · 2019-08-15T08:17:01Z

No need to worry, there will be plenty of time for 1.3 RCs + PkgEvals to shake out bugs.

Sacha0 added the linear algebra label Sep 28, 2017

dlfivefifty referenced this issue in dlfivefifty/julia Feb 19, 2018

mul!(α, A, x, β, y) -> mul!(y, A, x, a, b) (#23919)

9db2a59

dlfivefifty mentioned this issue Feb 19, 2018

mul!(α, A, x, β, y) -> mul!(y, A, x, a, b) in SparseArrays.jl (#23919) JuliaLang/julia#26117

Merged

andreasnoack referenced this issue in JuliaLang/julia Feb 24, 2018

mul!(α, A, x, β, y) -> mul!(y, A, x, a, b) in SparseArrays.jl (#23919) (

e4b6677

#26117) * mul!(α, A, x, β, y) -> mul!(y, A, x, a, b) (#23919) * missing )

StefanKarpinski closed this as completed Aug 12, 2019

StefanKarpinski reopened this Aug 12, 2019

andreasnoack closed this as completed Aug 14, 2019

KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024

Matrix Multiplication API #473

Matrix Multiplication API #473

Comments

jebej commented Sep 28, 2017

Sacha0 commented Sep 28, 2017

jebej commented Sep 28, 2017

jebej commented Sep 30, 2017

andreasnoack commented Oct 1, 2017

StefanKarpinski commented Oct 2, 2017

jebej commented Oct 3, 2017

jebej commented Oct 3, 2017

andreasnoack commented Oct 3, 2017

dlfivefifty commented Feb 14, 2018

andreasnoack commented Feb 14, 2018 • edited Loading

dlfivefifty commented Feb 14, 2018

andreasnoack commented Feb 14, 2018

perrutquist commented Jul 6, 2018

matthieugomez commented Jul 8, 2018 • edited Loading

dmbates commented Oct 3, 2018

simonbyrne commented Oct 3, 2018

simonbyrne commented Oct 3, 2018

YingboMa commented Oct 3, 2018

StefanKarpinski commented Oct 4, 2018

simonbyrne commented Oct 4, 2018

tkf commented Oct 4, 2018

andreasnoack commented Oct 4, 2018

tkf commented Oct 4, 2018

simonbyrne commented Oct 4, 2018

dmbates commented Oct 4, 2018

simonbyrne commented Oct 4, 2018 • edited Loading

tkf commented Aug 12, 2019

chriscoey commented Aug 12, 2019

tkf commented Aug 12, 2019 • edited Loading

chriscoey commented Aug 12, 2019

tkf commented Aug 12, 2019 • edited Loading

chriscoey commented Aug 12, 2019

tkf commented Aug 12, 2019

dlfivefifty commented Aug 12, 2019

chriscoey commented Aug 12, 2019

chethega commented Aug 12, 2019

dlfivefifty commented Aug 12, 2019

chethega commented Aug 12, 2019

StefanKarpinski commented Aug 12, 2019 • edited Loading

andreasnoack commented Aug 12, 2019

chriscoey commented Aug 12, 2019

tkf commented Aug 13, 2019

tkf commented Aug 13, 2019

andreasnoack commented Aug 13, 2019

tkf commented Aug 13, 2019

andreasnoack commented Aug 13, 2019

dlfivefifty commented Aug 13, 2019

tkf commented Aug 13, 2019

chriscoey commented Aug 13, 2019

andreasnoack commented Aug 14, 2019

tkf commented Aug 14, 2019

KristofferC commented Aug 15, 2019

andreasnoack commented Feb 14, 2018 •

edited

Loading

matthieugomez commented Jul 8, 2018 •

edited

Loading

simonbyrne commented Oct 4, 2018 •

edited

Loading

tkf commented Aug 12, 2019 •

edited

Loading

tkf commented Aug 12, 2019 •

edited

Loading

StefanKarpinski commented Aug 12, 2019 •

edited

Loading