-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster A_mul_B! and * involving bidiagonal and tridiagonal matrices #15505
Conversation
end | ||
|
||
function A_mul_B_td!(C::AbstractMatrix, A::Tridiagonal, B::Tridiagonal) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the unnecessary empty line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concretely, the vast majority of the existing code base does not have blank lines after signature, even when the body starts with a comment. Therefore, if we allow them we lose consistency an leave room for discussions, which wastes time when reviewing PRs. A simple rule without exceptions is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an open discussion in the issues I linked to.
Thanks. It's great with the specialized methods. |
Thanks for all the feedback. Btw have you any idea, why the AppVeyor build failed? |
Someone might have a sharper eye than I do, but I think you just lucked out. The AppVeyor plan used allows for the total run time to be no more than 60 minutes. The entire thing on |
|
||
check_A_mul_B!_sizes(C, A, B) | ||
n = size(A,1) | ||
n <= 3 && return A_mul_B!(C, full(A), full(B)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's any advantage to using full
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is stuff like C[2,4] = A[2,3]*B[3,4]
in the code, which violates bounds of small matrices. Thats why n <= 3
is special.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd likely get much better performance by hoisting out and manually doing the transformation between A[2,3]
vs Adu = A.du
outside the loop then referring to Adu[2]
within the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tkelman I think I already did this. For indexing I used the following rules:
- Outside loops explicit indexing like
A[1,2]
is allowed for clarity. The performance overhead is negligible. - Inside loops do all indexing of BiTriSym matrices manually.
If you ever find code like A[i, j]
inside a loop, its because A
is not BiTriSym e.g. full or something.
Rebased and squashed, hopefully ready to merge now. |
A_mul_B!(C::AbstractMatrix, A::BiTri, B::BiTriSym) = A_mul_B_td!(C, A, B) | ||
A_mul_B!(C::AbstractMatrix, A::BiTriSym, B::BiTriSym) = A_mul_B_td!(C, A, B) | ||
A_mul_B!(C::AbstractMatrix, A::AbstractTriangular, B::BiTriSym) = A_mul_B_td!(C, A, B) | ||
A_mul_B!(C::AbstractMatrix, A::AbstractMatrix, B::BiTriSym) = A_mul_B_td!(C, A, B) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these 5 lines should probably be expressed with a Union. And the A::BiTri
line is redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhh I don't know a simpler way then these 5 lines. Its easy to produce ambiguous definitions. For example if I remove the BiTri line I get
WARNING: New definition
A_mul_B!(AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal}, Union{AbstractArray{T<:Any, 2}, AbstractArray{T<:Any, 1}}) at linalg/bidiag.jl:236
is ambiguous with:
A_mul_B!(AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal}) at linalg/bidiag.jl:232.
To fix, define
A_mul_B!(AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal})
before the new definition.
WARNING: New definition
A_mul_B!(AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal}, Union{AbstractArray{T<:Any, 2}, AbstractArray{T<:Any, 1}}) at linalg/bidiag.jl:236
is ambiguous with:
A_mul_B!(AbstractArray{T<:Any, 2}, AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal}) at linalg/bidiag.jl:234.
To fix, define
A_mul_B!(AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal})
before the new definition.
WARNING: New definition
A_mul_B!(Union{AbstractArray{T<:Any, 2}, AbstractArray{T<:Any, 1}}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal}, Union{AbstractArray{T<:Any, 2}, AbstractArray{T<:Any, 1}}) at linalg/bidiag.jl:237
is ambiguous with:
A_mul_B!(AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal}) at linalg/bidiag.jl:232.
To fix, define
A_mul_B!(AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal})
before the new definition.
WARNING: New definition
A_mul_B!(Union{AbstractArray{T<:Any, 2}, AbstractArray{T<:Any, 1}}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal}, Union{AbstractArray{T<:Any, 2}, AbstractArray{T<:Any, 1}}) at linalg/bidiag.jl:237
is ambiguous with:
A_mul_B!(AbstractArray{T<:Any, 2}, AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal}) at linalg/bidiag.jl:234.
To fix, define
A_mul_B!(AbstractArray{T<:Any, 2}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal}, Union{Base.LinAlg.Tridiagonal, Base.LinAlg.Bidiagonal, Base.LinAlg.SymTridiagonal})
One big union also definitely does not work. In fact I started with AbstractMatrix (which the Union boils down to)
+A_mul_B!(C::AbstractMatrix, A::AbstractMatrix, B::BiTriSym) = A_mul_B_td!(C, A, B)
and these five lines are what I had to do to fix ambiguities.
@tkelman Because life is too short to be waiting for sparse arrays being filled by inefficient methods.
|
function _fillnonzero!{Tv,Ti}(arr::SparseMatrixCSC{Tv, Ti}, val) | ||
m, n = size(arr) | ||
arr.colptr = convert(Vector{Ti}, collect(1:m:n*m+1)) | ||
arr.rowval = convert(Vector{Ti}, vcat([1:m for _ in 1:n]...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should actually be in place, not replacing these with new arrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this and fill!
also should not be in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want things to be inplace for another epsilon of performance or is there another reason?
Where should this fill! be? It needs to be at a point, where SparseVector, SparseMatrixCSC type is already defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SparseMatrixCSC is likely to become an immutable in the future (SparseVector already is) so that would force in place anyway.
Also the array might be pointed to by other variables and this would decouple them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for the hint, I changed to inplace. I already wondered why SparseVector is immutable, what is the reason to make these types immutable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few reasons, It for example enables various compiler optimizations like hoisting field loads out of loops. If you look at the various SparseMatrix functions now most of them manually hoist the loads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the terminology is a bit confusing. sparsematrix and sparsevector are composite types, where you very rarely need or want to wholesale replace the arrays they contain. the fields themselves are usually mutable arrays that can be modified, but only as long as it remains a reference to the same object. though being an immutable type would make it tough to modify the dimension parameters which I have needed to do once in a while.
Any remaining issues about this PR? If not, I would squash the commits again. |
C[i, j] = A[i, j-1] * Bj₋1j + A[i, j]*Bjj + A[i, j+1] * Bj₊1j | ||
end | ||
end | ||
end#inbounds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) looks a little nicer with a space before and after the #
I wish this were achievable in a bit less code, but since it probably isn't, I think this is ready. Couple minor style nits. |
I agree there are some not so elegant places in this code, but I don't see how to make it shorter. Though I think writing a code generator, which creates A_mul_B! methods such as these automatically for lots of sparsity + memory layout patterns is in principle possible. Of course that would be a much much bigger project. Big thanks @tkelman and the others for all that feedback! |
Meh that travis build timed out, how can I force it to rerun, do I have to push something? |
…al, symmetric tridiagonal... matrices
Two
A_mul_B!(C, A, B)
methods whereB
orA, B
are tridiagonal are implemented. This gives faster matrix multiplication methods for some bi and tridiagonal argument combinations, which would fall back to full matrix multiplication before. See JuliaLang/LinearAlgebra.jl#263 and timings below.