Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure reductions benefit from sparsity #244

Merged
merged 13 commits into from
Sep 16, 2022
Merged

Make sure reductions benefit from sparsity #244

merged 13 commits into from
Sep 16, 2022

Conversation

dkarrasch
Copy link
Member

Fixes #237.

src/sparsematrix.jl Outdated Show resolved Hide resolved
@SobhanMP
Copy link
Member

SobhanMP commented Sep 1, 2022

maybe also for vectors?

    Co-authored-by: Sobhan Mohammadpour <[email protected]>
@dkarrasch
Copy link
Member Author

dkarrasch commented Sep 1, 2022

I took the AdjOrTrans stuff from your PR, not sure why the coauthorship doesn't show up correctly. So, sum, all, any etc. over the entiry array are all instances of mapreduce, so things need to be managed at a lower level. In fact, I believe they are handled quite well already. One could think about specializing even more and make mapreduce(..., typeof(|), ...) and mapreduce(..., typeof(&), ...), which correspond to all and any, be computed via the difference between length and nnz or something. EDIT: I don't think that will work, because nonzeros(x) can still have zeros.

@SobhanMP
Copy link
Member

SobhanMP commented Sep 1, 2022

the only thing bugging is any of matrix

julia> using Revise, SparseArrays, LinearAlgebra
       for f in [sum, any, all],
           t in [Int, Float64, Bool],
           a in [
               sprand(t, 100000, 100000, 0.00000001),
               sprand(t, 100000, 0.0001),
           ]
           f != sum && t != Bool && continue
           @show f
           f(a)
           f(transpose(a))
           f(a.nzval)
           @time f(a)
           @time f(transpose(a))
           @time f(a.nzval)
       end
[ Info: Precompiling SparseArrays [3f01184e-e22b-5df5-ae63-d93ebab69eaf]
f = sum
  0.000005 seconds (1 allocation: 16 bytes)
  0.000006 seconds (2 allocations: 64 bytes)
  0.000007 seconds (2 allocations: 64 bytes)
f = sum
  0.000003 seconds (1 allocation: 16 bytes)
  0.000003 seconds (2 allocations: 48 bytes)
  0.000003 seconds (2 allocations: 48 bytes)
f = sum
  0.000004 seconds (1 allocation: 16 bytes)
  0.000002 seconds (2 allocations: 64 bytes)
  0.000004 seconds (2 allocations: 64 bytes)
f = sum
  0.000001 seconds (1 allocation: 16 bytes)
  0.000002 seconds (2 allocations: 48 bytes)
  0.000002 seconds (2 allocations: 48 bytes)
f = sum
  0.000004 seconds
  0.000005 seconds (1 allocation: 48 bytes)
  0.000005 seconds (1 allocation: 48 bytes)
f = sum
  0.000239 seconds
  0.000232 seconds (1 allocation: 32 bytes)
  0.000003 seconds (1 allocation: 32 bytes)
f = any
  0.067989 seconds
  0.223590 seconds (1 allocation: 48 bytes)
  0.000019 seconds (1 allocation: 48 bytes)
f = any
  0.000007 seconds
  0.000006 seconds (1 allocation: 32 bytes)
  0.000001 seconds (1 allocation: 32 bytes)
f = all
  0.000002 seconds
  0.000003 seconds (1 allocation: 48 bytes)
  0.000002 seconds (1 allocation: 48 bytes)
f = all
  0.000003 seconds
  0.000003 seconds (1 allocation: 32 bytes)
  0.000002 seconds (1 allocation: 32 bytes)

i agree it's better to handle things at a lower level

i think it's fine?
image

@dkarrasch
Copy link
Member Author

i think it's fine?

I thought it should show the portrait. 😄

Are the timings without compilation?

@SobhanMP
Copy link
Member

SobhanMP commented Sep 1, 2022

i think so (updated)

@codecov-commenter
Copy link

codecov-commenter commented Sep 1, 2022

Codecov Report

Merging #244 (170a811) into main (dfcc48a) will increase coverage by 0.22%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #244      +/-   ##
==========================================
+ Coverage   91.82%   92.05%   +0.22%     
==========================================
  Files          12       12              
  Lines        7307     7314       +7     
==========================================
+ Hits         6710     6733      +23     
+ Misses        597      581      -16     
Impacted Files Coverage Δ
src/sparsematrix.jl 95.37% <100.00%> (+0.67%) ⬆️
src/sparsevector.jl 95.14% <100.00%> (+0.01%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@dkarrasch
Copy link
Member Author

Wonderful benchmark! The issue with any (and all, though less visible because it drops out early) is that

https://github.com/JuliaLang/julia/blob/71131c97cb00483597fcd357625c054693171aab/base/reduce.jl#L1210-L1221

has precendence over

https://github.com/JuliaLang/julia/blob/71131c97cb00483597fcd357625c054693171aab/base/reducedim.jl#L1010-L1025

see line 1023, because the latter is completely generic. So we need to hook in here to avoid that iterator-based implementation and redirect to mapreduce.

@dkarrasch
Copy link
Member Author

Could you double check your benchmark? I stil can't get a local dev version running.

@SobhanMP
Copy link
Member

SobhanMP commented Sep 1, 2022

nvm give me sec.

@SobhanMP
Copy link
Member

SobhanMP commented Sep 1, 2022

yup looks good

julia> using Revise, SparseArrays, LinearAlgebra
       for f in [sum, any, all],
           t in [Int, Float64, Bool],
           a in [
               sprand(t, 100000, 100000, 0.00000001),
               sprand(t, 100000, 0.0001),
           ]
           f != sum && t != Bool && continue
           println("\n\n")
           @show f, typeof(a)
           f(a)
           f(transpose(a))
           f(a.nzval)
       
           @time f(a)
           # f(view(a, axes(a)...))
           # @time f(view(a, axes(a)...))
           @time f(transpose(a))
           @time f(a.nzval)
       end
[ Info: Precompiling SparseArrays [3f01184e-e22b-5df5-ae63-d93ebab69eaf]



(f, typeof(a)) = (sum, SparseMatrixCSC{Int64, Int64})
  0.000004 seconds (1 allocation: 16 bytes)
  0.000007 seconds (2 allocations: 64 bytes)
  0.000008 seconds (2 allocations: 64 bytes)



(f, typeof(a)) = (sum, SparseVector{Int64, Int64})
  0.000005 seconds (1 allocation: 16 bytes)
  0.000004 seconds (2 allocations: 48 bytes)
  0.000005 seconds (2 allocations: 48 bytes)



(f, typeof(a)) = (sum, SparseMatrixCSC{Float64, Int64})
  0.000005 seconds (1 allocation: 16 bytes)
  0.000004 seconds (2 allocations: 64 bytes)
  0.000005 seconds (2 allocations: 64 bytes)



(f, typeof(a)) = (sum, SparseVector{Float64, Int64})
  0.000006 seconds (1 allocation: 16 bytes)
  0.000004 seconds (2 allocations: 48 bytes)
  0.000005 seconds (2 allocations: 48 bytes)



(f, typeof(a)) = (sum, SparseMatrixCSC{Bool, Int64})
  0.000004 seconds
  0.000003 seconds (1 allocation: 48 bytes)
  0.000004 seconds (1 allocation: 48 bytes)



(f, typeof(a)) = (sum, SparseVector{Bool, Int64})
  0.000004 seconds
  0.000004 seconds (1 allocation: 32 bytes)
  0.000004 seconds (1 allocation: 32 bytes)



(f, typeof(a)) = (any, SparseMatrixCSC{Bool, Int64})
  0.000012 seconds (2 allocations: 64 bytes)
  0.000007 seconds (3 allocations: 112 bytes)
  0.000003 seconds (1 allocation: 48 bytes)



(f, typeof(a)) = (any, SparseVector{Bool, Int64})
  0.000003 seconds
  0.000003 seconds (1 allocation: 32 bytes)
  0.000001 seconds (1 allocation: 32 bytes)



(f, typeof(a)) = (all, SparseMatrixCSC{Bool, Int64})
  0.000007 seconds (2 allocations: 64 bytes)
  0.000004 seconds (3 allocations: 112 bytes)
  0.000002 seconds (1 allocation: 48 bytes)



(f, typeof(a)) = (all, SparseVector{Bool, Int64})
  0.000004 seconds
  0.000004 seconds (1 allocation: 32 bytes)
  0.000003 seconds (1 allocation: 32 bytes)

src/sparsevector.jl Outdated Show resolved Hide resolved
@dkarrasch dkarrasch changed the title Add count w/o predicate Make sure reductions benefit from sparsity Sep 6, 2022
@dkarrasch
Copy link
Member Author

@SobhanMP Could you please check that performance is good, without the transpose cases? If yes, then I think this is ready to go.

@dkarrasch
Copy link
Member Author

I realized we already have

https://github.com/JuliaLang/julia/blob/fa3981bf83a016e2fb48f51204ccbf9d8d66397c/stdlib/LinearAlgebra/src/adjtrans.jl#L382-L385

so the desired adjoint/transpose behavior should already be included. So, even if we don't have tests for which specific code route should be taken, we should test that reduction over adjoints of sparse matrices is fast.

@dkarrasch
Copy link
Member Author

Together with JuliaLang/julia#46605, all benchmarks run within nanoseconds and plain sparse arrays and their transpose take pretty much the same amount of time. Let's go with this.

@dkarrasch dkarrasch merged commit 0d63db0 into main Sep 16, 2022
@dkarrasch dkarrasch deleted the dk/count branch September 16, 2022 10:42
fredrikekre added a commit to JuliaLang/julia that referenced this pull request Sep 22, 2022
This patch updates SparseArrays. In particular it contains
JuliaSparse/SparseArrays.jl#260 which is
necessary to make progress in #46759.

All changes:
 - Fix ambiguities with Base. (JuliaSparse/SparseArrays.jl#268)
 - add == for vectors (JuliaSparse/SparseArrays.jl#248)
 - add undef initializers (JuliaSparse/SparseArrays.jl#263)
 - Make sure reductions benefit from sparsity (JuliaSparse/SparseArrays.jl#244)
 - Remove fkeep! from the documentation (JuliaSparse/SparseArrays.jl#261)
 - Fix direction of circshift (JuliaSparse/SparseArrays.jl#260)
 - Fix `vcat` of sparse vectors with numbers (JuliaSparse/SparseArrays.jl#253)
 - decrement should always return a vector (JuliaSparse/SparseArrays.jl#241)
 - change order of arguments in fkeep, fix bug with fixed elements (JuliaSparse/SparseArrays.jl#240)
 - Sparse matrix/vectors with fixed sparsity pattern. (JuliaSparse/SparseArrays.jl#201)
fredrikekre added a commit to JuliaLang/julia that referenced this pull request Sep 27, 2022
This patch updates SparseArrays. In particular it contains
JuliaSparse/SparseArrays.jl#260 which is
necessary to make progress in #46759.

All changes:
 - Fix ambiguities with Base. (JuliaSparse/SparseArrays.jl#268)
 - add == for vectors (JuliaSparse/SparseArrays.jl#248)
 - add undef initializers (JuliaSparse/SparseArrays.jl#263)
 - Make sure reductions benefit from sparsity (JuliaSparse/SparseArrays.jl#244)
 - Remove fkeep! from the documentation (JuliaSparse/SparseArrays.jl#261)
 - Fix direction of circshift (JuliaSparse/SparseArrays.jl#260)
 - Fix `vcat` of sparse vectors with numbers (JuliaSparse/SparseArrays.jl#253)
 - decrement should always return a vector (JuliaSparse/SparseArrays.jl#241)
 - change order of arguments in fkeep, fix bug with fixed elements (JuliaSparse/SparseArrays.jl#240)
 - Sparse matrix/vectors with fixed sparsity pattern. (JuliaSparse/SparseArrays.jl#201)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sum of a Bool sparse array is slow
4 participants