Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from internal 5-arg searchsorted* methods to views #440

Merged
merged 2 commits into from
Sep 10, 2023

Conversation

LilithHafner
Copy link
Member

There should be little performance impact:

julia> x = cumsum(rand(1000));

julia> @btime searchsortedfirst($x, 250, 400, 600, Base.Forward)
  14.612 ns (0 allocations: 0 bytes)
510

julia> @btime searchsortedfirst(view($x, 400:600), 250)
  14.612 ns (0 allocations: 0 bytes)
111

See JuliaLang/julia#51177 and JuliaLang/julia#51176 for motivation

@codecov
Copy link

codecov bot commented Sep 5, 2023

Codecov Report

Merging #440 (cf4aac5) into main (4e6776a) will increase coverage by 0.78%.
Report is 2 commits behind head on main.
The diff coverage is 80.64%.

@@            Coverage Diff             @@
##             main     #440      +/-   ##
==========================================
+ Coverage   92.42%   93.20%   +0.78%     
==========================================
  Files          12       12              
  Lines        7667     7668       +1     
==========================================
+ Hits         7086     7147      +61     
+ Misses        581      521      -60     
Files Changed Coverage Δ
src/linalg.jl 91.56% <75.00%> (+4.56%) ⬆️
src/sparsematrix.jl 95.68% <77.77%> (+<0.01%) ⬆️
src/solvers/cholmod.jl 89.81% <100.00%> (ø)
src/sparsevector.jl 95.46% <100.00%> (ø)

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@LilithHafner LilithHafner merged commit ac5c8ed into main Sep 10, 2023
@LilithHafner LilithHafner deleted the lh/searchsorted branch September 10, 2023 16:43
@fredrikekre
Copy link
Member

The benchmark in the OP is not benchmarking the direct replacement. Is there a benchmark for e.g. getindex(::SparseMatrixCSC, ::Int, ::Int)?

@LilithHafner
Copy link
Member Author

Adding the drop-in replacement to the OP doesn't change much

julia> @btime searchsortedfirst($x, 250, 400, 600, Base.Forward)
  18.597 ns (0 allocations: 0 bytes)
498

julia> @btime searchsortedfirst(view($x, 400:600), 250) + 400 - 1
  18.806 ns (0 allocations: 0 bytes)
498

julia> @btime searchsortedfirst(view($x, 400:600), 250)
  18.806 ns (0 allocations: 0 bytes)
99

I expect the additional addition and decrement operations to be negligible in a larger program because the decrement can probably be eliminated through some means or another and the subtraction might result in increased or decreased work, depending on if it cancels out another operation.

For end to end, I'm getting noisy and unreliable benchmarking results :( however, they do not show a regression. I'd be somewhat surprised if this caused a noticeable regression.

Before

julia> versioninfo()
Julia Version 1.11.0-DEV.471
Commit 153b538ff0* (2023-09-05 21:23 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 8 × Apple M2
  WORD_SIZE: 64
  LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
  Threads: 1 on 4 virtual cores
Environment:
  JULIA_EDITOR = code

julia> using SparseArrays, BenchmarkTools, Random

julia> Random.seed!(1729)
TaskLocalRNG()

julia> x = sprand(1000, 1000, .001)
1000×1000 SparseMatrixCSC{Float64, Int64} with 1057 stored entries:
⎡⢉⠀⠐⠀⠁⠁⠂⠀⠁⠄⠀⠠⠀⠀⠈⠁⠈⠀⡁⡁⠀⢀⠀⠠⠃⠉⠆⡀⠢⠠⢀⠂⠂⠀⠠⠀⠄⠄⠄⠀⎤
⎢⠰⡗⠈⠥⠁⠄⠠⢀⠡⠂⠀⠀⠄⠄⠤⡅⢀⠂⡄⡀⠀⠂⠁⠔⡀⠘⠀⢈⠀⡰⠌⠀⢰⠀⠒⠂⠀⠒⢂⠀⎥
⎢⢤⡀⠀⠈⠐⠀⠈⠄⠀⠀⠀⡤⡀⠀⠄⡐⠌⢂⠔⠀⢐⠖⢀⠥⠀⡀⠐⡠⠄⠐⡐⠀⢠⠀⡀⠒⠘⡘⠒⠀⎥
⎢⠐⠄⠀⠅⠀⠀⠤⠀⢀⠀⠀⠀⠀⡀⡈⠰⠡⠀⠄⡀⡀⠀⠈⠢⠈⡄⠀⠀⠊⠜⠀⠘⠠⠠⠁⠊⠅⠒⠎⢀⎥
⎢⠀⠂⡐⠀⠌⠁⢡⢐⢄⠈⠂⠀⠀⠐⠒⢄⠠⠄⠁⠁⠀⢀⣀⠀⠀⠃⠂⠀⡐⠄⠄⠀⠀⢀⠀⠍⠠⠄⡠⠌⎥
⎢⠀⠐⠐⠁⠀⠰⢨⢀⠂⠀⠀⡠⡂⠐⡒⠄⠀⠆⠘⢘⢂⠀⠀⠈⠈⡀⠆⡘⠌⠃⠀⠂⡄⠀⠈⠀⢂⠐⠁⠀⎥
⎢⠀⠁⠑⠀⢀⡄⠂⠀⠂⢀⡁⡀⠲⢈⡠⠀⢀⠀⠀⠆⠀⢂⢔⠀⠀⠡⠄⠠⠠⢠⠄⠈⢈⠐⠋⡨⡀⡀⠌⠀⎥
⎢⠀⠠⠀⡈⠐⡀⠂⠀⠠⠀⠈⢅⠁⠀⠀⠀⠀⠀⡄⢉⠀⠅⠁⠠⠁⢐⠀⠀⡀⠈⠀⠀⠀⠠⠠⠄⠁⠀⢀⢠⎥
⎢⠐⣠⢂⠊⡠⡲⠘⠀⠀⠁⠐⠄⠍⠁⢄⠀⠠⠀⠀⠄⠐⠀⡀⢋⠀⠠⠄⠡⠀⠁⠐⡀⠀⠀⠈⡩⠀⡀⠠⠐⎥
⎢⠀⠉⠁⠀⠈⢐⠰⠂⠀⢄⠀⠠⠆⡌⢅⠀⣑⠅⠓⠐⡄⢀⠅⡀⠠⠁⡀⠡⠀⠠⠠⡐⠈⡄⠔⠴⠤⡡⡀⠁⎥
⎢⠁⡁⠀⠀⠝⠃⠉⠀⠰⡀⠀⠈⠐⡌⠂⠀⠠⢡⠈⡀⠱⠐⠀⢀⢀⠵⠀⠒⡂⠐⠈⣤⣀⡀⡌⠀⠠⠈⠀⢀⎥
⎢⠂⢀⠰⠀⡀⠁⠈⢉⢑⡂⢀⡈⣀⠂⢁⠀⠊⢀⡀⢀⡀⠄⡂⠈⠀⠁⡀⠁⠂⠀⡄⠀⠀⠀⠐⠄⡜⠈⠀⠀⎥
⎢⠠⡀⠂⠀⢄⠄⠀⠠⡀⢁⠂⢈⢰⠉⠂⡀⠂⠠⠀⠐⡀⠠⠂⡀⡀⢀⢑⢀⠐⠐⠠⡄⠀⠀⠐⠀⢒⠄⢁⠌⎥
⎢⠀⢀⠌⠐⠂⢀⠉⠌⠄⢁⠰⠁⡀⠀⡀⠄⠀⠈⡀⠀⢄⢀⠀⢢⠆⠊⡀⠤⡀⠀⡈⠐⠀⠅⠀⠌⢎⠐⠁⠅⎥
⎢⠊⠀⠀⠀⠰⡰⡨⡀⠀⢀⡄⠀⡀⡠⠰⠀⠀⠀⠀⣒⢆⣁⠂⠈⠀⠐⠐⠂⡈⠀⠃⠂⠠⠠⠌⠠⠨⠂⠢⡀⎥
⎢⠁⢀⡒⠂⠈⢩⢀⠀⢨⡄⡀⢄⠈⡂⠐⠀⡀⠠⠤⡰⢀⠀⠀⠀⠀⡄⠀⠪⠀⠁⠈⠀⡄⠠⠃⠀⠈⢠⡡⠄⎥
⎢⠀⡂⠕⠡⠀⠆⠠⠅⠁⠀⡑⠀⢌⠀⡂⠂⠅⠩⠁⢀⠑⠀⢠⠄⠠⡆⠂⠣⠀⠄⠂⠢⠂⢰⠄⠖⠥⠐⠂⠈⎥
⎢⠐⠀⠀⡄⠀⠌⠠⢄⢀⠌⡀⠎⠐⡁⠐⠁⡀⠣⠁⠂⠃⠁⢴⠃⠀⡁⠉⠊⠀⠒⠁⠀⠀⠨⠈⠀⠠⠒⡄⠀⎥
⎢⡂⠀⠠⠍⠀⠀⠄⡀⠔⢀⠁⠈⢄⣅⢄⡂⢈⠰⠠⠃⡁⠱⠒⢀⡀⠀⠀⠙⠂⠈⠁⠀⠘⠔⠄⠀⠒⡀⠂⣀⎥
⎣⠈⠀⠈⢀⠉⠀⠀⠑⠀⠀⡐⠀⠆⠀⢈⠀⠀⡄⠂⠉⠈⠀⡘⠀⠀⠈⠒⠐⢒⢉⠈⠁⠁⠀⠂⠀⠠⠂⢀⠄⎦

julia> @benchmark getindex($x, i, j) setup=(i=rand(1:1000); j=rand(1:1000)) gctrial=false gcsample=false evals=1_000 samples=10_000
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  2.125 ns  22.875 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     2.542 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.616 ns ±  0.469 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    █▁        ▃                 ▄                             
  ▂▃██▃▁▄▃▂▂▁▇█▃▅▁█▇▂▂▃▂▃▂▂▂▁▂▂▅█▁▃▂▂▂▁▂▂▂▂▂▁▂▂▂▂▁▁▁▁▁▁▁▂▂▂▂ ▃
  2.12 ns        Histogram: frequency by time        4.08 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> nonzeros = findall(x .!= 0);

julia> @benchmark getindex($x, i, j) setup=((i, j)=Tuple(rand(nonzeros))) gctrial=false gcsample=false evals=1_000 samples=10_000
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  20.292 ns   14.414 μs  ┊ GC (min  max):  0.00%  99.58%
 Time  (median):     22.375 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   26.562 ns ± 168.124 ns  ┊ GC (mean ± σ):  12.13% ±  2.60%

         ▂▁ ▁▃▃█                                                
  ▁▂▂▂▃▅▇██▆█████▄▃▂▂▃▂▂▂▂▂▂▃▄▃▄▄▃▄▅▄▅▄▃▃▃▃▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  20.3 ns         Histogram: frequency by time         29.5 ns <

 Memory estimate: 64 bytes, allocs estimate: 2.

julia> @benchmark getindex($x, i, j) setup=((i, j)=rand((rand(1:1000), rand(1:1000)), Tuple(rand(nonzeros)))) gctrial=false gcsample=false evals=1_000 samples=10_000 seconds=10
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  18.917 ns  58.875 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     20.375 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   20.537 ns ±  1.161 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

            ▁█▅▆█▃▆▁ ▂                                         
  ▂▁▂▂▂▃▃▄▇▇██████████▆▅▆▄▃▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▂▂▂▂▂▁▂▂▁▂▂▂▂▂▂ ▃
  18.9 ns         Histogram: frequency by time        24.7 ns <

 Memory estimate: 64 bytes, allocs estimate: 2.

julia> @less getindex(x, 1, 1)

@RCI @propagate_inbounds function getindex(A::AbstractSparseMatrixCSC{T}, i0::Integer, i1::Integer) where T
    @boundscheck checkbounds(A, i0, i1)
    r1 = Int(@inbounds getcolptr(A)[i1])
    r2 = Int(@inbounds getcolptr(A)[i1+1]-1)
    (r1 > r2) && return zero(T)
    r1 = searchsortedfirst(rowvals(A), i0, r1, r2, Forward)
    ((r1 > r2) || (rowvals(A)[r1] != i0)) ? zero(T) : nonzeros(A)[r1]
end

After

(@v1.11) pkg> activate --temp
  Activating new project at `/var/folders/hc/fn82kz1j5vl8w7lwd4l079y80000gn/T/jl_08EiCA`

(jl_08EiCA) pkg> dev SparseArrays
   Resolving package versions...
    Updating `/private/var/folders/hc/fn82kz1j5vl8w7lwd4l079y80000gn/T/jl_08EiCA/Project.toml`
  [2f01184e] + SparseArrays v1.11.0 `~/.julia/dev/SparseArrays`
    Updating `/private/var/folders/hc/fn82kz1j5vl8w7lwd4l079y80000gn/T/jl_08EiCA/Manifest.toml`
  [0dad84c5] + ArgTools v1.1.1
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching
  [b77e0a4c] + InteractiveUtils
  [b27032c2] + LibCURL v0.6.4
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg v1.11.0
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays v1.11.0 `~/.julia/dev/SparseArrays`
  [fa267f1f] + TOML v1.0.3
  [a4e569a6] + Tar v1.10.0
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll v1.0.5+1
  [deac9b47] + LibCURL_jll v8.0.1+1
  [29816b5a] + LibSSH2_jll v1.11.0+1
  [c8ffd9c3] + MbedTLS_jll v2.28.2+1
  [14a3606d] + MozillaCACerts_jll v2023.1.10
  [4536629a] + OpenBLAS_jll v0.3.23+2
  [bea87d4a] + SuiteSparse_jll v7.2.0+1
  [83775a58] + Zlib_jll v1.2.13+1
  [8e850b90] + libblastrampoline_jll v5.8.0+1
  [8e850ede] + nghttp2_jll v1.52.0+1
  [3f19e933] + p7zip_jll v17.4.0+2

julia> versioninfo()
Julia Version 1.11.0-DEV.471
Commit 153b538ff0* (2023-09-05 21:23 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 8 × Apple M2
  WORD_SIZE: 64
  LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
  Threads: 1 on 4 virtual cores
Environment:
  JULIA_EDITOR = code

julia> using SparseArrays, BenchmarkTools, Random

julia> Random.seed!(1729)
TaskLocalRNG()

julia> x = sprand(1000, 1000, .001)
1000×1000 SparseMatrixCSC{Float64, Int64} with 1057 stored entries:
⎡⢉⠀⠐⠀⠁⠁⠂⠀⠁⠄⠀⠠⠀⠀⠈⠁⠈⠀⡁⡁⠀⢀⠀⠠⠃⠉⠆⡀⠢⠠⢀⠂⠂⠀⠠⠀⠄⠄⠄⠀⎤
⎢⠰⡗⠈⠥⠁⠄⠠⢀⠡⠂⠀⠀⠄⠄⠤⡅⢀⠂⡄⡀⠀⠂⠁⠔⡀⠘⠀⢈⠀⡰⠌⠀⢰⠀⠒⠂⠀⠒⢂⠀⎥
⎢⢤⡀⠀⠈⠐⠀⠈⠄⠀⠀⠀⡤⡀⠀⠄⡐⠌⢂⠔⠀⢐⠖⢀⠥⠀⡀⠐⡠⠄⠐⡐⠀⢠⠀⡀⠒⠘⡘⠒⠀⎥
⎢⠐⠄⠀⠅⠀⠀⠤⠀⢀⠀⠀⠀⠀⡀⡈⠰⠡⠀⠄⡀⡀⠀⠈⠢⠈⡄⠀⠀⠊⠜⠀⠘⠠⠠⠁⠊⠅⠒⠎⢀⎥
⎢⠀⠂⡐⠀⠌⠁⢡⢐⢄⠈⠂⠀⠀⠐⠒⢄⠠⠄⠁⠁⠀⢀⣀⠀⠀⠃⠂⠀⡐⠄⠄⠀⠀⢀⠀⠍⠠⠄⡠⠌⎥
⎢⠀⠐⠐⠁⠀⠰⢨⢀⠂⠀⠀⡠⡂⠐⡒⠄⠀⠆⠘⢘⢂⠀⠀⠈⠈⡀⠆⡘⠌⠃⠀⠂⡄⠀⠈⠀⢂⠐⠁⠀⎥
⎢⠀⠁⠑⠀⢀⡄⠂⠀⠂⢀⡁⡀⠲⢈⡠⠀⢀⠀⠀⠆⠀⢂⢔⠀⠀⠡⠄⠠⠠⢠⠄⠈⢈⠐⠋⡨⡀⡀⠌⠀⎥
⎢⠀⠠⠀⡈⠐⡀⠂⠀⠠⠀⠈⢅⠁⠀⠀⠀⠀⠀⡄⢉⠀⠅⠁⠠⠁⢐⠀⠀⡀⠈⠀⠀⠀⠠⠠⠄⠁⠀⢀⢠⎥
⎢⠐⣠⢂⠊⡠⡲⠘⠀⠀⠁⠐⠄⠍⠁⢄⠀⠠⠀⠀⠄⠐⠀⡀⢋⠀⠠⠄⠡⠀⠁⠐⡀⠀⠀⠈⡩⠀⡀⠠⠐⎥
⎢⠀⠉⠁⠀⠈⢐⠰⠂⠀⢄⠀⠠⠆⡌⢅⠀⣑⠅⠓⠐⡄⢀⠅⡀⠠⠁⡀⠡⠀⠠⠠⡐⠈⡄⠔⠴⠤⡡⡀⠁⎥
⎢⠁⡁⠀⠀⠝⠃⠉⠀⠰⡀⠀⠈⠐⡌⠂⠀⠠⢡⠈⡀⠱⠐⠀⢀⢀⠵⠀⠒⡂⠐⠈⣤⣀⡀⡌⠀⠠⠈⠀⢀⎥
⎢⠂⢀⠰⠀⡀⠁⠈⢉⢑⡂⢀⡈⣀⠂⢁⠀⠊⢀⡀⢀⡀⠄⡂⠈⠀⠁⡀⠁⠂⠀⡄⠀⠀⠀⠐⠄⡜⠈⠀⠀⎥
⎢⠠⡀⠂⠀⢄⠄⠀⠠⡀⢁⠂⢈⢰⠉⠂⡀⠂⠠⠀⠐⡀⠠⠂⡀⡀⢀⢑⢀⠐⠐⠠⡄⠀⠀⠐⠀⢒⠄⢁⠌⎥
⎢⠀⢀⠌⠐⠂⢀⠉⠌⠄⢁⠰⠁⡀⠀⡀⠄⠀⠈⡀⠀⢄⢀⠀⢢⠆⠊⡀⠤⡀⠀⡈⠐⠀⠅⠀⠌⢎⠐⠁⠅⎥
⎢⠊⠀⠀⠀⠰⡰⡨⡀⠀⢀⡄⠀⡀⡠⠰⠀⠀⠀⠀⣒⢆⣁⠂⠈⠀⠐⠐⠂⡈⠀⠃⠂⠠⠠⠌⠠⠨⠂⠢⡀⎥
⎢⠁⢀⡒⠂⠈⢩⢀⠀⢨⡄⡀⢄⠈⡂⠐⠀⡀⠠⠤⡰⢀⠀⠀⠀⠀⡄⠀⠪⠀⠁⠈⠀⡄⠠⠃⠀⠈⢠⡡⠄⎥
⎢⠀⡂⠕⠡⠀⠆⠠⠅⠁⠀⡑⠀⢌⠀⡂⠂⠅⠩⠁⢀⠑⠀⢠⠄⠠⡆⠂⠣⠀⠄⠂⠢⠂⢰⠄⠖⠥⠐⠂⠈⎥
⎢⠐⠀⠀⡄⠀⠌⠠⢄⢀⠌⡀⠎⠐⡁⠐⠁⡀⠣⠁⠂⠃⠁⢴⠃⠀⡁⠉⠊⠀⠒⠁⠀⠀⠨⠈⠀⠠⠒⡄⠀⎥
⎢⡂⠀⠠⠍⠀⠀⠄⡀⠔⢀⠁⠈⢄⣅⢄⡂⢈⠰⠠⠃⡁⠱⠒⢀⡀⠀⠀⠙⠂⠈⠁⠀⠘⠔⠄⠀⠒⡀⠂⣀⎥
⎣⠈⠀⠈⢀⠉⠀⠀⠑⠀⠀⡐⠀⠆⠀⢈⠀⠀⡄⠂⠉⠈⠀⡘⠀⠀⠈⠒⠐⢒⢉⠈⠁⠁⠀⠂⠀⠠⠂⢀⠄⎦

julia> @benchmark getindex($x, i, j) setup=(i=rand(1:1000); j=rand(1:1000)) gctrial=false gcsample=false evals=1_000 samples=10_000
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  2.083 ns  26.459 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     2.875 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.870 ns ±  0.813 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

     █                 ▁                 ▁                    
  ▂▂▄██▄▁▂▂▂▂▁▁▁▂▂▂▂▂▁▃█▅▄▃▁▂▂▂▅▆▄▁▄▃▂▂▃▁█▄▄▄▃▃▁▂▂▂▁▂▁▂▂▁▂▂▂ ▃
  2.08 ns        Histogram: frequency by time        4.12 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> nonzeros = findall(x .!= 0);

julia> @benchmark getindex($x, i, j) setup=((i, j)=Tuple(rand(nonzeros))) gctrial=false gcsample=false evals=1_000 samples=10_000
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  16.000 ns   22.415 μs  ┊ GC (min  max):  0.00%  99.71%
 Time  (median):     19.959 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   24.752 ns ± 252.301 ns  ┊ GC (mean ± σ):  16.53% ±  2.21%

    ▂█▇▃       ▁▁                                               
  ▂▃████▇▅▅▅▅▅▇████▇▆▅▅▅▅▅▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▁▁▂▂▂▂▂▂▂ ▃
  16 ns           Histogram: frequency by time         36.5 ns <

 Memory estimate: 64 bytes, allocs estimate: 2.

julia> @benchmark getindex($x, i, j) setup=((i, j)=rand((rand(1:1000), rand(1:1000)), Tuple(rand(nonzeros)))) gctrial=false gcsample=false evals=1_000 samples=10_000 seconds=10
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  14.125 ns  99.250 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     16.875 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   17.289 ns ±  3.290 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▁▂ ▁▁▃▂▆█▇▃▄▃                                           
  ▁▁▂▄███████████████▆▆▆▅▃▃▃▂▂▂▂▂▂▂▂▂▁▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  14.1 ns         Histogram: frequency by time        25.8 ns <

 Memory estimate: 64 bytes, allocs estimate: 2.

julia> @less getindex(x, 1, 1)

@RCI @propagate_inbounds function getindex(A::AbstractSparseMatrixCSC{T}, i0::In
teger, i1::Integer) where T
    @boundscheck checkbounds(A, i0, i1)
    r1 = Int(@inbounds getcolptr(A)[i1])
    r2 = Int(@inbounds getcolptr(A)[i1+1]-1)
    (r1 > r2) && return zero(T)
    r1 = searchsortedfirst(view(rowvals(A), r1:r2), i0) + r1 - 1
    ((r1 > r2) || (rowvals(A)[r1] != i0)) ? zero(T) : nonzeros(A)[r1]
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants