Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive implementation of searchsortedfirst and searchsortedlast #155

Merged
merged 4 commits into from
Jan 17, 2021

Conversation

jipolanco
Copy link
Contributor

Similarly to #133, this PR reimplements searchsortedfirst and searchsortedlast for concatenated arrays recursively.

As shown below, the changes can result in important performance gains from both functions, especially when the concatenated arrays are composed of different array types.

The example below was run on Julia 1.5.4, but results are similar on 1.6-beta1.

using BenchmarkTools
using LazyArrays
using StaticArrays

A = Vcat(
    collect(1:10),
    11:2:20,
    SVector{4}(22:25),
)

@assert issorted(A)

Before this PR:

julia> @btime searchsortedfirst($A, 4);
  11.841 ns (0 allocations: 0 bytes)

julia> @btime searchsortedfirst($A, 23);
  112.249 ns (4 allocations: 240 bytes)

julia> @btime searchsortedlast($A, 4);
  92.327 ns (4 allocations: 240 bytes)

julia> @btime searchsortedlast($A, 23);
  241.237 ns (4 allocations: 224 bytes)

After this PR:

julia> @btime searchsortedfirst($A, 4);
  8.244 ns (0 allocations: 0 bytes)

julia> @btime searchsortedfirst($A, 23);
  30.821 ns (0 allocations: 0 bytes)

julia> @btime searchsortedlast($A, 4);
  32.848 ns (0 allocations: 0 bytes)

julia> @btime searchsortedlast($A, 23);
  12.541 ns (0 allocations: 0 bytes)

This PR also fixes an issue in which the return type of searchsortedlast was not inferred, which explains in part its poor performance above. This is before the proposed changes:

julia> @code_warntype searchsortedlast(A, 23);
Variables
  #self#::Core.Compiler.Const(searchsortedlast, false)
  f::ApplyArray{Int64,1,typeof(vcat),Tuple{Array{Int64,1},StepRange{Int64,Int64},SArray{Tuple{4},Int64,1,4}}}
  x::Int64
  args::Tuple{Array{Int64,1},StepRange{Int64,Int64},SArray{Tuple{4},Int64,1,4}}
  @_5::Union{Nothing, Tuple{Int64,Int64}}
  k::Int64
  r::Int64

Body::Any
1 ─       (args = LazyArrays.arguments(f))
│   %2  = LazyArrays.length(args)::Core.Compiler.Const(3, false)
│   %3  = (%2:-1:2)::Core.Compiler.Const(3:-1:2, false)
│         (@_5 = Base.iterate(%3))
│   %5  = (@_5::Core.Compiler.Const((3, 3), false) === nothing)::Core.Compiler.Const(false, false)
│   %6  = Base.not_int(%5)::Core.Compiler.Const(true, false)
└──       goto #6 if not %6
2%8  = @_5::Tuple{Int64,Int64}::Tuple{Int64,Int64}
│         (k = Core.getfield(%8, 1))
│   %10 = Core.getfield(%8, 2)::Int64%11 = Base.getindex(args, k)::Union{StepRange{Int64,Int64}, SArray{Tuple{4},Int64,1,4}, Array{Int64,1}}
│         (r = LazyArrays._searchsortedlast(%11, x))
│   %13 = (r > 0)::Bool
└──       goto #4 if not %13
3%15 = args::Tuple{Array{Int64,1},StepRange{Int64,Int64},SArray{Tuple{4},Int64,1,4}}%16 = (k - 1)::Int64%17 = (1:%16)::Core.Compiler.PartialStruct(UnitRange{Int64}, Any[Core.Compiler.Const(1, false), Int64])
│   %18 = Base.getindex(%15, %17)::Tuple%19 = LazyArrays.mapreduce(LazyArrays.length, LazyArrays.:+, %18)::Any%20 = (%19 + r)::Any
└──       return %20
4 ─       (@_5 = Base.iterate(%3, %10))
│   %23 = (@_5 === nothing)::Bool%24 = Base.not_int(%23)::Bool
└──       goto #6 if not %24
5 ─       goto #2
6%27 = Base.getindex(args, 1)::Array{Int64,1}%28 = LazyArrays._searchsortedlast(%27, x)::Int64
└──       return %28

@dlfivefifty
Copy link
Member

Fantastic, please bump the patch version so I can merge and tag

@jipolanco
Copy link
Contributor Author

Done, thanks for the quick answer!

@codecov
Copy link

codecov bot commented Jan 16, 2021

Codecov Report

Merging #155 (6fc41e4) into master (200c147) will decrease coverage by 0.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #155      +/-   ##
==========================================
- Coverage   88.56%   88.50%   -0.06%     
==========================================
  Files          12       12              
  Lines        1670     1670              
==========================================
- Hits         1479     1478       -1     
- Misses        191      192       +1     
Impacted Files Coverage Δ
src/lazyconcat.jl 94.06% <100.00%> (ø)
src/lazyoperations.jl 88.60% <0.00%> (-0.43%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 200c147...6fc41e4. Read the comment docs.

@dlfivefifty
Copy link
Member

Can you add a test using @inferred? And get the code coverage tests passing

@jipolanco
Copy link
Contributor Author

I added the inference tests.

There's not much that I can do for the coverage tests, since the coverage regression concerns a single line in a file that I didn't touch (somewhere in lazyoperations.jl, in the shuffle_algorithm function), which seems to be a bit random.

@dlfivefifty dlfivefifty merged commit 731ac76 into JuliaArrays:master Jan 17, 2021
@jipolanco jipolanco deleted the searchsorted branch January 17, 2021 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants