Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make simd loop over ranges perform better #28166

Merged
merged 1 commit into from
Jul 18, 2018
Merged

Conversation

KristofferC
Copy link
Member

@KristofferC KristofferC commented Jul 18, 2018

Fixes #27773 (and likely the sumlinear_view benchmarks in #27030)

julia> function perf_sumlinear_view(A)
           s = zero(eltype(A))
           @inbounds @simd for I in 1:length(A)
               val = view(A, I)
               s += val[]
           end
           return s
       end
perf_sumlinear_view (generic function with 1 method)

julia> A = 1:1000000
1:1000000

julia> using BenchmarkTools

julia> @btime perf_sumlinear_view(A)
  110.620 μs (1 allocation: 16 bytes)
500000500000

julia> Base.firstindex(::UnitRange) = 1

julia> @btime perf_sumlinear_view(A)
  28.288 ns (1 allocation: 16 bytes)
500000500000

Apparently firstindex(::UnitRange) is right now too complex to see through. Arguably it would be better to try to figure out why it cannot be folded but it is a quite deep chain of calls...

Regression introduced in #27038 (with no Nanosoldier run, perhaps it was nonfunctional at the time)

@KristofferC KristofferC added performance Must go faster compiler:simd instruction-level vectorization labels Jul 18, 2018
@KristofferC KristofferC requested a review from mbauman July 18, 2018 13:43
@KristofferC KristofferC force-pushed the kc/fix_simd_unitrange branch from 07995a7 to 06e7ad4 Compare July 18, 2018 13:59
@KristofferC
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@KristofferC
Copy link
Member Author

Some bonus SIMD stuff improvements in there too. Unexpected, but I'll take it.

@KristofferC KristofferC merged commit 6c47c24 into master Jul 18, 2018
@KristofferC KristofferC deleted the kc/fix_simd_unitrange branch July 18, 2018 21:17
LilithHafner added a commit to LilithHafner/julia that referenced this pull request May 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:simd instruction-level vectorization performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants