Add generic fallback for Base.LinAlg.BLAS: scal!, blascopy! ... #351

lopezm94 · 2016-07-30T08:03:03Z

In cases where one wants to use BLAS functions to reduce memory allocation, it is annoying that it is only defined for Arrays of BlasFloat types. Adding custom fallback operators inside packages is becoming increasingly more common for me, see IterativeSolvers.jl#79 and InplaceOps#9.

A change has already been done for axpy! (see #5189), but I think is now time for the rest of the functions, if its ok for me to open a PR about this.

andreasnoack · 2016-07-30T14:33:51Z

I think we have most of these already but with a different naming, e.g. scale! and copy! should cover the functionality of BLAS.scal! and blascopy! and they have generic fallbacks. We couldn't come up with a new name for axpy! so in that case we use the BLAS name.

lopezm94 · 2016-07-30T14:49:55Z

Thank you, should have searched better. It is even faster than the BLAS translation:

julia> @benchmark Base.LinAlg.BLAS.blascopy!(length(A),A,1,B,1)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     907
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     126.00 ns (0.00% GC)
  median time:      128.00 ns (0.00% GC)
  mean time:        133.54 ns (0.00% GC)
  maximum time:     291.00 ns (0.00% GC)

julia> @benchmark copy!(A,B)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     973
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     73.00 ns (0.00% GC)
  median time:      74.00 ns (0.00% GC)
  mean time:        78.62 ns (0.00% GC)
  maximum time:     458.00 ns (0.00% GC)

I assumed BLAS was the fastest thing

lopezm94 · 2016-08-01T18:23:51Z

Hello, I think this might need some reopening. copy! is fine but scale! doesn't seem to be optimized with BLAS.scal! when possible

julia> @benchmark Base.LinAlg.BLAS.scal!(length(A),1.0,A,1)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     990
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  16.00 bytes
  allocs estimate:  1
  minimum time:     42.00 ns (0.00% GC)
  median time:      44.00 ns (0.00% GC)
  mean time:        45.65 ns (1.66% GC)
  maximum time:     1.78 μs (96.25% GC)

julia> @benchmark scale!(1.0,A)
BenchmarkTools.Trial: 
  samples:          6602
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     623.05 μs (0.00% GC)
  median time:      752.92 μs (0.00% GC)
  mean time:        753.82 μs (0.00% GC)
  maximum time:     1.04 ms (0.00% GC)

KristofferC · 2016-08-01T18:37:57Z

Scaling with 1 takes the fast path in BLAS.

andreasnoack · 2016-08-01T18:42:32Z

Yes. This is pretty stupid and needs some cleanup.

julia> @benchmark scale!(A,1.0)
BenchmarkTools.Trial:
  samples:          10000
  evals/sample:     997
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     19.00 ns (0.00% GC)
  median time:      19.00 ns (0.00% GC)
  mean time:        19.35 ns (0.00% GC)
  maximum time:     72.00 ns (0.00% GC)

julia> @benchmark scale!(1.0,A)
BenchmarkTools.Trial:
  samples:          10000
  evals/sample:     9
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     2.50 μs (0.00% GC)
  median time:      2.53 μs (0.00% GC)
  mean time:        2.55 μs (0.00% GC)
  maximum time:     10.84 μs (0.00% GC)

KristofferC · 2016-08-01T18:47:41Z

Oh. That is interesting.

andreasnoack · 2016-08-01T19:33:41Z

@lopezm94 You probably know but what @KristofferC meant was that BLAS checks if the scaling factor is one and just returns if true. For all other values (all real use cases) Julia is much closer to the BLAS speed and should be even faster for small arrays. In any case, we should make sure that either none or both scale! methods call BLAS when beneficial. If we want to track that here then we should adjust the issue title.

KristofferC · 2016-08-01T19:47:59Z

diff --git a/base/linalg/dense.jl b/base/linalg/dense.jl
index b0fbe81..a4324e8 100644
--- a/base/linalg/dense.jl
+++ b/base/linalg/dense.jl
@@ -10,6 +10,8 @@ const ASUM_CUTOFF = 32
 const NRM2_CUTOFF = 32

 function scale!{T<:BlasFloat}(X::Array{T}, s::T)
+    s == 0 && return fill!(X, zero(T))
+    s == 1 && return X
     if length(X) < SCAL_CUTOFF
         generic_scale!(X, s)
     else
@@ -18,6 +20,8 @@ function scale!{T<:BlasFloat}(X::Array{T}, s::T)
     X
 end

+scale!{T<:BlasFloat}(s::T, X::Array{T}) = scale!(X, s)
+
 scale!{T<:BlasFloat}(X::Array{T}, s::Number) = scale!(X, convert(T, s))
 function scale!{T<:BlasComplex}(X::Array{T}, s::Real)
     R = typeof(real(zero(T)))

maybe

andreasnoack · 2016-08-01T19:49:38Z

Seems reasonable.

lopezm94 changed the title ~~Add generic fallback for Blas.LinAlg: scal!, blascopy! ...~~ Add generic fallback for Blas.LinAlg.BLAS: scal!, blascopy! ... Jul 30, 2016

lopezm94 changed the title ~~Add generic fallback for Blas.LinAlg.BLAS: scal!, blascopy! ...~~ Add generic fallback for Base.LinAlg.BLAS: scal!, blascopy! ... Jul 30, 2016

lopezm94 closed this as completed Jul 30, 2016

kshyatt added the linear algebra label Aug 1, 2016

lopezm94 reopened this Aug 1, 2016

KristofferC mentioned this issue Aug 1, 2016

fix fallback for scaling blas number with array JuliaLang/julia#17739

Merged

lopezm94 closed this as completed Aug 1, 2016

KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generic fallback for Base.LinAlg.BLAS: scal!, blascopy! ... #351

Add generic fallback for Base.LinAlg.BLAS: scal!, blascopy! ... #351

lopezm94 commented Jul 30, 2016

andreasnoack commented Jul 30, 2016

lopezm94 commented Jul 30, 2016 •

edited

Loading

lopezm94 commented Aug 1, 2016 •

edited

Loading

KristofferC commented Aug 1, 2016

andreasnoack commented Aug 1, 2016

KristofferC commented Aug 1, 2016

andreasnoack commented Aug 1, 2016

KristofferC commented Aug 1, 2016

andreasnoack commented Aug 1, 2016

Add generic fallback for Base.LinAlg.BLAS: scal!, blascopy! ... #351

Add generic fallback for Base.LinAlg.BLAS: scal!, blascopy! ... #351

Comments

lopezm94 commented Jul 30, 2016

andreasnoack commented Jul 30, 2016

lopezm94 commented Jul 30, 2016 • edited Loading

lopezm94 commented Aug 1, 2016 • edited Loading

KristofferC commented Aug 1, 2016

andreasnoack commented Aug 1, 2016

KristofferC commented Aug 1, 2016

andreasnoack commented Aug 1, 2016

KristofferC commented Aug 1, 2016

andreasnoack commented Aug 1, 2016

lopezm94 commented Jul 30, 2016 •

edited

Loading

lopezm94 commented Aug 1, 2016 •

edited

Loading