-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorized map and broadcast #3
Comments
A few comments:
Demo of the issue: julia> using SIMDPirates, BenchmarkTools
julia> x = ntuple(Val(8)) do i Core.VecElement(randn()) end
(VecElement{Float64}(-1.0759056141831105), VecElement{Float64}(-1.1579962137902238), VecElement{Float64}(2.560294914431641), VecElement{Float64}(-1.1957407117264527), VecElement{Float64}(-2.132117101923461), VecElement{Float64}(-0.5877224346584126), VecElement{Float64}(0.8586640177057222), VecElement{Float64}(-0.7302450769652871))
julia> sx = SVec(x)
SVec{8,Float64}<-1.0759056141831105, -1.1579962137902238, 2.560294914431641, -1.1957407117264527, -2.132117101923461, -0.5877224346584126, 0.8586640177057222, -0.7302450769652871>
julia> exp(sx)
SVec{8,Float64}<0.3409888111263572, 0.3141149699378415, 12.939632837353068, 0.3024798208329842, 0.11858596932271068, 0.5555912401702848, 2.3600056608637496, 0.4817908997685664>
julia> @btime sum(exp($sx))
7.194 ns (0 allocations: 0 bytes)
17.413190209375564
julia> @btime exp($sx)
julia: /home/chriselrod/Documents/languages/julia/src/cgutils.cpp:514: unsigned int convert_struct_offset(llvm::Type*, unsigned int): Assertion `SL->getElementOffset(idx) == byte_offset' failed.
signal (6): Aborted
in expression starting at REPL[6]:1
raise at /usr/src/debug/glibc-2.30-298.x86_64/signal/../sysdeps/unix/sysv/linux/internal-signals.h:84
abort at /usr/src/debug/glibc-2.30-298.x86_64/stdlib/abort.c:79
__assert_fail_base at /usr/src/debug/glibc-2.30-298.x86_64/assert/assert.c:92
__assert_fail at /usr/src/debug/glibc-2.30-298.x86_64/assert/assert.c:101
convert_struct_offset at /home/chriselrod/Documents/languages/julia/src/cgutils.cpp:514 [inlined]
convert_struct_offset at /home/chriselrod/Documents/languages/julia/src/cgutils.cpp:509
convert_struct_offset at /home/chriselrod/Documents/languages/julia/src/cgutils.cpp:520 [inlined]
emit_new_struct at /home/chriselrod/Documents/languages/julia/src/cgutils.cpp:2619
emit_builtin_call at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:2596
emit_call at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:3331
emit_expr at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:4103
emit_ssaval_assign at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:3807
emit_stmtpos at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:4000 [inlined]
emit_function at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:6578
jl_compile_linfo at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:1230
emit_invoke at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:3277
emit_expr at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:4094
emit_ssaval_assign at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:3807
emit_stmtpos at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:4000 [inlined]
emit_function at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:6578
jl_compile_linfo at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:1230
emit_invoke at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:3277
emit_expr at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:4094
emit_ssaval_assign at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:3807
emit_stmtpos at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:4000 [inlined]
emit_function at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:6578
jl_compile_linfo at /home/chriselrod/Documents/languages/julia/src/codegen.cpp:1230
jl_compile_method_internal at /home/chriselrod/Documents/languages/julia/src/gf.c:1889
_jl_invoke at /home/chriselrod/Documents/languages/julia/src/gf.c:2153 [inlined]
jl_apply_generic at /home/chriselrod/Documents/languages/julia/src/gf.c:2322
jl_apply at /home/chriselrod/Documents/languages/julia/src/julia.h:1654 [inlined]
do_apply at /home/chriselrod/Documents/languages/julia/src/builtins.c:634
jl_f__apply at /home/chriselrod/Documents/languages/julia/src/builtins.c:648 [inlined]
jl_f__apply_latest at /home/chriselrod/Documents/languages/julia/src/builtins.c:684
#invokelatest#1 at ./essentials.jl:715 [inlined]
invokelatest##kw at ./essentials.jl:710 [inlined]
#run_result#37 at /home/chriselrod/.julia/packages/BenchmarkTools/7aqwe/src/execution.jl:32 [inlined]
run_result##kw at /home/chriselrod/.julia/packages/BenchmarkTools/7aqwe/src/execution.jl:32 [inlined]
#run#39 at /home/chriselrod/.julia/packages/BenchmarkTools/7aqwe/src/execution.jl:46
run##kw at /home/chriselrod/.julia/packages/BenchmarkTools/7aqwe/src/execution.jl:46 [inlined]
run##kw at /home/chriselrod/.julia/packages/BenchmarkTools/7aqwe/src/execution.jl:46 [inlined]
#warmup#42 at /home/chriselrod/.julia/packages/BenchmarkTools/7aqwe/src/execution.jl:79 [inlined]
warmup at /home/chriselrod/.julia/packages/BenchmarkTools/7aqwe/src/execution.jl:79
jl_apply at /home/chriselrod/Documents/languages/julia/src/julia.h:1654 [inlined]
do_call at /home/chriselrod/Documents/languages/julia/src/interpreter.c:328
eval_value at /home/chriselrod/Documents/languages/julia/src/interpreter.c:417
eval_stmt_value at /home/chriselrod/Documents/languages/julia/src/interpreter.c:368 [inlined]
eval_body at /home/chriselrod/Documents/languages/julia/src/interpreter.c:760
jl_interpret_toplevel_thunk_callback at /home/chriselrod/Documents/languages/julia/src/interpreter.c:888
Lenter_interpreter_frame_start_val at /home/chriselrod/Documents/languages/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
jl_interpret_toplevel_thunk at /home/chriselrod/Documents/languages/julia/src/interpreter.c:897
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/julia/src/toplevel.c:814
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/julia/src/toplevel.c:764
jl_toplevel_eval at /home/chriselrod/Documents/languages/julia/src/toplevel.c:823 [inlined]
jl_toplevel_eval_in at /home/chriselrod/Documents/languages/julia/src/toplevel.c:843
eval at ./boot.jl:331
eval_user_input at /home/chriselrod/Documents/languages/julia/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:86
run_backend at /home/chriselrod/.julia/packages/Revise/0KQ7U/src/Revise.jl:1033
#85 at ./task.jl:349
jl_apply at /home/chriselrod/Documents/languages/julia/src/julia.h:1654 [inlined]
start_task at /home/chriselrod/Documents/languages/julia/src/task.c:687
unknown function (ip: (nil))
Allocations: 32354569 (Pool: 32346408; Big: 8161); GC: 42
fish: “/home/chriselrod/Documents/lang…” terminated by signal SIGABRT (Abort) Returning an
x = rand(4);
y = rand(3);
A = rand(4,3);
B = @vectorize log.(x) .* sin.(A) .+ exp.(y') we would want to be able to create our own version of a R, C = size(A)
Rrep, Rrem = divrem(R, W) # W is the vector width
@inbounds for r in 0:Rrep-1
vlogx = log(vload(Vec{W,eltype(x)}, pointer(x) + + r*W*sizeof(eltype(x)) )
for c in 0:C-1
vexpy = vbroadcast(Vec{W,eltype(y)}, exp(y[c+1]))
vA = vload(Vex{W,eltype(A)}, pointer(A) + (r*W + c*stride(A,2))* sizeof(eltype(A)))
vB = vmuladd(vlogx, vA, vexpy)
vstore!(pointer(B) + (r*W + c*stride(A,2))* sizeof(eltype(B)), vB)
end
end (plus similar code to handle the remaining rows, I have been working on code to try and do this inside the graphs branch. If input arrays are all "dense", then it can consider fusing loops. That is, if we're calculating To actually do a reasonable job modeling computational cost, the functions being evaluated cannot be opaque.
EDIT: @vectorize log.(x) .* sin.(A) .+ exp.(y') so that we aren't reevaluating @vectorize for m in 1:M, n in 1:N, k in 1:K
C[m,n] += A[m,k] * B[k,n]
end into an optimized Ideally, it should also be able to consider splitting loops up into multiple (rather than just letting you fuse), but all of this will take quite some time. |
It does now support vectorized broadcast. I'll try and add a |
Awesome developments! I'll be looking forward to experimenting with this in MonteCarloMeasurements :D |
This is just me dreaming about the future, but wouldn't it be neat if one could write stuff like
? 😃
The text was updated successfully, but these errors were encountered: