-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow chunksize choice in autodiff setup? #84
Comments
I'm not up to speed on the latest best practices on the use of the autodiff packages, but I think I recall seeing that chunksize is or will soon be deprecated. Is that possible, or am I remembering wrong? cc @jrevels |
Yeah, that keyword argument is deprecated - chunk size is now specified by a type parameter of the
Keyword argument performance is unfortunately quite a pain point for Julia right now...type-inferable/fast keyword arguments are one of my most wanted features. I'm really hoping something like JuliaLang/julia#16580 makes it into 1.0. I struggled with keyword argument performance when I first started working on ForwardDiff, and it's the reason I refactored ForwardDiff's API to get rid of keyword arguments and use an inferable configuration type instead. |
What I am proposing is a way to pass the chosen chunksize to the type parameter of the *Config type. Specifically, from the code I linked: Base.@pure function autodiff_setup{CS}(f!, initial_x::Vector,chunk_size::Type{Val{CS}})
permf! = (fx, x) -> f!(x, fx)
fx2 = copy(initial_x)
jac_cfg = ForwardDiff.JacobianConfig{CS}(initial_x, initial_x)
g! = (x, gx) -> ForwardDiff.jacobian!(gx, permf!, fx2, x, jac_cfg)
fg! = (x, fx, gx) -> begin
jac_res = DiffBase.DiffResult(fx, gx)
ForwardDiff.jacobian!(jac_res, permf!, fx2, x, jac_cfg)
DiffBase.value(jac_res)
end
return DifferentiableMultivariateFunction(f!, g!, fg!)
end In NLsolve.jl there is currently no way to set the config unless you make the |
Ahh I see, ok sounds like you guys are the ones to figure out how to implement this! |
Well, that code would just add to this: https://github.com/EconForge/NLsolve.jl/blob/master/src/autodiff.jl#L1 Chunksize would have to be an optional arg which would call the ForwardDiff default chunksize function when it's not specified. My question before was how to set this up all type-stable, but it sounds like from the other thread that I shouldn't worry about that and instead should just pass the number as a keyword argument here @KristofferC ? |
Luckily, if your API uses ForwardDiff's in-place methods (like in your code), then it's not as big of a deal for the chunk size to be dynamic (since the result type is known regardless). It does mean you probably can't take advantage of certain optimizations (e.g. SIMD), though.
Why not use I'm not super familiar with NLSolve, but why can't the code you linked simply be the API for using AD with NLSolve? In other words, remove |
I am surprised about this. Isn't it the case that as long as the function doesn't get inlined, then the generated code for the keyword argument function will be exactly the same as the one without keywords: For example: julia> @inline f(;x=1.0) = (z = 0.0; for i in 1:10^5; z += sin(x); end; z)
f (generic function with 2 methods)
julia> @noinline g(;x=1.0) = (z = 0.0; for i in 1:10^5; z += sin(x); end; z)
g (generic function with 1 method)
julia> @btime f(x = 0.5)
3.228 ms (200001 allocations: 3.05 MiB)
47942.55386052721
julia> @btime g(x = 0.5)
642.804 μs (2 allocations: 112 bytes)
47942.55386052721 |
This statement doesn't (directly) have anything to do with keyword arguments - apologies if I was unclear. I was saying that certain optimizations aren't going to be available to you if the chunk size is purely dynamic, i.e. isn't known at compile time. SIMD on dual numbers is a good example, because AFAIK, the tuple length has to be statically inferable for the SLP vectorizer to work. My earlier problem with keyword arguments is that arguments passed in via keyword arguments are inferred poorly. For example: julia> f{N}(; x::Type{Val{N}} = Val{3}) = Val{N}()
f (generic function with 1 method)
# yes, this is fine because the default
# value for `x` is well inferred
julia> @code_warntype f()
Variables:
#self#::#f
Body:
begin
return $(QuoteNode(Val{3}()))
end::Val{3}
# ...but as soon as you actually use
# the kwarg, you're poisoning the
# well. Note this happens
# regardless of whether
# or not f is inlined (on v0.6).
julia> @code_warntype f(x = Val{2})
Variables:
#unused#::#kw##f
#temp#@_2::Array{Any,1}
::#f
#temp#@_4::Int64
#temp#@_5::Int64
#temp#@_6::Any
#temp#@_7::Int64
#temp#@_8::Bool
x::Any
Body:
begin
NewvarNode(:(x::Any))
#temp#@_8::Bool = true
SSAValue(2) = (Base.arraylen)(#temp#@_2::Array{Any,1})::Int64
SSAValue(3) = (Base.select_value)((Base.sle_int)(0,1)::Bool,(Base.ashr_int)(SSAValue(2),(Base.bitcast)(UInt64,1))::Int64,(Base.shl_int)(SSAValue(2),(Base.bitcast)(UInt64,(Base.neg_int)(1)::Int64))::Int64)::Int64
SSAValue(4) = (Base.select_value)((Base.sle_int)(1,SSAValue(3))::Bool,SSAValue(3),(Base.sub_int)(1,1)::Int64)::Int64
#temp#@_7::Int64 = 1
7:
unless (Base.not_int)((#temp#@_7::Int64 === (Base.add_int)(SSAValue(4),1)::Int64)::Bool)::Bool goto 23
SSAValue(5) = #temp#@_7::Int64
SSAValue(6) = (Base.add_int)(#temp#@_7::Int64,1)::Int64
#temp#@_4::Int64 = SSAValue(5)
#temp#@_7::Int64 = SSAValue(6)
#temp#@_5::Int64 = (Base.sub_int)((Base.mul_int)(#temp#@_4::Int64,2)::Int64,1)::Int64
#temp#@_6::Any = (Core.arrayref)(#temp#@_2::Array{Any,1},#temp#@_5::Int64)::Any
unless (#temp#@_6::Any === :x)::Bool goto 19
x::Any = (Core.arrayref)(#temp#@_2::Array{Any,1},(Base.add_int)(#temp#@_5::Int64,1)::Int64)::Any
#temp#@_8::Bool = false
goto 21
19:
(Base.throw)($(Expr(:new, :(Base.MethodError), :((Core.getfield)((Core.getfield)((Core.getfield)(#f,:name)::TypeName,:mt),:kwsorter)), :((Core.tuple)(#temp#@_2,)::Tuple{Array{Any,1},#f}), :($(QuoteNode(0xffffffffffffffff))))))::Union{}
21:
goto 7
23:
unless #temp#@_8::Bool goto 26
x::Any = Val{3}
26:
return (Main.#f#1)(x::Any,::#f)::Any
end::Any |
It has to be available at compilation time of the function. My point is that there is a function barrier between the keyword argument function and the function actually doing the work. For example, in the generated code you linked, all of that is just setting up the keyword argument sorting stuff until finally calling Note that I am not talking about type inference of the return value, only about the generated code of the function. |
This also describes what my understanding is: JuliaLang/julia#10443 (comment) |
Ah, that makes sense. What's the right way to test whether or not the operations below the barrier are getting SIMD vectorized? For example: julia> using ForwardDiff: Dual
julia> f(x::Dual) = x + x
f (generic function with 1 method)
julia> @code_llvm f(Dual(2,3,4,5))
define void @julia_f_65900(%Dual* noalias sret, %Dual*) #0 !dbg !5 {
top:
%2 = bitcast %Dual* %1 to <4 x i64>*
%3 = load <4 x i64>, <4 x i64>* %2, align 8
%4 = shl <4 x i64> %3, <i64 1, i64 1, i64 1, i64 1>
%5 = bitcast %Dual* %0 to <4 x i64>*
store <4 x i64> %4, <4 x i64>* %5, align 8
ret void
}
julia> f2(; x::Dual=Dual(1,1)) = x + x
# If my understanding is correct, this call
# should dynamically dispatch to a compiled
# representation that is essentially equivalent
# to the compiled representation of
#`f(::Dual{4,Int})`, which we confirmed
# above uses SIMD.
#
# How do we confirm this? Just using
# `@code_llvm` etc. will only show
# the dispatch instructions, yes?
julia> f2(x=Dual(1,2,3,4)) Of course, a few points still stand; for keyword arguments to be considered "performant", the work being done below the function barrier must be expensive enough that the cost of dynamic dispatch is negligible, and the caller's return type can't depend on the keyword argument if we want it to be inferable. |
Agree on all points. I'm not sure how to "look into" the inner function though... |
This is now implemented, see the |
I was wondering if you think allowing the chunksize choice in the autodiff setup is a good idea. I have a working implementation that I am using since it's currently not allowed:
https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/misc_utils.jl#L33
But the issue is, if I was to make a PR, I can't think of a good way to pass it in. You'd want to make it a kwarg like
autodiff
, but then it wouldn't be type stable since you would need to pass it by value-type for the type-stability. I handle this with a "pre-stage" where I can use a@pure
constructor, but I'm not sure that applies here.The text was updated successfully, but these errors were encountered: