-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Anderson for inputs of arbitrary dimension #223
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks! I'd revert the change that makes it unstable, just to be sure we don't annoy somebody who relies on this (it doesn't change much anyway). Although be aware that @pkofod is focusing on a rewrite and might not come around to merge this soon
γs = nothing | ||
Q = nothing | ||
R = nothing | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unstable. Not sure if that's a big problem in practice since small type unions are supposed to be fast these days, and anyway Anderson is mostly useful for large scale in which the overhead doesn't matter...
@@ -14,34 +12,40 @@ struct AndersonCache{Tx,To,Tdg,Tg,TQ,TR} <: AbstractSolverCache | |||
R::TR | |||
end | |||
|
|||
function AndersonCache(df, ::Anderson{m}) where m | |||
function AndersonCache(df, m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this was done to have m be known to the compiler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, the current implementation has the same type instability problem, it's just better hidden. In
NLsolve.jl/src/solvers/anderson.jl
Line 175 in ddd16a5
anderson(df, initial_x, xtol, ftol, iterations, store_trace, show_trace, extended_trace, beta, aa_start, droptol, AndersonCache(df, Anderson{m}())) |
the creation of
Anderson{m}
is also type unstable.
As far as I understand, both these instabilities in the current and the new implementation should not be a problem, since the type instability should not affect the output type when calling nlsolve
and after the first call of Anderson{m}
and AndersonCache(df, m)
everything can be inferred. To me it seems, the current implementation is just more complicated without providing any benefits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I retract my objection then!
The following simple script (taken from using NLsolve
using InteractiveUtils
function f_2by2!(F, x)
F[1] = (x[1]+3)*(x[2]^3-7)+18
F[2] = sin(x[2]*exp(x[1])-1)
end
function g_2by2!(J, x)
J[1, 1] = x[2]^3-7
J[1, 2] = 3*x[2]^2*(x[1]+3)
u = exp(x[1])*cos(x[2]*exp(x[1])-1)
J[2, 1] = x[2]*u
J[2, 2] = u
end
const df = OnceDifferentiable(f_2by2!, g_2by2!, [ -0.5; 1.4], [ -0.5; 1.4])
@code_warntype nlsolve(df, [ 0.01; .99], method = :anderson, m = 10, beta=.01) shows that on Julia 1.1 the current implementation of Body::NLsolve.SolverResults{Float64,Float64,Array{Float64,1},_1} where _1 while with this PR we get Body::NLsolve.SolverResults{Float64,Float64,Array{Float64,1},Array{Float64,1}} I'm not sure why the current implementation is type unstable since I don't know how to interpret the output of Cthulhu. Comparing f(df, m) = NLsolve.AndersonCache(df, Anderson{m}())
@code_warntype f(df, 0)
@code_warntype f(df, 10) on master with g(df, m) = NLsolve.AndersonCache(df, m)
@code_warntype g(df, 0)
@code_warntype g(df, 10) on this PR shows that with the current implementation none of the type parameters can be inferred Body::NLsolve.AndersonCache
1 ─ %1 = NLsolve.AndersonCache::Core.Compiler.Const(NLsolve.AndersonCache, false)
│ %2 = (Core.apply_type)(Main.Anderson, m)::Type
│ %3 = (%2)()::Any
│ %4 = (isa)(%3, NLsolve.Anderson{0})::Bool
└── goto #3 if not %4
2 ─ %6 = (Base.getfield)(df, :x_f)::Array{Float64,1}
│ %7 = (Base.arraysize)(%6, 1)::Int64
│ %8 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, :(%7), :(%7)))::Array{Float64,1}
│ %9 = (Base.getfield)(df, :x_f)::Array{Float64,1}
│ %10 = (Base.arraysize)(%9, 1)::Int64
│ %11 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, :(%10), :(%10)))::Array{Float64,1}
│ %12 = %new(NLsolve.AndersonCache{Array{Float64,1},Nothing,Nothing,Nothing,Nothing,Nothing}, %8, %11, nothing, nothing, nothing, nothing, nothing, nothing)::NLsolve.AndersonCache{Array{Float64,1},Nothing,Nothing,Nothing,Nothing,Nothing}
└── goto #4
3 ─ %14 = (%1)(df, %3)::NLsolve.AndersonCache
└── goto #4
4 ┄ %16 = φ (#2 => %12, #3 => %14)::NLsolve.AndersonCache
└── return %16
Body::NLsolve.AndersonCache
1 ─ %1 = NLsolve.AndersonCache::Core.Compiler.Const(NLsolve.AndersonCache, false)
│ %2 = (Core.apply_type)(Main.Anderson, m)::Type
│ %3 = (%2)()::Any
│ %4 = (isa)(%3, NLsolve.Anderson{0})::Bool
└── goto #3 if not %4
2 ─ %6 = (Base.getfield)(df, :x_f)::Array{Float64,1}
│ %7 = (Base.arraysize)(%6, 1)::Int64
│ %8 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, :(%7), :(%7)))::Array{Float64,1}
│ %9 = (Base.getfield)(df, :x_f)::Array{Float64,1}
│ %10 = (Base.arraysize)(%9, 1)::Int64
│ %11 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, :(%10), :(%10)))::Array{Float64,1}
│ %12 = %new(NLsolve.AndersonCache{Array{Float64,1},Nothing,Nothing,Nothing,Nothing,Nothing}, %8, %11, nothing, nothing, nothing, nothing, nothing, nothing)::NLsolve.AndersonCache{Array{Float64,1},Nothing,Nothing,Nothing,Nothing,Nothing}
└── goto #4
3 ─ %14 = (%1)(df, %3)::NLsolve.AndersonCache
└── goto #4
4 ┄ %16 = φ (#2 => %12, #3 => %14)::NLsolve.AndersonCache
└── return %16 whereas with this PR at least the first type parameter can be inferred: Body::NLsolve.AndersonCache{Array{Float64,1},_1,_2,_3,_4,_5} where _5 where _4 where _3 where _2 where _1
1 ─ %1 = NLsolve.AndersonCache::Core.Compiler.Const(NLsolve.AndersonCache, false)
│ %2 = invoke %1(_2::OnceDifferentiable{Array{Float64,1},Array{Float64,2},Array{Float64,1}}, _3::Int64)::NLsolve.AndersonCache{Array{Float64,1},_1,_2,_3,_4,_5} where _5 where _4 where _3 where _2 where _1
└── return %2
Body::NLsolve.AndersonCache{Array{Float64,1},_1,_2,_3,_4,_5} where _5 where _4 where _3 where _2 where _1
1 ─ %1 = NLsolve.AndersonCache::Core.Compiler.Const(NLsolve.AndersonCache, false)
│ %2 = invoke %1(_2::OnceDifferentiable{Array{Float64,1},Array{Float64,2},Array{Float64,1}}, _3::Int64)::NLsolve.AndersonCache{Array{Float64,1},_1,_2,_3,_4,_5} where _5 where _4 where _3 where _2 where _1
└── return %2 So actually I think this PR improves type stability. |
David, Thanks! It looks good to me. If you agree it's done, then I'll merge. |
Great! I don't want to add anything else to this PR 😃 |
Thanks! |
This PR fixes the use of matrices (and higher-dimensional arrays) as inputs for Anderson.
The issue was not detected by the tests since
m = 0
was the default history size. I changed the defaults tom = 10
anddroptol = 1e10
, as used in Walker's implementation.However, for the default setting
beta = 1
the Wood example diverges, leading to infinite values in the standard fixed-point iteration andNaN
with Anderson acceleration (due to the computation ofQ
andR
). Thus I explicitly chosem = 2
andbeta = 1e-3
, such that the algorithm does not diverge within the first 1000 iterations, and added a check for finite values in the algorithm.Moreover, I removed the
Anderson
struct which seemed unnecessary.