Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Anderson for inputs of arbitrary dimension #223

Merged
merged 7 commits into from
Aug 14, 2019

Conversation

devmotion
Copy link
Contributor

This PR fixes the use of matrices (and higher-dimensional arrays) as inputs for Anderson.

The issue was not detected by the tests since m = 0 was the default history size. I changed the defaults to m = 10 and droptol = 1e10, as used in Walker's implementation.

However, for the default setting beta = 1 the Wood example diverges, leading to infinite values in the standard fixed-point iteration and NaN with Anderson acceleration (due to the computation of Q and R). Thus I explicitly chose m = 2 and beta = 1e-3, such that the algorithm does not diverge within the first 1000 iterations, and added a check for finite values in the algorithm.

Moreover, I removed the Anderson struct which seemed unnecessary.

Copy link
Contributor

@antoine-levitt antoine-levitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks! I'd revert the change that makes it unstable, just to be sure we don't annoy somebody who relies on this (it doesn't change much anyway). Although be aware that @pkofod is focusing on a rewrite and might not come around to merge this soon

γs = nothing
Q = nothing
R = nothing
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unstable. Not sure if that's a big problem in practice since small type unions are supposed to be fast these days, and anyway Anderson is mostly useful for large scale in which the overhead doesn't matter...

@@ -14,34 +12,40 @@ struct AndersonCache{Tx,To,Tdg,Tg,TQ,TR} <: AbstractSolverCache
R::TR
end

function AndersonCache(df, ::Anderson{m}) where m
function AndersonCache(df, m)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this was done to have m be known to the compiler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, the current implementation has the same type instability problem, it's just better hidden. In

anderson(df, initial_x, xtol, ftol, iterations, store_trace, show_trace, extended_trace, beta, aa_start, droptol, AndersonCache(df, Anderson{m}()))

the creation of Anderson{m} is also type unstable.

As far as I understand, both these instabilities in the current and the new implementation should not be a problem, since the type instability should not affect the output type when calling nlsolve and after the first call of Anderson{m} and AndersonCache(df, m) everything can be inferred. To me it seems, the current implementation is just more complicated without providing any benefits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I retract my objection then!

@devmotion
Copy link
Contributor Author

The following simple script (taken from test/2by2.jl)

using NLsolve
using InteractiveUtils

function f_2by2!(F, x)
    F[1] = (x[1]+3)*(x[2]^3-7)+18
    F[2] = sin(x[2]*exp(x[1])-1)
end

function g_2by2!(J, x)
    J[1, 1] = x[2]^3-7
    J[1, 2] = 3*x[2]^2*(x[1]+3)
    u = exp(x[1])*cos(x[2]*exp(x[1])-1)
    J[2, 1] = x[2]*u
    J[2, 2] = u
end

const df = OnceDifferentiable(f_2by2!, g_2by2!, [ -0.5; 1.4], [ -0.5; 1.4])

@code_warntype nlsolve(df, [ 0.01; .99], method = :anderson, m = 10, beta=.01)

shows that on Julia 1.1 the current implementation of nlsolve is actually type unstable

Body::NLsolve.SolverResults{Float64,Float64,Array{Float64,1},_1} where _1                  

while with this PR we get

Body::NLsolve.SolverResults{Float64,Float64,Array{Float64,1},Array{Float64,1}}

I'm not sure why the current implementation is type unstable since I don't know how to interpret the output of Cthulhu.

Comparing

f(df, m) = NLsolve.AndersonCache(df, Anderson{m}())

@code_warntype f(df, 0)
@code_warntype f(df, 10)

on master with

g(df, m) = NLsolve.AndersonCache(df, m)

@code_warntype g(df, 0)
@code_warntype g(df, 10)

on this PR shows that with the current implementation none of the type parameters can be inferred

Body::NLsolve.AndersonCache
1%1  = NLsolve.AndersonCache::Core.Compiler.Const(NLsolve.AndersonCache, false)
│   %2  = (Core.apply_type)(Main.Anderson, m)::Type%3  = (%2)()::Any%4  = (isa)(%3, NLsolve.Anderson{0})::Bool
└──       goto #3 if not %4
2%6  = (Base.getfield)(df, :x_f)::Array{Float64,1}%7  = (Base.arraysize)(%6, 1)::Int64%8  = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, :(%7), :(%7)))::Array{Float64,1}%9  = (Base.getfield)(df, :x_f)::Array{Float64,1}%10 = (Base.arraysize)(%9, 1)::Int64%11 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, :(%10), :(%10)))::Array{Float64,1}%12 = %new(NLsolve.AndersonCache{Array{Float64,1},Nothing,Nothing,Nothing,Nothing,Nothing}, %8, %11, nothing, nothing, nothing, nothing, nothing, nothing)::NLsolve.AndersonCache{Array{Float64,1},Nothing,Nothing,Nothing,Nothing,Nothing}
└──       goto #4
3%14 = (%1)(df, %3)::NLsolve.AndersonCache
└──       goto #4
4%16 = φ (#2 => %12, #3 => %14)::NLsolve.AndersonCache
└──       return %16
Body::NLsolve.AndersonCache
1%1  = NLsolve.AndersonCache::Core.Compiler.Const(NLsolve.AndersonCache, false)
│   %2  = (Core.apply_type)(Main.Anderson, m)::Type%3  = (%2)()::Any%4  = (isa)(%3, NLsolve.Anderson{0})::Bool
└──       goto #3 if not %4
2%6  = (Base.getfield)(df, :x_f)::Array{Float64,1}%7  = (Base.arraysize)(%6, 1)::Int64%8  = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, :(%7), :(%7)))::Array{Float64,1}%9  = (Base.getfield)(df, :x_f)::Array{Float64,1}%10 = (Base.arraysize)(%9, 1)::Int64%11 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, :(%10), :(%10)))::Array{Float64,1}%12 = %new(NLsolve.AndersonCache{Array{Float64,1},Nothing,Nothing,Nothing,Nothing,Nothing}, %8, %11, nothing, nothing, nothing, nothing, nothing, nothing)::NLsolve.AndersonCache{Array{Float64,1},Nothing,Nothing,Nothing,Nothing,Nothing}
└──       goto #4
3%14 = (%1)(df, %3)::NLsolve.AndersonCache
└──       goto #4
4%16 = φ (#2 => %12, #3 => %14)::NLsolve.AndersonCache
└──       return %16

whereas with this PR at least the first type parameter can be inferred:

Body::NLsolve.AndersonCache{Array{Float64,1},_1,_2,_3,_4,_5} where _5 where _4 where _3 where _2 where _1
1%1 = NLsolve.AndersonCache::Core.Compiler.Const(NLsolve.AndersonCache, false)
│   %2 = invoke %1(_2::OnceDifferentiable{Array{Float64,1},Array{Float64,2},Array{Float64,1}}, _3::Int64)::NLsolve.AndersonCache{Array{Float64,1},_1,_2,_3,_4,_5} where _5 where _4 where _3 where _2 where _1
└──      return %2
Body::NLsolve.AndersonCache{Array{Float64,1},_1,_2,_3,_4,_5} where _5 where _4 where _3 where _2 where _1
1%1 = NLsolve.AndersonCache::Core.Compiler.Const(NLsolve.AndersonCache, false)
│   %2 = invoke %1(_2::OnceDifferentiable{Array{Float64,1},Array{Float64,2},Array{Float64,1}}, _3::Int64)::NLsolve.AndersonCache{Array{Float64,1},_1,_2,_3,_4,_5} where _5 where _4 where _3 where _2 where _1
└──      return %2

So actually I think this PR improves type stability.

@pkofod
Copy link
Member

pkofod commented Aug 14, 2019

David,

Thanks! It looks good to me. If you agree it's done, then I'll merge.

@devmotion
Copy link
Contributor Author

Great! I don't want to add anything else to this PR 😃

@pkofod pkofod merged commit 9b4c7b3 into JuliaNLSolvers:master Aug 14, 2019
@pkofod
Copy link
Member

pkofod commented Aug 14, 2019

Thanks!

@devmotion devmotion deleted the anderson branch August 14, 2019 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants