Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a Trust Region solver. #116

Merged
merged 27 commits into from
Jan 18, 2023
Merged

Added a Trust Region solver. #116

merged 27 commits into from
Jan 18, 2023

Conversation

Deltadahl
Copy link
Contributor

Implementation of a TrustRegion solver.
The solver is similar to the TrustRegion solver in SimpleNonlinearSolve.jl, but this new solver is implemented with NewtonRaphson as a template.

Questions:

  1. Now the two TrustRegion solvers (in SimpleNonlinearSolve.jl and NonlinearSolve.jl) have the same name, should I change the name of the TurstRegion solver in SimpleNonlinearSolve.jl to SimpleTrustRegion?
  2. Regarding Error when using a neural network inside a system of nonlinear equations. #114, the TurstRegion solver does not give any error (as NewtonRaphson does). But the solver finds a solution with a "high" error. I suppose this is because it's getting stuck in a local minimum. But if you think that's not the case, please tell me.

As always, just tell me if there is anything I can improve/change.

src/trustRegion.jl Outdated Show resolved Hide resolved
src/trustRegion.jl Outdated Show resolved Hide resolved
src/trustRegion.jl Outdated Show resolved Hide resolved
src/trustRegion.jl Outdated Show resolved Hide resolved
src/trustRegion.jl Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jan 17, 2023

Codecov Report

Merging #116 (45d6195) into master (a791aca) will increase coverage by 3.25%.
The diff coverage is 96.52%.

@@            Coverage Diff             @@
##           master     #116      +/-   ##
==========================================
+ Coverage   83.33%   86.58%   +3.25%     
==========================================
  Files           5        6       +1     
  Lines         228      343     +115     
==========================================
+ Hits          190      297     +107     
- Misses         38       46       +8     
Impacted Files Coverage Δ
src/NonlinearSolve.jl 60.00% <ø> (ø)
src/ad.jl 100.00% <ø> (ø)
src/utils.jl 58.46% <50.00%> (-3.45%) ⬇️
src/trustRegion.jl 97.34% <97.34%> (ø)
src/jacobian.jl 84.61% <0.00%> (-2.57%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

src/trustRegion.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
@ChrisRackauckas
Copy link
Member

Note to @oscardssmith if he wants to take a look and help this get over the finish line.

@ChrisRackauckas
Copy link
Member

Now the two TrustRegion solvers (in SimpleNonlinearSolve.jl and NonlinearSolve.jl) have the same name, should I change the name of the TurstRegion solver in SimpleNonlinearSolve.jl to SimpleTrustRegion?

I handled that. SciML/SimpleNonlinearSolve.jl#28

differentiation. Defaults to true, which thus uses the `NonlinearSolveTag`. If `Val{false}`,
then ForwardDiff's default function naming tag is used, which results in larger stack
traces.
- `concrete_jac`: whether to build a concrete Jacobian. If a Krylov-subspace method is used,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should just be jac to be consistent with the rest of SciML

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's always concrete_jac

@oscardssmith
Copy link
Contributor

How does this benchmark compared to NLSolve?

@Deltadahl
Copy link
Contributor Author

Deltadahl commented Jan 17, 2023

I'll have a look at benchmarking as soon as I finish this up :)

@ChrisRackauckas
Copy link
Member

If the user inputs the exact solution as the forward diff seems to output the wrong value, Is this something I should handle as a special case?

What do you mean? Example?

@Deltadahl
Copy link
Contributor Author

Sorry, just deleted the message.
give me 1 sec

@Deltadahl
Copy link
Contributor Author

Deltadahl commented Jan 17, 2023

This input gives error

for p in 1.1:0.1:100.0
    @test g(p) ≈ sqrt(p)
    @test ForwardDiff.derivative(g, p) ≈ 1 / (2 * sqrt(p))
end

ONLY for the last iteration (p = 100.0), when

g = function (p)
    probN = NonlinearProblem(f, oftype(p, 1.0), p)
    sol = solve(probN, TrustRegion(), abstol = 1e-10)
    return sol.u
end
  Expression: ForwardDiff.derivative(g, p) ≈ 1 / (2 * sqrt(p))
   Evaluated: 0.09090909090909091 ≈ 0.05

@ChrisRackauckas
Copy link
Member

how far off is it?

@Deltadahl
Copy link
Contributor Author

  Expression: ForwardDiff.derivative(g, p) ≈ 1 / (2 * sqrt(p))
   Evaluated: 0.09090909090909091 ≈ 0.05
Stacktrace:
 [1] macro expansion
   @ C:\Users\ccsim\AppData\Local\Programs\julia-1.8.5\share\julia\stdlib\v1.8\Test\src\Test.jl:464 [inlined]
 [2] top-level scope
   @ c:\Users\ccsim\.julia\dev\NonlinearSolve.jl\test\basictests.jl:209
Test Summary:         | Pass  Fail  Total   Time
Basic Tests + Some AD | 7961     1   7962  53.6s

@Deltadahl
Copy link
Contributor Author

https://github.com/SciML/NonlinearSolve.jl/blob/master/src/ad.jl#L28-L37 expand this.

Solved the problem with alg::Union{NewtonRaphson, TrustRegion} Not 100% sure if this is ok, but it works

@ChrisRackauckas
Copy link
Member

Solved the problem with alg::Union{NewtonRaphson, TrustRegion} Not 100% sure if this is ok, but it works

That's the correct thing to do.

Comment on lines +137 to +138
solver = init(probN, TrustRegion(), abstol = 1e-9)
sol = solve!(solver)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test that solve! on the mutable code path does not allocate? I think if that passes then it's done.

cache.loss_new = get_loss(fu_new)

# Compute the ratio of the actual reduction to the predicted reduction.
cache.r = -(loss - cache.loss_new) / (step_size' * g + step_size' * H * step_size / 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be made non-allocating.

@Deltadahl
Copy link
Contributor Author

Tried to fix the allocation problem (get them < 400), but why did you make benchmark_mutable and benchmark_immutable the same with the same input in your code. I just used yours as a template, but I can't really understand why you do this.

solver = init(probN, TrustRegion(), abstol = 1e-9)
sol = solve!(solver)
end

function benchmark_mutable(f, u0)
probN = NonlinearProblem(f, u0)
probN = NonlinearProblem{false}(f, u0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one should be true

@@ -161,6 +161,10 @@ sol = benchmark_scalar(sf, csu0)
@test sol.retcode === ReturnCode.Success
@test sol.u * sol.u - 2 < 1e-9

@test (@ballocated benchmark_immutable(ff, cu0)) < 400
@test (@ballocated benchmark_mutable(ff, cu0)) < 400
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the allocation test we should check on this one shouldn't include the problem building.

@ChrisRackauckas
Copy link
Member

but why did you make benchmark_mutable and benchmark_immutable the same with the same input in your code

That looks like it's a typo.

@Deltadahl
Copy link
Contributor Author

NewtonRaphson allocates 48 while TrustRegion allocates 112 (for the inplace case)

@ChrisRackauckas
Copy link
Member

yeah my guess is that it's mostly line 299.

Let's get this into a form that can be merged and follow up with fixing that.

@Deltadahl
Copy link
Contributor Author

Sure!
What needs to be done to get it into that form?

@Deltadahl
Copy link
Contributor Author

Deltadahl commented Jan 18, 2023

(I don't really understand why some checks, e.g "IntegrationTest / OrdinaryDiffEq.jl" fail and what to do about it)

@ChrisRackauckas
Copy link
Member

It seems like it's a v1.6 only error?

@Deltadahl
Copy link
Contributor Author

yeah, sorry for not understanding... but I don't know what to do to fix it :(

@ChrisRackauckas
Copy link
Member

Precompilation of TrustRegion causes segfaults only on v1.6. That seems like it's a Julia bug that was fixed. Let's merge like this if tests pass, but then follow up by enabling it for VERSION >= v"1.7"

@Deltadahl
Copy link
Contributor Author

I want to say a huge thank you for all the help @ChrisRackauckas!
I'm learning so much from this (and is a lot of fun) :)
And sorry for all the time you have to put on me, hopefully, it will pay off soon as I intend to continue to do PRs.

If you would like me to continue with something specific, please tell me, otherwise I assume a good starting point is to implement the new solvers you just added to the issues.

PS. Thank you for the invite to SciML, means a lot to me :D

@ChrisRackauckas ChrisRackauckas merged commit 8ae8f8a into SciML:master Jan 18, 2023
avik-pal pushed a commit that referenced this pull request Nov 1, 2024
Add Line Search to (L)Broyden
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants