From 47f22774b4100c822a3a2c8e067793139b423a97 Mon Sep 17 00:00:00 2001 From: Tim Holy Date: Sun, 27 Oct 2024 13:41:51 -0500 Subject: [PATCH] Expand documentation on convergence options Fixes #1102 --- docs/src/user/config.md | 11 +++++++---- docs/src/user/minimization.md | 14 +++++++++++++- 2 files changed, 20 insertions(+), 5 deletions(-) diff --git a/docs/src/user/config.md b/docs/src/user/config.md index 5126adaf..9d67fa74 100644 --- a/docs/src/user/config.md +++ b/docs/src/user/config.md @@ -35,12 +35,15 @@ Special methods for bounded univariate optimization: * `Brent()` * `GoldenSection()` -### General Options +### [General Options](@id config-general) + In addition to the solver, you can alter the behavior of the Optim package by using the following keywords: -* `x_tol`: Absolute tolerance in changes of the input vector `x`, in infinity norm. Defaults to `0.0`. -* `f_tol`: Relative tolerance in changes of the objective value. Defaults to `0.0`. -* `g_tol`: Absolute tolerance in the gradient. If `g_tol` is a scalar (the default), convergence is achieved when `norm(g, Inf) ≤ g_tol`; if `g_tol` is supplied as a vector, then each component must satisfy `abs(g[i]) ≤ g_tol[i]`. Defaults to `1e-8`. For gradient-free methods (e.g., Nelder-Meade), this gets re-purposed to control the main convergence tolerance in a solver-specific manner. +* `x_tol` (alternatively, `x_abstol`): Absolute tolerance in changes of the input vector `x`, in infinity norm. Concretely, if `|x-x'| ≤ x_tol` on successive evaluation points `x` and `x'`, convergence is achieved. Defaults to `0.0`. +* `x_reltol`: Relative tolerance in changes of the input vector `x`, in infinity norm. Concretely, if `|x-x'| ≤ x_reltol * |x|`, convergence is achieved. Defaults to `0.0` +* `f_tol` (alternatively, `f_reltol`): Relative tolerance in changes of the objective value. Defaults to `0.0`. +* `f_abstol`: Absolute tolerance in changes of the objective value. Defaults to `0.0`. +* `g_tol` (alternatively, `g_abstol`): Absolute tolerance in the gradient. If `g_tol` is a scalar (the default), convergence is achieved when `norm(g, Inf) ≤ g_tol`; if `g_tol` is supplied as a vector, then each component must satisfy `abs(g[i]) ≤ g_tol[i]`. Defaults to `1e-8`. For gradient-free methods (e.g., Nelder-Meade), this gets re-purposed to control the main convergence tolerance in a solver-specific manner. * `f_calls_limit`: A soft upper limit on the number of objective calls. Defaults to `0` (unlimited). * `g_calls_limit`: A soft upper limit on the number of gradient calls. Defaults to `0` (unlimited). * `h_calls_limit`: A soft upper limit on the number of Hessian calls. Defaults to `0` (unlimited). diff --git a/docs/src/user/minimization.md b/docs/src/user/minimization.md index 238ad591..1a55a588 100644 --- a/docs/src/user/minimization.md +++ b/docs/src/user/minimization.md @@ -31,6 +31,12 @@ For better performance and greater precision, you can pass your own gradient fun optimize(f, x0, LBFGS(); autodiff = :forward) ``` +!!! note + For most real-world problems, you may want to carefully consider the appropriate convergence criteria. + By default, algorithms that support gradients converge if `|g| ≤ 1e-8`. Depending on how your variables are scaled, + this may or may not be appropriate. See [configuration](@ref config-general) for more information about your options. + Examining traces (`Options(show_trace=true)`) during optimization may provide insight about when convergence is achieved in practice. + For the Rosenbrock example, the analytical gradient can be shown to be: ```jl function g!(G, x) @@ -65,7 +71,7 @@ Now we can use Newton's method for optimization by running: ```jl optimize(f, g!, h!, x0) ``` -Which defaults to `Newton()` since a Hessian function was provided. Like gradients, the Hessian function will be ignored if you use a method that does not require it: +which defaults to `Newton()` since a Hessian function was provided. Like gradients, the Hessian function will be ignored if you use a method that does not require it: ```jl optimize(f, g!, h!, x0, LBFGS()) ``` @@ -74,6 +80,12 @@ because of the potentially low accuracy of approximations to the Hessians. Other than Newton's method, none of the algorithms provided by the Optim package employ exact Hessians. +As a reminder, it's advised to set your convergence criteria manually based on +your knowledge of the problem: +``` +optimize(f, g!, h!, x0, Optim.Options(g_tol = 1e-12)) +``` + ## Box Constrained Optimization A primal interior-point algorithm for simple "box" constraints (lower and upper bounds) is available. Reusing our Rosenbrock example from above, boxed minimization is performed as follows: