FiniteDifferences.jl estimates derivatives with finite differences.
See also the Python package FDM.
FiniteDiff.jl and FiniteDifferences.jl are similar libraries: both calculate approximate derivatives numerically. You should definitely use one or the other, rather than the legacy Calculus.jl finite differencing, or reimplementing it yourself. At some point in the future they might merge, or one might depend on the other. Right now here are the differences:
- FiniteDifferences.jl supports basically any type, whereas FiniteDiff.jl supports only array-ish types
- FiniteDifferences.jl supports higher order approximation and step size adaptation
- FiniteDiff.jl supports caching and in-place computation
- FiniteDiff.jl supports coloring vectors for efficient calculation of sparse Jacobians
This package was formerly called FDM.jl. We recommend users of FDM.jl update to FiniteDifferences.jl.
Compute the first derivative of sin
with a 5th order central method:
julia> central_fdm(5, 1)(sin, 1) - cos(1)
-2.4313884239290928e-14
Finite difference methods are optimised to minimise allocations:
julia> m = central_fdm(5, 1);
julia> @benchmark $m(sin, 1)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 486.621 ns (0.00% GC)
median time: 507.677 ns (0.00% GC)
mean time: 539.806 ns (0.00% GC)
maximum time: 1.352 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 195
Compute the second derivative of sin
with a 5th order central method:
julia> central_fdm(5, 2)(sin, 1) - (-sin(1))
-8.767431225464861e-11
To obtain better accuracy, you can increase the order of the method:
julia> central_fdm(12, 2)(sin, 1) - (-sin(1))
5.240252676230739e-14
The functions forward_fdm
and backward_fdm
can be used to construct
forward differences and backward differences, respectively.
The function log(x)
is only defined for x > 0
.
If we try to use central_fdm
to estimate the derivative of log
near x = 0
,
then we run into DomainError
s, because central_fdm
happens to evaluate log
at some x < 0
.
julia> central_fdm(5, 1)(log, 1e-3)
ERROR: DomainError with -0.02069596546590111
To deal with this situation, you have two options.
The first option is to use forward_fdm
, which only evaluates log
at inputs
greater or equal to x
:
julia> forward_fdm(5, 1)(log, 1e-3) - 1000
-3.741856744454708e-7
This works fine, but the downside is that you're restricted to one-sided
methods (forward_fdm
), which tend to perform worse than central methods
(central_fdm
).
The second option is to limit the distance that the finite difference method is
allowed to evaluate log
away from x
. Since x = 1e-3
, a reasonable value
for this limit is 9e-4
:
julia> central_fdm(5, 1, max_range=9e-4)(log, 1e-3) - 1000
-4.027924660476856e-10
Another commonly encountered example is logdet
, which is only defined
for positive-definite matrices.
Here you can use a forward method in combination with a positive-definite
deviation from x
:
julia> x = diagm([1.0, 2.0, 3.0]); v = Matrix(1.0I, 3, 3);
julia> forward_fdm(5, 1)(ε -> logdet(x .+ ε .* v), 0) - sum(1 ./ diag(x))
-4.222400207254395e-12
A range-limited central method is also possible:
julia> central_fdm(5, 1, max_range=9e-1)(ε -> logdet(x .+ ε .* v), 0) - sum(1 ./ diag(x))
-1.283417816466681e-13
It could be the case that the function f
you'd like compute the derivative of
suffers from numerical noise.
For example, f(x)
could be computed through some iterative procedure with some
error tolerance ε
.
In such cases, finite difference estimates can fail catastrophically.
To illustrate this, consider sin_noisy(x) = sin(x) * (1 + 1e-6 * randn())
.
Then
julia> central_fdm(5, 1)(sin_noisy, 1) - cos(1)
-0.027016678790599657
which is a terrible performance.
To deal with this, you can set the keyword argument factor
, which specifies
the level of numerical noise on the function evaluations relative to the
machine epsilon.
In this example, the relative error on the function evaluations
is 2e-6
(1e-6 * randn()
roughly produces a number in [-2e-6, 2e-6]
)
and the machine epsilon is eps(Float64) ≈ 2.22e-16
, so
factor = 2e-6 / 2e-16 = 1e10
should be appropriate:
julia> central_fdm(5, 1; factor=1e10)(sin_noisy, 1) - cos(1)
-1.9243663490486895e-6
As a rule of thumb, if you're dealing with numerical noise and Float64
s,
factor = 1e6
is not a bad first attempt.
The finite difference methods defined in this package can be extrapolated using Richardson extrapolation. This can offer superior numerical accuracy: Richardson extrapolation attempts polynomial extrapolation of the finite difference estimate as a function of the step size until a convergence criterion is reached.
julia> extrapolate_fdm(central_fdm(2, 1), sin, 1)[1] - cos(1)
1.6653345369377348e-14
Similarly, you can estimate higher order derivatives:
julia> extrapolate_fdm(central_fdm(5, 4), sin, 1)[1] - sin(1)
-1.626274487942503e-5
In this case, the accuracy can be improved by making the
contraction rate closer to
1
:
julia> extrapolate_fdm(central_fdm(5, 4), sin, 1, contract=0.8)[1] - sin(1)
2.0725743343774639e-10
This performs similarly to a 10
th order central method:
julia> central_fdm(10, 4)(sin, 1) - sin(1)
-1.0301381969668455e-10
If you really want, you can even extrapolate the 10
th order central method,
but that provides no further gains:
julia> extrapolate_fdm(central_fdm(10, 4), sin, 1, contract=0.8)[1] - sin(1)
5.673617131662922e-10
In this case, the central method can be pushed to a high order to obtain improved accuracy:
julia> central_fdm(20, 4)(sin, 1) - sin(1)
-5.2513549064769904e-14
julia> method = FiniteDifferenceMethod([-2, 0, 5], 1)
FiniteDifferenceMethod:
order of method: 3
order of derivative: 1
grid: [-2, 0, 5]
coefficients: [-0.35714285714285715, 0.3, 0.05714285714285714]
julia> method(sin, 1) - cos(1)
-3.701483564100272e-13
Consider a quadratic function:
julia> a = randn(3, 3); a = a * a'
3×3 Matrix{Float64}:
4.31995 -2.80614 0.0829128
-2.80614 3.91982 0.764388
0.0829128 0.764388 1.18055
julia> f(x) = 0.5 * x' * a * x
Compute the gradient:
julia> x = randn(3)
3-element Vector{Float64}:
-0.18563161988700727
-0.4659836395595666
2.304584409826511
julia> grad(central_fdm(5, 1), f, x)[1] - a * x
3-element Vector{Float64}:
4.1744385725905886e-14
-6.611378111642807e-14
-8.615330671091215e-14
Compute the Jacobian:
julia> jacobian(central_fdm(5, 1), f, x)[1] - (a * x)'
1×3 Matrix{Float64}:
4.17444e-14 -6.61138e-14 -8.61533e-14
The Jacobian can also be computed for non-scalar functions:
julia> a = randn(3, 3)
3×3 Matrix{Float64}:
0.844846 1.04772 1.0173
-0.867721 0.154146 -0.938077
1.34078 -0.630105 -1.13287
julia> f(x) = a * x
julia> jacobian(central_fdm(5, 1), f, x)[1] - a
3×3 Matrix{Float64}:
2.91989e-14 1.77636e-15 4.996e-14
-5.55112e-15 -7.63278e-15 2.4758e-14
4.66294e-15 -2.05391e-14 -1.04361e-14
To compute Jacobian--vector products, use jvp
and j′vp
:
julia> v = randn(3)
3-element Array{Float64,1}:
-1.290782164377614
-0.37701592844250903
-1.4288108966380777
julia> jvp(central_fdm(5, 1), f, (x, v)) - a * v
3-element Vector{Float64}:
-7.993605777301127e-15
-8.881784197001252e-16
-3.22519788653608e-14
julia> j′vp(central_fdm(5, 1), f, x, v)[1] - a'x
3-element Vector{Float64}:
-2.1316282072803006e-14
2.4646951146678475e-14
6.661338147750939e-15