State estimation in nonlinear systems is difficult due to the non-Gaussianity of posterior state distributions. For linear systems, an exact solution is attained by running the Kalman filtering/smoother. However for nonlinear systems, one typically relies on either crude Gaussian approximations by linearising the system (e.g. the Extended Kalman filter/smoother) or to use a Monte-Carlo method (particle filter/smoother) that sample the non-Gaussian posterior, but at the cost of more compute.
We propose an intermediate nonlinear state estimation method based on (approximate) Expectation Propagation (EP), which allows for an iterative refinement of the Gaussian approximation based on message passing. It turns out that this method generalises any standard Gaussian smoother such as the Extended Kalman smoother and the Unscented Kalman smoother, in the sense that these well-known smoothers are special cases of (approximate) EP. Moreover, they have the same computational cost up to the number of iterations, making it a practical solution to improving state estimates.
The aim of state estimation is to provide an estimate of a time-evolving latent state (given by a probability distribution) based on noisy observations of the dynamical system. This can be formulated mathematically using the state-space model:
Here,
Graphical representation of a state-space model.
We distinguish between two types of solutions. In filtering, a solution to the state estimation problem is given by the probability distribution
Expectation propagation (EP) [1] gives us a way to approximate the intractable marginal distribution of the nodes in a Bayesian network, such as the one in the figure above. Assuming that the marginal factorises as
EP approximates this using a simpler distribution of the form
-
Form the cavity distribution:
$q_{\backslash i}(x_t) \propto q(x_t)/q_i(x_t)$ . -
Projection
- Update
When the KL-divergence in Step 2 is replaced by the
in Step 3 with damping factor
In practice, the projection in Step 2 cannot be solved exactly when the true factor
Our implementation of approximate EP primarily uses numpy
and scipy
. To perform the Taylor linearisation, we also use automatic differentiation with the autograd
package. We have kept the number of required packages minimal. You can install the necessary packages by running:
pip install -r requirements.txt
State estimation with EP can be done as follows:
- Set up a state-space model with the
DynamicSystemModel
class inSystems.DynamicSystem
. - Set up the nodes in the Markov chain with
ExpectationPropagation.build_nodes
. - Add the state-space model and observations to the nodes using
ExpectationPropagation.node_system
. This completes the information required to form the factor graph for a dynamical system. - Equip the nodes with an
Estimator
object usingExpectationPropagation.node_estimator
. This object contains information about the state estimation procedure, such as linearisation method (e.g. Taylor transformation), and values for$\alpha$ and$\gamma$ . - Run a single EP sweep with
ExpectationPropagation.Iterations.ep_fwd_back_updates
. - Iterate step 5 until a stopping criterion is met.
A jupyter notebook demonstrating this basic procedure can be found in Notebooks/Demo.ipynb
.
The uniform nonlinear growth model (UNGM) is a well-known 1D nonlinear benchmark for state estimation, given as follows.
The video below shows the result of EP iterations with the following setup.
- Linearisation method: unscented transform
- Power:
$\alpha = 0.9$ - Damping:
$\gamma = 0.4$ - Number of iterations: 50
Next, we demonstrate our algorithm on the problem of bearings-only tracking of a turning target. This is a five dimensional nonlinear system in the variables
- Linearisation method: Taylor transform
- Power:
$\alpha = 1.0$ - Damping:
$\gamma = 0.6$ - Number of iterations: 10
The video above only displays the spatial components
The Lorenz 96 model is another well-known benchmark for nonlinear state-estimation. This is governed by the following system of ODEs:
for
- Linearisation method: unscented transform
- Power:
$\alpha = 1.0$ - Damping:
$\gamma = 1.0$ - Number of iterations: 5
The video below displays the Hovmöller representation of a single simulation of the L96 model, the absolute error of the prediction, and componentwise negative log likelihood loss.
[1] Thomas P. Minka. Expectation Propagation for Approximate Bayesian Inference. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2001.
[2] Thomas P. Minka. Power EP. Technical Report MSR-TR-2004-149, Microsoft Research, 2004.
[3] Sanket Kamthe, So Takao, Shakir Mohamed, Marc P. Deisenroth. Iterative State Estimation in Non-linear Dynamical Systems Using Approximate Expectation Propagation. Transactions on Machine Learning Research, 2022.