Skip to content

Using a model

Shai Carmi edited this page Oct 9, 2018 · 9 revisions

Once you have constructed a model class, you can instantiate it and obtain an object. This usually looks like this:

P = {'param1': 1, 'param2': 2}
H = MyFactorialHMM(params=P, n_steps=10)

Simulating from the model

To draw a sequence of observations from the model:

Z, X = H.Simulate(random_seed=None)

This returns:

  • Z - the drawn sequence of hidden states
  • X - the drawn sequence of observations

You can set random_seed to a specific seed for reproducibility.

The method SimulateHiddenStates simulates only the sequence of hidden states.

The Forward algorithm

We follow the alpha-beta forward-backward algorithm formulation, with scaling factors (see Bishop, and note the online errata). The Forward algorithm gives the probability of each hidden state conditional on the observations of all steps leading to it (inclusive). These are called alphas, and their corresponding scaling factors are given as well. The product of all elements of scaling_factors equals the likelihood of the observations given the model.

alphas, scaling_constants, log_likelihood = H.Forward(observed_states=X)

Returns:

  • alphas - a multidimensional array, where, e.g., alphas[h1, h2, h3, t] corresponds to the probability of seeing the hidden state (h1, h2, h3) at step t, conditional on X1, ... Xt.
  • scaling_constants - The corresponding scaling constants. These are needed to compute the pairwise posterior probability xi (see below).
  • log_likelihood - The log-likelihood of the observations X given the model.

The Forward-Backward algorithm (E-Step in EM)

This runs both the forward and the backward algorithm, as follows:

alphas, betas, gammas, scaling_constants, log_likelihood = H.EStep(observed_states=X)

Returns:

  • alphas, betas, scaling_constants - the results of the forward-backward algorithm (see SI and Bishop). Note that the returned alpha and beta are scaled.
  • log_likelihood - The log-likelihood of the observations X given the model.
  • gammas - a multidimensional array, where, e.g., alphas[h1, h2, h3, t] corresponds to the posterior probability of seeing the hidden state (h1, h2, h3) at step t, conditional on all observations

Pairwise posterior probability

The forward-backward algorithm allows us to calculate the pairwise posterior probability of pairs of hidden states at each step. Warning - in most of the cases, you don't want to calculate this explicitly, and doing so will result in memory blowup. Proceed with caution.

alphas, betas, gammas, scaling_constants, log_likelihood = H.EStep(observed_states=X)
xis = H.CalculateXis(observed_states=X, alphas=alphas, betas=betas, scaling_constants=scaling_constants)

Returns:

  • xis - multidimensional array, where, e.g., xis[h1, h2, h3, g1, g2, g3, t] corresponds to the posterior probability of seeing the pair of hidden states (h1, h2, h3), (g1, g2, g3) at step t and t+1, conditional on all observations

Posterior marginal probability of a subset of Markov chains

Often we are interested only in a single hidden chain. We therefore would like to marginalize (collapse) over all the other substates. This can be done as:

alphas, betas, gammas, scaling_constants, log_likelihood = H.EStep(observed_states=X)
collapsed_gammas = H.CollapseGammas(gammas, fields)

where:

  • fields - A field or a list of fields to marginalize to, e.g. [H.I.trit].
  • collapsed_gammas - a multidimensional array, where, e.g., alphas[h2, t] corresponds to the posterior probability of seeing the hidden state (h2) for the Markov chain for I.trit at step t, conditional on all observations

Posterior pairwise marginal probability of a subset of Markov chains

Similary, we may be interested in the pairwise marginal posterior over a field or a subset of fields. Warning - this required calculating the full pairwise posterior; see below CalculateAndCollapseXis on how to avoid this.

alphas, betas, gammas, scaling_constants, log_likelihood = H.EStep(observed_states=X)
xis = H.CalculateXis(observed_states=X, alphas=alphas, betas=betas, scaling_constants=scaling_constants)
collapsed_xis = H.CollapseXis(xis, fields)

where:

  • fields - A field or a list of fields to marginalize to, e.g. [H.I.trit].
  • collapsed_xis - multidimensional array, where, e.g., xis[h2, g2, t] corresponds to the posterior probability of seeing the pair of hidden states (h2), (g2) for the chain corresponding to I.trit at step t and t+1, conditional on all observations

Since calculating the full posterior is computationally expensive, we show in our paper that if we want the posterior pairwise distribution for one field (or a subset), we can circumvent the full calculation and calculate the pairwise distribution for that field directly:

alphas, betas, gammas, scaling_constants, log_likelihood = H.EStep(observed_states=X)
collapsed_xis = H.CalculateAndCollapseXis(field, observed_states=X, alphas=alphas, betas=betas, scaling_constants=scaling_constants)

Viterbi

The Viterbi algorithm calculates the most likely sequence of hidden states (conditional on the observations).

most_likely, back_pointers, lls = H.Viterbi(observed_states=X)
  • most_likely - A matrix (number of hidden chains X number of steps) of the most likely sequence.
  • back_pointers, lls - internal structures of the Viterbi algorithm.

Drawing a sequence of hidden states from the posterior distribution

Often we are not interested in the most likely sequence of hidden states, but rather we would like to sample from the posterior distribution over all sequence of hidden states, conditional on the observations. We can do this for any subsequence (subrange of steps), without the need for sampling the entire sequence.

alphas, betas, gammas, scaling_constants, log_likelihood = H.EStep(observed_states=X)
Y = H.DrawFromPosterior(observed_states=X, alphas=alphas, betas=betas, scaling_constants=scaling_constants, start=0, end=None, random_seed=None)

Where:

  • start and end are the subrange we want to sample (end may be None to denote the end of the sequence).
  • You can set random_seed to a specific seed for reproducibility.

This returns:

  • Y - the drawn sequence of hidden states

Using the simple discrete HMM

Our package also includes the class FullDiscreteFactorialHMM, which implements a special case of a discrete-observation HMM under some simplifying assumptions on the parameters (see Constructing a model). To use the model, an object need to be created, as follows:

F = MyFullDiscreteFactorialHMM(n_steps=100)

where the class MyFullDiscreteFactorialHMM was defined here. All HMM operations are available for this class, exactly as for the standard discrete HMM. For example, to simulate from the model,

Z,X = F.Simulate()

The key additional property of this class is that it also provides a complete implementation of the EM algorithm. This can be run as:

R = F.EM(X, likelihood_precision=0.1, n_iterations=1000, verbose=True, print_every=1, random_seed=None)

where

  • X: a sequence of observations
  • n_iterations: the maximum number of iterations to run the algorithm
  • likelihood_precision: the magnitude of improvement in the log-likelihood at which point the algorithm stops.
  • verbose: print the log-likelihoods
  • print_every: print the log-likelihood every desired number of iterations.

The function returns an object of the same class, FullDiscreteFactorialHMM, and where the object's variables contain the inferred HMM parameters. For example, the inferred initial state probabilities can be obtained as R.initial_hidden_state_tensor, the inferred transition probabilities as R.transition_matrices_tensor[0] (the index 0 is to time step 0, but it could have been any other time step), and the inferred emission probabilities as R.obs_given_hidden[0].