-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling latent parameters #595
Conversation
Pull Request Test Coverage Report for Build 8774826250Details
💛 - Coveralls |
I don't know if I'm familiar enough wit Turing to help, but I would be very enthusiastic if it landed in the official package / docs, so ping me whenever you run into HMM-related troubles! |
I like the idea here; also agree this can be developed into a complete replacement for submodels. |
In addition to x ~ returned(some_turing_model(...)) to replace @submodel x = some_turing_model(...) Here, |
Though not fundamentally against this, I guess the issue with all of this is that it really conflates what |
Good question. The returned variables in a submodel induce an implicit distribution (sometimes a Delta distribution if RVs do not influence these variables in the submodel). However, the likelihood distribution for this implicit distribution might be hard to compute or intractable in certain cases. This is when the condition syntax becomes intractable since it depends on the likelihood function being tractable. We can probably update the docs to inform the user that certain model operations are not applicable for submodels or, more generally, distributions without closed-form log density functions. |
But this is, because we don't know anything about the underlying model, all of the models in DPPL.
We would most certainly have to make the user aware of this if we were to use this syntax. My issue is that we don't want to end up in a scenario where the user has to ask themselves "is this valid?" every time they write I'm not sure which side I'm on tbh. We might be able to handle it properly if we introduce a lot of useful error messages, but it requires care. |
Also, regarding this particular PR, I was naively thinking that we could just use the forward sampling conditioned on the inferred HMM parameters to sample the latents conditioned on data, but of course (thanks to @THargreaves for pointing this out), this doesn't work since in general we don't have This means that we need three different modus operandi:
Buuut currently we don't really have a good way of knowing when to use (1) or (3). This is relevant to the discussion had in #589 regarding how to specify whether we are performing "post inference"-analysis or performing inference. |
My view is that the latents shouldn't be handled by Turing's inference engine by default (unless it is a Turing submodel). Instead, it should be infered by manually specified external algorithms. |
I'm unsubscribing since I can't help right now, ping me if there is some HMM-related stuff coming up as a test case! |
As written above, if we have a generative model of the form θ ~ p(θ)
z ~ p(z ∣ θ)
y ~ p(y ∣ θ, z) and we want to perform the following:
The first two are "easy" (both are achieved in the current PR), while the third is not so trivial. The problem with the third one is as follows:
We could fix this by also adding varname as an argument to the
|
I don't think we can handle inference for |
Two things:
|
It feels like this belongs to the doc! |
This can now be straightforwardly supported via the new """
hmm(K, T)
A Hidden Markov Model with `K` states and `T` observations with marginalized hidden states.
"""
@model function hmm(K, T)
# Transition matrix.
π ~ product_distribution(fill(Dirichlet(fill(1 / K, K)), K))
# Mean of emission distribution.
μ ~ Bijectors.ordered(MvNormal(zeros(K), 10I))
# HMM(init, trans, emissions).
hmm = HMM(π[:, 1], permutedims(π), Normal.(μ, 1))
y ~ to_distribution(hmm, T)
return y
end For
|
Okay, so this all started from this gist https://gist.github.com/JasonPekos/82be830e4bf390fd1cc2886a7518aede by @JasonPekos using Turing.jl combined with @gdalle's HiddenMarkovModels.jl.
This PR demonstrates how we could support such latent parameter models on the RHS of
~
without any / minimal intervention of the user, aaannd it's fairly easy to handle.We could also do the same to replace
@submodel
, though there it's a bit non-trivial because there are two "types" of realizations: the random variables involved in~
and thosereturn
ed, so let's leave that for now.But with this PR, one needs to implement a few methods:
latent(dist)
: returns aDistribution
for the latent parameters.conditional(dist, latents)
: returns aDistribution
for the conditional distribution of the data given the latent parameters.marginalize(dist)
: returns aDistribution
for the marginal distribution of the data.And that's it!
For example, to allow straight-forward usage of HiddenMarkovModels.jl with Turing.jl, the following "just works" (though it's a bit hacky, it's not particularly difficult):
The above requires the following to be implemented:
Some other people who might be interested in this: @THargreaves @yebai @devmotion