Transforming PyMC models to ModelGraph
and back
#112
Replies: 1 comment
-
@lucianopaz thanks for the write-up. My original approach did not introduce the new dummy Value variablesFor others reading the discussion: The value variables define the conditioning points of the logp graph and, possibly, the input variables of a random graph (do-operator, posterior predictive based on traced values). The goal obviously matters, but in general, I think we want to reason explicitly about the "placement" of value variables in our rewrites. To give a concrete example, note that a graph like: with pm.Model() as m:
x = pm.Normal.dist()
y = pm.Normal.dist() + x
m.register_rv(x, "x")
m.register_rv(y, "y") Is very different from the following, for the purposes of logp evaluation / MCMC sampling: with pm.Model() as m:
x = pm.Normal.dist()
y = pm.Normal.dist()
z = x + y
m.register_rv(x, "x")
m.register_rv(y, "y")
m.add_named_variable(z, "z") # Deterministic Which is also different than the following (whose logp/ MCMC sampling is currently unsupported by PyMC): with pm.Model() as m:
x = pm.Normal.dist()
y = pm.Normal.dist()
z = x + y
m.register_rv(z, "z") The new Other times, our rewrites may be just about changing the conditioning points, without altering anything from the random graph:
Again it helps that they are an explicit part of the graph. We can use the same "language" to do these rewrites. It may be worthwhile to note that this type of marker Op's were introduced in the IR reperesentation of Aeppl recently, because we always needed to check that the "source of measurability" was not being conditioned on already. This is a more specific reason related to the logp rewrites, but I think it shows how these markers may be generally useful: aesara-devs/aeppl#78 Some rewrites require manipulating the value variables themselves. Examples: splitting observed/missing components; splitting the initial and innovation steps of a time-series so that they can be sampled separately; removing a value variable during marginalization. Having the variables directly as inputs to these dummy Ops gives us a very natural hook to manipulate them. Old Aeppl and current PyMC had to add a PotentialsI also think it makes a lot of sense to label Potentials, because those correspond to expressions that exist on the logp space, and have nothing to do with the random space. We usually don't want to mess with them when we manipulate "random" graphs. DeterministicsThe exception here is Deterministics! Initially I didn't add a dummy Op for them, and they were just an "un-wrapped" output. I ended up adding them just because it looked cleaner, but I think my initial hunch was correct! Deterministics shouldn't constrain our rewrites at all. I think we should add them as new "copy" outputs, and leave them out of the main random graph. So the following user-defined graph: with pm.Model() as m:
x = pm.Normal("x")
exp_x = pm.Determinsitic("exp_x", pm.math.exp(x))
y = pm.Normal("y", exp_x) Should be represented internally as the following: with pm.Model() as m:
x = pm.Normal("x")
exp_x = pm.math.exp("x")
y = pm.Normal("y", exp_x)
pm.Determinsitic("exp_x", exp_x.copy()) We can still add the dummy Deterministic when we put |
Beta Was this translation helpful? Give feedback.
-
This started out as an internal discussion some months ago. Since @ricardoV94 has opened #111, I thought that it would be best to condense all of the discussion here and open it up to everyone that is interested.
Goal
PyMC
provides a way to define a generative model (and also non-generative models through the use ofPotential
s) and then gives access to automatic ways of drawing samples from the prior, posterior and posterior predictive distributions. Since random variables are now aTensor
(at some point viatheano
, thenaesara
and nowpytensor
), we can leverage the computational backend to do rewrites of models. I'll list a few cases that are relevant use cases of model rewrites:do
operator (discussed here). Where we say, "replace some random variable with a given value"observe
operator. This would allow us to first define a model, and then say "this rv should have these observed values". Maybedo
andobserve
are equivalent because I don't know enough about do calculus, but from my naive point of view,observe
adds a logp term conditioned on the observed values, anddo
simply sets the values but ignores the logp.conditional
so that all variables and deterministics downstream now depend on the conditional when you dosample_posterior_predictive
. Before, users had to recreate their deterministics, on top of the conditional manually.PyTensor (and
aesara
before that) enable ways to rewrite the computational graph. The piece that is missing is to connect PyMCModel
s with the entities that are used for applying rewrites:FunctionGraph
s.What do
PyMC
models store?PyMC
models work as bookkeepers of a few things:Many of the above entities are simple
TensorVariable
s and mappings between them. This means that we could very plainly use aFunctionGraph
that takes as outputs the random variables, deterministics and potentials as outputs, and we'd get thePyMC
model's induced function graph. We will need to store the rest of the information somewhere to be able to make the leap back into aModel
instance.How to target rewrites?
We need to choose how to track some of the meta information. In particular, how to track which
TensorVariable
is a free random variable, an observed random variable, a deterministic or a potential. I see two alternativestag
of theTensor
Tensor
using a newOp
(the approach taken in Add utility to convert Model to and from FunctionGraph #111)The benefit of the first is that rewrites don't need to reason about newly invented
Op
s since they can work with whatever was in the computational graph. The benefit of the second approach is that the newOp
s can include extra information that conditions the shapes and value variables of the resulting RVs (e.g. #111 includes the dimension information in theOp
s directly).There is still a lot of extra information that we need to carry around with us: the mappings, the configuration and scopes. All of these could potentially be included through
Feature
s that are appended to theFunctionGraph
.Beta Was this translation helpful? Give feedback.
All reactions