Transforming PyMC models to `ModelGraph` and back #112

lucianopaz · 2023-02-23T09:39:46Z

lucianopaz
Feb 23, 2023
Maintainer

This started out as an internal discussion some months ago. Since @ricardoV94 has opened #111, I thought that it would be best to condense all of the discussion here and open it up to everyone that is interested.

Goal

PyMC provides a way to define a generative model (and also non-generative models through the use of Potentials) and then gives access to automatic ways of drawing samples from the prior, posterior and posterior predictive distributions. Since random variables are now a Tensor (at some point via theano, then aesara and now pytensor), we can leverage the computational backend to do rewrites of models. I'll list a few cases that are relevant use cases of model rewrites:

It allows us to implement a do operator (discussed here). Where we say, "replace some random variable with a given value"
It allows us to implement an observe operator. This would allow us to first define a model, and then say "this rv should have these observed values". Maybe do and observe are equivalent because I don't know enough about do calculus, but from my naive point of view, observe adds a logp term conditioned on the observed values, and do simply sets the values but ignores the logp.
We can define the model in some form and then automatically get different variants. For example, get a model that marginalizes out a variable, exploits a conjugacy relation or changes a parametrization.
We can also address cases where we want different forms for observed and unobserved variables easily: Implement DiffTransform for RandomWalk distributions pymc#6098
More generally we can do arbitrary functions that take models as inputs and return new models as outputs. For instance, we can use this for GP predictions, where we replace the GP prior by the GP conditional so that all variables and deterministics downstream now depend on the conditional when you do sample_posterior_predictive. Before, users had to recreate their deterministics, on top of the conditional manually.

PyTensor (and aesara before that) enable ways to rewrite the computational graph. The piece that is missing is to connect PyMC Models with the entities that are used for applying rewrites: FunctionGraphs.

What do `PyMC` models store?

PyMC models work as bookkeepers of a few things:

Free random variables
Observed random variables
Deterministics
Potentials
Dimension names
Dimension coordinate values or lengths
Model name to prepend to newly added entities
A compile configuration (unless this has been deprecated at some point and I'm not aware of it)
The parent model context if the current model is nested within another one
Stores a mapping between variables and their transformed counterparts in unconstrained space (used for running inference, but not forward samples)

Many of the above entities are simple TensorVariables and mappings between them. This means that we could very plainly use a FunctionGraph that takes as outputs the random variables, deterministics and potentials as outputs, and we'd get the PyMC model's induced function graph. We will need to store the rest of the information somewhere to be able to make the leap back into a Model instance.

How to target rewrites?

We need to choose how to track some of the meta information. In particular, how to track which TensorVariable is a free random variable, an observed random variable, a deterministic or a potential. I see two alternatives

Store this information in the tag of the Tensor
Wrap the Tensor using a new Op (the approach taken in Add utility to convert Model to and from FunctionGraph #111)

The benefit of the first is that rewrites don't need to reason about newly invented Ops since they can work with whatever was in the computational graph. The benefit of the second approach is that the new Ops can include extra information that conditions the shapes and value variables of the resulting RVs (e.g. #111 includes the dimension information in the Ops directly).

There is still a lot of extra information that we need to carry around with us: the mappings, the configuration and scopes. All of these could potentially be included through Features that are appended to the FunctionGraph.

ricardoV94 · 2023-02-27T10:06:03Z

ricardoV94
Feb 27, 2023
Maintainer

@lucianopaz thanks for the write-up. My original approach did not introduce the new dummy Ops, but I came to think we actually want them. Let me try to explain.

Value variables

For others reading the discussion: The value variables define the conditioning points of the logp graph and, possibly, the input variables of a random graph (do-operator, posterior predictive based on traced values). The goal obviously matters, but in general, I think we want to reason explicitly about the "placement" of value variables in our rewrites.

To give a concrete example, note that a graph like:

with pm.Model() as m:
  x = pm.Normal.dist()
  y = pm.Normal.dist() + x

  m.register_rv(x, "x")
  m.register_rv(y, "y")

Is very different from the following, for the purposes of logp evaluation / MCMC sampling:

with pm.Model() as m:
  x = pm.Normal.dist()
  y = pm.Normal.dist()
  z = x + y

  m.register_rv(x, "x")
  m.register_rv(y, "y")
  m.add_named_variable(z, "z")  # Deterministic

Which is also different than the following (whose logp/ MCMC sampling is currently unsupported by PyMC):

with pm.Model() as m:
  x = pm.Normal.dist() 
  y = pm.Normal.dist()
  z = x + y
  m.register_rv(z, "z")

The new FreeRV / ObservedRV Op naturally force us to treat those 3 graphs differently. In general I think that's what we would want to do anyway.

Other times, our rewrites may be just about changing the conditioning points, without altering anything from the random graph:

FreeRV(Cumsum(Normal(0, 1, size=10), axis=0)) -> Cumsum(FreeRV(Normal(0, 1, size=10)), axis=0)

Again it helps that they are an explicit part of the graph. We can use the same "language" to do these rewrites.

It may be worthwhile to note that this type of marker Op's were introduced in the IR reperesentation of Aeppl recently, because we always needed to check that the "source of measurability" was not being conditioned on already. This is a more specific reason related to the logp rewrites, but I think it shows how these markers may be generally useful: aesara-devs/aeppl#78

Some rewrites require manipulating the value variables themselves. Examples: splitting observed/missing components; splitting the initial and innovation steps of a time-series so that they can be sampled separately; removing a value variable during marginalization.

Having the variables directly as inputs to these dummy Ops gives us a very natural hook to manipulate them. Old Aeppl and current PyMC had to add a update_rv_maps method to be able to do this sort of manipulations. It feels much more clean to use the same native PyTensor rewrite features as we do for changes in the random graphs, which requires having value variables explicitly in the graph.

https://github.com/pymc-devs/pymc/blob/f96594bb215b44197615c695130c9d60e1bf9601/pymc/logprob/rewriting.py#L121-L150

Potentials

I also think it makes a lot of sense to label Potentials, because those correspond to expressions that exist on the logp space, and have nothing to do with the random space. We usually don't want to mess with them when we manipulate "random" graphs.

Deterministics

The exception here is Deterministics! Initially I didn't add a dummy Op for them, and they were just an "un-wrapped" output. I ended up adding them just because it looked cleaner, but I think my initial hunch was correct! Deterministics shouldn't constrain our rewrites at all.

I think we should add them as new "copy" outputs, and leave them out of the main random graph. So the following user-defined graph:

with pm.Model() as m:
  x = pm.Normal("x")
  exp_x = pm.Determinsitic("exp_x", pm.math.exp(x))
  y = pm.Normal("y", exp_x)

Should be represented internally as the following:

with pm.Model() as m:
  x = pm.Normal("x")
  exp_x = pm.math.exp("x")
  y = pm.Normal("y", exp_x)
  pm.Determinsitic("exp_x", exp_x.copy())

We can still add the dummy Deterministic when we put exp_x.copy() as one of the outputs of the FunctionGraph, just for easy of labelling, but this label should never show up in the graph between y and x. I will update the PR soon with this change.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transforming PyMC models to `ModelGraph` and back #112

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Transforming PyMC models to ModelGraph and back #112

lucianopaz Feb 23, 2023 Maintainer

Goal

What do PyMC models store?

How to target rewrites?

Replies: 1 comment

ricardoV94 Feb 27, 2023 Maintainer

Value variables

Potentials

Deterministics

Transforming PyMC models to `ModelGraph` and back #112

lucianopaz
Feb 23, 2023
Maintainer

What do `PyMC` models store?

ricardoV94
Feb 27, 2023
Maintainer