Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling Bounded Variables #1068

Closed
brandonwillard opened this issue Apr 13, 2016 · 16 comments
Closed

Sampling Bounded Variables #1068

brandonwillard opened this issue Apr 13, 2016 · 16 comments

Comments

@brandonwillard
Copy link
Contributor

I get errors when sampling a Bound'ed univariate scalar random variable with--effectively--non-scalar shape, e.g.

import numpy as np
import pymc3 as pm

x_shape = (2,1)
with pm.Model() as model:
    BoundedNormal = pm.Bound(pm.Normal, lower=0)
    x_var = BoundedNormal('x', mu=np.zeros(shape=x_shape),
            sd=np.ones(shape=x_shape), shape=x_shape)
    x_var.random(size=(1,))

The errors look like this.

First, is there anything fundamentally wrong about what's going on in that example?

Regardless, I've hacked up a quick solution (assuming that the underlying bounded [tensor] variable is a collection of i.i.d. variables) and I'm sending a pull request.

@twiecki
Copy link
Member

twiecki commented Apr 13, 2016

Thanks! This looks good on first sight but would love to get @mwibrow's take on this who wrote the initial random() support.

@brandonwillard
Copy link
Contributor Author

Sounds good; especially since I wasn't too sure about the backend sampling framework (see my comment in the pull request).

Also, the i.i.d. assumption doesn't seem very generalizable. The code doesn't explicitly check that the results from self.dist.random are i.i.d., yet would be run nonetheless if they weren't, right? For example, if self.dist is a multivariate normal distribution, the support bounds check/rejection should be performed upon the entirety of the result array, sample. Multivariate rejection gets bad quickly, so avoiding it is understandable, but it seems like the current result would simply be incorrect.
Changing the code to perform batch rejection is easy, but the performance penalty for actual i.i.d. variables could be great.

@mwibrow
Copy link

mwibrow commented Apr 16, 2016

Something definitely isn't right with what I did, and PR #1069 seems to address it, but if the following is a legitimate use-case (i.e., different mu values):

import numpy as np
import pymc3 as pm
x_shape = (2,1)
with pm.Model() as model:
    BoundedNormal = pm.Bound(pm.Normal, lower=0)
    x_var = BoundedNormal('x', mu=np.array([[50],[500]]),
            sd=np.ones(shape=x_shape), shape=x_shape)
    s = x_var.random(size=(1,))

PR #1069 produces [[ 49.6733539 50.35423156]] which is an improvement on the error message from the existing code but not right either.

@brandonwillard
Copy link
Contributor Author

brandonwillard commented Apr 19, 2016

@mwibrow, I don't seem able to reproduce that result with the given code; the second term in the sample is ~500, and the mean appears correct (via s = x_var.random(size=(1000,)) and s.mean(axis=0)), too.
E.g.

In [20]: !git branch -vv
* bound-shape-fix  b50c566 [origin/bound-shape-fix] quick solution to Bounded sampling of arbitrary shapes
  master           2924910 [origin/master] DOC Add link to pymc3 paper and citation.
  njobs-import-fix e9883ec [origin/njobs-import-fix] added missing import alias for njobs=None condition
In [15]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:import numpy as np
:import pymc3 as pm
:
:x_shape = (2,1)
:
:with pm.Model() as model:
:    BoundedNormal = pm.Bound(pm.Normal, lower=0)
:    x_var = BoundedNormal('x', mu=np.array([[50],[500]]),
:            sd=np.ones(shape=x_shape), shape=x_shape)
:    s = x_var.random(size=(1000,))
:    s.mean(axis=0)
:--
Out[15]:
array([[  49.98727624],
       [ 499.97657311]])

Any recommendations (different settings, etc.)?

@brandonwillard
Copy link
Contributor Author

brandonwillard commented Apr 19, 2016

Actually, @mwibrow, it looks like I can produce your result by setting lower=50! Looking into it...
Also, seems like we should set rng seeds on these examples.

@brandonwillard
Copy link
Contributor Author

brandonwillard commented Apr 20, 2016

OK, so that example is really just an instance of the i.i.d. assumption being broken. When one of the samples is rejected, the code that follows is no longer consistent. We can't just push the remaining accepted samples on the samples array (it's used like a stack), because after reshaping we have mismatched samples and dimensions.
In this exact case, we have a 2D truncated normal--or two truncated normals--with different means (that's what breaks the "identical" in i.i.d.).

FYI: I've updated the sampler in that branch to take an iid parameter modulating the acceptance/rejection of entire samples. It needs testing, but should provide some consistency in a truncated sampler.

@brandonwillard
Copy link
Contributor Author

brandonwillard commented May 3, 2016

To answer this problem somewhat thoroughly, we need to know if the

  1. underlying distribution is multivariate or a collection of independent univariates, and
  2. the set of distribution parameters, so we can check whether they're identical or not.

I don't see how those are possible through the object/class structure of PyMC3 distributions alone, but there are some hackish approaches that might work. For instance, using introspection, we could ask if the distribution object is in the multivariate module. The distribution parameters seem available through self.dist.kwargs, in the context of a Bounded object, but it's not clear that they're exclusively meant to be those.

What do you folks say to a class structure that specifies the dimension (at least uni/multi-variate) of the support clearly and the addition of a Distribution.params member that explicitly tracks the [essential] distribution parameters (if there aren't already such things that I've missed)?

@twiecki
Copy link
Member

twiecki commented May 4, 2016

Could we add a new baseclass Multivariate that inherits from Continuous? Then we could do an isinstance check.

@brandonwillard
Copy link
Contributor Author

That's what I was thinking.

@twiecki
Copy link
Member

twiecki commented May 4, 2016

Great, can you add that here?

@brandonwillard
Copy link
Contributor Author

brandonwillard commented May 4, 2016

Yep, and what about adding a Distribution.params member: worth it, or already covered by kwargs? I'm afraid extra stuff could creep into the latter.

@twiecki
Copy link
Member

twiecki commented May 4, 2016

Just to make sure:

For instance, using introspection, we could ask if the distribution object is in the multivariate module.

Is different from what I proposed with the base-class.

A bit undecided on Distribution.params. I suppose it can't hurt but has some overhead. I don't think anything but the parameters are passed in currently, but you are saying that you need to access all of them later on?

@brandonwillard
Copy link
Contributor Author

brandonwillard commented May 4, 2016

Yes, I was just mentioning a "hack".

Regarding the shape parameter: I recall finding a PyMC3 issue with a long conversation about this. Personally, I think shape should refer to the dimension of the distribution's support, and a separate parameter--if any--should indicate the number of samples/replications. Actually, when multiple samples/replications are requested, it would be even better to return an instance of a multivariate class that wraps the univariate distribution. That class could even be a "collection of independent variates" class.

@brandonwillard
Copy link
Contributor Author

brandonwillard commented May 6, 2016

I just tried specifying some MvNormal distributions and noticed that my assumptions don't seem to carry over to the current multivariate distributions.
Simply put, I was assuming that Distribution.shape = replications_shape + support_shape, so for the univariate distributions support_shape = () (i.e. they have scalar support). For the multivariate distributions, like Distribution.MvNormal, I checked that assumption with the following and the results were confusing.
First, establish the basics (with support_shape=(dim,) and integer dim):

def test_multi_shape(reps_shape=(), dim=2):
    supp_shape = (dim,)
    test_shape_mu = reps_shape + supp_shape
    test_shape_tau = reps_shape + supp_shape + supp_shape
    with pm.Model():
        x = pm.MvNormal('x'
                , mu=np.zeros(shape=test_shape_mu)
                , tau=np.ones(shape=test_shape_tau)
                , shape=test_shape_mu)
        assert x.random().shape == test_shape_mu

test_multi_shape((), 3)

Now, when I change the replications shape reps_shape to any other tuple of integers, it fails
with something like this. That error comes from theano.tensor.nlinalg.det, which is meant to only take matrix arguments. Applying det across reps_shape would probably take care of that, though.

Anyway, am I simply mistaken in these assumptions?

@brandonwillard
Copy link
Contributor Author

brandonwillard commented May 6, 2016

Here's an example of a potential fix. I've written it to highlight the elements mentioned above.

def mvnorm_logf(mu, tau, value):
    mu = T.as_tensor_variable(mu)
    tau = T.as_tensor_variable(tau)
    reps_shape_T = tau.shape[:-2]
    reps_shape_prod = T.prod(reps_shape_T, keepdims=True)
    dist_shape_T = mu.shape[-1:]

    # Collapse reps dimensions
    flat_supp_shape = T.concatenate([reps_shape_prod, dist_shape_T])
    # TODO: This could be made more generic.  If we had a collection of params,
    # we could simply loop through those.
    mus_collapsed=mu.reshape(flat_supp_shape, ndim = 2)
    taus_collapsed = tau.reshape(T.concatenate([reps_shape_prod,
        dist_shape_T, dist_shape_T]), ndim=3)
    params_collapsed = [mus_collapsed, taus_collapsed]

    # Force value to conform to reps_shape
    value_reshape = T.ones_like(mu) * value
    values_collapsed = value_reshape.reshape(flat_supp_shape, ndim=2)

    def single_logl(_mu, _tau, _value, k):
        delta = _value - _mu
        result = k * T.log(2 * np.pi) + T.log(det(_tau))
        result += T.square(delta.dot(_tau)).sum(axis=-1)
        return -result/2

    res, _ = theano.scan(fn=single_logl
            , sequences=params_collapsed + [values_collapsed]
            , non_sequences=[dist_shape_T]
            , strict=True
            )

    return res.sum()

My note about the params, and the use of a params_collapsed array, are only there to imply that we can probably get the single_logl function from a univariate distribution and use those generic operations as a default implementation in an independent multivariate class.

@twiecki
Copy link
Member

twiecki commented May 10, 2016

Thanks @brandonwillard. I think fixing the shape issue is one of the last items in the way of a major release. Do you think you could help applying this to the rest of the code-base so that we have a good standard?

@twiecki twiecki closed this as completed Dec 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants