Custom Likelihood Op Fails v3.8 and v3.9 #4057

tvwenger · 2020-08-17T21:52:19Z

Description of your problem

Implementing a custom likelihood theano.tensor.Op fails in v3.8 with chains > 1 and always in v3.9. See the following example:

import numpy as np
import pymc3 as pm
import theano.tensor as tt
import theano.tests.unittest_tools as utt

class Loglike(tt.Op):
    itypes = [tt.dvector]
    otypes = [tt.dscalar]

    def __init__(self, data):
        self.data = data
        self.loglike_grad = LoglikeGrad(self.data)

    def perform(self, node, inputs, outputs):
        theta, = inputs
        mu, sigma = theta
        logp = -np.log(np.sqrt(2.0*np.pi)*sigma)
        logp += -np.sum((self.data - mu)**2.0)/(2.0*sigma**2.0)
        outputs[0][0] = np.array(logp)

    def grad(self, inputs, grad_outputs):
        theta, = inputs
        grads = self.loglike_grad(theta)
        return [grad_outputs[0]*grads]

class LoglikeGrad(tt.Op):
    itypes = [tt.dvector]
    otypes = [tt.dvector]

    def __init__(self, data):
        self.data = data

    def perform(self, node, inputs, outputs):
        theta, = inputs
        mu, sigma = theta
        dmu = np.sum(self.data - mu)/sigma**2.0
        dsigma = np.sum((self.data - mu)**2.0)/sigma**3.0 - 1.0/sigma
        outputs[0][0] = np.array([dmu, dsigma])

def main():
    mu_true = 5.0
    sigma_true = 1.0
    data = np.random.normal(loc=mu_true, scale=sigma_true, size=1000)
    loglike = Loglike(data)
    utt.verify_grad(loglike, [np.array([3.0, 2.0])])
    # verify_grad passes with no errors
    with pm.Model() as model:
        mu = pm.Normal('mu', mu=4.0, sigma=2.0, testval=4.0)
        sigma = pm.HalfNormal('sigma', sigma=5.0, testval=5.0)
        theta = tt.as_tensor_variable([mu, sigma])
        like = pm.DensityDist('like', lambda v: loglike(v), observed={'v': theta})
    with model:
        trace = pm.sample(cores=2, chains=2)

if __name__ == "__main__":
    print("pymc3 version: ", pm.__version__)
    main()

This runs fine with pymc v3.8 as long as I run with only 1 chain. If I run on v3.8 with chains > 1, or with any number of chains on v3.9.3, I get the following error after sampling finishes:

MissingInputError: Input 0 of the graph (indices start from 0), used to compute Elemwise{exp,no_inplace}(sigma_log__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.

Here is the output on v3.8 with chains=1

pymc3 version:  3.8
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [sigma, mu]
Sampling chain 0, 0 divergences: 100%|████████████████████████████████████████████████████████████████████████| 1000/1000 [00:01<00:00, 869.02it/s]
Only one chain was sampled, this makes it impossible to run some convergence checks

Here is the full 3.8 traceback with chains = 2:

pymc3 version:  3.8
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [sigma, mu]
Sampling 2 chains, 0 divergences: 100%|███████████████████████████████████████████████████████████████████| 2000/2000 [00:00<00:00, 2108.88draws/s]
Traceback (most recent call last):
  File "test.py", line 57, in <module>
    main()
  File "test.py", line 53, in main
    trace = pm.sample(cores=2, chains=2)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/pymc3/sampling.py", line 498, in sample
    trace.report._run_convergence_checks(trace, model)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/pymc3/backends/report.py", line 84, in _run_convergence_checks
    self._ess = ess = ess(trace, var_names=varnames)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/pymc3/stats/__init__.py", line 24, in wrapped
    return func(*args, **kwargs)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/stats/diagnostics.py", line 191, in ess
    dataset = convert_to_dataset(data, group="posterior")
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/data/converters.py", line 177, in convert_to_dataset
    inference_data = convert_to_inference_data(obj, group=group, coords=coords, dims=dims)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/data/converters.py", line 91, in convert_to_inference_data
    return from_pymc3(trace=kwargs.pop(group), **kwargs)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 531, in from_pymc3
    save_warmup=save_warmup,
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 159, in __init__
    self.observations, self.multi_observations = self.find_observations()
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 172, in find_observations
    multi_observations[key] = val.eval() if hasattr(val, "eval") else val
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/gof/graph.py", line 522, in eval
    self._fn_cache[inputs] = theano.function(inputs, self)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/function.py", line 317, in function
    output_keys=output_keys)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/pfunc.py", line 486, in pfunc
    output_keys=output_keys)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py", line 1839, in orig_function
    name=name)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py", line 1487, in __init__
    accept_inplace)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py", line 181, in std_fgraph
    update_mapping=update_mapping)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py", line 175, in __init__
    self.__import_r__(output, reason="init")
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py", line 346, in __import_r__
    self.__import__(variable.owner, reason=reason)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py", line 391, in __import__
    raise MissingInputError(error_msg, variable=r)
theano.gof.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute Elemwise{exp,no_inplace}(sigma_log__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.

And here is the full v3.9.3 traceback with chains=anything:

pymc3 version:  3.9.3
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [sigma, mu]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 2 seconds.00:01<00:00 Sampling 2 chains, 0 divergences]
Traceback (most recent call last):
  File "test.py", line 57, in <module>
    main()
  File "test.py", line 53, in main
    trace = pm.sample(cores=2, chains=2)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/pymc3/sampling.py", line 624, in sample
    idata = arviz.from_pymc3(trace, **ikwargs)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 531, in from_pymc3
    save_warmup=save_warmup,
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 159, in __init__
    self.observations, self.multi_observations = self.find_observations()
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/arviz/data/io_pymc3.py", line 172, in find_observations
    multi_observations[key] = val.eval() if hasattr(val, "eval") else val
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/gof/graph.py", line 522, in eval
    self._fn_cache[inputs] = theano.function(inputs, self)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/function.py", line 317, in function
    output_keys=output_keys)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/pfunc.py", line 486, in pfunc
    output_keys=output_keys)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py", line 1839, in orig_function
    name=name)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py", line 1487, in __init__
    accept_inplace)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/compile/function_module.py", line 181, in std_fgraph
    update_mapping=update_mapping)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py", line 175, in __init__
    self.__import_r__(output, reason="init")
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py", line 346, in __import_r__
    self.__import__(variable.owner, reason=reason)
  File "/home/twenger/anaconda3/lib/python3.7/site-packages/theano/gof/fg.py", line 391, in __import__
    raise MissingInputError(error_msg, variable=r)
theano.gof.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute Elemwise{exp,no_inplace}(sigma_log__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.

Versions and main components

PyMC3 Version: 3.8 or 3.9.3
Theano Version: 1.0.5
Python Version: 3.7.6
Operating system: Ubuntu 20.04 on Windows Subsystem for Linux 2
How did you install PyMC3: pip

tvwenger · 2020-08-17T22:10:36Z

This worked in v3.7, and I've just downgraded to v3.7 and confirmed it still works (with chains>1) in v3.7:

pymc3 version:  3.7
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma, mu]
Sampling 4 chains: 100%|██████████████████████████████████████████████████████████████████████████████████| 4000/4000 [00:00<00:00, 4000.38draws/s]

I've also tried it on Ubuntu 16.04 (i.e., not a WSL environment), and confirmed that the issues with v3.9.3 and v3.8 (chains > 1) exist there as well.

junpenglao · 2020-08-18T06:43:44Z

Thanks for reporting, can confirm the bug. Seems it is relate to io with Arviz - @OriolAbril could you take a look?

OriolAbril · 2020-08-18T16:42:45Z

This is related to #4002 and arviz-devs/arviz#1279. It is currently an incompatibility between ArviZ and PyMC3. And we have to decide how to best fix it from either side or from both at the same time. TL;DR from the other issue, the root of the problem is using the observed keyword of the density dist for sampled variables, theta has a different value for each sample which breaks ArviZ.

In the meantime I'd recommend using a Potential instead of a DensityDist, or going all the way to define a custom Distribution.

        sigma = pm.HalfNormal('sigma', sigma=5.0, testval=5.0)
        theta = tt.as_tensor_variable([mu, sigma])
        pm.Potential("like", loglike(theta))
    with model:
        trace = pm.sample(cores=2, chains=2)

tvwenger · 2020-08-18T17:35:01Z

Excellent - thanks for the recommendation. I can confirm that the following works. Are there any downsides to using Potential instead of DensityDist?

import numpy as np
import pymc3 as pm
import theano.tensor as tt
import theano.tests.unittest_tools as utt

class Loglike(tt.Op):
    itypes = [tt.dvector]
    otypes = [tt.dscalar]

    def __init__(self, data):
        self.data = data
        self.loglike_grad = LoglikeGrad(self.data)

    def perform(self, node, inputs, outputs):
        theta, = inputs
        mu, sigma = theta
        logp = -len(self.data)*np.log(np.sqrt(2.0*np.pi)*sigma)
        logp += -np.sum((self.data - mu)**2.0)/(2.0*sigma**2.0)
        outputs[0][0] = np.array(logp)

    def grad(self, inputs, grad_outputs):
        theta, = inputs
        grads = self.loglike_grad(theta)
        return [grad_outputs[0]*grads]

class LoglikeGrad(tt.Op):
    itypes = [tt.dvector]
    otypes = [tt.dvector]

    def __init__(self, data):
        self.data = data

    def perform(self, node, inputs, outputs):
        theta, = inputs
        mu, sigma = theta
        dmu = np.sum(self.data - mu)/sigma**2.0
        dsigma = np.sum((self.data - mu)**2.0)/sigma**3.0 - len(self.data)/sigma
        outputs[0][0] = np.array([dmu, dsigma])

def main():
    mu_true = 5.0
    sigma_true = 1.0
    data = np.random.normal(loc=mu_true, scale=sigma_true, size=10000)
    loglike = Loglike(data)
    utt.verify_grad(loglike, [np.array([3.0, 2.0])])
    # verify_grad passes with no errors
    with pm.Model() as model:
        mu = pm.Normal('mu', mu=4.0, sigma=2.0, testval=4.0)
        sigma = pm.HalfNormal('sigma', sigma=5.0, testval=2.0)
        theta = tt.as_tensor_variable([mu, sigma])
        like = pm.Potential('like', loglike(theta))
    with model:
        trace = pm.sample()
        print(pm.summary(trace).to_string())

if __name__ == "__main__":
    print("pymc3 version: ", pm.__version__)
    main()

pymc-devs/pymc#4057 (comment)

tvwenger · 2021-01-21T18:37:10Z

Any updates on this?

OriolAbril · 2021-03-19T19:17:12Z

ArviZ 0.11.0 added the density_dist_obs argument to az.from_pymc3. Set it to false to get the old behaviour of DensityDist using FreeRVs as values passed with the observed kwarg. If idata_kwargs argument is not present yet in pm.sample you should do the following:

with ...:
    trace = pm.sample(..., compute_convergence_checks=False)
    idata = az.from_pymc3(idata, density_dist_obs=False)
# use idata for plotting and stats like ess, summary... *independently* of using az.function or pm.function
# both use the same arviz function under the hood.

As this concerns old pymc3 versions, stick with 0.11.0. The .1 and .2 releases were necessary to fix some issues with the latest cmdstanpy release at the time and while fixing the issues with latest cmdstanpy and working seamlessly with latest pymc3 which is where we test, they break arviz compatibility with old pymc3 versions.

Things will be completely different in pymc3 4+, the converter to inferencedata will be part of the pymc3 codebase and DensityDist may no longer exist in the same form, therefore I am closing this issue.

junpenglao assigned OriolAbril Aug 18, 2020

junpenglao added the bug label Aug 18, 2020

johannes-mueller added a commit to boschresearch/pylife that referenced this issue Oct 14, 2020

Work around bug/deprecation in pymc3

d8de150

pymc-devs/pymc#4057 (comment)

OriolAbril mentioned this issue Jan 23, 2021

[WIP] Add givens argument to DensityDist #4433

Closed

6 tasks

OriolAbril closed this as completed Mar 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Likelihood Op Fails v3.8 and v3.9 #4057

Custom Likelihood Op Fails v3.8 and v3.9 #4057

tvwenger commented Aug 17, 2020

tvwenger commented Aug 17, 2020

junpenglao commented Aug 18, 2020

OriolAbril commented Aug 18, 2020

tvwenger commented Aug 18, 2020

tvwenger commented Jan 21, 2021

OriolAbril commented Mar 19, 2021

Custom Likelihood Op Fails v3.8 and v3.9 #4057

Custom Likelihood Op Fails v3.8 and v3.9 #4057

Comments

tvwenger commented Aug 17, 2020

Description of your problem

Versions and main components

tvwenger commented Aug 17, 2020

junpenglao commented Aug 18, 2020

OriolAbril commented Aug 18, 2020

tvwenger commented Aug 18, 2020

tvwenger commented Jan 21, 2021

OriolAbril commented Mar 19, 2021